postgresql

Commit Graph

Author	SHA1	Message	Date
Bruce Momjian	6b82f78ff9	Again move function where we set effective_cache_size's default	2013-10-08 23:12:45 -04:00
Bruce Momjian	cbafd6618a	Move new effective_cache_size function Previously set_default_effective_cache_size() could not handle fork, non-fork, and bootstrap cases.	2013-10-08 22:41:23 -04:00
Robert Haas	568d4138c6	Use an MVCC snapshot, rather than SnapshotNow, for catalog scans. SnapshotNow scans have the undesirable property that, in the face of concurrent updates, the scan can fail to see either the old or the new versions of the row. In many cases, we work around this by requiring DDL operations to hold AccessExclusiveLock on the object being modified; in some cases, the existing locking is inadequate and random failures occur as a result. This commit doesn't change anything related to locking, but will hopefully pave the way to allowing lock strength reductions in the future. The major issue has held us back from making this change in the past is that taking an MVCC snapshot is significantly more expensive than using a static special snapshot such as SnapshotNow. However, testing of various worst-case scenarios reveals that this problem is not severe except under fairly extreme workloads. To mitigate those problems, we avoid retaking the MVCC snapshot for each new scan; instead, we take a new snapshot only when invalidation messages have been processed. The catcache machinery already requires that invalidation messages be sent before releasing the related heavyweight lock; else other backends might rely on locally-cached data rather than scanning the catalog at all. Thus, making snapshot reuse dependent on the same guarantees shouldn't break anything that wasn't already subtly broken. Patch by me. Review by Michael Paquier and Andres Freund.	2013-07-02 09:47:01 -04:00
Heikki Linnakangas	15386281a6	Put back allow_system_table_mods check in heap_create(). This reverts commit `a475c60367`. Erik Rijkers reported back in January 2013 that after the patch, if you do "pg_dump -t myschema.mytable" to dump a single table, and restore that in a database where myschema does not exist, the table is silently created in pg_catalog instead. That is because pg_dump uses "SET search_path=myschema, pg_catalog" to set schema the table is created in. While allow_system_table_mods is not a very elegant solution to this, we can't leave it as it is, so for now, revert it back to the way it was previously.	2013-06-03 17:22:31 +03:00
Bruce Momjian	9af4159fce	pgindent run for release 9.3 This is the first run of the Perl-based pgindent script. Also update pgindent instructions.	2013-05-29 16:58:43 -04:00
Simon Riggs	443951748c	Record data_checksum_version in control file. The value is not used anywhere in code, but will allow future changes to the checksum version should that become necessary in the future.	2013-04-30 12:27:12 +01:00
Simon Riggs	96ef3b8ff1	Allow I/O reliability checks using 16-bit checksums Checksums are set immediately prior to flush out of shared buffers and checked when pages are read in again. Hint bit setting will require full page write when block is dirtied, which causes various infrastructure changes. Extensive comments, docs and README. WARNING message thrown if checksum fails on non-all zeroes page; ERROR thrown but can be disabled with ignore_checksum_failure = on. Feature enabled by an initdb option, since transition from option off to option on is long and complex and has not yet been implemented. Default is not to use checksums. Checksum used is WAL CRC-32 truncated to 16-bits. Simon Riggs, Jeff Davis, Greg Smith Wide input and assistance from many community members. Thank you.	2013-03-22 13:54:07 +00:00
Tom Lane	b853eb9718	Improve handling of ereport(ERROR) and elog(ERROR). In commit `71450d7fd6`, we added code to inform suitably-intelligent compilers that ereport() doesn't return if the elevel is ERROR or higher. This patch extends that to elog(), and also fixes a double-evaluation hazard that the previous commit created in ereport(), as well as reducing the emitted code size. The elog() improvement requires the compiler to support __VA_ARGS__, which should be available in just about anything nowadays since it's required by C99. But our minimum language baseline is still C89, so add a configure test for that. The previous commit assumed that ereport's elevel could be evaluated twice, which isn't terribly safe --- there are already counterexamples in xlog.c. On compilers that have __builtin_constant_p, we can use that to protect the second test, since there's no possible optimization gain if the compiler doesn't know the value of elevel. Otherwise, use a local variable inside the macros to prevent double evaluation. The local-variable solution is inferior because (a) it leads to useless code being emitted when elevel isn't constant, and (b) it increases the optimization level needed for the compiler to recognize that subsequent code is unreachable. But it seems better than not teaching non-gcc compilers about unreachability at all. Lastly, if the compiler has __builtin_unreachable(), we can use that instead of abort(), resulting in a noticeable code savings since no function call is actually emitted. However, it seems wise to do this only in non-assert builds. In an assert build, continue to use abort(), so that the behavior will be predictable and debuggable if the "impossible" happens. These changes involve making the ereport and elog macros emit do-while statement blocks not just expressions, which forces small changes in a few call sites. Andres Freund, Tom Lane, Heikki Linnakangas	2013-01-13 18:40:09 -05:00
Alvaro Herrera	84f6fb81b8	Fix IsUnderPostmaster/EXEC_BACKEND confusion	2013-01-02 18:39:20 -03:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Alvaro Herrera	f4c4335a4a	Add context info to OAT_POST_CREATE security hook ... and have sepgsql use it to determine whether to check permissions during certain operations. Indexes that are being created as a result of REINDEX, for instance, do not need to have their permissions checked; they were already checked when the index was created. Author: KaiGai Kohei, slightly revised by me	2012-10-23 18:24:24 -03:00
Bruce Momjian	49ec613201	In our source code, make a copy of getopt's 'optarg' string arguments, rather than just storing a pointer.	2012-10-12 13:35:43 -04:00
Peter Eisentraut	8521d13194	Refactor flex and bison make rules Numerous flex and bison make rules have appeared in the source tree over time, and they are all virtually identical, so we can replace them by pattern rules with some variables for customization. Users of pgxs will also be able to benefit from this.	2012-10-11 06:57:04 -04:00
Alvaro Herrera	c219d9b0a5	Split tuple struct defs from htup.h to htup_details.h This reduces unnecessary exposure of other headers through htup.h, which is very widely included by many files. I have chosen to move the function prototypes to the new file as well, because that means htup.h no longer needs to include tupdesc.h. In itself this doesn't have much effect in indirect inclusion of tupdesc.h throughout the tree, because it's also required by execnodes.h; but it's something to explore in the future, and it seemed best to do the htup.h change now while I'm busy with it.	2012-08-30 16:52:35 -04:00
Tom Lane	4a9c30a8a1	Fix management of pendingOpsTable in auxiliary processes. mdinit() was misusing IsBootstrapProcessingMode() to decide whether to create an fsync pending-operations table in the current process. This led to creating a table not only in the startup and checkpointer processes as intended, but also in the bgwriter process, not to mention other auxiliary processes such as walwriter and walreceiver. Creation of the table in the bgwriter is fatal, because it absorbs fsync requests that should have gone to the checkpointer; instead they just sit in bgwriter local memory and are never acted on. So writes performed by the bgwriter were not being fsync'd which could result in data loss after an OS crash. I think there is no live bug with respect to walwriter and walreceiver because those never perform any writes of shared buffers; but the potential is there for future breakage in those processes too. To fix, make AuxiliaryProcessMain() export the current process's AuxProcType as a global variable, and then make mdinit() test directly for the types of aux process that should have a pendingOpsTable. Having done that, we might as well also get rid of the random bool flags such as am_walreceiver that some of the aux processes had grown. (Note that we could not have fixed the bug by examining those variables in mdinit(), because it's called from BaseInit() which is run by AuxiliaryProcessMain() before entering any of the process-type-specific code.) Back-patch to 9.2, where the problem was introduced by the split-up of bgwriter and checkpointer processes. The bogus pendingOpsTable exists in walwriter and walreceiver processes in earlier branches, but absent any evidence that it causes actual problems there, I'll leave the older branches alone.	2012-07-18 15:28:10 -04:00
Tom Lane	c92be3c059	Avoid pre-determining index names during CREATE TABLE LIKE parsing. Formerly, when trying to copy both indexes and comments, CREATE TABLE LIKE had to pre-assign names to indexes that had comments, because it made up an explicit CommentStmt command to apply the comment and so it had to know the name for the index. This creates bad interactions with other indexes, as shown in bug #6734 from Daniele Varrazzo: the preassignment logic couldn't take any other indexes into account so it could choose a conflicting name. To fix, add a field to IndexStmt that allows it to carry a comment to be assigned to the new index. (This isn't a user-exposed feature of CREATE INDEX, only an internal option.) Now we don't need preassignment of index names in any situation. I also took the opportunity to refactor DefineIndex to accept the IndexStmt as such, rather than passing all its fields individually in a mile-long parameter list. Back-patch to 9.2, but no further, because it seems too dangerous to change IndexStmt or DefineIndex's API in released branches. The bug exists back to 9.0 where CREATE TABLE LIKE grew the ability to copy comments, but given the lack of prior complaints we'll just let it go unfixed before 9.2.	2012-07-16 13:25:18 -04:00
Robert Haas	a475c60367	Remove misplaced sanity check from heap_create(). Even when allow_system_table_mods is not set, we allow creation of any type of SQL object in pg_catalog, except for relations. And you can get relations into pg_catalog, too, by initially creating them in some other schema and then moving them with ALTER .. SET SCHEMA. So this restriction, which prevents relations (only) from being created in pg_catalog directly, is fairly pointless. If we need a safety mechanism for this, it should be placed further upstream, so that it affects all SQL objects uniformly, and picks up both CREATE and SET SCHEMA. For now, just rip it out, per discussion with Tom Lane.	2012-06-14 09:58:53 -04:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Simon Riggs	9aceb6ab3c	Refactor xlog.c to create src/backend/postmaster/startup.c Startup process now has its own dedicated file, just like all other special/background processes. Reduces role and size of xlog.c	2011-11-02 14:25:01 +00:00
Simon Riggs	806a2aee37	Split work of bgwriter between 2 processes: bgwriter and checkpointer. bgwriter is now a much less important process, responsible for page cleaning duties only. checkpointer is now responsible for checkpoints and so has a key role in shutdown. Later patches will correct doc references to the now old idea that bgwriter performs checkpoints. Has beneficial effect on performance at high write rates, but mainly refactoring to more easily allow changes for power reduction by simplifying previously tortuous code around required to allow page cleaning and checkpointing to time slice in the same process. Patch by me, Review by Dickson Guedes	2011-11-01 17:14:47 +00:00
Tom Lane	ca4af308c3	Simplify handling of the timezone GUC by making initdb choose the default. We were doing some amazingly complicated things in order to avoid running the very expensive identify_system_timezone() procedure during GUC initialization. But there is an obvious fix for that, which is to do it once during initdb and have initdb install the system-specific default into postgresql.conf, as it already does for most other GUC variables that need system-environment-dependent defaults. This means that the timezone (and log_timezone) settings no longer have any magic behavior in the server. Per discussion.	2011-09-09 17:59:11 -04:00
Peter Eisentraut	e6d800981e	Correct ancient logic mistake in assertion Found by gcc -Wlogical-op	2011-09-06 23:05:02 +03:00
Tom Lane	1609797c25	Clean up the #include mess a little. walsender.h should depend on xlog.h, not vice versa. (Actually, the inclusion was circular until a couple hours ago, which was even sillier; but Bruce broke it in the expedient rather than logically correct direction.) Because of that poor decision, plus blind application of pgrminclude, we had a situation where half the system was depending on xlog.h to include such unrelated stuff as array.h and guc.h. Clean up the header inclusion, and manually revert a lot of what pgrminclude had done so things build again. This episode reinforces my feeling that pgrminclude should not be run without adult supervision. Inclusion changes in header files in particular need to be reviewed with great care. More generally, it'd be good if we had a clearer notion of module layering to dictate which headers can sanely include which others ... but that's a big task for another day.	2011-09-04 01:13:16 -04:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Tom Lane	2e95f1f002	Add "%option warn" to all flex input files that lacked it. This is recommended in the flex manual, and there seems no good reason not to use it everywhere.	2011-08-25 13:55:57 -04:00
Robert Haas	367bc426a1	Avoid index rebuild for no-rewrite ALTER TABLE .. ALTER TYPE. Noah Misch. Review and minor cosmetic changes by me.	2011-07-18 11:04:43 -04:00
Alvaro Herrera	b93f5a5673	Move Trigger and TriggerDesc structs out of rel.h into a new reltrigger.h This lets us stop including rel.h into execnodes.h, which is a widely used header.	2011-07-04 14:35:58 -04:00
Peter Eisentraut	8a8fbe7e79	Capitalization fixes	2011-06-19 00:37:30 +03:00
Robert Haas	68ef051f5c	Refactor broken CREATE TABLE IF NOT EXISTS support. Per bug #5988, reported by Marko Tiikkaja, and further analyzed by Tom Lane, the previous coding was broken in several respects: even if the target table already existed, a subsequent CREATE TABLE IF NOT EXISTS might try to add additional constraints or sequences-for-serial specified in the new CREATE TABLE statement. In passing, this also fixes a minor information leak: it's no longer possible to figure out whether a schema to which you don't have CREATE access contains a sequence named like "x_y_seq" by attempting to create a table in that schema called "x" with a serial column called "y". Some more refactoring of this code in the future might be warranted, but that will need to wait for a later major release.	2011-04-25 16:55:11 -04:00
Tom Lane	8c19977e9c	Avoid changing an index's indcheckxmin horizon during REINDEX. There can never be a need to push the indcheckxmin horizon forward, since any HOT chains that are actually broken with respect to the index must pre-date its original creation. So we can just avoid changing pg_index altogether during a REINDEX operation. This offers a cleaner solution than my previous patch for the problem found a few days ago that we mustn't try to update pg_index while we are reindexing it. System catalog indexes will always be created with indcheckxmin = false during initdb, and with this modified code we should never try to change their pg_index entries. This avoids special-casing system catalogs as the former patch did, and should provide a performance benefit for many cases where REINDEX formerly caused an index to be considered unusable for a short time. Back-patch to 8.3 to cover all versions containing HOT. Note that this patch changes the API for index_build(), but I believe it is unlikely that any add-on code is calling that directly.	2011-04-19 18:50:56 -04:00
Tom Lane	0c9d9e8dd6	More collations cleanup, from trawling for missed collation assignments. Mostly cosmetic, though I did find that generateClonedIndexStmt failed to clone the index's collations.	2011-03-26 16:35:25 -04:00
Peter Eisentraut	414c5a2ea6	Per-column collation support This adds collation support for columns and domains, a COLLATE clause to override it per expression, and B-tree index support. Peter Eisentraut reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch	2011-02-08 23:04:18 +02:00
Bruce Momjian	5d950e3b0c	Stamp copyrights for year 2011.	2011-01-01 13:18:15 -05:00
Robert Haas	5f7b58fad8	Generalize concept of temporary relations to "relation persistence". This commit replaces pg_class.relistemp with pg_class.relpersistence; and also modifies the RangeVar node type to carry relpersistence rather than istemp. It also removes removes rd_istemp from RelationData and instead performs the correct computation based on relpersistence. For clarity, we add three new macros: RelationNeedsWAL(), RelationUsesLocalBuffers(), and RelationUsesTempNamespace(), so that we can clarify the purpose of each check that previous depended on rd_istemp. This is intended as infrastructure for the upcoming unlogged tables patch, as well as for future possible work on global temporary tables.	2010-12-13 12:34:26 -05:00
Peter Eisentraut	fc946c39ae	Remove useless whitespace at end of lines	2010-11-23 22:34:55 +02:00
Magnus Hagander	fe9b36fd59	Convert cvsignore to gitignore, and add .gitignore for build targets.	2010-09-22 12:57:04 +02:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Tom Lane	303696c3b4	Install a data-type-based solution for protecting pg_get_expr(). Since the code underlying pg_get_expr() is not secure against malformed input, and can't practically be made so, we need to prevent miscreants from feeding arbitrary data to it. We can do this securely by declaring pg_get_expr() to take a new datatype "pg_node_tree" and declaring the system catalog columns that hold nodeToString output to be of that type. There is no way at SQL level to create a non-null value of type pg_node_tree. Since the backend-internal operations that fill those catalog columns operate below the SQL level, they are oblivious to the datatype relabeling and don't need any changes.	2010-09-03 01:34:55 +00:00
Robert Haas	a3b012b560	CREATE TABLE IF NOT EXISTS. Reviewed by Bernd Helmle.	2010-07-25 23:21:22 +00:00
Tom Lane	c670410e7f	Move the responsibility for calling StartupXLOG into InitPostgres, for those process types that go through InitPostgres; in particular, bootstrap and standalone-backend cases. This ensures that we have set up a PGPROC and done some other basic initialization steps (corresponding to the if (IsUnderPostmaster) block in AuxiliaryProcessMain) before we attempt to run WAL recovery in a standalone backend. As was discovered last September, this is necessary for some corner-case code paths during WAL recovery, particularly end-of-WAL cleanup. Moving the bootstrap case here too is not necessary for correctness, but it seems like a good idea since it reduces the number of distinct code paths.	2010-04-20 01:38:52 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Tom Lane	b9b8831ad6	Create a "relation mapping" infrastructure to support changing the relfilenodes of shared or nailed system catalogs. This has two key benefits: * The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs. * We no longer have to use an unsafe reindex-in-place approach for reindexing shared catalogs. CLUSTER on nailed catalogs now works too, although I left it disabled on shared catalogs because the resulting pg_index.indisclustered update would only be visible in one database. Since reindexing shared system catalogs is now fully transactional and crash-safe, the former special cases in REINDEX behavior have been removed; shared catalogs are treated the same as non-shared. This commit does not do anything about the recently-discussed problem of deadlocks between VACUUM FULL/CLUSTER on a system catalog and other concurrent queries; will address that in a separate patch. As a stopgap, parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid such failures during the regression tests.	2010-02-07 20:48:13 +00:00
Peter Eisentraut	e7b3349a8a	Type table feature This adds the CREATE TABLE name OF type command, per SQL standard.	2010-01-28 23:21:13 +00:00
Robert Haas	76a47c0e74	Replace ALTER TABLE ... SET STATISTICS DISTINCT with a more general mechanism. Attributes can now have options, just as relations and tablespaces do, and the reloptions code is used to parse, validate, and store them. For simplicity and because these options are not performance critical, we store them in a separate cache rather than the main relcache. Thanks to Alex Hunsaker for the review.	2010-01-22 16:40:19 +00:00
Heikki Linnakangas	32bc08b1d4	Rethink the way walreceiver is linked into the backend. Instead than shoving walreceiver as whole into a dynamically loaded module, split the libpq-specific parts of it into dynamically loaded module and keep the rest in the main backend binary. Although Tom fixed the Windows compilation problems with the old walreceiver module already, this is a cleaner division of labour and makes the code more readable. There's also the prospect of adding new transport methods as pluggable modules in the future, which this patch makes easier, though for now the API between libpqwalreceiver and walreceiver process should be considered private. The libpq-specific module is now in src/backend/replication/libpqwalreceiver, and the part linked with postgres binary is in src/backend/replication/walreceiver.c.	2010-01-20 09:16:24 +00:00
Heikki Linnakangas	40f908bdcd	Introduce Streaming Replication. This includes two new kinds of postmaster processes, walsenders and walreceiver. Walreceiver is responsible for connecting to the primary server and streaming WAL to disk, while walsender runs in the primary server and streams WAL from disk to the client. Documentation still needs work, but the basics are there. We will probably pull the replication section to a new chapter later on, as well as the sections describing file-based replication. But let's do that as a separate patch, so that it's easier to see what has been added/changed. This patch also adds a new section to the chapter about FE/BE protocol, documenting the protocol used by walsender/walreceivxer. Bump catalog version because of two new functions, pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for monitoring the progress of replication. Fujii Masao, with additional hacking by me	2010-01-15 09:19:10 +00:00
Tom Lane	daf5b0f297	Fix a few places where we needed -I. in CPPFLAGS to work properly in VPATH builds. We had this already in several places, but not all.	2010-01-05 03:56:52 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00
Tom Lane	cfc5008a51	Adjust naming of indexes and their columns per recent discussion. Index expression columns are now named after the FigureColname result for their expressions, rather than always being "pg_expression_N". Digits are appended to this name if needed to make the column name unique within the index. (That happens for regular columns too, thus fixing the old problem that CREATE INDEX fooi ON foo (f1, f1) fails. Before exclusion indexes there was no real reason to do such a thing, but now maybe there is.) Default names for indexes and associated constraints now include the column names of all their columns, not only the first one as in previous practice. (Of course, this will be truncated as needed to fit in NAMEDATALEN. Also, pkey indexes retain the historical behavior of not naming specific columns at all.) An example of the results: regression=# create table foo (f1 int, f2 text, regression(# exclude (f1 with =, lower(f2) with =)); NOTICE: CREATE TABLE / EXCLUDE will create implicit index "foo_f1_lower_exclusion" for table "foo" CREATE TABLE regression=# \d foo_f1_lower_exclusion Index "public.foo_f1_lower_exclusion" Column \| Type \| Definition --------+---------+------------ f1 \| integer \| f1 lower \| text \| lower(f2) btree, for table "public.foo"	2009-12-23 02:35:25 +00:00
Tom Lane	0cb65564e5	Add exclusion constraints, which generalize the concept of uniqueness to support any indexable commutative operator, not just equality. Two rows violate the exclusion constraint if "row1.col OP row2.col" is TRUE for each of the columns in the constraint. Jeff Davis, reviewed by Robert Haas	2009-12-07 05:22:23 +00:00
Tom Lane	249724cb01	Create an ALTER DEFAULT PRIVILEGES command, which allows users to adjust the privileges that will be applied to subsequently-created objects. Such adjustments are always per owning role, and can be restricted to objects created in particular schemas too. A notable benefit is that users can override the traditional default privilege settings, eg, the PUBLIC EXECUTE privilege traditionally granted by default for functions. Petr Jelinek	2009-10-05 19:24:49 +00:00
Tom Lane	12d8fae4cd	Simplify the bootstrap (BKI) code by getting rid of a useless table of all the strings seen during the bootstrap run. There might have been some actual point to doing that, many years ago, but as far as I can see the only value now is to conserve a bit of memory. Even if we cared about wasting a megabyte or so during the initdb run, it'd be far more effective to arrange to release memory at the end of each BKI command, instead of intentionally hanging onto strings that might never be used again. Not maintaining the table probably makes it faster too; but the main point of this patch is to get rid of a couple hundred lines of unnecessary and rather crufty code.	2009-09-27 01:32:11 +00:00
Tom Lane	4985635230	Extend the BKI infrastructure to allow system catalogs to be given hand-assigned rowtype OIDs, even when they are not "bootstrapped" catalogs that have handmade type rows in pg_type.h. Give pg_database such an OID. Restore the availability of C macros for the rowtype OIDs of the bootstrapped catalogs. (These macros are now in the individual catalogs' .h files, though, not in pg_type.h.) This commit doesn't do anything especially useful by itself, but it's necessary infrastructure for reverting some ill-considered changes in relcache.c.	2009-09-26 22:42:03 +00:00
Peter Eisentraut	234c7ce9f2	Derived files that are shipped in the distribution used to be built in the source directory even for out-of-tree builds. They are now alsl built in the build tree. This should be more convenient for certain developers' workflows, and shouldn't really break anything else.	2009-08-28 20:26:19 +00:00
Tom Lane	9072592946	Add ALTER TABLE ... ALTER COLUMN ... SET STATISTICS DISTINCT Robert Haas	2009-08-02 22:14:53 +00:00
Tom Lane	2487d872e0	Create a multiplexing structure for signals to Postgres child processes. This patch gets us out from under the Unix limitation of two user-defined signal types. We already had done something similar for signals directed to the postmaster process; this adds multiplexing for signals directed to backends and auxiliary processes (so long as they're connected to shared memory). As proof of concept, replace the former usage of SIGUSR1 and SIGUSR2 for backends with use of the multiplexing mechanism. There are still some hard-wired definitions of SIGUSR1 and SIGUSR2 for other process types, but getting rid of those doesn't seem interesting at the moment. Fujii Masao	2009-07-31 20:26:23 +00:00
Tom Lane	25d9bf2e3e	Support deferrable uniqueness constraints. The current implementation fires an AFTER ROW trigger for each tuple that looks like it might be non-unique according to the index contents at the time of insertion. This works well as long as there aren't many conflicts, but won't scale to massive unique-key reassignments. Improving that case is a TODO item. Dean Rasheed	2009-07-29 20:56:21 +00:00
Heikki Linnakangas	cdd46c7654	Start background writer during archive recovery. Background writer now performs its usual buffer cleaning duties during archive recovery, and it's responsible for performing restartpoints. This requires some changes in postmaster. When the startup process has done all the initialization and is ready to start WAL redo, it signals the postmaster to launch the background writer. The postmaster is signaled again when the point in recovery is reached where we know that the database is in consistent state. Postmaster isn't interested in that at the moment, but that's the point where we could let other backends in to perform read-only queries. The postmaster is signaled third time when the recovery has ended, so that postmaster knows that it's safe to start accepting connections. The startup process now traps SIGTERM, and performs a "clean" shutdown. If you do a fast shutdown during recovery, a shutdown restartpoint is performed, like a shutdown checkpoint, and postmaster kills the processes cleanly. You still have to continue the recovery at next startup, though. Currently, the background writer is only launched during archive recovery. We could launch it during crash recovery as well, but it seems better to keep that codepath as simple as possible, for the sake of robustness. And it couldn't do any restartpoints during crash recovery anyway, so it wouldn't be that useful. log_restartpoints is gone. Use log_checkpoints instead. This is yet to be documented. This whole operation is a pre-requisite for Hot Standby, but has some value of its own whether the hot standby patch makes 8.4 or not. Simon Riggs, with lots of modifications by me.	2009-02-18 15:58:41 +00:00
Tom Lane	3cb5d6580a	Support column-level privileges, as required by SQL standard. Stephen Frost, with help from KaiGai Kohei and others	2009-01-22 20:16:10 +00:00
Bruce Momjian	511db38ace	Update copyright for 2009.	2009-01-01 17:24:05 +00:00
Peter Eisentraut	a53536d031	Add %expect 0 to all parser input files to prevent conflicts slipping by.	2008-11-26 08:45:12 +00:00
Tom Lane	902d1cb35f	Remove all uses of the deprecated functions heap_formtuple, heap_modifytuple, and heap_deformtuple in favor of the newer functions heap_form_tuple et al (which do the same things but use bool control flags instead of arbitrary char values). Eliminate the former duplicate coding of these functions, reducing the deprecated functions to mere wrappers around the newer ones. We can't get rid of them entirely because add-on modules probably still contain many instances of the old coding style. Kris Jurka	2008-11-02 01:45:28 +00:00
Heikki Linnakangas	15c121b3ed	Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, the free space information is stored in a dedicated FSM relation fork, with each relation (except for hash indexes; they don't use FSM). This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any trace of them from the backend, initdb, and documentation. Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also introduce a new variant of the get_raw_page(regclass, int4, int4) function in contrib/pageinspect that let's you to return pages from any relation fork, and a new fsm_page_contents() function to inspect the new FSM pages.	2008-09-30 10:52:14 +00:00
Tom Lane	fbb2b69c8f	Prevent memory leaks in our various bison parsers when an error occurs during parsing. Formerly the parser's stack was allocated with malloc and so wouldn't be reclaimed; this patch makes it use palloc instead, so that flushing the current context will reclaim the memory. Per Marko Kreen.	2008-09-02 20:37:55 +00:00
Tom Lane	b153c09209	Add a bunch of new error location reports to parse-analysis error messages. There are still some weak spots around JOIN USING and relation alias lists, but most errors reported within backend/parser/ now have locations.	2008-09-01 20:42:46 +00:00
Peter Eisentraut	7c31742a07	Remove all traces that suggest that a non-Bison yacc might be supported, and change build system to use only Bison. Simplify build rules, make file names uniform. Don't build the token table header file where it is not needed.	2008-08-29 13:02:33 +00:00
Tom Lane	5f6f840e93	Reduce the alignment requirement of type "name" from int to char, and arrange to suppress zero-padding of "name" entries in indexes. The alignment change is unlikely to save any space, but it is really needed anyway to make the world safe for our widespread practice of passing plain old C strings to functions that are declared as taking Name. In the previous coding, the C compiler was entitled to assume that a Name pointer was word-aligned; but we were failing to guarantee that. I think the reason we'd not seen failures is that usually the only thing that gets done with such a pointer is strcmp(), which is hard to optimize in a way that exploits word-alignment. Still, some enterprising compiler guy will probably think of a way eventually, or we might change our code in a way that exposes more-obvious optimization opportunities. The padding change is accomplished in one-liner fashion by declaring the "name" index opclasses to use storage type "cstring" in pg_opclass.h. Normally btree and hash don't allow a nondefault storage type, because they don't have any provisions for converting the input datum to another type. However, because name and cstring are effectively the same thing except for padding, no conversion is needed --- we only need index_form_tuple() to treat the datum as being cstring not name, and this is sufficient. This seems to make for about a one-third reduction in the typical sizes of system catalog indexes that involve "name" columns, of which we have many. These two changes are only weakly related, but the alignment change makes me feel safer that the padding change won't introduce problems, so I'm committing them together.	2008-06-24 17:58:27 +00:00
Alvaro Herrera	f8c4d7db60	Restructure some header files a bit, in particular heapam.h, by removing some unnecessary #include lines in it. Also, move some tuple routine prototypes and macros to htup.h, which allows removal of heapam.h inclusion from some .c files. For this to work, a new header file access/sysattr.h needed to be created, initially containing attribute numbers of system columns, for pg_dump usage. While at it, make contrib ltree, intarray and hstore header files more consistent with our header style.	2008-05-12 00:00:54 +00:00
Tom Lane	cd902b331d	Change the rules for inherited CHECK constraints to be essentially the same as those for inherited columns; that is, it's no longer allowed for a child table to not have a check constraint matching one that exists on a parent. This satisfies the principle of least surprise (rows selected from the parent will always appear to meet its check constraints) and eliminates some longstanding bogosity in pg_dump, which formerly had to guess about whether check constraints were really inherited or not. The implementation involves adding conislocal and coninhcount columns to pg_constraint (paralleling attislocal and attinhcount in pg_attribute) and refactoring various ALTER TABLE actions to be more like those for columns. Alex Hunsaker, Nikhil Sontakke, Tom Lane	2008-05-09 23:32:05 +00:00
Peter Eisentraut	d35c56ed9f	Add "%option noinput" to the scanners to avoid compiler warnings. GCC 4.3 began to realize that the input() function isn't used and printed warnings.	2008-05-09 15:36:31 +00:00
Tom Lane	8472bf7a73	Allow float8, int8, and related datatypes to be passed by value on machines where Datum is 8 bytes wide. Since this will break old-style C functions (those still using version 0 calling convention) that have arguments or results of these types, provide a configure option to disable it and retain the old pass-by-reference behavior. Likewise, provide a configure option to disable the recently-committed float4 pass-by-value change. Zoltan Boszormenyi, plus configurability stuff by me.	2008-04-21 00:26:47 +00:00
Alvaro Herrera	7861d72ea2	Modify the float4 datatype to be pass-by-val. Along the way, remove the last uses of the long-deprecated float32 in contrib/seg; the definitions themselves are still there, but no longer used. fmgr/README updated to match. I added a CREATE FUNCTION to account for existing seg_center() code in seg.c too, and some tests for it and the neighbor functions. At the same time, remove checks for NULL which are not needed (because the functions are declared STRICT). I had to do some adjustments to contrib's btree_gist too. The choices for representation there are not ideal for changing the underlying types :-( Original patch by Zoltan Boszormenyi, with some adjustments by me.	2008-04-18 18:43:09 +00:00
Alvaro Herrera	73b0300b2a	Move the HTSU_Result enum definition into snapshot.h, to avoid including tqual.h into heapam.h. This makes all inclusion of tqual.h explicit. I also sorted alphabetically the includes on some source files.	2008-03-26 21:10:39 +00:00
Peter Eisentraut	0474dcb608	Refactor backend makefiles to remove lots of duplicate code	2008-02-19 10:30:09 +00:00
Tom Lane	8b63aa1ffc	Add back #include <time.h> in a couple of files that seem to need it on Linux.	2008-02-17 04:21:05 +00:00
Bruce Momjian	9098ab9e32	Update copyrights in source tree to 2008.	2008-01-01 19:46:01 +00:00
Tom Lane	265f904d8f	Code review for LIKE ... INCLUDING INDEXES patch. Fix failure to propagate constraint status of copied indexes (bug #3774), as well as various other small bugs such as failure to pstrdup when needed. Allow INCLUDING INDEXES indexes to be merged with identical declared indexes (perhaps not real useful, but the code is there and having it not apply to LIKE indexes seems pretty unorthogonal). Avoid useless work in generateClonedIndexStmt(). Undo some poorly chosen API changes, and put a couple of routines in modules that seem to be better places for them.	2007-12-01 23:44:44 +00:00
Bruce Momjian	fdf5a5efb7	pgindent run for 8.3.	2007-11-15 21:14:46 +00:00
Andrew Dunstan	63872601e8	Move session_start out of MyProcPort stucture and make it a global called MyStartTime, so that we will be able to create a cookie for all processes for CSVlogs. It is set wherever MyProcPid is set. Take the opportunity to remove the now unnecessary session-only restriction on the %s and %c escapes in log_line_prefix.	2007-08-02 23:39:45 +00:00
Tom Lane	ad4295728e	Create a new dedicated Postgres process, "wal writer", which exists to write and fsync WAL at convenient intervals. For the moment it just tries to offload this work from backends, but soon it will be responsible for guaranteeing a maximum delay before asynchronously-committed transactions will be flushed to disk. This is a portion of Simon Riggs' async-commit patch, committed to CVS separately because a background WAL writer seems like it might be a good idea independently of the async-commit feature. I rebased walwriter.c on bgwriter.c because it seemed like a more appropriate way of handling signals; while the startup/shutdown logic in postmaster.c is more like autovac because we want walwriter to quit before we start the shutdown checkpoint.	2007-07-24 04:54:09 +00:00
Neil Conway	474774918b	Implement CREATE TABLE LIKE ... INCLUDING INDEXES. Patch from NikhilS, based in part on an earlier patch from Trevor Hardcastle, and reviewed by myself.	2007-07-17 05:02:03 +00:00
Tom Lane	867e2c91a0	Implement "distributed" checkpoints in which the checkpoint I/O is spread over a fairly long period of time, rather than being spat out in a burst. This happens only for background checkpoints carried out by the bgwriter; other cases, such as a shutdown checkpoint, are still done at full speed. Remove the "all buffers" scan in the bgwriter, and associated stats infrastructure, since this seems no longer very useful when the checkpoint itself is properly throttled. Original patch by Itagaki Takahiro, reworked by Heikki Linnakangas, and some minor API editorialization by me.	2007-06-28 00:02:40 +00:00
Tom Lane	b9527e9840	First phase of plan-invalidation project: create a plan cache management module and teach PREPARE and protocol-level prepared statements to use it. In service of this, rearrange utility-statement processing so that parse analysis does not assume table schemas can't change before execution for utility statements (necessary because we don't attempt to re-acquire locks for utility statements when reusing a stored plan). This requires some refactoring of the ProcessUtility API, but it ends up cleaner anyway, for instance we can get rid of the QueryContext global. Still to do: fix up SPI and related code to use the plan cache; I'm tempted to try to make SQL functions use it too. Also, there are at least some aspects of system state that we want to ensure remain the same during a replan as in the original processing; search_path certainly ought to behave that way for instance, and perhaps there are others.	2007-03-13 00:33:44 +00:00
Alvaro Herrera	626eb02198	Cleanup the bootstrap code a little, and rename "dummy procs" in the code comments and variables to "auxiliary proc", per Heikki's request.	2007-03-07 13:35:03 +00:00
Alvaro Herrera	68046a20c7	Remove useless database name from bootstrap argument processing (including startup and bgwriter processes), and the -y flag. It's not used anywhere.	2007-02-16 02:10:07 +00:00
Alvaro Herrera	1820650934	Restructure autovacuum in two processes: a dummy process, which runs continuously, and requests vacuum runs of "autovacuum workers" to postmaster. The workers do the actual vacuum work. This allows for future improvements, like allowing multiple autovacuum jobs running in parallel. For now, the code keeps the original behavior of having a single autovac process at any time by sleeping until the previous worker has finished.	2007-02-15 23:23:23 +00:00
Peter Eisentraut	4ab8fcba8a	StrNCpy -> strlcpy (not complete)	2007-02-10 14:58:55 +00:00
Tom Lane	5a7471c307	Add COST and ROWS options to CREATE/ALTER FUNCTION, plus underlying pg_proc columns procost and prorows, to allow simple user adjustment of the estimated cost of a function call, as well as control of the estimated number of rows returned by a set-returning function. We might eventually wish to extend this to allow function-specific estimation routines, but there seems to be consensus that we should try a simple constant estimate first. In particular this provides a relatively simple way to control the order in which different WHERE clauses are applied in a plan node, which is a Good Thing in view of the fact that the recent EquivalenceClass planner rewrite made that much less predictable than before.	2007-01-22 01:35:23 +00:00
Peter Eisentraut	2cc01004c6	Remove remains of old depend target.	2007-01-20 17:16:17 +00:00
Tom Lane	4431758229	Support ORDER BY ... NULLS FIRST/LAST, and add ASC/DESC/NULLS FIRST/NULLS LAST per-column options for btree indexes. The planner's support for this is still pretty rudimentary; it does not yet know how to plan mergejoins with nondefault ordering options. The documentation is pretty rudimentary, too. I'll work on improving that stuff later. Note incompatible change from prior behavior: ORDER BY ... USING will now be rejected if the operator is not a less-than or greater-than member of some btree opclass. This prevents less-than-sane behavior if an operator that doesn't actually define a proper sort ordering is selected.	2007-01-09 02:14:16 +00:00
Bruce Momjian	29dccf5fe0	Update CVS HEAD for 2007 copyright. Back branches are typically not back-stamped for this.	2007-01-05 22:20:05 +00:00
Tom Lane	3ad0728c81	On systems that have setsid(2) (which should be just about everything except Windows), arrange for each postmaster child process to be its own process group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole process group not only the direct child process. This provides saner behavior for archive and recovery scripts; in particular, it's possible to shut down a warm-standby recovery server using "pg_ctl stop -m immediate", since delivery of SIGQUIT to the startup subprocess will result in killing the waiting recovery_command. Also, this makes Query Cancel and statement_timeout apply to scripts being run from backends via system(). (There is no support in the core backend for that, but it's widely done using untrusted PLs.) Per gripe from Stephen Harris and subsequent discussion.	2006-11-21 20:59:53 +00:00
Tom Lane	e82d9e6283	Adjust elog.c so that elog(FATAL) exits (including cases where ERROR is promoted to FATAL) end in exit(1) not exit(0). Then change the postmaster to allow exit(1) without a system-wide panic, but not for the startup subprocess or the bgwriter. There were a couple of places that were using exit(1) to deliberately force a system-wide panic; adjust these to be exit(2) instead. This fixes the problem noted back in July that if the startup process exits with elog(ERROR), the postmaster would think everything is hunky-dory and proceed to start up. Alternative solutions such as trying to run the entire startup process as a critical section seem less clean, primarily because of the fact that a fair amount of startup code is shared by all postmaster children in the EXEC_BACKEND case. We'd need an ugly special case somewhere near the head of main.c to make it work if it's the child process's responsibility to determine what happens; and what's the point when the postmaster already treats different children differently?	2006-11-21 00:49:55 +00:00
Bruce Momjian	f99a569a2e	pgindent run for 8.2.	2006-10-04 00:30:14 +00:00
Tom Lane	e093dcdd28	Add the ability to create indexes 'concurrently', that is, without blocking concurrent writes to the table. Greg Stark, with a little help from Tom Lane.	2006-08-25 04:06:58 +00:00
Tom Lane	1395ac6c67	Add a hack so that get_type_io_data() can work from bootstrap.c's internal TypInfo table in bootstrap mode. This allows array_in and array_out to be used during early bootstrap, which eliminates the former obstacle to giving OUT parameters to built-in functions.	2006-08-15 22:36:17 +00:00
Tom Lane	09d3670df3	Change the relation_open protocol so that we obtain lock on a relation (table or index) before trying to open its relcache entry. This fixes race conditions in which someone else commits a change to the relation's catalog entries while we are in process of doing relcache load. Problems of that ilk have been reported sporadically for years, but it was not really practical to fix until recently --- for instance, the recent addition of WAL-log support for in-place updates helped. Along the way, remove pg_am.amconcurrent: all AMs are now expected to support concurrent update.	2006-07-31 20:09:10 +00:00
Tom Lane	6e38e34d64	Change the bootstrap sequence so that toast tables for system catalogs are created in the bootstrap phase proper, rather than added after-the-fact by initdb. This is cleaner than before because it allows us to retire the undocumented ALTER TABLE ... CREATE TOAST TABLE command, but the real reason I'm doing it is so that toast tables of shared catalogs will now have predetermined OIDs. This will allow a reasonably clean solution to the problem of locking tables before we load their relcache entries, to appear in a forthcoming patch.	2006-07-31 01:16:38 +00:00
Tom Lane	033a477e9e	Adjust initialization sequence for timezone_abbreviations so that it's handled just about like timezone; in particular, don't try to read anything during InitializeGUCOptions. Should solve current startup failure on Windows, and avoid wasted cycles if a nondefault setting is specified in postgresql.conf too. Possibly we need to think about a more general solution for handling 'expensive to set' GUC options.	2006-07-29 03:02:56 +00:00
Bruce Momjian	e0522505bd	Remove 576 references of include files that were not needed.	2006-07-14 14:52:27 +00:00

1 2 3 4 5 ...

448 Commits