postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	c1793f2e0c	In SPGiST replay, do conflict resolution before modifying the page. In yesterday's commit `962e0cc71e`, I added the ResolveRecoveryConflictWithSnapshot call in the wrong place. I correctly put it before spgRedoVacuumRedirect itself would modify the index page --- but not before RestoreBkpBlocks, so replay of a record with a full-page image would modify the page before kicking off any conflicting HS transactions. Oops.	2012-08-03 15:23:14 -04:00
Tom Lane	962e0cc71e	Fix race conditions associated with SPGiST redirection tuples. The correct test for whether a redirection tuple is removable is whether tuple's xid < RecentGlobalXmin, not OldestXmin; the previous coding failed to protect index searches being done in concurrent transactions that have no XID. This mirrors the recent fix in btree's page recycling logic made in commit `d3abbbebe5`. Also, WAL-log the newest XID of any removed redirection tuple on an index page, and apply ResolveRecoveryConflictWithSnapshot during InHotStandby WAL replay. This protects against concurrent Hot Standby transactions possibly needing to see the redirection tuple(s). Per my query of 2012-03-12 and subsequent discussion.	2012-08-02 15:34:14 -04:00
Tom Lane	41b9c8452b	Replace libpq's "row processor" API with a "single row" mode. After taking awhile to digest the row-processor feature that was added to libpq in commit `92785dac2e`, we've concluded it is over-complicated and too hard to use. Leave the core infrastructure changes in place (that is, there's still a row processor function inside libpq), but remove the exposed API pieces, and instead provide a "single row" mode switch that causes PQgetResult to return one row at a time in separate PGresult objects. This approach incurs more overhead than proper use of a row processor callback would, since construction of a PGresult per row adds extra cycles. However, it is far easier to use and harder to break. The single-row mode still affords applications the primary benefit that the row processor API was meant to provide, namely not having to accumulate large result sets in memory before processing them. Preliminary testing suggests that we can probably buy back most of the extra cycles by micro-optimizing construction of the extra results, but that task will be left for another day. Marko Kreen	2012-08-02 13:10:30 -04:00
Tom Lane	f6ce81f55a	Fix WITH attached to a nested set operation (UNION/INTERSECT/EXCEPT). Parse analysis neglected to cover the case of a WITH clause attached to an intermediate-level set operation; it only handled WITH at the top level or WITH attached to a leaf-level SELECT. Per report from Adam Mackler. In HEAD, I rearranged the order of SelectStmt's fields to put withClause with the other fields that can appear on non-leaf SelectStmts. In back branches, leave it alone to avoid a possible ABI break for third-party code. Back-patch to 8.4 where WITH support was added.	2012-07-31 17:56:21 -04:00
Tom Lane	b76356ac22	Fix syslogger so that log_truncate_on_rotation works in the first rotation. In the original coding of the log rotation stuff, we did not bother to make the truncation logic work for the very first rotation after postmaster start (or after a syslogger crash and restart). It just always appended in that case. It did not seem terribly important at the time, but we've recently had two separate complaints from people who expected it to work unsurprisingly. (Both users tend to restart the postmaster about as often as a log rotation is configured to happen, which is maybe not typical use, but still...) Since the initial log file is opened in the postmaster, fixing this requires passing down some more state to the syslogger child process. It's always been like this, so back-patch to all supported branches.	2012-07-31 14:36:54 -04:00
Alvaro Herrera	2f29f011c8	pg_basebackup: stylistic adjustments The most user-visible part of this is to change the long options --statusint and --noloop to --status-interval and --no-loop, respectively, per discussion. Also, consistently enclose file names in double quotes, per our conventions; and consistently use the term "transaction log file" to talk about WAL segments. (Someday we may need to go over this terminology and make it consistent across the whole source code.) Finally, reflow the code to better fit in 80 columns, and have pgindent fix it up some more.	2012-07-31 11:02:39 -04:00
Tom Lane	9ae8ebe0b2	Improve reporting of error situations in find_other_exec(). This function suppressed any stderr output from the called program, which is unnecessary in the normal case and unhelpful in error cases. It also gave a rather opaque message along the lines of "fgets failure: Success" in case the called program failed to return anything on stdout. Since we've seen multiple reports of people not understanding what's wrong when pg_ctl reports this, improve the message. Back-patch to all active branches.	2012-07-27 19:31:13 -04:00
Tom Lane	26b438694c	Only allow autovacuum to be auto-canceled by a directly blocked process. In the original coding of the autovacuum cancel feature, commit `acac68b2bc`, an autovacuum process was considered a target for cancellation if it was found to hard-block any process examined in the deadlock search. This patch tightens the test so that the autovacuum must directly hard-block the current process. This should make the behavior more predictable in general, and in particular it ensures that an autovacuum will not be canceled with less than deadlock_timeout grace period. In the old coding, it was possible for an autovacuum to be canceled almost instantly, given unfortunate timing of two or more other processes' lock attempts. This also justifies the logging methodology in the recent commit d7318d43d891bd63e82dcfc27948113ed7b1db80; without this restriction, that patch isn't providing enough information to see the connection of the canceling process to the autovacuum. Like that one, patch all the way back.	2012-07-26 14:29:22 -04:00
Robert Haas	d20cdd31c0	Tab complete table names after ALTER TABLE x [NO] INHERIT. Jeff Janes	2012-07-26 10:16:55 -04:00
Robert Haas	d7318d43d8	Log a better message when canceling autovacuum. The old message was at DEBUG2, so typically it didn't show up in the log at all. As a result, in most cases where autovacuum was canceled, the only information that was logged was the table being vacuumed, with no indication as to what problem caused the cancel. Crank up the level to LOG and add some more details to assist with debugging. Back-patch all the way, per discussion on pgsql-hackers.	2012-07-26 09:19:03 -04:00
Tom Lane	af026b5d9b	Fix longstanding crash-safety bug with newly-created-or-reset sequences. If a crash occurred immediately after the first nextval() call for a serial column, WAL replay would restore the sequence to a state in which it appeared that no nextval() had been done, thus allowing the first sequence value to be returned again by the next nextval() call; as reported in bug #6748 from Xiangming Mei. More generally, the problem would occur if an ALTER SEQUENCE was executed on a freshly created or reset sequence. (The manifestation with serial columns was introduced in 8.2 when we added an ALTER SEQUENCE OWNED BY step to serial column creation.) The cause is that sequence creation attempted to save one WAL entry by writing out a WAL record that made it appear that the first nextval() had already happened (viz, with is_called = true), while marking the sequence's in-database state with log_cnt = 1 to show that the first nextval() need not emit a WAL record. However, ALTER SEQUENCE would emit a new WAL entry reflecting the actual in-database state (with is_called = false). Then, nextval would allocate the first sequence value and set is_called = true, but it would trust the log_cnt value and not emit any WAL record. A crash at this point would thus restore the sequence to its post-ALTER state, causing the next nextval() call to return the first sequence value again. To fix, get rid of the idea of logging an is_called status different from reality. This means that the first nextval-driven WAL record will happen at the first nextval call not the second, but the marginal cost of that is pretty negligible. In addition, make sure that ALTER SEQUENCE resets log_cnt to zero in any case where it touches sequence parameters that affect future nextval results. This will result in some user-visible changes in the contents of a sequence's log_cnt column, as reflected in the patch's regression test changes; but no application should be depending on that anyway, since it was already true that log_cnt changes rather unpredictably depending on checkpoint timing. In addition, make some basically-cosmetic improvements to get rid of sequence.c's undesirable intimacy with page layout details. It was always really trying to WAL-log the contents of the sequence tuple, so we should have it do that directly using a HeapTuple's t_data and t_len, rather than backing into it with some magic assumptions about where the tuple would be on the sequence's page. Back-patch to all supported branches.	2012-07-25 17:42:23 -04:00
Alvaro Herrera	58f17dcf83	Add translator comments to module names	2012-07-25 00:02:49 -04:00
Alvaro Herrera	d7b47e5155	Change syntax of new CHECK NO INHERIT constraints The initially implemented syntax, "CHECK NO INHERIT (expr)" was not deemed very good, so switch to "CHECK (expr) NO INHERIT" instead. This way it looks similar to SQL-standards compliant constraint attribute. Backport to 9.2 where the new syntax and feature was introduced. Per discussion.	2012-07-24 16:01:32 -04:00
Peter Eisentraut	d61d9aa750	Update information schema to SQL:2011 This is just a section renumbering for now. Some details might be filled in later.	2012-07-23 22:32:56 +03:00
Tom Lane	b71258af56	Fix name collision between concurrent regression tests. Commit `f5bcd398ad` introduced a test using a table named "circles" in inherit.sql. Unfortunately, the concurrently executed constraints test was already using that table name, so the parallel regression tests would sometimes fail. Rename table to dodge the problem. Per buildfarm.	2012-07-22 00:01:19 -04:00
Tom Lane	2d46a57ddc	Improve copydir() code for the case that fsync is off. We should avoid calling sync_file_range or posix_fadvise in this case, since (a) we don't really care if the data gets synced, and might as well save the kernel calls; (b) at least on Linux we know that the kernel might block us until it's scheduled the write. Also, avoid making a useless second traversal of the directory tree if we're not actually going to call fsync(2) after all.	2012-07-21 20:10:29 -04:00
Tom Lane	2c4f5b4bc5	Use --nosync during make check's initdb call. We left this out of commit `b966dd6c42` so as to get some more buildfarm testing of the new fsync code in initdb. But since no problems have turned up, it's probably time to save the cycles.	2012-07-21 19:56:22 -04:00
Tom Lane	1f115d98b9	Suppress volatile-related warning seen in some compilers. Antique versions of gcc complain about vars that are initialized outside PG_TRY and then modified within it. Rather than marking the var volatile, expend one more line of code.	2012-07-21 19:39:03 -04:00
Tom Lane	31c7c642b6	Account for SRFs in targetlists in planner rowcount estimates. We made use of the ROWS estimate for set-returning functions used in FROM, but not for those used in SELECT targetlists; which is a bit of an oversight considering there are common usages that require the latter approach. Improve that. (I had initially thought it might be worth folding this into cost_qual_eval, but after investigation concluded that that wouldn't be very helpful, so just do it separately.) Per complaint from David Johnston. Back-patch to 9.2, but not further, for fear of destabilizing plan choices in existing releases.	2012-07-21 17:45:07 -04:00
Robert Haas	ed0af33247	Revert temporary patch to debug Windows breakage. This reverts commit `0a248208a0`.	2012-07-20 22:31:19 -04:00
Robert Haas	0635c0b524	Repair plpgsql_validator breakage. Commit `3a0e4d36eb` arranged to reference stack-allocated variables after they were out of scope. That's no good, so let's arrange to not do that after all.	2012-07-20 21:28:26 -04:00
Andrew Dunstan	a1e5705c9f	Remove now unneeded results file for disabled prepared transactions case.	2012-07-20 16:30:34 -04:00
Robert Haas	0a248208a0	Temporary patch to try to debug why event trigger patch broke Windows. Apologies for the ugliness.	2012-07-20 16:22:11 -04:00
Andrew Dunstan	ae55d9fbe3	Remove prepared transactions from main isolation test schedule. There is no point in running this test when prepared transactions are disabled, which is the default. New make targets that include the test are provided. This will save some useless waste of cycles on buildfarm machines. Backpatch to 9.1 where these tests were introduced.	2012-07-20 15:51:40 -04:00
Peter Eisentraut	8ca03aa414	pg_dump: Simplify mkdir() error checking mkdir() can check for errors itself. We don't need to code that ourselves again.	2012-07-20 22:34:11 +03:00
Alvaro Herrera	f5bcd398ad	connoinherit may be true only for CHECK constraints The code was setting it true for other constraints, which is bogus. Doing so caused bogus catalog entries for such constraints, and in particular caused an error to be raised when trying to drop a constraint of types other than CHECK from a table that has children, such as reported in bug #6712. In 9.2, additionally ignore connoinherit=true for other constraint types, to avoid having to force initdb; existing databases might already contain bogus catalog entries. Includes a catversion bump (in HEAD only). Bug report from Miroslav Šulc Analysis from Amit Kapila and Noah Misch; Amit also contributed the patch.	2012-07-20 14:08:07 -04:00
Tom Lane	8e617e29aa	Fix whole-row Var evaluation to cope with resjunk columns (again). When a whole-row Var is reading the result of a subquery, we need it to ignore any "resjunk" columns that the subquery might have evaluated for GROUP BY or ORDER BY purposes. We've hacked this area before, in commit `68e40998d0`, but that fix only covered whole-row Vars of named composite types, not those of RECORD type; and it was mighty klugy anyway, since it just assumed without checking that any extra columns in the result must be resjunk. A proper fix requires getting hold of the subquery's targetlist so we can actually see which columns are resjunk (whereupon we can use a JunkFilter to get rid of them). So bite the bullet and add some infrastructure to make that possible. Per report from Andrew Dunstan and additional testing by Merlin Moncure. Back-patch to all supported branches. In 8.3, also back-patch commit `292176a118`, which for some reason I had not done at the time, but it's a prerequisite for this change.	2012-07-20 13:10:58 -04:00
Robert Haas	3a0e4d36eb	Make new event trigger facility actually do something. Commit `3855968f32` added syntax, pg_dump, psql support, and documentation, but the triggers didn't actually fire. With this commit, they now do. This is still a pretty basic facility overall because event triggers do not get a whole lot of information about what the user is trying to do unless you write them in C; and there's still no option to fire them anywhere except at the very beginning of the execution sequence, but it's better than nothing, and a good building block for future work. Along the way, add a regression test for ALTER LARGE OBJECT, since testing of event triggers reveals that we haven't got one. Dimitri Fontaine and Robert Haas	2012-07-20 11:39:01 -04:00
Tom Lane	be86e3dd5b	Rethink checkpointer's fsync-request table representation. Instead of having one hash table entry per relation/fork/segment, just have one per relation, and use bitmapsets to represent which specific segments need to be fsync'd. This eliminates the need to scan the whole hash table to implement FORGET_RELATION_FSYNC, which fixes the O(N^2) behavior recently demonstrated by Jeff Janes for cases involving lots of TRUNCATE or DROP TABLE operations during a single checkpoint cycle. Per an idea from Robert Haas. (FORGET_DATABASE_FSYNC still sucks, but since dropping a database is a pretty expensive operation anyway, we'll live with that.) In passing, improve the delayed-unlink code: remove the pass over the list in mdpreckpt, since it wasn't doing anything for us except supporting a useless Assert in mdpostckpt, and fix mdpostckpt so that it will absorb fsync requests every so often when clearing a large backlog of deletion requests.	2012-07-19 19:28:22 -04:00
Tom Lane	3072b7bade	Send only one FORGET_RELATION_FSYNC request when dropping a relation. We were sending one per fork, but a little bit of refactoring allows us to send just one request with forknum == InvalidForkNumber. This not only reduces pressure on the shared-memory request queue, but saves repeated traversals of the checkpointer's hash table.	2012-07-19 13:07:33 -04:00
Heikki Linnakangas	a7a4add6c4	Refactor the way code is shared between some range type functions. Functions like range_eq, range_before etc. are exposed at the SQL-level, but they're also used internally by the GiST consistent support function. The code sharing was done by a hack, TrickFunctionCall2, which relied on the knowledge that all the functions used fn_extra the same way. This commit splits the functions into internal versions that take a TypeCacheEntry as argument, and thin wrappers to expose the functions at the SQL-level. The internal versions can then be called directly and in a less hacky way from the GiST consistent function. This is just cosmetic, but backpatch to 9.2 anyway, to avoid having a different version of this code in the 9.2 branch. That would make backpatching fixes in this area more difficult. Alexander Korotkov	2012-07-18 23:14:56 +03:00
Tom Lane	80e373c3a8	Fix statistics breakage from bgwriter/checkpointer process split. ForwardFsyncRequest() supposed that it could only be called in regular backends, which used to be true; but since the splitup of bgwriter and checkpointer, it is also called in the bgwriter. We do not want to count such calls in pg_stat_bgwriter.buffers_backend statistics, so fix things so that they aren't. (It's worth noting here that this implies an alarmingly large increase in the expected amount of cross-process fsync request traffic, which may well mean that the process splitup was not such a hot idea.)	2012-07-18 15:40:31 -04:00
Tom Lane	4a9c30a8a1	Fix management of pendingOpsTable in auxiliary processes. mdinit() was misusing IsBootstrapProcessingMode() to decide whether to create an fsync pending-operations table in the current process. This led to creating a table not only in the startup and checkpointer processes as intended, but also in the bgwriter process, not to mention other auxiliary processes such as walwriter and walreceiver. Creation of the table in the bgwriter is fatal, because it absorbs fsync requests that should have gone to the checkpointer; instead they just sit in bgwriter local memory and are never acted on. So writes performed by the bgwriter were not being fsync'd which could result in data loss after an OS crash. I think there is no live bug with respect to walwriter and walreceiver because those never perform any writes of shared buffers; but the potential is there for future breakage in those processes too. To fix, make AuxiliaryProcessMain() export the current process's AuxProcType as a global variable, and then make mdinit() test directly for the types of aux process that should have a pendingOpsTable. Having done that, we might as well also get rid of the random bool flags such as am_walreceiver that some of the aux processes had grown. (Note that we could not have fixed the bug by examining those variables in mdinit(), because it's called from BaseInit() which is run by AuxiliaryProcessMain() before entering any of the process-type-specific code.) Back-patch to 9.2, where the problem was introduced by the split-up of bgwriter and checkpointer processes. The bogus pendingOpsTable exists in walwriter and walreceiver processes in earlier branches, but absent any evidence that it causes actual problems there, I'll leave the older branches alone.	2012-07-18 15:28:10 -04:00
Robert Haas	3855968f32	Syntax support and documentation for event triggers. They don't actually do anything yet; that will get fixed in a follow-on commit. But this gets the basic infrastructure in place, including CREATE/ALTER/DROP EVENT TRIGGER; support for COMMENT, SECURITY LABEL, and ALTER EXTENSION .. ADD/DROP EVENT TRIGGER; pg_dump and psql support; and documentation for the anticipated initial feature set. Dimitri Fontaine, with review and a bunch of additional hacking by me. Thom Brown extensively reviewed earlier versions of this patch set, but there's not a whole lot of that code left in this commit, as it turns out.	2012-07-18 10:16:16 -04:00
Tom Lane	73b796a52c	Improve coding around the fsync request queue. In all branches back to 8.3, this patch fixes a questionable assumption in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue that there are no uninitialized pad bytes in the request queue structs. This would only cause trouble if (a) there were such pad bytes, which could happen in 8.4 and up if the compiler makes enum ForkNumber narrower than 32 bits, but otherwise would require not-currently-planned changes in the widths of other typedefs; and (b) the kernel has not uniformly initialized the contents of shared memory to zeroes. Still, it seems a tad risky, and we can easily remove any risk by pre-zeroing the request array for ourselves. In addition to that, we need to establish a coding rule that struct RelFileNode can't contain any padding bytes, since such structs are copied into the request array verbatim. (There are other places that are assuming this anyway, it turns out.) In 9.1 and up, the risk was a bit larger because we were also effectively assuming that struct RelFileNodeBackend contained no pad bytes, and with fields of different types in there, that would be much easier to break. However, there is no good reason to ever transmit fsync or delete requests for temp files to the bgwriter/checkpointer, so we can revert the request structs to plain RelFileNode, getting rid of the padding risk and saving some marginal number of bytes and cycles in fsync queue manipulation while we are at it. The savings might be more than marginal during deletion of a temp relation, because the old code transmitted an entirely useless but nonetheless expensive-to-process ForgetRelationFsync request to the background process, and also had the background process perform the file deletion even though that can safely be done immediately. In addition, make some cleanup of nearby comments and small improvements to the code in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue.	2012-07-17 16:56:54 -04:00
Peter Eisentraut	71f2dd2321	PL/Python: Remove PLy_result_ass_item It is apparently no longer used after the new slicing support was implemented (`a97207b690`), so let's remove the dead code and see if anything cares.	2012-07-17 23:26:49 +03:00
Alvaro Herrera	65558995a2	Remove recently added PL/Perl encoding tests These only pass cleanly on UTF8 and SQL_ASCII encodings, besides the Japanese encoding in which they were originally written, which is clearly not good enough. Since the functionality they test has not ever been tested from PL/Perl, the best answer seems to be to remove the new tests completely. Per buildfarm results and ensuing discussion.	2012-07-17 13:26:25 -04:00
Tom Lane	57b9bdda39	Put back storage/proc.h in postmaster.c. I took this out thinking it wasn't needed anymore, but the EXEC_BACKEND code still needs it. Per buildfarm.	2012-07-17 10:14:06 -04:00
Alvaro Herrera	f34c68f096	Introduce timeout handling framework Management of timeouts was getting a little cumbersome; what we originally had was more than enough back when we were only concerned about deadlocks and query cancel; however, when we added timeouts for standby processes, the code got considerably messier. Since there are plans to add more complex timeouts, this seems a good time to introduce a central timeout handling module. External modules register their timeout handlers during process initialization, and later enable and disable them as they see fit using a simple API; timeout.c is in charge of keeping track of which timeouts are in effect at any time, installing a common SIGALRM signal handler, and calling setitimer() as appropriate to ensure timely firing of external handlers. timeout.c additionally supports pluggable modules to add their own timeouts, though this capability isn't exercised anywhere yet. Additionally, as of this commit, walsender processes are aware of timeouts; we had a preexisting bug there that made those ignore SIGALRM, thus being subject to unhandled deadlocks, particularly during the authentication phase. This has already been fixed in back branches in commit `0bf8eb2a`, which see for more details. Main author: Zoltán Böszörményi Some review and cleanup by Álvaro Herrera Extensive reworking by Tom Lane	2012-07-16 22:55:33 -04:00
Peter Eisentraut	dd16f9480a	Remove unreachable code The Solaris Studio compiler warns about these instances, unlike more mainstream compilers such as gcc. But manual inspection showed that the code is clearly not reachable, and we hope no worthy compiler will complain about removing this code.	2012-07-16 22:15:03 +03:00
Peter Eisentraut	a76c857eba	Add comment why seemingly dead code is necessary	2012-07-16 22:08:04 +03:00
Tom Lane	c92be3c059	Avoid pre-determining index names during CREATE TABLE LIKE parsing. Formerly, when trying to copy both indexes and comments, CREATE TABLE LIKE had to pre-assign names to indexes that had comments, because it made up an explicit CommentStmt command to apply the comment and so it had to know the name for the index. This creates bad interactions with other indexes, as shown in bug #6734 from Daniele Varrazzo: the preassignment logic couldn't take any other indexes into account so it could choose a conflicting name. To fix, add a field to IndexStmt that allows it to carry a comment to be assigned to the new index. (This isn't a user-exposed feature of CREATE INDEX, only an internal option.) Now we don't need preassignment of index names in any situation. I also took the opportunity to refactor DefineIndex to accept the IndexStmt as such, rather than passing all its fields individually in a mile-long parameter list. Back-patch to 9.2, but no further, because it seems too dangerous to change IndexStmt or DefineIndex's API in released branches. The bug exists back to 9.0 where CREATE TABLE LIKE grew the ability to copy comments, but given the lack of prior complaints we'll just let it go unfixed before 9.2.	2012-07-16 13:25:18 -04:00
Tom Lane	54fd196ffc	Prevent corner-case core dump in rfree(). rfree() failed to cope with the case that pg_regcomp() had initialized the regex_t struct but then failed to allocate any memory for re->re_guts (ie, the first malloc call in pg_regcomp() failed). It would try to touch the guts struct anyway, and thus dump core. This is a sufficiently narrow corner case that it's not surprising it's never been seen in the field; but still a bug is a bug, so patch all active branches. Noted while investigating whether we need to call pg_regfree after a failure return from pg_regcomp. Other than this bug, it turns out we don't, so adjust comments appropriately.	2012-07-15 13:27:54 -04:00
Heikki Linnakangas	2686da9db2	Don't initialize TLI variable to -1, as TimeLineID is unsigned. This was causing a compiler warning with Solaris compiler. Use 0 instead. The variable is initialized just for the sake of tidyness and/or debugging, it's not used for anything before setting it to a real value. Per report and suggestion from Peter Eisentraut.	2012-07-14 21:04:53 +03:00
Heikki Linnakangas	6c349a565a	Print the name of the WAL file containing latest REDO ptr in pg_controldata. This makes it easier to determine how far back you need to keep archived WAL files, to restore from a backup. Fujii Masao	2012-07-14 14:22:57 +03:00
Tom Lane	b966dd6c42	Add fsync capability to initdb, and use sync_file_range() if available. Historically we have not worried about fsync'ing anything during initdb (in fact, initdb intentionally passes -F to each backend launch to prevent it from fsync'ing). But with filesystems getting more aggressive about caching data, that's not such a good plan anymore. Make initdb do a pass over the finished data directory tree to fsync everything. For testing purposes, the -N/--nosync flag can be used to restore the old behavior. Also, testing shows that on Linux, sync_file_range() is much faster than posix_fadvise() for hinting to the kernel that an fsync is coming, apparently because the latter blocks on a rather small request queue while the former doesn't. So use this function if available in initdb, and also in the backend's pg_flush_data() (where it currently will affect only the speed of CREATE DATABASE's cloning step). We will later make pg_regress invoke initdb with the --nosync flag to avoid slowing down cases such as "make check" in contrib. But let's not do so until we've shaken out any portability issues in this patch. Jeff Davis, reviewed by Andres Freund	2012-07-13 17:16:58 -04:00
Tom Lane	1a9405d265	Cosmetic cleanup of ginInsertValue(). Make it clearer that the passed stack mustn't be empty, and that we are not supposed to fall off the end of the stack in the main loop. Tighten the loop that extracts the root block number, too. Markus Wanner and Tom Lane	2012-07-13 11:37:39 -04:00
Peter Eisentraut	a84bf4922e	Avoid extra newlines in XML mapping in table forest mode found by P. Broennimann	2012-07-12 23:52:50 +03:00
Tom Lane	a36088bcfa	Skip text->binary conversion of unnecessary columns in contrib/file_fdw. When reading from a text- or CSV-format file in file_fdw, the datatype input routines can consume a significant fraction of the runtime. Often, the query does not need all the columns, so we can get a useful speed boost by skipping I/O conversion for unnecessary columns. To support this, add a "convert_selectively" option to the core COPY code. This is undocumented and not accessible from SQL (for now, anyway). Etsuro Fujita, reviewed by KaiGai Kohei	2012-07-12 16:26:59 -04:00
Bruce Momjian	76720bdf1a	Remove 'x =- 1' check for pgindent, not needed, per report from Andrew Dunstan.	2012-07-12 14:37:47 -04:00
Magnus Hagander	058a050ec7	Fix memory and file descriptor leaks in pg_receivexlog/pg_basebackup When the internal loop mode was added, freeing memory and closing filedescriptors before returning became important, and a few cases in the code missed that. Fujii Masao	2012-07-12 13:33:58 +02:00
Tom Lane	84a42560c8	Add array_remove() and array_replace() functions. These functions support removing or replacing array element value(s) matching a given search value. Although intended mainly to support a future array-foreign-key feature, they seem useful in their own right. Marco Nenciarini and Gabriele Bartolini, reviewed by Alex Hunsaker	2012-07-11 13:59:35 -04:00
Tom Lane	01215d61a7	Fix bogus macro definition. Per buildfarm complaints.	2012-07-10 22:36:11 -04:00
Tatsuo Ishii	1c7a7faa5b	Add comments about additional mule-internal charsets from emacs's source code(lisp/international/mule-conf.el). These charsets have not been supported up to now anyway, so this is just for adding commentary. Also add mention that we follow emacs's implementation, not xemacs's.	2012-07-11 08:10:50 +09:00
Tom Lane	60e9c224a1	Fix ASCII case in pg_wchar2mule_with_len. Also some cosmetic improvements for wchar-to-mblen patch.	2012-07-10 15:59:39 -04:00
Alvaro Herrera	379607c9e8	plperl: Skip setting UTF8 flag when in SQL_ASCII encoding When in SQL_ASCII encoding, strings passed around are not necessarily UTF8-safe. We had already fixed this in some places, but it looks like we missed some. I had to backpatch Peter Eisentraut's `a8b92b60` to 9.1 in order for this patch to cherry-pick more cleanly. Patch from Alex Hunsaker, tweaked by Kyotaro HORIGUCHI and myself. Some desultory cleanup and comment addition by me, during patch review. Per bug report from Christoph Berg in 20120209102116.GA14429@msgid.df7cb.de	2012-07-10 15:15:16 -04:00
Alvaro Herrera	fc4a8a6d74	perltidy adjustments to new file	2012-07-10 15:15:16 -04:00
Tom Lane	628cbb50ba	Re-implement extraction of fixed prefixes from regular expressions. To generate btree-indexable conditions from regex WHERE conditions (such as WHERE indexed_col ~ '^foo'), we need to be able to identify any fixed prefix that a regex might have; that is, find any string that must be a prefix of all strings satisfying the regex. We used to do that with entirely ad-hoc code that looked at the source text of the regex. It didn't know very much about regex syntax, which mostly meant that it would fail to identify some optimizable cases; but Viktor Rosenfeld reported that it would produce actively wrong answers for quantified parenthesized subexpressions, such as '^(foo)?bar'. Rather than trying to extend the ad-hoc code to cover this, let's get rid of it altogether in favor of identifying prefixes by examining the compiled form of a regex. To do this, I've added a new entry point "pg_regprefix" to the regex library; hopefully it is defined in a sufficiently general fashion that it can remain in the library when/if that code gets split out as a standalone project. Since this bug has been there for a very long time, this fix needs to get back-patched. However it depends on some other recent commits (particularly the addition of wchar-to-database-encoding conversion), so I'll commit this separately and then go to work on back-porting the necessary fixes.	2012-07-10 14:54:37 -04:00
Tom Lane	00dac6000d	Refactor pattern_fixed_prefix() to avoid dealing in incomplete patterns. Previously, pattern_fixed_prefix() was defined to return whatever fixed prefix it could extract from the pattern, plus the "rest" of the pattern. That definition was sensible for LIKE patterns, but not so much for regexes, where reconstituting a valid pattern minus the prefix could be quite tricky (certainly the existing code wasn't doing that correctly). Since the only thing that callers ever did with the "rest" of the pattern was to pass it to like_selectivity() or regex_selectivity(), let's cut out the middle-man and just have pattern_fixed_prefix's subroutines do this directly. Then pattern_fixed_prefix can return a simple selectivity number, and the question of how to cope with partial patterns is removed from its API specification. While at it, adjust the API spec so that callers who don't actually care about the pattern's selectivity (which is a lot of them) can pass NULL for the selectivity pointer to skip doing the work of computing a selectivity estimate. This patch is only an API refactoring that doesn't actually change any processing, other than allowing a little bit of useless work to be skipped. However, it's necessary infrastructure for my upcoming fix to regex prefix extraction, because after that change there won't be any simple way to identify the "rest" of the regex, not even to the low level of fidelity needed by regex_selectivity. We can cope with that if regex_fixed_prefix and regex_selectivity communicate directly, but not if we have to work within the old API. Hence, back-patch to all active branches.	2012-07-09 23:22:55 -04:00
Tom Lane	e7ef6d7e24	Fix planner to pass correct collation to operator selectivity estimators. We can do this without creating an API break for estimation functions by passing the collation using the existing fmgr functionality for passing an input collation as a hidden parameter. The need for this was foreseen at the outset, but we didn't get around to making it happen in 9.1 because of the decision to sort all pg_statistic histograms according to the database's default collation. That meant that selectivity estimators generally need to use the default collation too, even if they're estimating for an operator that will do something different. The reason it's suddenly become more interesting is that regexp interpretation also uses a collation (for its LC_TYPE not LC_COLLATE property), and we no longer want to use the wrong collation when examining regexps during planning. It's not that the selectivity estimate is likely to change much from this; rather that we are thinking of caching compiled regexps during planner estimation, and we won't get the intended benefit if we cache them with a different collation than the executor will use. Back-patch to 9.1, both because the regexp change is likely to get back-patched and because we might as well get this right in all collation-supporting branches, in case any third-party code wants to rely on getting the collation. The patch turns out to be minuscule now that I've done it ...	2012-07-08 23:51:08 -04:00
Tom Lane	c6aae3042b	Simplify and document regex library's compact-NFA representation. The previous coding abused the first element of a cNFA state's arcs list to hold a per-state flag bit, which was confusing, undocumented, and not even particularly efficient. Get rid of that in favor of a separate "stflags" vector. Since there's only one bit in use, I chose to allocate a char per state; we could possibly replace this with a bitmap at some point, but that would make accesses a little slower. It's already about 8X smaller than before, so let's not get overly tense. Also document the representation better than it was before, which is to say not at all. This patch is a byproduct of investigations towards extracting a "fixed prefix" string from the compact-NFA representation of regex patterns. Might need to back-patch it if we decide to back-patch that fix, but for now it's just code cleanup so I'll just put it in HEAD.	2012-07-07 17:39:50 -04:00
Alvaro Herrera	a184e4db83	Convert libpq regress script to Perl This should ease its use on the Windows build environment.	2012-07-06 16:45:48 -04:00
Alvaro Herrera	adb9b7d53b	Update libpq test expected output Commit `2b443063` changed wording for some of the error messages, but neglected updating the regress output to match.	2012-07-06 16:45:47 -04:00
Bruce Momjian	3c9b406420	Run updated copyright.pl on HEAD and 9.2 trees, updating the psql \copyright output to 2012. Backpatch to 9.2.	2012-07-06 12:28:18 -04:00
Bruce Momjian	d17c0135cd	Have copyright.pl skip updating something that is just the current year, to avoid producing dups, e.g. 2012-2012 Backpatch to 9.2.	2012-07-06 12:21:43 -04:00
Bruce Momjian	95203e0833	Modify copyright.pl so all lines are processed, not just the first match, so files that contain embedded copyrights are updated, e.g. pgsql/help.c. Backpatch to 9.2.	2012-07-06 11:58:55 -04:00
Bruce Momjian	5198ae8992	Fix copyright.pl to properly skip the .git directory by adding a basename() qualification.	2012-07-06 11:43:59 -04:00
Bruce Momjian	b9eb808bf2	Fix spacing in copyright.pl after being run with missing regex slash (now added). Backpatch to 9.2.	2012-07-06 10:57:08 -04:00
Robert Haas	f6a05fd973	Fix failure of new wchar->mb functions to advance from pointer. Bug spotted by Tom Lane.	2012-07-05 23:47:53 -04:00
Tom Lane	8525419947	Don't try to trim "../" in join_path_components(). join_path_components() tried to remove leading ".." components from its tail argument, but it was not nearly bright enough to do so correctly unless the head argument was (a) absolute and (b) canonicalized. Rather than try to fix that logic, let's just get rid of it: there is no correctness reason to remove "..", and cosmetic concerns can be taken care of by a subsequent canonicalize_path() call. Per bug #6715 from Greg Davidson. Back-patch to all supported branches. It appears that pre-9.2, this function is only used with absolute paths as head arguments, which is why we'd not noticed the breakage before. However, third-party code might be expecting this function to work in more general cases, so it seems wise to back-patch. In HEAD and 9.2, also make some minor cosmetic improvements to callers.	2012-07-05 17:16:11 -04:00
Heikki Linnakangas	de479e2ed2	Revert part of the previous patch that avoided using PLy_elog(). That caused the plpython_unicode regression test to fail on SQL_ASCII encoding, as evidenced by the buildfarm. The reason is that with the patch, you don't get the detail in the error message that you got before. That detail is actually very informative, so rather than just adjust the expected output, let's revert that part of the patch for now to make the buildfarm green again, and figure out some other way to avoid the recursion of PLy_elog() that doesn't lose the detail.	2012-07-05 23:40:25 +03:00
Heikki Linnakangas	b66de4c6d7	Fix mapping of PostgreSQL encodings to Python encodings. Windows encodings, "win1252" and so forth, are named differently in Python, like "cp1252". Also, if the PyUnicode_AsEncodedString() function call fails for some reason, use a plain ereport(), not a PLy_elog(), to report that error. That avoids recursion and crash, if PLy_elog() tries to call PLyUnicode_Bytes() again. This fixes bug reported by Asif Naeem. Backpatch down to 9.0, before that plpython didn't even try these conversions. Jan Urbański, with minor comment improvements by me.	2012-07-05 22:31:29 +03:00
Tom Lane	fc548b2296	Remove support for using wait3() in place of waitpid(). All Unix-oid platforms that we currently support should have waitpid(), since it's in V2 of the Single Unix Spec. Our git history shows that the wait3 code was added to support NextStep, which we officially dropped support for as of 9.2. So get rid of the configure test, and simplify the macro spaghetti in reaper(). Per suggestion from Fujii Masao.	2012-07-05 14:00:40 -04:00
Magnus Hagander	3644a63984	Fix function argument tab completion for schema-qualified or quoted function names Dean Rasheed, reviewed by Josh Kupershmidt	2012-07-05 14:06:55 +02:00
Bruce Momjian	539d38757a	Fix missing regex slash that caused perltidy to get confused on copyright.pl. Backpatch to 9.2.	2012-07-04 21:58:48 -04:00
Bruce Momjian	042d9ffc28	Run newly-configured perltidy script on Perl files. Run on HEAD and 9.2.	2012-07-04 21:47:49 -04:00
Robert Haas	d7c734841b	Reduce messages about implicit indexes and sequences to DEBUG1. Per recent discussion on pgsql-hackers, these messages are too chatty for most users.	2012-07-04 20:35:29 -04:00
Bruce Momjian	3e00d33261	Have pg_dump in binary-upgrade mode properly drop user-created extensions that might exist in the new empty cluster databases, like plpgsql. Backpatch to 9.2.	2012-07-04 17:37:01 -04:00
Robert Haas	72dd6291f2	Add wchar -> mb conversion routines. This is infrastructure for Alexander Korotkov's work on indexing regular expression searches. Alexander Korotkov, with a bit of further hackery on the MULE conversion by me	2012-07-04 17:10:10 -04:00
Robert Haas	f358428280	Increase the maximum initdb-configured value for shared_buffers to 128MB. The old value of 32MB has been around for a very long time, and in the meantime typical system memories have become vastly larger. Also, now that we no longer depend on being able to fit the entirety of our shared memory segment into the system's limit on System V shared memory, there's a much better chance of the higher limit actually proving productive. Per recent discussion on pgsql-hackers.	2012-07-04 15:55:21 -04:00
Magnus Hagander	10e0dd8f91	Remove duplicate, unnecessary, variable declaration	2012-07-04 16:17:30 +02:00
Magnus Hagander	dbc6fcf35d	Set the write location in the pg_receivexlog status messages This makes it possible for the master to track how much data has actually been written my pg_receivexlog - and not just how much has been sent towards it.	2012-07-04 15:14:49 +02:00
Magnus Hagander	0c4b468692	Always treat a standby returning an an invalid flush location as async This ensures that a standby such as pg_receivexlog will not be selected as sync standby - which would cause the master to block waiting for a location that could never happen. Fujii Masao	2012-07-04 15:14:42 +02:00
Tom Lane	09022de1f5	Improve documentation about MULE encoding. This commit improves the comments in pg_wchar.h and creates #define symbols for some formerly hard-coded values. No substantive code changes. Tatsuo Ishii and Tom Lane	2012-07-04 00:29:57 -04:00
Alvaro Herrera	47a2adc83c	Forgot an #include in the previous patch :-(	2012-07-03 16:40:15 -04:00
Alvaro Herrera	0c7b9dc7d0	Have REASSIGN OWNED work on extensions, too Per bug #6593, REASSIGN OWNED fails when the affected role has created an extension. Even though the user related to the extension is not nominally the owner, its OID appears on pg_shdepend and thus causes problems when the user is to be dropped. This commit adds code to change the "ownership" of the extension itself, not of the contained objects. This is fine because it's currently only called from REASSIGN OWNED, which would also modify the ownership of the contained objects. However, this is not sufficient for a working ALTER OWNER implementation extension. Back-patch to 9.1, where extensions were introduced. Bug #6593 reported by Emiliano Leporati.	2012-07-03 15:09:59 -04:00
Bruce Momjian	b33385b89d	Have copyright tool mention that certain files should be updated in back branches.	2012-07-03 12:02:17 -04:00
Robert Haas	6a77bff086	Remove misleading hints about reducing the System V request size. Since the request size will now be ~48 bytes regardless of how shared_buffers et. al. are set, much of this advice is no longer relevant.	2012-07-03 10:07:47 -04:00
Robert Haas	3cf39e6ddb	Fix a stupid bug I introduced into XLogFlush(). Commit `f11e8be3e8` broke this; it was right in Peter's original patch, but I messed it up before committing.	2012-07-02 15:33:59 -04:00
Robert Haas	3bb592bb20	Fix position of WalSndWakeupRequest call. This avoids discriminating against wal_sync_method = open_sync or open_datasync. Fujii Masao, reviewed by Andres Freund	2012-07-02 14:44:10 -04:00
Peter Eisentraut	2b44306315	Assorted message style improvements	2012-07-02 21:12:46 +03:00
Tom Lane	41f4a0ab78	Fix to_date's handling of year 519. A thinko in commit `029dfdf115` caused the year 519 to be handled differently from either adjacent year, which was not the intention AFAICS. Report and diagnosis by Marc Cousin. In passing, remove redundant re-tests of year value.	2012-07-02 11:35:35 -04:00
Robert Haas	82cdd2df75	Work a little harder on comments for walsender wakeup patch. Per gripe from Tom Lane.	2012-07-02 11:28:53 -04:00
Robert Haas	f11e8be3e8	Make commit_delay much smarter. Instead of letting every backend participating in a group commit wait independently, have the first one that becomes ready to flush WAL wait for the configured delay, and let all the others wait just long enough for that first process to complete its flush. This greatly increases the chances of being able to configure a commit_delay setting that actually improves performance. As a side consequence of this change, commit_delay now affects all WAL flushes, rather than just commits. There was some discussion on pgsql-hackers about whether to rename the GUC to, say, wal_flush_delay, but in the absence of consensus I am leaving it alone for now. Peter Geoghegan, with some changes, mostly to the documentation, by me.	2012-07-02 10:26:31 -04:00
Robert Haas	f83b59997d	Make walsender more responsive. Per testing by Andres Freund, this improves replication performance and reduces replication latency and latency jitter. I was a bit concerned about moving more work into XLogInsert, but testing seems to show that it's not a problem in practice. Along the way, improve comments for WaitLatchOrSocket. Andres Freund. Review and stylistic cleanup by me.	2012-07-02 09:41:01 -04:00
Tom Lane	9ad45c18b6	Fix race condition in enum value comparisons. When (re) loading the typcache comparison cache for an enum type's values, use an up-to-date MVCC snapshot, not the transaction's existing snapshot. This avoids problems if we encounter an enum OID that was created since our transaction started. Per report from Andres Freund and diagnosis by Robert Haas. To ensure this is safe even if enum comparison manages to get invoked before we've set a transaction snapshot, tweak GetLatestSnapshot to redirect to GetTransactionSnapshot instead of throwing error when FirstSnapshotSet is false. The existing uses of GetLatestSnapshot (in ri_triggers.c) don't care since they couldn't be invoked except in a transaction that's already done some work --- but it seems just conceivable that this might not be true of enums, especially if we ever choose to use enums in system catalogs. Note that the comparable coding in enum_endpoint and enum_range_internal remains GetTransactionSnapshot; this is perhaps debatable, but if we changed it those functions would have to be marked volatile, which doesn't seem attractive. Back-patch to 9.1 where ALTER TYPE ADD VALUE was added.	2012-07-01 17:12:49 -04:00
Tom Lane	39bfc94c86	Suppress compiler warnings in readfuncs.c. Commit `7357558fc8` introduced "(void) token;" into the READ_TEMP_LOCALS() macro, to suppress complaints from gcc 4.6 when the value of token was not used anywhere in a particular node-read function. However, this just moved the warning around: inspection of buildfarm results shows that some compilers are now complaining that token is being read before it's set. Revert the READ_TEMP_LOCALS() macro change and instead put "(void) token;" into READ_NODE_FIELD(), which is the principal culprit for cases where the warning might occur. In principle we might need the same in READ_BITMAPSET_FIELD() and/or READ_LOCATION_FIELD(), but it seems unlikely that a node would consist only of such fields, so I'll leave them alone for now.	2012-06-30 22:27:49 -04:00
Tom Lane	fa188b5ef5	Remove inappropriate semicolons after function definitions. Solaris Studio warns about this, and some compilers might think it's an outright syntax error.	2012-06-30 17:29:39 -04:00
Tom Lane	81e8264383	Declare AnonymousShmem pointer as "void ". The original coding had it as "PGShmemHeader ", but that doesn't offer any notational benefit because we don't dereference it. And it was resulting in compiler warnings on some platforms, notably buildfarm member castoroides, where mmap() and munmap() are evidently declared to take and return "char *".	2012-06-30 17:19:46 -04:00
Tom Lane	541ffa65c3	Prevent CREATE TABLE LIKE/INHERITS from (mis) copying whole-row Vars. If a CHECK constraint or index definition contained a whole-row Var (that is, "table.*"), an attempt to copy that definition via CREATE TABLE LIKE or table inheritance produced incorrect results: the copied Var still claimed to have the rowtype of the source table, rather than the created table. For the LIKE case, it seems reasonable to just throw error for this situation, since the point of LIKE is that the new table is not permanently coupled to the old, so there's no reason to assume its rowtype will stay compatible. In the inheritance case, we should ideally allow such constraints, but doing so will require nontrivial refactoring of CREATE TABLE processing (because we'd need to know the OID of the new table's rowtype before we adjust inherited CHECK constraints). In view of the lack of previous complaints, that doesn't seem worth the risk in a back-patched bug fix, so just make it throw error for the inheritance case as well. Along the way, replace change_varattnos_of_a_node() with a more robust function map_variable_attnos(), which is capable of being extended to handle insertion of ConvertRowtypeExpr whenever we get around to fixing the inheritance case nicely, and in the meantime it returns a failure indication to the caller so that a helpful message with some context can be thrown. Also, this code will do the right thing with subselects (if we ever allow them in CHECK or indexes), and it range-checks varattnos before using them to index into the map array. Per report from Sergey Konoplev. Back-patch to all supported branches.	2012-06-30 16:45:14 -04:00
Peter Eisentraut	e4ffa86b57	initdb: Update check_need_password for new options Change things so that something like initdb --auth-local=peer --auth-host=md5 does not cause a "must specify a password" error, like initdb -A md5 does.	2012-06-30 23:42:32 +03:00
Heikki Linnakangas	567787f216	Validate xlog record header before enlarging the work area to store it. If the record header is garbled, we're now quite likely to notice it before we try to make a bogus memory allocation and run out of memory. That can still happen, if the xlog record is split across pages (we cannot verify the record header until reading the next page in that scenario), but this reduces the chances. An out-of-memory is treated as a corrupt record anyway, so this isn't a correctness issue, just a case of giving a better error message. Per Amit Kapila's suggestion.	2012-06-30 23:14:35 +03:00
Tom Lane	42e2ce6ae3	Fix confusion between "size" and "AnonymousShmemSize". Noted by Andres Freund. Also improve a couple of comments.	2012-06-29 15:12:10 -04:00
Heikki Linnakangas	7a5c9ca93a	Initialize shared memory copy of ckptXidEpoch correctly when not in recovery. This bug was introduced by commit `20d98ab6e4`, so backpatch this to 9.0-9.2 like that one. This fixes bug #6710, reported by Tarvi Pillessaar	2012-06-29 19:32:15 +03:00
Peter Eisentraut	b344c651fb	Make init-po and update-po recursive make targets This is for convenience, now that adding recursive targets is much easier than it used to be when the NLS stuff was initially added.	2012-06-29 14:01:54 +03:00
Tom Lane	ae90128dc5	Fix NOTIFY to cope with I/O problems, such as out-of-disk-space. The LISTEN/NOTIFY subsystem got confused if SimpleLruZeroPage failed, which would typically happen as a result of a write() failure while attempting to dump a dirty pg_notify page out of memory. Subsequently, all attempts to send more NOTIFY messages would fail with messages like "Could not read from file "pg_notify/nnnn" at offset nnnnn: Success". Only restarting the server would clear this condition. Per reports from Kevin Grittner and Christoph Berg. Back-patch to 9.0, where the problem was introduced during the LISTEN/NOTIFY rewrite.	2012-06-29 00:51:34 -04:00
Tom Lane	c1494b7330	Provide MAP_FAILED if sys/mman.h doesn't. On old HPUX this has to be #defined to -1. It might be that other values are required on other dinosaur systems, but we'll worry about that when and if we get reports.	2012-06-28 14:19:20 -04:00
Heikki Linnakangas	8f85667a86	Update outdated commit; xlp_rem_len field is in page header now. Spotted by Amit Kapila	2012-06-28 20:35:18 +03:00
Peter Eisentraut	dcd5af6c34	Further fix install program detection The $(or) make function was introduced in GNU make 3.81, so the previous coding didn't work in 3.80. Write it differently, and improve the variable naming to make more sense in the new coding.	2012-06-28 20:07:02 +03:00
Robert Haas	39715af23a	Fix broken mmap failure-detection code, and improve error message. Per an observation by Thom Brown that my previous commit made an overly large shmem allocation crash the server, on Linux.	2012-06-28 12:57:22 -04:00
Robert Haas	b0fc0df936	Dramatically reduce System V shared memory consumption. Except when compiling with EXEC_BACKEND, we'll now allocate only a tiny amount of System V shared memory (as an interlock to protect the data directory) and allocate the rest as anonymous shared memory via mmap. This will hopefully spare most users the hassle of adjusting operating system parameters before being able to start PostgreSQL with a reasonable value for shared_buffers. There are a bunch of documentation updates needed here, and we might need to adjust some of the HINT messages related to shared memory as well. But it's not 100% clear how portable this is, so before we write the documentation, let's give it a spin on the buildfarm and see what turns red.	2012-06-28 11:05:16 -04:00
Robert Haas	c5b3451a8e	Add missing space in event_source GUC description. This has apparently been wrong since event_source was added. Alexander Lakhin	2012-06-28 08:15:50 -04:00
Tom Lane	bde689f809	Make UtilityContainsQuery recurse until it finds a non-utility Query. The callers of UtilityContainsQuery want it to return a non-utility Query if it returns anything at all. However, since we made CREATE TABLE AS/SELECT INTO into a utility command instead of a variant of SELECT, a command like "EXPLAIN SELECT INTO" results in two nested utility statements. So what we need UtilityContainsQuery to do is drill down to the bottom non-utility Query. I had thought of this possibility in setrefs.c, and fixed it there by looping around the UtilityContainsQuery call; but overlooked that the call sites in plancache.c have a similar issue. In those cases it's notationally inconvenient to provide an external loop, so let's redefine UtilityContainsQuery as recursing down to a non-utility Query instead. Noted by Rushabh Lathia. This is a somewhat cleaned-up version of his proposed patch.	2012-06-27 23:18:30 -04:00
Peter Eisentraut	f786715412	Fix install program detection configure handles INSTALL as a substitution variable specially, and apparently it gets confused when it's set to empty. Use INSTALL_ instead as a workaround to avoid the issue.	2012-06-27 21:22:41 +03:00
Heikki Linnakangas	a8f97b39c7	Fix two more neglected comments, still referring to log/seg. Fujii Masao	2012-06-27 19:11:26 +03:00
Heikki Linnakangas	ec786c6c81	I neglected many comments in the log+seg -> 64-bit segno patch. Fix. Reported by Amit Kapila.	2012-06-27 17:53:53 +03:00
Peter Eisentraut	9db7ccae20	Use system install program when available and usable In `a3176dac22` we switched to using install-sh unconditionally, because the configure check AC_PROG_INSTALL would pick up any random program named install, which has caused failure reports (http://archives.postgresql.org/pgsql-hackers/2001-03/msg00312.php). Now the configure check is much improved and should avoid false positives. It has also been shown that using a system install program can significantly reduce "make install" times, so it's worth trying.	2012-06-27 13:40:51 +03:00
Robert Haas	c60ca19de9	Allow pg_terminate_backend() to be used on backends with matching role. A similar change was made previously for pg_cancel_backend, so now it all matches again. Dan Farina, reviewed by Fujii Masao, Noah Misch, and Jeff Davis, with slight kibitzing on the doc changes by me.	2012-06-26 16:16:52 -04:00
Robert Haas	b79ab00144	When LWLOCK_STATS is defined, count spindelays. When LWLOCK_STATS is not defined, the only change is that SpinLockAcquire now returns the number of delays. Patch by me, review by Jeff Janes.	2012-06-26 16:06:07 -04:00
Tom Lane	757773602c	Cope with smaller-than-normal BLCKSZ setting in SPGiST indexes on text. The original coding failed miserably for BLCKSZ of 4K or less, as reported by Josh Kupershmidt. With the present design for text indexes, a given inner tuple could have up to 256 labels (requiring either 3K or 4K bytes depending on MAXALIGN), which means that we can't positively guarantee no failures for smaller blocksizes. But we can at least make it behave sanely so long as there are few enough labels to fit on a page. Considering that btree is also more prone to "index tuple too large" failures when BLCKSZ is small, it's not clear that we should expend more work than this on this case.	2012-06-26 14:36:25 -04:00
Robert Haas	0caa0d04db	Make DROP FUNCTION hint more informative. If you decide you want to take the hint, this gives you something you can paste right back to the server. Dean Rasheed	2012-06-26 13:33:23 -04:00
Robert Haas	76837c1507	Reduce use of heavyweight locking inside hash AM. Avoid using LockPage(rel, 0, lockmode) to protect against changes to the bucket mapping. Instead, an exclusive buffer content lock is now viewed as sufficient permission to modify the metapage, and a shared buffer content lock is used when such modifications need to be prevented. This more relaxed locking regimen makes it possible that, when we're busy getting a heavyweight bucket on the bucket we intend to search or insert into, a bucket split might occur underneath us. To compenate for that possibility, we use a loop-and-retry system: release the metapage content lock, acquire the heavyweight lock on the target bucket, and then reacquire the metapage content lock and check that the bucket mapping has not changed. Normally it hasn't, and we're done. But if by chance it has, we simply unlock the metapage, release the heavyweight lock we acquired previously, lock the new bucket, and loop around again. Even in the worst case we cannot loop very many times here, since we don't split the same bucket again until we've split all the other buckets, and 2^N gets big pretty fast. This results in greatly improved concurrency, because we're effectively replacing two lwlock acquire-and-release cycles in exclusive mode (on one of the lock manager locks) with a single acquire-and-release cycle in shared mode (on the metapage buffer content lock). Testing shows that it's still not quite as good as btree; for that, we'd probably have to find some way of getting rid of the heavyweight bucket locks as well, which does not appear straightforward. Patch by me, review by Jeff Janes.	2012-06-26 06:56:10 -04:00
Heikki Linnakangas	038f3a0509	Fix pg_upgrade, broken by the xlogid/segno -> 64-bit int refactoring. The xlogid + segno representation of a particular WAL segment doesn't make much sense in pg_resetxlog anymore, now that we don't use that anywhere else. Use the WAL filename instead, since that's a convenient way to name a particular WAL segment. I did this partially for pg_resetxlog in the original xlogid/segno -> uint64 patch, but I neglected pg_upgrade and the docs. This should now be more complete.	2012-06-26 07:49:02 +03:00
Tom Lane	8a504a3639	Make pg_dump emit more accurate dependency information. While pg_dump has included dependency information in archive-format output ever since 7.3, it never made any large effort to ensure that that information was actually useful. In particular, in common situations where dependency chains include objects that aren't separately emitted in the dump, the dependencies shown for objects that were emitted would reference the dump IDs of these un-dumped objects, leaving no clue about which other objects the visible objects indirectly depend on. So far, parallel pg_restore has managed to avoid tripping over this misfeature, but only by dint of some crude hacks like not trusting dependency information in the pre-data section of the archive. It seems prudent to do something about this before it rises up to bite us, so instead of emitting the "raw" dependencies of each dumped object, recursively search for its actual dependencies among the subset of objects that are being dumped. Back-patch to 9.2, since that code hasn't yet diverged materially from HEAD. At some point we might need to back-patch further, but right now there are no known cases where this is actively necessary. (The one known case, bug #6699, is fixed in a different way by my previous patch.) Since this patch depends on 9.2 changes that made TOC entries be marked before output commences as to whether they'll be dumped, back-patching further would require additional surgery; and as of now there's no evidence that it's worth the risk.	2012-06-25 21:21:18 -04:00
Tom Lane	a1ef01fe16	Improve pg_dump's dependency-sorting logic to enforce section dump order. As of 9.2, with the --section option, it is very important that the concept of "pre data", "data", and "post data" sections of the output be honored strictly; else a dump divided into separate sectional files might be unrestorable. However, the dependency-sorting logic knew nothing of sections and would happily select output orderings that didn't fit that structure. Doing so was mostly harmless before 9.2, but now we need to be sure it doesn't do that. To fix, create dummy objects representing the section boundaries and add dependencies between them and all the normal objects. (This might sound expensive but it seems to only add a percent or two to pg_dump's runtime.) This also fixes a problem introduced in 9.1 by the feature that allows incomplete GROUP BY lists when a primary key is given in GROUP BY. That means that views can depend on primary key constraints. Previously, pg_dump would deal with that by simply emitting the primary key constraint before the view definition (and hence before the data section of the output). That's bad enough for simple serial restores, where creating an index before the data is loaded works, but is undesirable for speed reasons. But it could lead to outright failure of parallel restores, as seen in bug #6699 from Joe Van Dyk. That happened because pg_restore would switch into parallel mode as soon as it reached the constraint, and then very possibly would try to emit the view definition before the primary key was committed (as a consequence of another bug that causes the view not to be correctly marked as depending on the constraint). Adding the section boundary constraints forces the dependency-sorting code to break the view into separate table and rule declarations, allowing the rule, and hence the primary key constraint it depends on, to revert to their intended location in the post-data section. This also somewhat accidentally works around the bogus-dependency-marking problem, because the rule will be correctly shown as depending on the constraint, so parallel pg_restore will now do the right thing. (We will fix the bogus-dependency problem for real in a separate patch, but that patch is not easily back-portable to 9.1, so the fact that this patch is enough to dodge the only known symptom is fortunate.) Back-patch to 9.1, except for the hunk that adds verification that the finished archive TOC list is in correct section order; the place where it was convenient to add that doesn't exist in 9.1.	2012-06-25 21:21:17 -04:00
Alvaro Herrera	77ed0c6950	Tighten up includes in sinvaladt.h, twophase.h, proc.h Remove proc.h from sinvaladt.h and twophase.h; also replace xlog.h in proc.h with xlogdefs.h.	2012-06-25 18:40:40 -04:00
Peter Eisentraut	eeece9e609	Unify calling conventions for postgres/postmaster sub-main functions There was a wild mix of calling conventions: Some were declared to return void and didn't return, some returned an int exit code, some claimed to return an exit code, which the callers checked, but actually never returned, and so on. Now all of these functions are declared to return void and decorated with attribute noreturn and don't return. That's easiest, and most code already worked that way.	2012-06-25 21:30:12 +03:00
Robert Haas	c7d47abd04	Fix typo in DEBUG message, introduced by recent WAL refactoring. Fujii Masao	2012-06-25 14:00:35 -04:00
Robert Haas	a6427f1f47	Unbreak pg_resetxlog -l. Fujii Masao	2012-06-25 13:58:38 -04:00
Robert Haas	2dfa87bcb6	Remove sanity test in XRecOffIsValid. Commit `061e7efb1b` changed the rules for splitting xlog records across pages, but neglected to update this test. It's possible that there's some better action here than just removing the test completely, but this at least appears to get some of the things that are currently broken (like initdb on MacOS X) working again.	2012-06-25 12:14:43 -04:00
Kevin Grittner	5c7f954d31	Fix warning for 64-bit literal on 32-bit build.	2012-06-25 07:25:00 -05:00
Peter Eisentraut	b8b2e3b2de	Replace int2/int4 in C code with int16/int32 The latter was already the dominant use, and it's preferable because in C the convention is that intXX means XX bits. Therefore, allowing mixed use of int2, int4, int8, int16, int32 is obviously confusing. Remove the typedefs for int2 and int4 for now. They don't seem to be widely used outside of the PostgreSQL source tree, and the few uses can probably be cleaned up by the time this ships.	2012-06-25 01:51:46 +03:00
Heikki Linnakangas	7eb8c78514	I missed some references to xlogid/xrecoff in Win32-only code. Fix.	2012-06-24 22:14:31 +03:00
Heikki Linnakangas	0687a26002	Use UINT64CONST for 64-bit integer constants. Peter Eisentraut advised me that UINT64CONST is the proper way to do that, not LL suffix.	2012-06-24 21:56:45 +03:00
Heikki Linnakangas	a218e23a08	Oops. Remove stray paren. I didn't notice this on my laptop as I don't HAVE_FSYNC_WRITETHROUGH.	2012-06-24 20:03:57 +03:00
Heikki Linnakangas	96ff85e2dd	Use LL suffix for 64-bit constants. Per warning from buildfarm member 'locust'. At least I think this what's making it upset.	2012-06-24 20:01:55 +03:00
Heikki Linnakangas	0ab9d1c4b3	Replace XLogRecPtr struct with a 64-bit integer. This simplifies code that needs to do arithmetic on XLogRecPtrs. To avoid changing on-disk format of data pages, the LSN on data pages is still stored in the old format. That should keep pg_upgrade happy. However, we have XLogRecPtrs embedded in the control file, and in the structs that are sent over the replication protocol, so this changes breaks compatibility of pg_basebackup and server. I didn't do anything about this in this patch, per discussion on -hackers, the right thing to do would to be to change the replication protocol to be architecture-independent, so that you could use a newer version of pg_receivexlog, for example, against an older server version.	2012-06-24 19:19:45 +03:00
Heikki Linnakangas	061e7efb1b	Allow WAL record header to be split across pages. This saves a few bytes of WAL space, but the real motivation is to make it predictable how much WAL space a record requires, as it no longer depends on whether we need to waste the last few bytes at end of WAL page because the header doesn't fit. The total length field of WAL record, xl_tot_len, is moved to the beginning of the WAL record header, so that it is still always found on the first page where a WAL record begins. Bump WAL version number again as this is an incompatible change.	2012-06-24 18:35:56 +03:00
Heikki Linnakangas	20ba5ca64c	Move WAL continuation record information to WAL page header. The continuation record only contained one field, xl_rem_len, so it makes things simpler to just include it in the WAL page header. This wastes four bytes on pages that don't begin with a continuation from previos page, plus four bytes on every page, because of padding. The motivation of this is to make it easier to calculate how much space a WAL record needs. Before this patch, it depended on how many page boundaries the record crosses. The motivation of that, in turn, is to separate the allocation of space in the WAL from the copying of the record data to the allocated space. Keeping the calculation of space required simple helps to keep the critical section of allocating the space from WAL short. But that's not included in this patch yet. Bump WAL version number again, as this is an incompatible change.	2012-06-24 18:35:30 +03:00
Heikki Linnakangas	dfda6ebaec	Don't waste the last segment of each 4GB logical log file. The comments claimed that wasting the last segment made it easier to do calculations with XLogRecPtrs, because you don't have problems representing last-byte-position-plus-1 that way. In my experience, however, it only made things more complicated, because the there was two ways to represent the boundary at the beginning of a logical log file: logid = n+1 and xrecoff = 0, or as xlogid = n and xrecoff = 4GB - XLOG_SEG_SIZE. Some functions were picky about which representation was used. Also, use a 64-bit segment number instead of the log/seg combination, to point to a certain WAL segment. We assume that all platforms have a working 64-bit integer type nowadays. This is an incompatible change in WAL format, so bumping WAL version number.	2012-06-24 18:35:29 +03:00
Tom Lane	d14241c2cf	Fix memory leak in ARRAY(SELECT ...) subqueries. Repeated execution of an uncorrelated ARRAY_SUBLINK sub-select (which I think can only happen if the sub-select is embedded in a larger, correlated subquery) would leak memory for the duration of the query, due to not reclaiming the array generated in the previous execution. Per bug #6698 from Armando Miraglia. Diagnosis and fix idea by Heikki, patch itself by me. This has been like this all along, so back-patch to all supported versions.	2012-06-21 17:27:19 -04:00
Alvaro Herrera	68d0e3cbf9	Repair comment mangled by a pgindent run long ago	2012-06-21 15:37:05 -04:00
Heikki Linnakangas	eeb6f37d89	Add a small cache of locks owned by a resource owner in ResourceOwner. This speeds up reassigning locks to the parent owner, when the transaction holds a lot of locks, but only a few of them belong to the current resource owner. This is particularly helps pg_dump when dumping a large number of objects. The cache can hold up to 15 locks in each resource owner. After that, the cache is marked as overflowed, and we fall back to the old method of scanning the whole local lock table. The tradeoff here is that the cache has to be scanned whenever a lock is released, so if the cache is too large, lock release becomes more expensive. 15 seems enough to cover pg_dump, and doesn't have much impact on lock release. Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.	2012-06-21 15:30:26 +03:00
Tom Lane	dfd9c116cc	Remove incomplete/incorrect support for zero-column foreign keys. The original coding in ri_triggers.c had partial support for the concept of zero-column foreign key constraints. But this is not defined in the SQL standard, nor was it ever allowed by any other part of Postgres, nor was it very fully implemented even here (eg there was no support for preventing PK-table deletions that would violate the constraint). Doesn't seem very useful to carry 100-plus lines of code for a corner case that no one is interested in making work. Instead, just add a check that the column list read from pg_constraint is non-empty.	2012-06-20 20:15:02 -04:00
Tom Lane	0ce4459a36	Increase MAX_SYSCACHE_CALLBACKS from 20 to 32. By my count there are 18 callers of CacheRegisterSyscacheCallback in the core code in HEAD, so we are potentially leaving as few as 2 slots for any add-on code to use (though possibly not all these callers would actually activate in any particular session). That doesn't seem like a lot of headroom, so let's pump it up a little.	2012-06-20 19:47:37 -04:00
Tom Lane	45ba424f33	Cache the results of ri_FetchConstraintInfo in a backend-local cache. Extracting data from pg_constraint turned out to take as much as 10% of the runtime in a bulk-update case where the foreign key column wasn't changing, because we did it over again for each tuple. Fix that by maintaining a backend-local cache of the results. This is really a pretty small patch, but converting the trigger functions to work with pointers rather than local struct variables requires a lot of mechanical changes.	2012-06-20 17:24:14 -04:00
Tom Lane	cfa0f4255b	Improve tests for whether we can skip queueing RI enforcement triggers. During an update of a PK row, we can skip firing the RI trigger if any old key value is NULL, because then the row could not have had any matching rows in the FK table. Conversely, during an update of an FK row, the outcome is determined if any new key value is NULL. In either case it becomes unnecessary to compare individual key values. This patch was inspired by discussion of Vik Reykja's patch to use IS NOT DISTINCT semantics for the key comparisons. In the event there is no need for that and so this patch looks nothing like his, but he should still get credit for having re-opened consideration of the trigger skip logic.	2012-06-19 20:07:33 -04:00
Alvaro Herrera	11b335ac4c	pg_dump: Fix verbosity level in LO progress messages In passing, reword another instance of the same message that was gratuitously different. Author: Josh Kupershmidt after a bug report by Bosco Rama	2012-06-19 17:20:23 -04:00
Tom Lane	fe3db74002	Share RI trigger code between NO ACTION and RESTRICT cases. These triggers are identical except for whether ri_Check_Pk_Match is to be called, so factor out the common code to save a couple hundred lines. Also, eliminate null-column checks in ri_Check_Pk_Match, since they're duplicate with the calling functions and require unnecessary complication in its API statement. Simplify the way code is shared between RI_FKey_check_ins and RI_FKey_check_upd, too.	2012-06-19 14:31:54 -04:00
Tom Lane	48756be9cf	Improve comments about why SET DEFAULT triggers must recheck for matches. I was confused about this, so try to make it clearer for the next person. (This seems like a fairly inefficient way of dealing with a corner case, but I don't have a better idea offhand. Maybe if there were a way to turn off the RI_FKey_keyequal_upd_fk event filter temporarily?)	2012-06-18 22:45:07 -04:00
Tom Lane	e8c9fd5fdf	Allow ON UPDATE/DELETE SET DEFAULT plans to be cached. Once upon a time, somebody was worried that cached RI plans wouldn't get remade with new default values after ALTER TABLE ... SET DEFAULT, so they didn't allow caching of plans for ON UPDATE/DELETE SET DEFAULT actions. That time is long gone, though (and even at the time I doubt this was the greatest hazard posed by ALTER TABLE...). So allow these triggers to cache their plans just like the others. The cache_plan argument to ri_PlanCheck is now vestigial, since there are no callers that don't pass "true"; but I left it alone in case there is any future need for it.	2012-06-18 19:37:23 -04:00
Tom Lane	03a5ba24b0	Remove derived fields from RI_QueryKey, and do a bit of other cleanup. We really only need the foreign key constraint's OID and the query type code to uniquely identify each plan we are caching for FK checks. The other stuff that was in the struct had no business being used as part of a hash key, and was all just being copied from struct RI_ConstraintInfo anyway. Get rid of the unnecessary fields, and readjust various function APIs to make them use RI_ConstraintInfo not RI_QueryKey as info source. I'd be surprised if this makes any measurable performance difference, but it certainly feels cleaner.	2012-06-18 18:50:29 -04:00
Peter Eisentraut	e1e97e9313	pg_dump: Add missing newlines at end of messages	2012-06-18 23:57:00 +03:00
Tom Lane	f9429746c9	Update SQL spec references in ri_triggers code to match SQL:2008. Now that what we're implementing isn't SQL92, we probably shouldn't cite chapter and verse in that spec anymore. Also fix some comments that talked about MATCH FULL but in fact were in code that's also used for MATCH SIMPLE. No code changes in this commit, just comments.	2012-06-18 12:19:38 -04:00
Tom Lane	c75be2ad60	Change ON UPDATE SET NULL/SET DEFAULT referential actions to meet SQL spec. Previously, when executing an ON UPDATE SET NULL or SET DEFAULT action for a multicolumn MATCH SIMPLE foreign key constraint, we would set only those referencing columns corresponding to referenced columns that were changed. This is what the SQL92 standard said to do --- but more recent versions of the standard say that all referencing columns should be set to null or their default values, no matter exactly which referenced columns changed. At least for SET DEFAULT, that is clearly saner behavior. It's somewhat debatable whether it's an improvement for SET NULL, but it appears that other RDBMS systems read the spec this way. So let's do it like that. This is a release-notable behavioral change, although considering that our documentation already implied it was done this way, the lack of complaints suggests few people use such cases.	2012-06-18 12:12:52 -04:00
Tom Lane	f5297bdfe4	Refer to the default foreign key match style as MATCH SIMPLE internally. Previously we followed the SQL92 wording, "MATCH <unspecified>", but since SQL99 there's been a less awkward way to refer to the default style. In addition to the code changes, pg_constraint.confmatchtype now stores this match style as 's' (SIMPLE) rather than 'u' (UNSPECIFIED). This doesn't affect pg_dump or psql because they use pg_get_constraintdef() to reconstruct foreign key definitions. But other client-side code might examine that column directly, so this change will have to be marked as an incompatibility in the 9.3 release notes.	2012-06-17 20:16:44 -04:00
Peter Eisentraut	bb7520cc26	Make documentation of --help and --version options more consistent Before, some places didn't document the short options (-? and -V), some documented both, some documented nothing, and they were listed in various orders. Now this is hopefully more consistent and complete.	2012-06-18 02:46:59 +03:00
Tom Lane	9e18eacbdf	Fix stats collector to recover nicely when system clock goes backwards. Formerly, if the system clock went backwards, the stats collector would fail to update the stats file any more until the clock reading again exceeds whatever timestamp was last written into the stats file. Such glitches in the clock's behavior are not terribly unlikely on machines not using NTP. Such a scenario has been observed to cause regression test failures in the buildfarm, and it could have bad effects on the behavior of autovacuum, so it seems prudent to install some defenses. We could directly detect the clock going backwards by adding GetCurrentTimestamp calls in the stats collector's main loop, but that would hurt performance on platforms where GetCurrentTimestamp is expensive. To minimize the performance hit in normal cases, adopt a more complicated scheme wherein backends check for clock skew when reading the stats file, and if they see it, signal the stats collector by sending an extra stats inquiry message. The stats collector does an extra GetCurrentTimestamp only when it receives an inquiry with an apparently out-of-order timestamp. To avoid unnecessary GetCurrentTimestamp calls, expand the inquiry messages to carry the backend's current clock reading as well as its stats cutoff time. The latter, being intentionally slightly in-the-past, would trigger more clock rechecks than we need if it were used for this purpose. We might want to backpatch this change at some point, but let's let it shake out in the buildfarm for awhile first.	2012-06-17 17:11:49 -04:00
Bruce Momjian	47463a8098	Remove 'for' loop perltidy argument, and move args to perltidyrc file. Backpatch to 9.2. Per suggestion from Noah Misch	2012-06-16 10:12:50 -04:00
Bruce Momjian	0acd978259	In pgindent, suppress reading the perltidy RC file using --noprofile.	2012-06-15 22:50:02 -04:00
Bruce Momjian	d6e0207437	Update pgindent Perl indentation instructions based on feedback from Àlvaro and Noah Misch. Backpatch to 9.2.	2012-06-15 22:43:23 -04:00
Peter Eisentraut	15b1918e7d	Improve reporting of permission errors for array types Because permissions are assigned to element types, not array types, complaining about permission denied on an array type would be misleading to users. So adjust the reporting to refer to the element type instead. In order not to duplicate the required logic in two dozen places, refactor the permission denied reporting for types a bit. pointed out by Yeb Havinga during the review of the type privilege feature	2012-06-15 22:55:03 +03:00
Peter Eisentraut	d933092e0a	Add more message pluralization Even though we can't do much about the case with multiple plurals in one sentence, we can fix the other cases.	2012-06-15 02:02:02 +03:00
Robert Haas	8507c2f856	Improve readability and error messages in pg_backup_start_time. Gurjeet Singh, with corrections by me.	2012-06-14 15:20:08 -04:00
Robert Haas	68de499bda	New SQL functons pg_backup_in_progress() and pg_backup_start_time() Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by Marco Nenciarini. Stylistic cleanup and OID fixes by me.	2012-06-14 13:25:43 -04:00
Robert Haas	cd80073445	During transaction cleanup, release locks before deleting files. There's no need to hold onto the locks until the files are needed, and by doing it this way, we reduce the impact on other backends who may be awaiting locks we hold. Noah Misch	2012-06-14 10:19:33 -04:00
Robert Haas	6cd015bea3	Add new function log_newpage_buffer. When I implemented the ginbuildempty() function as part of implementing unlogged tables, I falsified the note in the header comment for log_newpage. Although we could fix that up by changing the comment, it seems cleaner to add a new function which is specifically intended to handle this case. So do that.	2012-06-14 10:11:16 -04:00
Robert Haas	a475c60367	Remove misplaced sanity check from heap_create(). Even when allow_system_table_mods is not set, we allow creation of any type of SQL object in pg_catalog, except for relations. And you can get relations into pg_catalog, too, by initially creating them in some other schema and then moving them with ALTER .. SET SCHEMA. So this restriction, which prevents relations (only) from being created in pg_catalog directly, is fairly pointless. If we need a safety mechanism for this, it should be placed further upstream, so that it affects all SQL objects uniformly, and picks up both CREATE and SET SCHEMA. For now, just rip it out, per discussion with Tom Lane.	2012-06-14 09:58:53 -04:00
Robert Haas	d2c86a1ccd	Remove RELKIND_UNCATALOGED. This may have been important at some point in the past, but it no longer does anything useful. Review by Tom Lane.	2012-06-14 09:47:30 -04:00
Robert Haas	7582e0be78	Make \conninfo print SSL information. Alastair Turner, per suggestion from Bruce Momjian.	2012-06-14 09:43:14 -04:00
Tom Lane	80491a1983	Add 9.2 branch to git_changelog's list.	2012-06-13 22:23:31 -04:00
Tom Lane	f32609db72	Flesh out RELEASE_CHANGES instructions for branching in git. We have this info in the wiki, but it should be here too.	2012-06-13 22:11:06 -04:00
Tom Lane	357c549334	Stamp library minor versions for 9.3. This includes fixing the MSVC copy of ecpg/preproc's version info, which seems to have been overlooked repeatedly. Can't we fix that so there are not two copies??	2012-06-13 22:06:26 -04:00
Tom Lane	bed88fceac	Stamp HEAD as 9.3devel. Let the hacking begin ...	2012-06-13 20:03:02 -04:00
Tom Lane	80edfd7659	Revisit error message details for JSON input parsing. Instead of identifying error locations only by line number (which could be entirely unhelpful with long input lines), provide a fragment of the input text too, placing this info in a new CONTEXT entry. Make the error detail messages conform more closely to style guidelines, fix failure to expose some of them for translation, ensure compiler can check formats against supplied parameters.	2012-06-13 19:43:35 -04:00
Tom Lane	b8b69d8990	Revert "Reduce checkpoints and WAL traffic on low activity database server" This reverts commit `18fb9d8d21`. Per discussion, it does not seem like a good idea to allow committed changes to go un-checkpointed indefinitely, as could happen in a low-traffic server; that makes us entirely reliant on the WAL stream with no redundancy that might aid data recovery in case of disk failure. This re-introduces the original problem of hot-standby setups generating a small continuing stream of WAL traffic even when idle, but there are other ways to address that without compromising crash recovery, so we'll revisit that issue in a future release cycle.	2012-06-13 18:48:44 -04:00
Tom Lane	c3bc76bdb0	Deprecate use of GLOBAL and LOCAL in temp table creation. Aside from adjusting the documentation to say that these are deprecated, we now report a warning (not an error) for use of GLOBAL, since it seems fairly likely that we might change that to request SQL-spec-compliant temp table behavior in the foreseeable future. Although our handling of LOCAL is equally nonstandard, there is no evident interest in ever implementing SQL modules, and furthermore some other products interpret LOCAL as behaving the same way we do. So no expectation of change and no warning for LOCAL; but it still seems a good idea to deprecate writing it. Noah Misch	2012-06-13 17:48:42 -04:00
Tom Lane	93f4d7f806	Support Linux's oom_score_adj API as well as the older oom_adj API. The simplest way to handle this is just to copy-and-paste the relevant code block in fork_process.c, so that's what I did. (It's possible that something more complicated would be useful to packagers who want to work with either the old or the new API; but at this point the number of such people is rapidly approaching zero, so let's just get the minimal thing done.) Update relevant documentation as well.	2012-06-13 15:35:52 -04:00
Peter Eisentraut	c0a6f9c84b	Improve documentation of postgres -C option Clarify help (s/return/print/), and explain that this option is for use by other programs, not for user-facing use (it does not print units).	2012-06-13 13:41:25 +03:00
Tom Lane	f871ef74a5	Minor code review for json.c. Improve commenting, conform to project style for use of ++ etc. No functional changes.	2012-06-12 16:23:45 -04:00
Robert Haas	36b7e3da17	Mark JSON error detail messages for translation. Per gripe from Tom Lane.	2012-06-12 10:41:38 -04:00
Tom Lane	51e61b04f8	Ensure pg_ctl behaves sanely when data directory is not specified. Commit `aaa6e1def2` introduced multiple hazards in the case where pg_ctl is executed with neither a -D switch nor any PGDATA environment variable. It would dump core on machines which are unforgiving about printf("%s", NULL), or failing that possibly give a rather unhelpful complaint about being unable to execute "postgres -C", rather than the logically prior complaint about not being told where the data directory is. Edmund Horner's report suggests that there is another, Windows-specific hazard here, but I'm not the person to fix that; it would in any case only be significant when trying to use a config-only PGDATA pointer.	2012-06-11 22:47:16 -04:00
Tom Lane	bf0945e863	Fix pg_dump output to a named tar-file archive. "pg_dump -Ft -f filename ..." got broken by my recent commit `4317e0246c`, which I fear I only tested in the output-to-stdout variant. Report and fix by Muhammad Asif Naeem.	2012-06-11 21:55:48 -04:00
Peter Eisentraut	7d754961f7	pg_receivexlog: Rename option --dir to --directory getopt_long() allows abbreviating long options, so we might as well give the option the full name, and users can abbreviate it how they like. Do some general polishing of the --help output at the same time.	2012-06-12 00:55:27 +03:00
Magnus Hagander	3595a71e9c	Prevent non-streaming replication connections from being selected sync slave This prevents a pg_basebackup backup session that just does a base backup (no xlog involved at all) from becoming the synchronous slave and thus blocking all access while it runs. Also fixes the problem when a higher priority slave shows up it would become the sync standby before it has reached the STREAMING state, by making sure we can only switch to a walsender that's actually STREAMING. Fujii Masao	2012-06-11 15:17:38 +02:00
Magnus Hagander	9af34cdec8	Revert behaviour of -x/--xlog to 9.1 semantics To replace it, add -X/--xlog-method that allows the specification of fetch or stream. Do this to avoid unnecessary backwards-incompatiblity. Spotted and suggested by Peter Eisentraut.	2012-06-11 14:58:35 +02:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Bruce Momjian	60801944fa	Update pgindent install instructions and update typedef list.	2012-06-10 15:15:31 -04:00
Magnus Hagander	a0b4c5a20a	Fix pg_basebackup/pg_receivexlog for floating point timestamps Since the replication protocol deals with TimestampTz, we need to care for the floating point case as well in the frontend tools. Fujii Masao, with changes from Magnus Hagander	2012-06-10 12:12:36 +02:00
Magnus Hagander	7c1abc00fa	Error message capitalization fix	2012-06-10 12:02:52 +02:00
Peter Eisentraut	8570114dc1	Make include files work without having to include other ones first	2012-06-10 12:46:14 +03:00
Simon Riggs	28ac797287	Revert error message on GLOBAL/LOCAL pending further discussion	2012-06-10 08:41:01 +01:00
Simon Riggs	72335a2015	Add ERROR msg for GLOBAL/LOCAL TEMP is not yet implemented	2012-06-09 16:35:26 +01:00
Simon Riggs	3725570539	Fix bug in early startup of Hot Standby with subtransactions. When HS startup is deferred because of overflowed subtransactions, ensure that we re-initialize KnownAssignedXids for when both existing and incoming snapshots have non-zero qualifying xids. Fixes bug #6661 reported by Valentine Gogichashvili. Analysis and fix by Andres Freund	2012-06-08 17:34:04 +01:00
Robert Haas	3b5548a3d5	When using libpq URI syntax, error out on invalid parameter names. Dan Farina	2012-06-08 08:47:24 -04:00
Tom Lane	ece01aae47	Scan the buffer pool just once, not once per fork, during relation drop. This provides a speedup of about 4X when NBuffers is large enough. There is also a useful reduction in sinval traffic, since we only do CacheInvalidateSmgr() once not once per fork. Simon Riggs, reviewed and somewhat revised by Tom Lane	2012-06-07 17:43:11 -04:00
Peter Eisentraut	5d0109bd27	Message style improvements	2012-06-07 23:54:59 +03:00
Tom Lane	e8d029a30b	Do unlocked prechecks in bufmgr.c loops that scan the whole buffer pool. DropRelFileNodeBuffers, DropDatabaseBuffers, FlushRelationBuffers, and FlushDatabaseBuffers have to scan the whole shared_buffers pool because we have no index structure that would find the target buffers any more efficiently than that. This gets expensive with large NBuffers. We can shave some cycles from these loops by prechecking to see if the current buffer is interesting before we acquire the buffer header lock. Ordinarily such a test would be unsafe, but in these cases it should be safe because we are already assuming that the caller holds a lock that prevents any new target pages from being loaded into the buffer pool concurrently. Therefore, no buffer tag should be changing to a value of interest, only away from a value of interest. So a false negative match is impossible, while a false positive is safe because we'll recheck after acquiring the buffer lock. Initial testing says that this speeds these loops by a factor of 2X to 3X on common Intel hardware. Patch for DropRelFileNodeBuffers by Jeff Janes (based on an idea of Heikki's); extended to the remaining sequential scans by Tom Lane	2012-06-07 16:46:26 -04:00
Simon Riggs	2c8a4e9be2	Wake WALSender to reduce data loss at failover for async commit. WALSender now woken up after each background flush by WALwriter, avoiding multi-second replication delay for an all-async commit workload. Replication delay reduced from 7s with default settings to 200ms and often much less, allowing significantly reduced data loss at failover. Andres Freund and Simon Riggs	2012-06-07 19:22:47 +01:00
Robert Haas	b50991eedb	Fix more crash-safe visibility map bugs, and improve comments. In lazy_scan_heap, we could issue bogus warnings about incorrect information in the visibility map, because we checked the visibility map bit before locking the heap page, creating a race condition. Fix by rechecking the visibility map bit before we complain. Rejigger some related logic so that we rely on the possibly-outdated all_visible_according_to_vm value as little as possible. In heap_multi_insert, it's not safe to clear the visibility map bit before beginning the critical section. The visibility map is not crash-safe unless we treat clearing the bit as a critical operation. Specifically, if the transaction were to error out after we set the bit and before entering the critical section, we could end up writing the heap page to disk (with the bit cleared) and crashing before the visibility map page made it to disk. That would be bad. heap_insert has this correct, but somehow the order of operations got rearranged when heap_multi_insert was added. Also, add some more comments to visibilitymap_test, lazy_scan_heap, and IndexOnlyNext, expounding on concurrency issues. Per extensive code review by Andres Freund, and further review by Tom Lane, who also made the original report about the bogus warnings.	2012-06-07 12:48:13 -04:00
Magnus Hagander	92135ea0ed	Use strerror(errno) instead of %m Found by Fujii Masao	2012-06-05 15:52:08 +02:00
Tom Lane	3dd8e59681	Fix bogus handling of control characters in json_lex_string(). The original coding misbehaved if "char" is signed, and also made the extremely poor decision to print control characters literally when trying to complain about them. Report and patch by Shigeru Hanada. In passing, also fix core dump risk in report_parse_error() should the parse state be something other than what it expects.	2012-06-04 20:43:57 -04:00
Tom Lane	d73b7f973d	Fix memory leaks in failure paths in buildACLCommands and parseAclItem. This is currently only cosmetic, since all the call sites just curl up and die in event of a failure return. It might be important for some future use-case, though, and in any case it quiets warnings from the clang static analyzer (as reported by Anna Zaks). Josh Kupershmidt	2012-06-03 11:52:52 -04:00
Simon Riggs	d3abbbebe5	Avoid early reuse of btree pages, causing incorrect query results. When we allowed read-only transactions to skip assigning XIDs we introduced the possibility that a fully deleted btree page could be reused. This broke the index link sequence which could then lead to indexscans silently returning fewer rows than would have been correct. The actual incidence of silent errors from this is thought to be very low because of the exact workload required and locking pre-conditions. Fix is to remove pages only if index page opaque->btpo.xact precedes RecentGlobalXmin. Noah Misch, reviewed by Simon Riggs	2012-06-01 12:21:45 +01:00
Simon Riggs	055c352abb	After any checkpoint, close all smgr files handles in bgwriter	2012-06-01 09:24:53 +01:00
Simon Riggs	a297d64d92	Checkpointer starts before bgwriter to avoid missing fsync requests. Noted while testing Hot Standby startup.	2012-06-01 08:25:17 +01:00
Simon Riggs	1ec6a2bbc9	Provide interim statistics while in mid-checkpoint. Re-implements similar functionality in 9.1 and previously which was removed during split of checkpointer and bgwriter. Requested/spotted by Magnus Hagander	2012-06-01 08:19:06 +01:00
Tom Lane	4bec93ac0f	Stamp 9.2beta2.	2012-05-31 19:16:55 -04:00
Tom Lane	a04dc87db1	Improve comment for GetStableLatestTransactionId().	2012-05-31 11:20:02 -04:00
Simon Riggs	a2b516dab9	Only throw recovery conflicts when InHotStandby. Bug fix to recent patch to allow Index Only Scans on Hot Standby. Bug report from Jaime Casanova	2012-05-31 13:11:47 +01:00
Tom Lane	c8105e62bb	Update time zone data files to tzdata release 2012c. DST law changes in Antarctica, Armenia, Chile, Cuba, Falkland Islands, Gaza, Haiti, Hebron, Morocco, Syria, Tokelau Islands. Historical corrections for Canada.	2012-05-31 00:47:57 -04:00
Tom Lane	ad0009e7be	Force PL and range-type support functions to be owned by a superuser. We allow non-superusers to create procedural languages (with restrictions) and range datatypes. Previously, the automatically-created support functions for these objects ended up owned by the creating user. This represents a rather considerable security hazard, because the owning user might be able to alter a support function's definition in such a way as to crash the server, inject trojan-horse SQL code, or even execute arbitrary C code directly. It appears that right now the only actually exploitable problem is the infinite-recursion bug fixed in the previous patch for CVE-2012-2655. However, it's not hard to imagine that future additions of more ALTER FUNCTION capability might unintentionally open up new hazards. To forestall future problems, cause these support functions to be owned by the bootstrap superuser, not the user creating the parent object.	2012-05-30 23:47:57 -04:00
Tom Lane	33c6eaf78e	Ignore SECURITY DEFINER and SET attributes for a PL's call handler. It's not very sensible to set such attributes on a handler function; but if one were to do so, fmgr.c went into infinite recursion because it would call fmgr_security_definer instead of the handler function proper. There is no way for fmgr_security_definer to know that it ought to call the handler and not the original function referenced by the FmgrInfo's fn_oid, so it tries to do the latter, causing the whole process to start over again. Ordinarily such misconfiguration of a procedural language's handler could be written off as superuser error. However, because we allow non-superuser database owners to create procedural languages and the handler for such a language becomes owned by the database owner, it is possible for a database owner to crash the backend, which ideally shouldn't be possible without superuser privileges. In 9.2 and up we will adjust things so that the handler functions are always owned by superusers, but in existing branches this is a minor security fix. Problem noted by Noah Misch (after several of us had failed to detect it :-(). This is CVE-2012-2655.	2012-05-30 23:27:57 -04:00
Tom Lane	cd0ff9c0f4	Expand the allowed range of timezone offsets to +/-15:59:59 from Greenwich. We used to only allow offsets less than +/-13 hours, then it was +/14, then it was +/-15. That's still not good enough though, as per today's bug report from Patric Bechtel. This time I actually looked through the Olson timezone database to find the largest offsets used anywhere. The winners are Asia/Manila, at -15:56:00 until 1844, and America/Metlakatla, at +15:13:42 until 1867. So we'd better allow offsets less than +/-16 hours. Given the history, we are way overdue to have some greppable #define symbols controlling this, so make some ... and also remove an obsolete comment that didn't get fixed the last time. Back-patch to all supported branches.	2012-05-30 19:58:35 -04:00
Robert Haas	07ab1383e3	Fix two more bugs in fast-path relation locking. First, the previous code failed to account for the fact that, during Hot Standby operation, the startup process takes AccessExclusiveLocks on relations without setting MyDatabaseId. This resulted in fast path strong lock counts failing to be incremented with the startup process took locks, which in turn allowed conflicting lock requests to succeed when they should not have. Report by Erik Rijkers, diagnosis by Heikki Linnakangas. Second, LockReleaseAll() failed to honor the allLocks and lockmethodid restrictions with respect to fast-path locks. It's not clear to me whether this produces any user-visible breakage at the moment, but it's certainly wrong. Rearrange order of operations in LockReleaseAll to fix. Noted by Tom Lane.	2012-05-30 16:17:46 -04:00
Heikki Linnakangas	d1996ed5e8	Change the way parent pages are tracked during buffered GiST build. We used to mimic the way a stack is constructed when descending the tree during normal GiST inserts, but that was quite complicated during a buffered build. It was also wrong: in GiST, the left-to-right relationships on different levels might not match each other, so that when you know the parent of a child page, you won't necessarily find the parent of the page to the right of the child page by following the rightlinks at the parent level. This sometimes led to "could not re-find parent" errors while building a GiST index. We now use a simple hash table to track the parent of every internal page. Whenever a page is split, and downlinks are moved from one page to another, we update the hash table accordingly. This is also better for performance than the old method, as we never need to move right to re-find the parent page, which could take a significant amount of time for buffers that were created much earlier in the index build.	2012-05-30 12:05:57 +03:00
Heikki Linnakangas	be02b16826	Delete the temporary file used in buffered GiST build, after the build. There were two bugs here: We forgot to call gistFreeBuildBuffers() function at the end of build, and we passed interXact == true to BufFileCreateTemp, so the file wasn't automatically cleaned up at end-of-transaction either.	2012-05-30 12:05:57 +03:00
Tom Lane	4317e0246c	Rewrite --section option to decouple it from --schema-only/--data-only. The initial implementation of pg_dump's --section option supposed that the existing --schema-only and --data-only options could be made equivalent to --section settings. This is wrong, though, due to dubious but long since set-in-stone decisions about where to dump SEQUENCE SET items, as seen in bug report from Martin Pitt. (And I'm not totally convinced there weren't other bugs, either.) Undo that coupling and instead drive --section filtering off current-section state tracked as we scan through the TOC list to call _tocEntryRequired(). To make sure those decisions don't shift around and hopefully save a few cycles, run _tocEntryRequired() only once per TOC entry and save the result in a new TOC field. This required minor rejiggering of ACL handling but also allows a far cleaner implementation of inhibit_data_for_failed_table. Also, to ensure that pg_dump and pg_restore have the same behavior with respect to the --section switches, add _tocEntryRequired() filtering to WriteToc() and WriteDataChunks(), rather than trying to implement section filtering in an entirely orthogonal way in dumpDumpableObject(). This required adjusting the handling of the special ENCODING and STDSTRINGS items, but they were pretty weird before anyway. Minor other code review for the patch, too.	2012-05-29 23:22:14 -04:00
Heikki Linnakangas	4bc6fb57f7	Fix integer overflow bug in GiST buffering build calculations. The result of (maintenance_work_mem * 1024) / BLCKSZ doesn't fit in a signed 32-bit integer, if maintenance_work_mem >= 2GB. Use double instead. And while we're at it, write the calculations in an easier to understand form, with the intermediary steps written out and commented.	2012-05-29 22:27:42 +03:00
Tom Lane	2755abf386	Teach AbortOutOfAnyTransaction to clean up partially-started transactions. AbortOutOfAnyTransaction failed to do anything if the state it saw on entry corresponded to failing partway through StartTransaction. I fixed AbortCurrentTransaction to cope with that case way back in commit `60b2444cc3`, but evidently overlooked that AbortOutOfAnyTransaction should do likewise. Back-patch to all supported branches. It's not clear that this omission has any more-than-cosmetic consequences, but it's also not clear that it doesn't, so back-patching seems the least risky choice.	2012-05-28 23:57:06 -04:00
Tom Lane	c89bdf7690	Eliminate some more O(N^2) behaviors in pg_dump/pg_restore. This patch fixes three places (which AFAICT is all of them) where runtime was O(N^2) in the number of TOC entries, by using an index array to replace linear searches of the TOC list. This performance issue is a bit less bad than those recently fixed, because it depends on the number of items dumped not the number in the source database, so the problem can be dodged by doing partial dumps. The previous coding already had an instance of one of the two index arrays needed, but it was only calculated in parallel-restore cases; now we need it all the time. I also chose to move the arrays into the ArchiveHandle data structure, to make this code a bit more ready for the day that we try to sling multiple ArchiveHandles around in pg_dump or pg_restore. Since we still need some server-side work before pg_dump can really cope nicely with tens of thousands of tables, there's probably little point in back-patching.	2012-05-28 20:38:28 -04:00
Peter Eisentraut	2d612abd4d	libpq: URI parsing fixes Drop special handling of host component with slashes to mean Unix-domain socket. Specify it as separate parameter or using percent-encoding now. Allow omitting username, password, and port even if the corresponding designators are present in URI. Handle percent-encoding in query parameter keywords. Alex Shulgin some documentation improvements by myself	2012-05-28 22:44:34 +03:00
Peter Eisentraut	388d251679	Update SQL features list Set E081 Basic Privileges to supported, since by the letter of it, we support it, even though not all possible forms of USAGE privileges are implemented.	2012-05-27 23:34:16 +03:00
Peter Eisentraut	8e497c731b	psql: Remove notice about readline from --version output This was from a time when readline support wasn't standard. And it doesn't help analyzing current line editing library problems.	2012-05-27 22:48:20 +03:00
Peter Eisentraut	27314d32a8	Suppress -Wunused-result warning about write() This is related to `aa90e148ca`, but this code is only used under -DLINUX_OOM_ADJ, so it was apparently overlooked then.	2012-05-27 22:35:01 +03:00
Peter Eisentraut	a8b92b6090	PL/Perl: Avoid compiler warning from clang Use SvREFCNT_inc_simple_void() instead of SvREFCNT_inc() to avoid warning about unused return value.	2012-05-27 22:30:34 +03:00
Magnus Hagander	16282ae688	Make pg_recievexlog by default loop on connection failures Avoids the need for an external script in the most common scenario. Behavior can be overridden using the -n/--noloop commandline parameter.	2012-05-27 11:05:24 +02:00
Tom Lane	532fe28dad	Prevent synchronized scanning when systable_beginscan chooses a heapscan. The only interesting-for-performance case wherein we force heapscan here is when we're rebuilding the relcache init file, and the only such case that is likely to be examining a catalog big enough to be syncscanned is RelationBuildTupleDesc. But the early-exit optimization in that code gets broken if we start the scan at a random place within the catalog, so that allowing syncscan is actually a big deoptimization if pg_attribute is large (at least for the normal case where the rows for core system catalogs have never been changed since initdb). Hence, prevent syncscan here. Per my testing pursuant to complaints from Jeff Frost and Greg Sabino Mullane, though neither of them seem to have actually hit this specific problem. Back-patch to 8.3, where syncscan was introduced.	2012-05-26 19:09:52 -04:00
Tom Lane	d3b97d1488	Fix string truncation to be multibyte-aware in text_name and bpchar_name. Previously, casts to name could generate invalidly-encoded results. Also, make these functions match namein() more exactly, by consistently using palloc0() instead of ad-hoc zeroing code. Back-patch to all supported branches. Karl Schnaitter and Tom Lane	2012-05-25 17:34:51 -04:00
Tom Lane	73cc7d3b24	Use binary search instead of brute-force scan in findNamespace(). The previous coding presented a significant bottleneck when dumping databases containing many thousands of schemas, since the total time spent searching would increase roughly as O(N^2) in the number of objects. Noted by Jeff Janes, though I rewrote his proposed patch to use the existing findObjectByOid infrastructure. Since this is a longstanding performance bug, backpatch to all supported versions.	2012-05-25 14:35:37 -04:00
Magnus Hagander	31d965819b	Fix base backup streaming xlog from standby When backing up from a standby server, the backup process will not automatically switch xlog segment. So we must accept a partially transferred xlog file in this case, but rename it into position anyway. In passing, merge the two callbacks for segment end and stop stream into a single callback, since their implementations were close to identical, and rename this callback to reflect that it stops streaming rather than continues it. Patch by Magnus Hagander, review by Fujii Masao	2012-05-25 11:36:22 +02:00
Tom Lane	2a4c46e0ba	Fix array overrun in regex code. zaptreesubs() was coded to unconditionally reset a capture subre's corresponding pmatch[] entry. However, in regexes without backrefs, that array is caller-supplied and might not have as many entries as the regex has capturing parens. So check the array length and do nothing if there is no corresponding entry, much as subset() does. Failure to check this resulted in a stack clobber in the case reported by Marko Kreen. This bug appears to have been latent in the regex library from the beginning. It was not exposed because find() called dissect() not cdissect(), and the dissect() code path didn't ever call zaptreesubs() (formerly zapmem()). When I unified dissect() and cdissect() in commit `4dd78bf37a`, the problem was exposed. Now that I've seen this, I'm rather suspicious that we might need to back-patch it; but will refrain for now, for lack of evidence that the case can be hit in the previous coding.	2012-05-24 13:56:16 -04:00
Magnus Hagander	77f93cb32d	Add missing PQfinish() calls Fujii Masao	2012-05-23 21:52:23 +02:00
Tom Lane	ed962fd712	Ensure that seqscans check for interrupts at least once per page. If a seqscan encounters many consecutive pages containing only dead tuples, it can remain in the loop in heapgettup for a long time, and there was no CHECK_FOR_INTERRUPTS anywhere in that loop. This meant there were real-world situations where a query would be effectively uncancelable for long stretches. Add a check placed to occur once per page, which should be enough to provide reasonable response time without adding any measurable overhead. Report and patch by Merlin Moncure (though I tweaked it a bit). Back-patch to all supported branches.	2012-05-22 19:42:05 -04:00
Robert Haas	8fbe5a317d	Fix error message for COMMENT/SECURITY LABEL ON COLUMN xxx IS 'yyy' When the column name is an unqualified name, rather than table.column, the error message complains about too many dotted names, which is wrong. Report by Peter Eisentraut based on examination of the sepgsql regression test output, but the problem also affects COMMENT. New wording as suggested by Tom Lane.	2012-05-22 11:23:36 -04:00
Robert Haas	304aa339b2	Prevent pg_basebackup when integer_datetimes flag doesn't match. Magnus Hagander, reviewed by Fujii Masao, with slight wording changes by me.	2012-05-22 10:02:47 -04:00
Robert Haas	219c024c64	Repair out-of-date information in src/backend/storage/buffer/README. In commit `d526575f89`, we changed things so that buffer usage counts are incremented when the buffer is pinned, rather than when it is unpinned, but the README file didn't get the memo. Report by Amit Kapila.	2012-05-22 09:32:09 -04:00
Tom Lane	b94ce6e807	Move postmaster's RemovePgTempFiles call to a less randomly chosen place. There is no reason to do this as early as possible in postmaster startup, and good reason not to do it until we have completely created the postmaster's lock file, namely that it might contribute to pg_ctl thinking that postmaster startup has timed out. (This would require a rather unusual amount of time to be spent scanning temp file directories, but we have at least one field report of it happening reproducibly.) Back-patch to 9.1. Before that, pg_ctl didn't wait for additional info to be added to the lock file, so it wasn't a problem. Note that this is not a complete fix to the slow-start issue in 9.1, because we still had identify_system_timezone being run during postmaster start in 9.1. But that's at least a reasonably well-defined delay, with an easy workaround if needed, whereas the temp-files scan is not so predictable and cannot be avoided.	2012-05-21 22:50:30 -04:00
Tom Lane	efae4653c9	Update woefully-obsolete comment. The accurate info about what's in a lock file has been in miscadmin.h for some time, so let's just make this comment point there instead of maintaining a duplicative copy.	2012-05-21 22:11:00 -04:00
Peter Eisentraut	cdf8bcb8d9	pg_ctl: Sort signal list in --help output The list was neither logical nor numerical nor alphabetical. Let's go with alphabetical.	2012-05-21 20:12:30 +03:00
Peter Eisentraut	4c39a09089	libpq: Add missing file to GETTEXT_FILES list For the record, fe-print.c is also missing, but it's sort of deprecated, and the string internationalization there has some issues, and it doesn't seem worth fixing that. So let's leave that out.	2012-05-21 20:08:50 +03:00
Peter Eisentraut	f1f6737e15	Fix incorrect logic in JSON number lexer Detectable by gcc -Wlogical-op. Add two regression test cases that would previously allow incorrect values to pass.	2012-05-20 02:24:46 +03:00
Peter Eisentraut	2273a50364	Realign some --help output to have better spacing between columns	2012-05-18 20:34:14 +03:00
Heikki Linnakangas	1d27dcf578	Fix bug in gistRelocateBuildBuffersOnSplit(). When we create a temporary copy of the old node buffer, in stack, we mustn't leak that into any of the long-lived data structures. Before this patch, when we called gistPopItupFromNodeBuffer(), it got added to the array of "loaded buffers". After gistRelocateBuildBuffersOnSplit() exits, the pointer added to the loaded buffers array points to garbage. Often that goes unnotied, because when we go through the array of loaded buffers to unload them, buffers with a NULL pageBuffer are ignored, which can often happen by accident even if the pointer points to garbage. This patch fixes that by marking the temporary copy in stack explicitly as temporary, and refrain from adding buffers marked as temporary to the array of loaded buffers. While we're at it, initialize nodeBuffer->pageBlocknum to InvalidBlockNumber and improve comments a bit. This isn't strictly necessary, but makes debugging easier.	2012-05-18 19:38:32 +03:00
Peter Eisentraut	939ec9b8a4	Update SQL features/conformance information to SQL:2011	2012-05-17 09:50:04 +03:00
Peter Eisentraut	be6d1c88a4	Change COLLATION keyword category It was changed from unreserved to reserved as part of the COLLATION FOR syntax, but it turns out that type_func_name_keyword is sufficient.	2012-05-16 20:19:44 +03:00
Tom Lane	488c6dd170	Improve error message for ALTER COLUMN TYPE coercion failure. Per recent discussion, the error message for this was actually a trifle inaccurate, since it said "cannot be cast" which might be incorrect. Adjust that wording, and add a HINT suggesting that a USING clause might be needed.	2012-05-16 07:28:25 -04:00
Heikki Linnakangas	6593c5b5dc	Fix bug in freespace calculation in heap_multi_insert(). If the amount of freespace on page was less than the amount reserved by fillfactor, the calculation would underflow. This fixes bug #6643 reported by Tomonari Katsumata.	2012-05-16 14:13:06 +03:00
Peter Eisentraut	c8e086795a	Remove whitespace from end of lines pgindent and perltidy should clean up the rest.	2012-05-15 22:19:41 +03:00
Peter Eisentraut	8afb026e57	Remove stray nbsp character	2012-05-15 21:38:59 +03:00
Heikki Linnakangas	d2495f272c	Fix bug in to_tsquery(). We were using memcpy() to copy to a possibly overlapping memory region, which is a no-no. Use memmove() instead.	2012-05-15 19:27:34 +03:00
Tom Lane	9b63e9869f	In pgstat.c, use a timeout in WaitLatchOrSocket only on Windows. We have no need for a timeout here really, but some broken products from Redmond seem to lose FD_READ events occasionally, and waking up and retrying the recv() is the only known way to work around that. Perhaps somebody will be motivated to figure out a better answer here; but not I.	2012-05-14 23:51:34 -04:00
Tom Lane	5a2bb06012	Revert "Add some temporary instrumentation to pgstat.c." This reverts commit `7d88bb73f7`. That instrumentation has served its purpose.	2012-05-14 23:08:10 -04:00
Tom Lane	f667747b6d	Put back AC_REQUIRE([AC_STRUCT_TM]). The BSD-ish members of the buildfarm all seem to think removing this was a bad idea. It looks to me like it resulted in omitting the system header inclusion necessary to detect the fields of struct tm correctly.	2012-05-14 23:06:48 -04:00
Tom Lane	e42a21b9e6	Assert that WaitLatchOrSocket callers cannot wait only for writability. Since we have chosen to report socket EOF and error conditions via the WL_SOCKET_READABLE flag bit, it's unsafe to wait only for WL_SOCKET_WRITEABLE; the caller would never be notified of the socket condition, and in some of these implementations WaitLatchOrSocket would busy-wait until something else happens. Add this restriction to the API specification, and add Asserts to check that callers don't try to do that. At some point we might want to consider adjusting the API to relax this restriction, but until we have an actual use case for waiting on a write-only socket, it seems premature to design a solution.	2012-05-14 16:12:28 -04:00
Peter Eisentraut	ff4628f37a	Remove unused AC_DEFINE symbols ENABLE_DTRACE unused as of `a7b7b07af3` HAVE_ERR_SET_MARK unused as of `4ed4b6c54e` HAVE_FCVT unused as of `4553e1d80f` HAVE_STRUCT_SOCKADDR_UN unused as of `b4cea00a1f` HAVE_SYSCONF unused as of `f83356c7f5` TM_IN_SYS_TIME never used, obsolescent per Autoconf documentation	2012-05-14 22:51:21 +03:00
Tom Lane	d461d0502b	For testing purposes, reinsert a timeout in pgstat.c's wait call. Test results from buildfarm members mastodon/narwhal (Windows Server 2003) make it look like that platform just plain loses FD_READ events occasionally, and the only reason our previous coding seemed to work was that it timed out every couple of seconds and retried the whole operation. Try to verify this by reinserting a finite timeout into the pgstat loop. This isn't meant to be a permanent patch either, just to confirm or disprove a theory.	2012-05-14 15:03:14 -04:00
Tom Lane	f1ca51549e	Force pgwin32_recv into nonblock mode when called from pgstat.c. This should get rid of the usage of pgwin32_waitforsinglesocket entirely, and perhaps thereby remove the race condition that's evidently still present on some versions of Windows. The previous arrangement was a bit unsafe anyway, since waiting at the recv() would not allow pgstat to notice postmaster death.	2012-05-14 10:57:07 -04:00
Heikki Linnakangas	f15c2eae9c	Remove unnecessary pg_verifymbstr() calls from tsvector/query in functions. The input should've been validated well before it hits the input function. Doing so again is a waste of cycles.	2012-05-14 14:30:32 +03:00
Heikki Linnakangas	9e4637bf89	Update comments that became out-of-date with the PGXACT struct. When the "hot" members of PGPROC were split off to separate PGXACT structs, many PGPROC fields referred to in comments were moved to PGXACT, but the comments were neglected in the commit. Mostly this is just a search/replace of PGPROC with PGXACT, but the way the dummy PGPROC entries are created for prepared transactions changed more, making some of the comments totally bogus. Noah Misch	2012-05-14 10:28:55 +03:00
Peter Eisentraut	64f09ca386	Remove leftovers of BeOS port These should have been removed when the BeOS port was removed in `44f9021223`.	2012-05-14 04:50:39 +03:00
Peter Eisentraut	6bf1e7668d	Small punctuation editing of postgresql.conf.sample	2012-05-14 04:50:39 +03:00
Peter Eisentraut	2a7f636640	pg_ctl: Improve --help output All other --help output has = signs between long options and their arguments, so do it here as well.	2012-05-14 04:50:39 +03:00
Tom Lane	7d88bb73f7	Add some temporary instrumentation to pgstat.c. Log main-loop blocking events and the results of inquiry messages. This is to get some clarity as to what's happening on those Windows buildfarm members that still don't like the latch-ified stats collector. This bulks up the postmaster log a tad, so I won't leave it in place for long.	2012-05-13 21:11:31 -04:00
Tom Lane	b8347138e9	Fix DROP TABLESPACE to unlink symlink when directory is not there. If the tablespace directory is missing entirely, we allow DROP TABLESPACE to go through, on the grounds that it should be possible to clean up the catalog entry in such a situation. However, we forgot that the pg_tblspc symlink might still be there. We should try to remove the symlink too (but not fail if it's no longer there), since not doing so can lead to weird behavior subsequently, as per report from Michael Nolan. There was some discussion of adding dependency links to prevent DROP TABLESPACE when the catalogs still contain references to the tablespace. That might be worth doing too, but it's an orthogonal question, and in any case wouldn't be back-patchable. Back-patch to 9.0, which is as far back as the logic looks like this. We could possibly do something similar in 8.x, but given the lack of reports I'm not sure it's worth the trouble, and anyway the case could not arise in the form the logic is meant to cover (namely, a post-DROP transaction rollback having resurrected the pg_tablespace entry after some or all of the filesystem infrastructure is gone).	2012-05-13 18:06:52 -04:00
Tom Lane	966970ed63	Re-revert stats collector latch changes. This reverts commit `cb2f2873d6`, restoring the latch-ified stats collector logic. We'll soon see if this works any better on the Windows buildfarm machines.	2012-05-13 14:44:39 -04:00
Tom Lane	b85427f227	Attempt to fix some issues in our Windows socket code. Make sure WaitLatchOrSocket regards FD_CLOSE as a read-ready condition. We might want to tweak this further, but it was surely wrong as-is. Make pgwin32_waitforsinglesocket detach its private event object from the passed socket before returning. I suspect that failure to do so leads to race conditions when other code (such as WaitLatchOrSocket) attaches a different event object to the same socket. Moreover, the existing coding meant that repeated calls to pgwin32_waitforsinglesocket would perform ResetEvent on an event actively connected to a socket, which is rumored to be an unsafe practice; the WSAEventSelect documentation appears to recommend against this, though it does not say not to do it in so many words. Also, uniformly use the coding pattern "WSAEventSelect(s, NULL, 0)" to detach events from sockets, rather than passing the event in the second parameter. The WSAEventSelect documentation says that the second parameter is ignored if the third is 0, so theoretically this should make no difference. However, elsewhere on the same reference page the use of NULL in this context is recommended, and I have found suggestions on the net that some versions of Windows have bugs with a non-NULL second parameter in this usage. Some other mostly-cosmetic cleanup, such as using the right one of WSAGetLastError and GetLastError for reporting errors from these functions.	2012-05-13 14:35:40 -04:00
Tom Lane	fd350ef395	Fix bogus declaration of local variable. rc should be an int here, not a pgsocket. Fairly harmless as long as pgsocket is an integer type, but nonetheless wrong. Error introduced in commit `87091cb1f1`.	2012-05-13 00:30:32 -04:00
Tom Lane	398b240151	Avoid unnecessary process wakeups in the log collector. syslogger was coded to wake up once per second whether there was anything useful to do or not. As part of our campaign to reduce the server's idle power consumption, change it to use a latch for waiting. Now, in the absence of any data to log or any signals to service, it will only wake up at the programmed logfile rotation times (if any).	2012-05-12 19:21:54 -04:00
Tom Lane	31ad655364	Fix WaitLatchOrSocket to handle EOF on socket correctly. When using poll(), EOF on a socket is reported with the POLLHUP not POLLIN flag (at least on Linux). WaitLatchOrSocket failed to check this bit, causing it to go into a busy-wait loop if EOF occurs. We earlier fixed the same mistake in the test for the state of the postmaster_alive socket, but missed it for the caller-supplied socket. Fortunately, this error is new in 9.2, since 9.1 only had a select() based code path not a poll() based one.	2012-05-12 16:36:47 -04:00
Simon Riggs	867540b49c	Ensure backwards compatibility for GetStableLatestTransactionId()	2012-05-12 13:26:10 +01:00
Peter Eisentraut	afe86a9e73	Fix obsolescent C declaration syntax gcc -Wextra/-Wold-style-declaration thinks that "inline" should go before the function return type.	2012-05-12 12:52:02 +03:00
Tom Lane	d0c231d132	Cosmetic adjustments for postmaster's handling of checkpointer. Correct some comments, order some operations a bit more consistently. No functional changes.	2012-05-11 17:46:37 -04:00
Peter Eisentraut	2cfb1c6f77	PL/Python: Adjust the regression tests for Python 3.3 The string representation of ImportError changed. Remove printing that; it's not necessary for the test. The order in which members of a dict are printed changed. But this was always implementation-dependent, so we have just been lucky for a long time. Do the printing the hard way to ensure sorted order.	2012-05-11 23:04:47 +03:00
Robert Haas	1331cc6c1a	Prevent loss of init fork when truncating an unlogged table. Fixes bug #6635, reported by Akira Kurosawa.	2012-05-11 09:48:56 -04:00
Simon Riggs	b762e8f50b	Remove extraneous #include "storage/proc.h"	2012-05-11 14:46:46 +01:00
Simon Riggs	b06679e012	Ensure age() returns a stable value rather than the latest value	2012-05-11 14:36:24 +01:00
Heikki Linnakangas	3652d72dd4	On GiST page split, release the locks on child pages before recursing up. When inserting the downlinks for a split gist page, we used hold the locks on the child pages until the insertion into the parent - and recursively its parent if it had to be split too - were all completed. Change that so that the locks on child pages are released after the insertion in the immediate parent is done, before recursing further up the tree. This reduces the number of lwlocks that are held simultaneously. Holding many locks is bad for concurrency, and in extreme cases you can even hit the limit of 100 simultaneously held lwlocks in a backend. If you're really unlucky, you can hit the limit while in a critical section, which brings down the whole system. This fixes bug #6629 reported by Tom Forbes. Backpatch to 9.1. The page splitting code was rewritten in 9.1, and the old code did not have this problem.	2012-05-11 12:35:28 +03:00
Bruce Momjian	ee24de4001	Revert catalog bump; was post-beta1, and unnecessary.	2012-05-10 18:44:47 -04:00
Bruce Momjian	d2fe836cd2	Update comment for 'name' data type to say 63 "bytes". Catalog version bump so everyone has the same comment for beta1.	2012-05-10 18:40:40 -04:00
Tom Lane	f70fa835e0	Stamp 9.2beta1.	2012-05-10 18:35:09 -04:00
Tom Lane	cb2f2873d6	Temporarily revert stats collector latch changes so we can ship beta1. This patch reverts commit `49340037ee` and some follow-on tweaking in pgstat.c. While the basic scheme of latch-ifying the stats collector seems sound enough, it's failing on most Windows buildfarm members for unknown reasons, and there's no time left to debug that before 9.2beta1. Better to ship a beta version without this improvement. I hope to re-revert this once beta1 is out, though.	2012-05-10 17:26:33 -04:00
Tom Lane	f40022f1ad	Make WaitLatch's WL_POSTMASTER_DEATH result trustworthy; simplify callers. Per a suggestion from Peter Geoghegan, make WaitLatch responsible for verifying that the WL_POSTMASTER_DEATH bit it returns is truthful (by testing PostmasterIsAlive). Then simplify its callers, who no longer need to do that for themselves. Remove weasel wording about falsely-set result bits from WaitLatch's API contract.	2012-05-10 14:34:53 -04:00
Peter Eisentraut	a97207b690	PL/Python: Fix slicing support for result objects for Python 3 The old way of implementing slicing support by implementing PySequenceMethods.sq_slice no longer works in Python 3. You now have to implement PyMappingMethods.mp_subscript. Do this by simply proxying the call to the wrapped list of result dictionaries. Consolidate some of the subscripting regression tests. Jan Urbański	2012-05-10 20:40:30 +03:00
Peter Eisentraut	1540d3bf4d	PL/Python: Update incorrect comment Jan Urbański	2012-05-10 20:40:30 +03:00
Tom Lane	ada8fa08fc	Fix Windows implementation of PGSemaphoreLock. The original coding failed to reset ImmediateInterruptOK before returning, which would potentially allow a subsequent query-cancel interrupt to be accepted at an unsafe point. This is a really nasty bug since it's so hard to predict the consequences, but they could be unpleasant. Also, ensure that signal handlers are serviced before this function returns, even if the semaphore is already set. This should make the behavior more like Unix. Back-patch to all supported versions.	2012-05-10 13:36:14 -04:00
Tom Lane	8ebc908c57	Improve Windows implementation of WaitLatch/WaitLatchOrSocket. Ensure that signal handlers are serviced before this function returns. This should make the behavior more like Unix. Also, add some more error checking, and make some other cosmetic improvements. No back-patch since it's not clear whether this is fixing any live bug that would affect 9.1. I'm more concerned about 9.2 anyway given our considerable recent expansions in the usage of WaitLatch.	2012-05-10 13:26:47 -04:00
Peter Eisentraut	1d158d7f98	Python 2.2 is no longer supported It was already on its last legs, and it turns out that it was accidentally broken in commit `89e850e6fd` and no one cared. So remove the rest the support for it and update the documentation to indicate that Python 2.3 is now required.	2012-05-10 20:02:57 +03:00
Magnus Hagander	f33c5d471c	Only attempt to show collations on servers >= 9.1. Show a proper error message instead of a SQL error. Josh Kupershmidt	2012-05-10 09:12:26 +02:00
Heikki Linnakangas	60a3dffb72	Fix outdated comment. Multi-insert records observe XLOG_HEAP_INIT_PAGE flag too, as Andres Freund pointed out.	2012-05-10 09:55:48 +03:00
Joe Conway	b58bacdacb	PL/pgSQL RETURN NEXT was leaking converted tuples, causing out of memory when looping through large numbers of rows. Flag the converted tuples to be freed. Complaint and patch by Joe.	2012-05-09 22:57:19 -07:00
Tom Lane	fd71421b01	Improve tests for postmaster death in auxiliary processes. In checkpointer and walwriter, avoid calling PostmasterIsAlive unless WaitLatch has reported WL_POSTMASTER_DEATH. This saves a kernel call per iteration of the process's outer loop, which is not all that much, but a cycle shaved is a cycle earned. I had already removed the unconditional PostmasterIsAlive calls in bgwriter and pgstat in previous patches, but forgot that WL_POSTMASTER_DEATH is supposed to be treated as untrustworthy (per comment in unix_latch.c); so adjust those two cases to match. There are a few other places where the same idea might be applied, but only after substantial code rearrangement, so I didn't bother.	2012-05-10 00:54:56 -04:00
Tom Lane	d3ae406f54	Further tweaking of nomenclature in checkpointer.c. Get rid of some more naming choices that only make sense if you know that this code used to be in the bgwriter, as well as some stray comments referencing the bgwriter.	2012-05-10 00:01:10 -04:00
Tom Lane	6308ba05a7	Improve control logic for bgwriter hibernation mode. Commit `6d90eaaa89` added a hibernation mode to the bgwriter to reduce the server's idle-power consumption. However, its interaction with the detailed behavior of BgBufferSync's feedback control loop wasn't very well thought out. That control loop depends primarily on the rate of buffer allocation, not the rate of buffer dirtying, so the hibernation mode has to be designed to operate only when no new buffer allocations are happening. Also, the check for whether the system is effectively idle was not quite right and would fail to detect a constant low level of activity, thus allowing the bgwriter to go into hibernation mode in a way that would let the cycle time vary quite a bit, possibly further confusing the feedback loop. To fix, move the wakeup support from MarkBufferDirty and SetBufferCommitInfoNeedsSave into StrategyGetBuffer, and prevent the bgwriter from entering hibernation mode unless no buffer allocations have happened recently. In addition, fix the delaying logic to remove the problem of possibly not responding to signals promptly, which was basically caused by trying to use the process latch's is_set flag for multiple purposes. I can't prove it but I'm suspicious that that hack was responsible for the intermittent "postmaster does not shut down" failures we've been seeing in the buildfarm lately. In any case it did nothing to improve the readability or robustness of the code. In passing, express the hibernation sleep time as a multiplier on BgWriterDelay, not a constant. I'm not sure whether there's any value in exposing the longer sleep time as an independently configurable setting, but we can at least make it act like this for little extra code.	2012-05-09 23:37:10 -04:00
Peter Eisentraut	5d39807a00	Add make dependency so that postgres.bki is rebuilt in major version change Every time since the current rule for postgres.bki was put in place when we change the major version, people complain that their tests fail in strange ways. This is because the version number in postgres.bki is not updated, because it has no dependency for that. And you can't even force the rebuild manually if you don't happen to know which file has the problem. Fix that now before it will happen again. The only remaining problem with switching major versions, as far as the regression tests are concerned, is that contrib needs to be rebuilt. But that's easily invoked, and in any case the failure modes are more friendly if you forget that.	2012-05-09 20:45:56 +03:00
Simon Riggs	8f28789bff	Rename BgWriterShmem/Request to CheckpointerShmem/Request	2012-05-09 14:23:45 +01:00
Simon Riggs	bbd3ec9dce	Rename BgWriterCommLock to CheckpointerCommLock	2012-05-09 14:11:48 +01:00
Simon Riggs	5829387381	Avoid xid error from age() function when run on Hot Standby	2012-05-09 13:56:24 +01:00
Tom Lane	acd4c7d58b	Fix an issue in recent walwriter hibernation patch. Users of asynchronous-commit mode expect there to be a guaranteed maximum delay before an async commit's WAL records get flushed to disk. The original version of the walwriter hibernation patch broke that. Add an extra shared-memory flag to allow async commits to kick the walwriter out of hibernation mode, without adding any noticeable overhead in cases where no action is needed.	2012-05-08 23:06:40 -04:00
Tom Lane	49340037ee	Reduce idle power consumption of stats collector process. Latch-ify the stats collector, so that it does not need an arbitrary wakeup cycle to check for postmaster death. The incremental savings in idle power is pretty marginal, since we only had it waking every two seconds; but I believe that this patch may also improve the collector's performance under load, by reducing the number of kernel calls made per message when messages are arriving constantly (we now avoid a select/poll call except when we need to sleep). The change also reduces the time needed for a normal database shutdown on platforms where signals don't interrupt select().	2012-05-08 21:26:46 -04:00
Tom Lane	5461564a9d	Reduce idle power consumption of walwriter and checkpointer processes. This patch modifies the walwriter process so that, when it has not found anything useful to do for many consecutive wakeup cycles, it extends its sleep time to reduce the server's idle power consumption. It reverts to normal as soon as it's done any successful flushes. It's still true that during any async commit, backends check for completed, unflushed pages of WAL and signal the walwriter if there are any; so that in practice the walwriter can get awakened and returned to normal operation sooner than the sleep time might suggest. Also, improve the checkpointer so that it uses a latch and a computed delay time to not wake up at all except when it has something to do, replacing a previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for reducing the server's power consumption when idle. In passing, get rid of the dedicated latch for signaling the walwriter in favor of using its procLatch, since that comports better with possible generic signal handlers using that latch. Also, fix a pre-existing bug with failure to save/restore errno in walwriter's signal handlers. Peter Geoghegan, somewhat simplified by Tom	2012-05-08 20:03:26 -04:00
Peter Eisentraut	db84ba65ab	psql: Add variable to control keyword case in tab completion This adds the variable COMP_KEYWORD_CASE, which controls in what case keywords are completed. This is partially to let users configure the change from commit `69f4f1c357`, but it also offers more behaviors than were available before.	2012-05-08 21:06:08 +03:00
Peter Eisentraut	3420b241a7	Fix dependency tracking for src/port/%_srv.o files Because they use their own compilation rule, they don't use the dependency tracking logic from Makefile.global. To make sure that dependency tracking works anyway for the _srv.o files, depend on their .o siblings as well, which do have proper dependencies. It's a hack that might fail someday if there is a _srv.o without a corresponding .o, but it works for now (and those would probably go into src/backend/port/ anyway).	2012-05-08 20:10:50 +03:00
Peter Eisentraut	dcb2c58381	Fix misleading comments Josh Kupershmidt	2012-05-08 19:35:22 +03:00
Peter Eisentraut	3284e03d5d	Remove strdup, strtol, strtoul from libpgport These should not be needed anymore, at least after the recent port removals. So let's see whether we can do without them.	2012-05-07 23:10:28 +03:00
Peter Eisentraut	d7b2cd9d40	Fix pg_config.h make rule According to the Autoconf documentation, there should be a make rule pg_config.h: stamp-h so that with the right setup around this, a change in pg_config.h.in will trigger a rebuild of everything that depends on pg_config.h. But this doesn't always work, sometimes you need to run make twice to get everything up to date after a change of pg_config.h.in. The fix is to write the rule as pg_config.h: stamp-h ; instead (with an empty command instead of no command). This is what Automake-generated makefiles effectively do, so it seems safe to be on this side. It's not actually clear why this is (apparently) more correct. It's been posted to <http://lists.gnu.org/archive/html/help-make/2012-04/msg00058.html> without response so far.	2012-05-07 21:28:38 +03:00
Magnus Hagander	916d589a10	Make "unexpected EOF" messages DEBUG1 unless in an open transaction "Unexpected EOF on client connection" without an open transaction is mostly noise, so turn it into DEBUG1. With an open transaction it's still indicating a problem, so keep those as ERROR, and change the message to indicate that it happened in a transaction.	2012-05-07 18:50:44 +02:00
Tom Lane	71b9549d05	Overdue code review for transaction-level advisory locks patch. Commit `62c7bd31c8` had assorted problems, most visibly that it broke PREPARE TRANSACTION in the presence of session-level advisory locks (which should be ignored by PREPARE), as per a recent complaint from Stephen Rees. More abstractly, the patch made the LockMethodData.transactional flag not merely useless but outright dangerous, because in point of fact that flag no longer tells you anything at all about whether a lock is held transactionally. This fix therefore removes that flag altogether. We now rely entirely on the convention already in use in lock.c that transactional lock holds must be owned by some ResourceOwner, while session holds are never so owned. Setting the locallock struct's owner link to NULL thus denotes a session hold, and there is no redundant marker for that. PREPARE TRANSACTION now works again when there are session-level advisory locks, and it is also able to transfer transactional advisory locks to the prepared transaction, but for implementation reasons it throws an error if we hold both types of lock on a single lockable object. Perhaps it will be worth improving that someday. Assorted other minor cleanup and documentation editing, as well. Back-patch to 9.1, except that in the 9.1 branch I did not remove the LockMethodData.transactional flag for fear of causing an ABI break for any external code that might be examining those structs.	2012-05-04 17:44:31 -04:00
Bruce Momjian	ebcaa5fcde	Remove BSD/OS (BSDi) port. There are no known users upgrading to Postgres 9.2, and perhaps no existing users either.	2012-05-03 10:58:44 -04:00
Bruce Momjian	7490c48f1e	Mark git_changelog examples with the proper executable names.	2012-05-02 20:42:44 -04:00
Robert Haas	8e0c5195df	Add missing parenthesis in comment.	2012-05-02 14:30:58 -04:00
Peter Eisentraut	e6c2e8cb87	PL/Python: Improve test coverage Add test cases for inline handler of plython2u (when using that language name), and for result object element assignment. There is now at least one test case for every top-level functionality, except plpy.Fatal (annoying to use in regression tests) and result object slice retrieval and slice assignment (which are somewhat broken).	2012-05-02 21:09:03 +03:00
Peter Eisentraut	52aa334fcd	PL/Python: Fix crash in functions returning SETOF and using SPI Allocate PLyResultObject.tupdesc in TopMemoryContext, because its lifetime is the lifetime of the Python object and it shouldn't be freed by some other memory context, such as one controlled by SPI. We trust that the Python object will clean up its own memory. Before, this would crash the included regression test case by trying to use memory that was already freed. reported by Asif Naeem, analysis by Tom Lane	2012-05-02 20:59:51 +03:00
Peter Eisentraut	e9605a039b	Even more duplicate word removal, in the spirit of the season	2012-05-02 20:56:03 +03:00
Robert Haas	0038110421	Avoid repeated CLOG access from heap_hot_search_buffer. At the time we check whether the tuple is dead to all running transactions, we've already verified that it isn't visible to our scan, setting hint bits if appropriate. So there's no need to recheck CLOG for the all-dead test we do just a moment later. So, add HeapTupleIsSurelyDead() to test the appropriate condition under the assumption that all relevant hit bits are already set. Review by Tom Lane.	2012-05-02 12:40:07 -04:00
Robert Haas	1b4998fd44	Further corrections from the department of redundancy department. Thom Brown	2012-05-02 11:11:25 -04:00
Robert Haas	e01e66f808	More duplicate word removal.	2012-05-02 09:28:16 -04:00
Heikki Linnakangas	f291ccd43e	Remove duplicate words in comments. Found these with grep -r "for for ".	2012-05-02 10:20:27 +03:00
Tom Lane	50c2d6a1a6	Kill some remaining references to SVR4 and univel. Both terms still appear in a few places, but I thought it best to leave those alone in context.	2012-05-02 00:29:17 -04:00
Robert Haas	9b7a84f2a4	Tweak psql to print row counts when \x auto chooses non-expanded output. Noah Misch	2012-05-01 16:05:01 -04:00
Peter Eisentraut	f2f9439fbf	Remove dead ports Remove the following ports: - dgux - nextstep - sunos4 - svr4 - ultrix4 - univel These are obsolete and not worth rescuing. In most cases, there is circumstantial evidence that they wouldn't work anymore anyway.	2012-05-01 22:11:12 +03:00
Tom Lane	809e7e21af	Converge all SQL-level statistics timing values to float8 milliseconds. This patch adjusts the core statistics views to match the decision already taken for pg_stat_statements, that values representing elapsed time should be represented as float8 and measured in milliseconds. By using float8, we are no longer tied to a specific maximum precision of timing data. (Internally, it's still microseconds, but we could now change that without needing changes at the SQL level.) The columns affected are pg_stat_bgwriter.checkpoint_write_time pg_stat_bgwriter.checkpoint_sync_time pg_stat_database.blk_read_time pg_stat_database.blk_write_time pg_stat_user_functions.total_time pg_stat_user_functions.self_time pg_stat_xact_user_functions.total_time pg_stat_xact_user_functions.self_time The first four of these are new in 9.2, so there is no compatibility issue from changing them. The others require a release note comment that they are now double precision (and can show a fractional part) rather than bigint as before; also their underlying statistics functions now match the column definitions, instead of returning bigint microseconds.	2012-04-30 14:03:33 -04:00
Peter Eisentraut	26471a51fc	Mark ReThrowError() with attribute noreturn All related functions were already so marked.	2012-04-30 20:23:41 +03:00
Robert Haas	0d2235a25b	Remove duplicate word in comment. Noted by Peter Geoghegan.	2012-04-30 13:14:46 -04:00
Bruce Momjian	f33fe47a91	Add comments suggesting usage of git_changelog to generate release notes.	2012-04-30 11:05:34 -04:00
Tom Lane	1dd89eadcd	Rename I/O timing statistics columns to blk_read_time and blk_write_time. This seems more consistent with the pre-existing choices for names of other statistics columns. Rename assorted internal identifiers to match.	2012-04-29 18:13:33 -04:00
Tom Lane	309c64745e	Rename track_iotiming GUC to track_io_timing. This spelling seems significantly more readable to me.	2012-04-29 16:23:54 -04:00
Peter Eisentraut	81107282a5	Change return type of ExceptionalCondition to void and mark it noreturn In ancient times, it was thought that this wouldn't work because of TrapMacro/AssertMacro, but changing those to use a comma operator appears to work without compiler warnings.	2012-04-29 21:20:14 +03:00
Peter Eisentraut	2227bb9c94	Simplify makefile rule Instead of writing out the .c -> .o rule, use the default one, so that dependency tracking can be used.	2012-04-29 21:20:14 +03:00
Tom Lane	cdbad241f4	Clear I/O timing counters after sending them to the stats collector. This oversight caused the reported times to accumulate in an O(N^2) fashion the longer a backend runs.	2012-04-28 15:11:13 -04:00
Tom Lane	d6f7d4fdc5	Fix printing of whole-row Vars at top level of a SELECT targetlist. Normally whole-row Vars are printed as "tabname.". However, that does not work at top level of a targetlist, because per SQL standard the parser will think that the "" should result in column-by-column expansion; which is not at all what a whole-row Var implies. We used to just print the table name in such cases, which works most of the time; but it fails if the table name matches a column name available anywhere in the FROM clause. This could lead for instance to a view being interpreted differently after dump and reload. Adding parentheses doesn't fix it, but there is a reasonably simple kluge we can use instead: attach a no-op cast, so that the "*" isn't syntactically at top level anymore. This makes the printing of such whole-row Vars a lot more consistent with other Vars, and may indeed fix more cases than just the reported one; I'm suspicious that cases involving schema qualification probably didn't work properly before, either. Per bug report and fix proposal from Abbas Butt, though this patch is quite different in detail from his. Back-patch to all supported versions.	2012-04-27 19:49:18 -04:00
Bruce Momjian	993ce4e6c9	Add options to git_changelog for use in major release note creation: --details-after --master-only --oldest-first	2012-04-27 17:15:41 -04:00
Tom Lane	537b266953	Fix syslogger's rotation disable/re-enable logic. If it fails to open a new log file, the syslogger assumes there's something wrong with its parameters (such as log_directory), and stops attempting automatic time-based or size-based log file rotations. Sending it SIGHUP is supposed to start that up again. However, the original coding for that was really bogus, involving clobbering a couple of GUC variables and hoping that SIGHUP processing would restore them. Get rid of that technique in favor of maintaining a separate flag showing we've turned rotation off. Per report from Mark Kirkwood. Also, the syslogger will automatically attempt to create the log_directory directory if it doesn't exist, but that was only happening at startup. For consistency and ease of use, it should do the same whenever the value of log_directory is changed by SIGHUP. Back-patch to all supported branches.	2012-04-27 00:12:42 -04:00
Robert Haas	3424bff90f	Prevent index-only scans from returning wrong answers under Hot Standby. The alternative of disallowing index-only scans in HS operation was discussed, but the consensus was that it was better to treat marking a page all-visible as a recovery conflict for snapshots that could still fail to see XIDs on that page. We may in the future try to soften this, so that we simply force index scans to do heap fetches in cases where this may be an issue, rather than throwing a hard conflict.	2012-04-26 20:00:21 -04:00
Tom Lane	7c85aa39fc	Fix oversight in recent parameterized-path patch. bitmap_scan_cost_est() has to be able to cope with a BitmapOrPath, but I'd taken a shortcut that didn't work for that case. Noted by Heikki. Add some regression tests since this area is evidently under-covered.	2012-04-26 14:17:44 -04:00
Peter Eisentraut	ba3e4157a7	PL/Python: Accept strings in functions returning composite types Before 9.1, PL/Python functions returning composite types could return a string and it would be parsed using record_in. The 9.1 changes made PL/Python only expect dictionaries, tuples, or objects supporting getattr as output of composite functions, resulting in a regression and a confusing error message, as the strings were interpreted as sequences and the code for transforming lists to database tuples was used. Fix this by treating strings separately as before, before checking for the other types. The reason why it's important to support string to database tuple conversion is that trigger functions on tables with composite columns get the composite row passed in as a string (from record_out). Without supporting converting this back using record_in, this makes it impossible to implement pass-through behavior for these columns, as PL/Python no longer accepts strings for composite values. A better solution would be to fix the code that transforms composite inputs into Python objects to produce dictionaries that would then be correctly interpreted by the Python->PostgreSQL counterpart code. But that would be too invasive to backpatch to 9.1, and it is too late in the 9.2 cycle to attempt it. It should be revisited in the future, though. Reported as bug #6559 by Kirill Simonov. Jan Urbański	2012-04-26 21:03:48 +03:00
Peter Eisentraut	cc71ceab57	psql: Tab completion updates Add/complete support for: - ALTER DOMAIN / VALIDATE CONSTRAINT - ALTER DOMAIN / RENAME - ALTER DOMAIN / RENAME CONSTRAINT - ALTER TABLE / RENAME CONSTRAINT	2012-04-26 20:07:40 +03:00
Tom Lane	d6d5f67b5b	Modify create_index regression test to avoid intermittent failures. We have been seeing intermittent buildfarm failures due to a query sometimes not using an index-only scan plan, because a background auto-ANALYZE prevented the table's all-visible bits from being set immediately, thereby causing the estimated cost of an index-only scan to go up considerably. Adjust the test case so that a bitmap index scan is preferred instead, which serves equally well for the purpose the test case is actually meant for. (Of course, it would be better to eliminate the interference from auto-ANALYZE, but I see no low-risk way to do that, so any such fix will have to be left for 9.3 or later.)	2012-04-25 22:57:48 -04:00
Tom Lane	9fa82c9809	Fix planner's handling of RETURNING lists in writable CTEs. setrefs.c failed to do "rtoffset" adjustment of Vars in RETURNING lists, which meant they were left with the wrong varnos when the RETURNING list was in a subquery. That was never possible before writable CTEs, of course, but now it's broken. The executor fails to notice any problem because ExecEvalVar just references the ecxt_scantuple for any normal varno; but EXPLAIN breaks when the varno is wrong, as illustrated in a recent complaint from Bartosz Dmytrak. Since the eventual rtoffset of the subquery is not known at the time we are preparing its plan node, the previous scheme of executing set_returning_clause_references() at that time cannot handle this adjustment. Fortunately, it turns out that we don't really need to do it that way, because all the needed information is available during normal setrefs.c execution; we just have to dig it out of the ModifyTable node. So, do that, and get rid of the kluge of early setrefs processing of RETURNING lists. (This is a little bit of a cheat in the case of inherited UPDATE/DELETE, because we are not passing a "root" struct that corresponds exactly to what the subplan was built with. But that doesn't matter, and anyway this is less ugly than early setrefs processing was.) Back-patch to 9.1, where the problem became possible to hit.	2012-04-25 20:20:33 -04:00
Tom Lane	c62b8eaae1	Fix edge-case behavior of pg_next_dst_boundary(). Due to rather sloppy thinking (on my part, I'm afraid) about the appropriate behavior for boundary conditions, pg_next_dst_boundary() gave undefined, platform-dependent results when the input time is exactly the last recorded DST transition time for the specified time zone, as a result of fetching values one past the end of its data arrays. Change its specification to be that it always finds the next DST boundary after the input time, and adjust code to match that. The sole existing caller, DetermineTimeZoneOffset, doesn't actually care about this distinction, since it always uses a probe time earlier than the instant that it does care about. So it seemed best to me to change the API to make the result=1 and result=0 cases more consistent, specifically to ensure that the "before" outputs always describe the state at the given time, rather than hacking the code to obey the previous API comment exactly. Per bug #6605 from Sergey Burladyan. Back-patch to all supported versions.	2012-04-25 17:26:10 -04:00
Robert Haas	ca1e1a8da1	Remove prototype for nonexistent function.	2012-04-25 15:32:15 -04:00
Tom Lane	9873001e6d	Another trivial comment-typo fix.	2012-04-25 14:28:58 -04:00
Peter Eisentraut	65ca8e68b7	PL/Python: Improve error messages	2012-04-25 21:11:59 +03:00
Peter Eisentraut	8bd44677df	entab: Improve makefile A few simplifications and stylistic improvements, found while grepping around for makefile problems elsewhere.	2012-04-24 21:20:55 +03:00
Robert Haas	3ce7f18e92	Casts to or from a domain type are ignored; warn and document. Prohibiting this outright would break dumps taken from older versions that contain such casts, which would create far more pain than is justified here. Per report by Jaime Casanova and subsequent discussion.	2012-04-24 09:20:53 -04:00
Robert Haas	5d4b60f2f2	Lots of doc corrections. Josh Kupershmidt	2012-04-23 22:43:09 -04:00
Robert Haas	7ab9b2f3b7	Rearrange lazy_scan_heap to avoid visibility map race conditions. We must set the visibility map bit before releasing our exclusive lock on the heap page; otherwise, someone might clear the heap page bit before we set the visibility map bit, leading to a situation where the visibility map thinks the page is all-visible but it's really not. This problem has existed since 8.4, but it wasn't critical before we had index-only scans, since the worst case scenario was that the page wouldn't get vacuumed until the next scan_all vacuum. Along the way, a couple of minor, related improvements: (1) if we pause the heap scan to do an index vac cycle, release any visibility map page we're holding, since really long-running pins are not good for a variety of reasons; and (2) warn if we see a page that's marked all-visible in the visibility map but not on the page level, since that should never happen any more (it was allowed in previous releases, but not in 9.2).	2012-04-23 22:08:06 -04:00
Robert Haas	85efd5f065	Reduce hash size for compute_array_stats, compute_tsvector_stats. The size is only a hint, but a big hint chews up a lot of memory without apparently improving performance much. Analysis and patch by Noah Misch.	2012-04-23 22:05:41 -04:00
Peter Eisentraut	48658a1b81	Fix some typos Josh Kupershmidt	2012-04-22 19:23:47 +03:00
Tom Lane	33e99153e9	Use fuzzy not exact cost comparison for the final tie-breaker in add_path. Instead of an exact cost comparison, use a fuzzy comparison with 1e-10 delta after all other path metrics have proved equal. This is to avoid having platform-specific roundoff behaviors determine the choice when two paths are really the same to our cost estimators. Adjust the recently-added test case that made it obvious we had a problem here.	2012-04-21 00:51:14 -04:00
Alvaro Herrera	09ff76fcdb	Recast "ONLY" column CHECK constraints as NO INHERIT The original syntax wasn't universally loved, and it didn't allow its usage in CREATE TABLE, only ALTER TABLE. It now works everywhere, and it also allows using ALTER TABLE ONLY to add an uninherited CHECK constraint, per discussion. The pg_constraint column has accordingly been renamed connoinherit. This commit partly reverts some of the changes in `61d81bd28d`, particularly some pg_dump and psql bits, because now pg_get_constraintdef includes the necessary NO INHERIT within the constraint definition. Author: Nikhil Sontakke Some tweaks by me	2012-04-20 23:56:57 -03:00
Tom Lane	1f03630011	Adjust join_search_one_level's handling of clauseless joins. For an initial relation that lacks any join clauses (that is, it has to be cartesian-product-joined to the rest of the query), we considered only cartesian joins with initial rels appearing later in the initial-relations list. This creates an undesirable dependency on FROM-list order. We would never fail to find a plan, but perhaps we might not find the best available plan. Noted while discussing the logic with Amit Kapila. Improve the comments a bit in this area, too. Arguably this is a bug fix, but given the lack of complaints from the field I'll refrain from back-patching.	2012-04-20 20:10:46 -04:00
Tom Lane	5b7b5518d0	Revise parameterized-path mechanism to fix assorted issues. This patch adjusts the treatment of parameterized paths so that all paths with the same parameterization (same set of required outer rels) for the same relation will have the same rowcount estimate. We cache the rowcount estimates to ensure that property, and hopefully save a few cycles too. Doing this makes it practical for add_path_precheck to operate without a rowcount estimate: it need only assume that paths with different parameterizations never dominate each other, which is close enough to true anyway for coarse filtering, because normally a more-parameterized path should yield fewer rows thanks to having more join clauses to apply. In add_path, we do the full nine yards of comparing rowcount estimates along with everything else, so that we can discard parameterized paths that don't actually have an advantage. This fixes some issues I'd found with add_path rejecting parameterized paths on the grounds that they were more expensive than not-parameterized ones, even though they yielded many fewer rows and hence would be cheaper once subsequent joining was considered. To make the same-rowcounts assumption valid, we have to require that any parameterized path enforce all join clauses that could be obtained from the particular set of outer rels, even if not all of them are useful for indexing. This is required at both base scans and joins. It's a good thing anyway since the net impact is that join quals are checked at the lowest practical level in the join tree. Hence, discard the original rather ad-hoc mechanism for choosing parameterization joinquals, and build a better one that has a more principled rule for when clauses can be moved. The original rule was actually buggy anyway for lack of knowledge about which relations are part of an outer join's outer side; getting this right requires adding an outer_relids field to RestrictInfo.	2012-04-19 15:53:47 -04:00
Robert Haas	293ec33c32	Remove bogus comment from HeapTupleSatisfiesNow. This has been wrong for a really long time. We don't use two-phase locking to protect against serialization anomalies. Per discussion on pgsql-hackers about 2011-03-07; original report by Dan Ports.	2012-04-18 11:50:45 -04:00
Robert Haas	4a6fab03f2	Finish rename of FastPathStrongLocks to FastPathStrongRelationLocks. Commit `8e5ac74c12` tried to do this renaming, but I relied on gcc to tell me where I needed to make changes, instead of grep. Noted by Jeff Davis.	2012-04-18 11:29:34 -04:00
Robert Haas	53c5b869b4	Tighten up error recovery for fast-path locking. The previous code could cause a backend crash after BEGIN; SAVEPOINT a; LOCK TABLE foo (interrupted by ^C or statement timeout); ROLLBACK TO SAVEPOINT a; LOCK TABLE foo, and might have leaked strong-lock counts in other situations. Report by Zoltán Böszörményi; patch review by Jeff Davis.	2012-04-18 11:17:30 -04:00
Robert Haas	ab77b2da8b	Fix incorrect comment in SetBufferCommitInfoNeedsSave(). Noah Misch spotted the fact that the old comment is in fact incorrect, due to memory ordering hazards.	2012-04-18 10:55:40 -04:00
Robert Haas	e93c0b820f	After PageSetAllVisible, use MarkBufferDirty. Previously, we used SetBufferCommitInfoNeedsSave, but that's really intended for dirty-marks we can theoretically afford to lose, such as hint bits. As for 9.2, the PD_ALL_VISIBLE mustn't be lost in this way, since we could then end up with a heap page that isn't all-visible and a visibility map page that is all visible, causing index-only scans to return wrong answers.	2012-04-18 10:49:37 -04:00
Robert Haas	b5eccaef2c	Fix copyfuncs/equalfuncs support for ReassignOwnedStmt. Noah Misch	2012-04-18 10:45:18 -04:00
Robert Haas	53bbc681ca	Fix various infelicities in node functions. Mostly, this consists of adding support for fields which exist in the structure but aren't handled by copy/equal/outfuncs; but the create foreign table case can actually produce garbage output. Noah Misch	2012-04-18 10:43:16 -04:00
Peter Eisentraut	1fd832ddff	psql: Add tab completion for CREATE/ALTER ROLE name WITH Previously, the use of the optional key word WITH was not supported. Josh Kupershmidt	2012-04-18 16:55:01 +03:00
Andrew Dunstan	1b37a8c3cc	Don't override arguments set via options with positional arguments. A number of utility programs were rather careless about paremeters that can be set via both an option argument and a positional argument. This leads to results which can violate the Principal Of Least Astonishment. These changes refuse to use positional arguments to override settings that have been made via positional arguments. The changes are backpatched to all live branches.	2012-04-17 18:30:34 -04:00
Heikki Linnakangas	fe546f3da6	Don't wait for the commit record to be replicated if we wrote no WAL. When using synchronous replication, we waited for the commit record to be replicated, but if we our transaction didn't write any other WAL records, that's not required because we don't even flush the WAL locally to disk in that case. This lead to long waits when committing a transaction that only modified a temporary table. Bug spotted by Thom Brown.	2012-04-17 16:28:31 +03:00
Peter Eisentraut	a33fcd7e79	Fix typo Kyotaro HORIGUCHI	2012-04-16 15:36:40 +03:00
Heikki Linnakangas	49440fff08	Install plpgsql.h to to include/server at "make install". The header file is needed by any module that wants to use the PL/pgSQL instrumentation plugin interface. Most notably, the pldebugger plugin needs this. With this patch, it can be built using pgxs, without having the full server source tree available.	2012-04-16 13:03:16 +03:00
Peter Eisentraut	0f48e06751	PL/Python: Improve documentation of nrows() method Clarify that nrows() is the number of rows processed, versus the number of rows returned, which can be obtained using len. Also add tests about that.	2012-04-16 11:30:32 +03:00
Peter Eisentraut	c03523ed3f	PL/Python: Fix crash when colnames() etc. called without result set The result object methods colnames() etc. would crash when called after a command that did not produce a result set. Now they throw an exception. discovery and initial patch by Jean-Baptiste Quenot	2012-04-15 20:23:08 +03:00
Tatsuo Ishii	4efbb7d04f	Add missing descriptions about '--timeout' and '--mode' to help message. They are already implemented in the source code. Suggestions about the message formatting from Tom Lane.	2012-04-15 09:17:12 +09:00
Robert Haas	ea6a2d8d47	Rename synchronous_commit='write' to 'remote_write'. Fujii Masao, per discussion on pgsql-hackers	2012-04-14 10:53:22 -04:00
Robert Haas	4a2d7ad76f	pg_size_pretty(numeric) The output of the new pg_xlog_location_diff function is of type numeric, since it could theoretically overflow an int8 due to signedness; this provides a convenient way to format such values. Fujii Masao, with some beautification by me.	2012-04-14 08:07:25 -04:00
Tom Lane	e54b10a62d	Remove the "last ditch" code path in join_search_one_level(). So far as I can tell, it is no longer possible for this heuristic to do anything useful, because the new weaker definition of have_relevant_joinclause means that any relation with a joinclause must be considered joinable to at least one other relation. It would still be possible for the code block to be entered, for example if there are join order restrictions that prevent any join of the current level from being formed; but in that case it's just a waste of cycles to attempt to form cartesian joins, since the restrictions will still apply. Furthermore, IMO the existence of this code path can mask bugs elsewhere; we would have noticed the problem with cartesian joins a lot sooner if this code hadn't compensated for it in the simplest case. Accordingly, let's remove it and see what happens. I'm committing this separately from the prerequisite changes in have_relevant_joinclause, just to make the question easier to revisit if there is some fault in my logic.	2012-04-13 16:07:18 -04:00
Tom Lane	e3ffd05b02	Weaken the planner's tests for relevant joinclauses. We should be willing to cross-join two small relations if that allows us to use an inner indexscan on a large relation (that is, the potential indexqual for the large table requires both smaller relations). This worked in simple cases but fell apart as soon as there was a join clause to a fourth relation, because the existence of any two-relation join clause caused the planner to not consider clauseless joins between other base relations. The added regression test shows an example case adapted from a recent complaint from Benoit Delbosc. Adjust have_relevant_joinclause, have_relevant_eclass_joinclause, and has_relevant_eclass_joinclause to consider that a join clause mentioning three or more relations is sufficient grounds for joining any subset of those relations, even if we have to do so via a cartesian join. Since such clauses are relatively uncommon, this shouldn't affect planning speed on typical queries; in fact it should help a bit, because the latter two functions in particular get significantly simpler. Although this is arguably a bug fix, I'm not going to risk back-patching it, since it might have currently-unforeseen consequences.	2012-04-13 16:07:17 -04:00
Peter Eisentraut	c0cc526e8b	Rename bytea_agg to string_agg and add delimiter argument Per mailing list discussion, we would like to keep the bytea functions parallel to the text functions, so rename bytea_agg to string_agg, which already exists for text. Also, to satisfy the rule that we don't want aggregate functions of the same name with a different number of arguments, add a delimiter argument, just like string_agg for text already has.	2012-04-13 21:36:59 +03:00
Peter Eisentraut	64e1309c76	Consistently quote encoding and locale names in messages	2012-04-13 20:37:07 +03:00
Robert Haas	61167bfaf2	Fix typo in comment.	2012-04-13 08:54:13 -04:00
Robert Haas	5630eddf1e	Update lazy_scan_heap header comment. The previous comment described how things worked in PostgreSQL 8.2 and prior.	2012-04-13 08:51:19 -04:00
Tom Lane	732bfa2448	Fix cost estimation for indexscan filter conditions. cost_index's method for estimating per-tuple costs of evaluating filter conditions (a/k/a qpquals) was completely wrong in the presence of derived indexable conditions, such as range conditions derived from a LIKE clause. This was largely masked in common cases as a result of all simple operator clauses having about the same costs, but it could show up in a big way when dealing with functional indexes containing expensive functions, as seen for example in bug #6579 from Istvan Endredy. Rejigger the calculation to give sane answers when the indexquals aren't a subset of the baserestrictinfo list. As a side benefit, we now do the calculation properly for cases involving join clauses (ie, parameterized indexscans), which we always overestimated before. There are still cases where this is an oversimplification, such as clauses that can be dropped because they are implied by a partial index's predicate. But we've never accounted for that in cost estimates before, and I'm not convinced it's worth the cycles to try to do so.	2012-04-11 20:24:17 -04:00
Tom Lane	880bfc3287	Silently ignore any nonexistent schemas that are listed in search_path. Previously we attempted to throw an error or at least warning for missing schemas, but this was done inconsistently because of implementation restrictions (in many cases, GUC settings are applied outside transactions so that we can't do system catalog lookups). Furthermore, there were exceptions to the rule even in the beginning, and we'd been poking more and more holes in it as time went on, because it turns out that there are lots of use-cases for having some irrelevant items in a common search_path value. It seems better to just adopt a philosophy similar to what's always been done with Unix PATH settings, wherein nonexistent or unreadable directories are silently ignored. This commit also fixes the documentation to point out that schemas for which the user lacks USAGE privilege are silently ignored. That's always been true but was previously not documented. This is mostly in response to Robert Haas' complaint that 9.1 started to throw errors or warnings for missing schemas in cases where prior releases had not. We won't adopt such a significant behavioral change in a back branch, so something different will be needed in 9.1.	2012-04-11 12:02:50 -04:00
Alvaro Herrera	b035cb9db7	Accept postgres:// URIs in libpq connection functions postgres:// URIs are an attempt to "stop the bleeding" in this general area that has been said to occur due to external projects adopting their own syntaxes. The syntaxes supported by this patch: postgres://[user[:pwd]@][unix-socket][:port[/dbname]][?param1=value1&...] postgres://[user[:pwd]@][net-location][:port][/dbname][?param1=value1&...] should be enough to cover most interesting cases without having to resort to "param=value" pairs, but those are provided for the cases that need them regardless. libpq documentation has been shuffled around a bit, to avoid stuffing all the format details into the PQconnectdbParams description, which was already a bit overwhelming. The list of keywords has moved to its own subsection, and the details on the URI format live in another subsection. This includes a simple test program, as requested in discussion, to ensure that interesting corner cases continue to work appropriately in the future. Author: Alexander Shulgin Some tweaking by Álvaro Herrera, Greg Smith, Daniel Farina, Peter Eisentraut Reviewed by Robert Haas, Alexey Klyukin (offlist), Heikki Linnakangas, Marko Kreen, and others Oh, it also supports postgresql:// but that's probably just an accident.	2012-04-11 04:33:51 -03:00
Tom Lane	3769fa5fc6	Make pg_tablespace_location(0) return the database's default tablespace. This definition is convenient when applying the function to the reltablespace column of pg_class, since that's what zero means there; and it doesn't interfere with any other plausible use of the function. Per gripe from Bruce Momjian.	2012-04-10 21:43:14 -04:00
Peter Eisentraut	eb821b91c8	NLS: Initialize Project-Id-Version field by xgettext Since xgettext provides options to do this now, we might as well use them.	2012-04-10 21:26:17 +03:00
Peter Eisentraut	6b8c99c386	psql: Improve tab completion of WITH Only match when WITH is the first word, as WITH may appear in many other contexts. Josh Kupershmidt	2012-04-10 20:35:39 +03:00
Tom Lane	0d9819f7e3	Measure epoch of timestamp-without-time-zone from local not UTC midnight. This patch reverts commit `191ef2b407` and thereby restores the pre-7.3 behavior of EXTRACT(EPOCH FROM timestamp-without-tz). Per discussion, the more recent behavior was misguided on a couple of grounds: it makes it hard to get a non-timezone-aware epoch value for a timestamp, and it makes this one case dependent on the value of the timezone GUC, which is incompatible with having timestamp_part() labeled as immutable. The other behavior is still available (in all releases) by explicitly casting the timestamp to timestamp with time zone before applying EXTRACT. This will need to be called out as an incompatible change in the 9.2 release notes. Although having mutable behavior in a function marked immutable is clearly a bug, we're not going to back-patch such a change.	2012-04-10 12:04:42 -04:00
Tom Lane	65fd91333e	Fix an Assert that turns out to be reachable after all. estimate_num_groups() gets unhappy with create table empty(); select * from empty except select * from empty e2; I can't see any actual use-case for such a query (and the table is illegal per SQL spec), but it seems like a good idea that it not cause an assert failure.	2012-04-09 11:58:24 -04:00
Tom Lane	d515365a61	Don't bother copying empty support arrays in a zero-column MergeJoin. The case could not arise when this code was originally written, but it can now (since we made zero-column MergeJoins work for the benefit of FULL JOIN ON TRUE). I don't think there is any actual bug here, but we might as well treat it consistently with other uses of COPY_POINTER_FIELD(). Per comment from Ashutosh Bapat.	2012-04-09 11:41:54 -04:00
Robert Haas	3ae5133b1c	Teach SLRU code to avoid replacing I/O-busy pages. Patch by me; review by Tom Lane and others.	2012-04-08 23:05:55 -04:00
Heikki Linnakangas	03529a3ff9	set_stack_base() no longer needs to be called in PostgresMain. This was a thinko in previous commit. Now that stack base pointer is now set in PostmasterMain and SubPostmasterMain, it doesn't need to be set in PostgresMain anymore.	2012-04-08 19:39:12 +03:00
Heikki Linnakangas	ef3883d130	Do stack-depth checking in all postmaster children. We used to only initialize the stack base pointer when starting up a regular backend, not in other processes. In particular, autovacuum workers can run arbitrary user code, and without stack-depth checking, infinite recursion in e.g an index expression will bring down the whole cluster. The comment about PL/Java using set_stack_base() is not yet true. As the code stands, PL/java still modifies the stack_base_ptr variable directly. However, it's been discussed in the PL/Java mailing list that it should be changed to use the function, because PL/Java is currently oblivious to the register stack used on Itanium. There's another issues with PL/Java, namely that the stack base pointer it sets is not really the base of the stack, it could be something close to the bottom of the stack. That's a separate issue that might need some further changes to this code, but that's a different story. Backpatch to all supported releases.	2012-04-08 19:07:55 +03:00
Tom Lane	7feecedcce	Fix incorrect make maintainer-clean rule.	2012-04-07 18:16:50 -04:00
Tom Lane	95b9c333b2	Further adjustment of comment about qsort_tuple.	2012-04-07 17:48:40 -04:00
Tom Lane	a25ef7a5f6	Remove useless variable to suppress compiler warning.	2012-04-07 16:44:43 -04:00
Bruce Momjian	d24ac36f4f	Stamp libraries versions for 9.2 (better late than never).	2012-04-07 16:19:43 -04:00
Tom Lane	0ab4db52c0	Fix misleading output from gin_desc(). XLOG_GIN_UPDATE_META_PAGE and XLOG_GIN_DELETE_LISTPAGE records were printed with a list link field labeled as "blkno", which was confusing, especially when the link was empty (InvalidBlockNumber). Print the metapage block number instead, since that's what's actually being updated. We could include the link values too as a separate field, but not clear it's worth the trouble. Back-patch to 8.4 where the dubious code was added.	2012-04-06 18:10:21 -04:00
Tom Lane	17b985b1a0	Fix broken comparetup_datum code. Commit `337b6f5ecf` contained the entirely fanciful assumption that it had made comparetup_datum unreachable. Reported and patched by Takashi Yamamoto. Fix up some not terribly accurate/useful comments from that commit, too.	2012-04-06 16:58:50 -04:00
Tom Lane	cea49fe82f	Dept of second thoughts: improve the API for AnalyzeForeignTable. If we make the initially-called function return the table physical-size estimate, acquire_inherited_sample_rows will be able to use that to allocate numbers of samples among child tables, when the day comes that we want to support foreign tables in inheritance trees.	2012-04-06 16:04:10 -04:00
Tom Lane	263d9de66b	Allow statistics to be collected for foreign tables. ANALYZE now accepts foreign tables and allows the table's FDW to control how the sample rows are collected. (But only manual ANALYZEs will touch foreign tables, for the moment, since among other things it's not very clear how to handle remote permissions checks in an auto-analyze.) contrib/file_fdw is extended to support this. Etsuro Fujita, reviewed by Shigeru Hanada, some further tweaking by me.	2012-04-06 15:02:35 -04:00
Simon Riggs	8cb53654db	Add DROP INDEX CONCURRENTLY [IF EXISTS], uses ShareUpdateExclusiveLock	2012-04-06 10:21:40 +01:00
Robert Haas	21cc529698	checkopint -> checkpoint Report by Guillaume Lelarge.	2012-04-05 21:37:33 -04:00
Robert Haas	662ca285a6	Put back code inadvertently deleted from exit_nicely. Report by Andrew Dunstan.	2012-04-05 21:37:33 -04:00
Peter Eisentraut	05261ab624	NLS: Use msgmerge/xgettext --no-wrap and --sort-by-file The option --no-wrap prevents wars with (most?) editors about proper line wrapping. --sort-by-file ensures consistent file order, for easier diffing.	2012-04-05 22:28:13 +03:00

... 6 7 8 9 10 ...

23760 Commits