postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-10-08 11:56:51 +02:00

Author	SHA1	Message	Date
Alvaro Herrera	13aa624431	Optimize updating a row that's locked by same xid Updating or locking a row that was already locked by the same transaction under the same Xid caused a MultiXact to be created; but this is unnecessary, because there's no usefulness in being able to differentiate two locks by the same transaction. In particular, if a transaction executed SELECT FOR UPDATE followed by an UPDATE that didn't modify columns of the key, we would dutifully represent the resulting combination as a multixact -- even though a single key-update is sufficient. Optimize the case so that only the strongest of both locks/updates is represented in Xmax. This can save some Xmax's from becoming MultiXacts, which can be a significant optimization. This missed optimization opportunity was spotted by Andres Freund while investigating a bug reported by Oliver Seemann in message CANCipfpfzoYnOz5jj=UZ70_R=CwDHv36dqWSpwsi27vpm1z5sA@mail.gmail.com and also directly as a performance regression reported by Dong Ye in message d54b8387.000012d8.00000010@YED-DEVD1.vmware.com Reportedly, this patch fixes the performance regression. Since the missing optimization was reported as a significant performance regression from 9.2, backpatch to 9.3. Andres Freund, tweaked by Álvaro Herrera	2013-12-19 16:53:49 -03:00
Fujii Masao	084e385a2f	Add tab completion for ALTER SYSTEM SET in psql.	2013-12-20 02:33:27 +09:00
Peter Eisentraut	94b899b829	Upgrade to Autoconf 2.69	2013-12-18 20:53:23 -05:00
Robert Haas	001a573a20	Allow on-detach callbacks for dynamic shared memory segments. Just as backends must clean up their shared memory state (releasing lwlocks, buffer pins, etc.) before exiting, they must also perform any similar cleanups related to dynamic shared memory segments they have mapped before unmapping those segments. So add a mechanism to ensure that. Existing on_shmem_exit hooks include both "user level" cleanup such as transaction abort and removal of leftover temporary relations and also "low level" cleanup that forcibly released leftover shared memory resources. On-detach callbacks should run after the first group but before the second group, so create a new before_shmem_exit function for registering the early callbacks and keep on_shmem_exit for the regular callbacks. (An earlier draft of this patch added an additional argument to on_shmem_exit, but that had a much larger footprint and probably a substantially higher risk of breaking third party code for no real gain.) Patch by me, reviewed by KaiGai Kohei and Andres Freund.	2013-12-18 13:09:09 -05:00
Bruce Momjian	613c6d26bd	Fix incorrect error message reported for non-existent users Previously, lookups of non-existent user names could return "Success"; it will now return "User does not exist" by resetting errno. This also centralizes the user name lookup code in libpgport. Report and analysis by Nicolas Marchildon; patch by me	2013-12-18 12:16:21 -05:00
Alvaro Herrera	11ac4c73cb	Don't ignore tuple locks propagated by our updates If a tuple was locked by transaction A, and transaction B updated it, the new version of the tuple created by B would be locked by A, yet visible only to B; due to an oversight in HeapTupleSatisfiesUpdate, the lock held by A wouldn't get checked if transaction B later deleted (or key-updated) the new version of the tuple. This might cause referential integrity checks to give false positives (that is, allow deletes that should have been rejected). This is an easy oversight to have made, because prior to improved tuple locks in commit `0ac5ad5134` it wasn't possible to have tuples created by our own transaction that were also locked by remote transactions, and so locks weren't even considered in that code path. It is recommended that foreign keys be rechecked manually in bulk after installing this update, in case some referenced rows are missing with some referencing row remaining. Per bug reported by Daniel Wood in CAPweHKe5QQ1747X2c0tA=5zf4YnS2xcvGf13Opd-1Mq24rF1cQ@mail.gmail.com	2013-12-18 13:45:51 -03:00
Tatsuo Ishii	65d6e4cb5c	Add ALTER SYSTEM command to edit the server configuration file. Patch contributed by Amit Kapila. Reviewed by Hari Babu, Masao Fujii, Boszormenyi Zoltan, Andres Freund, Greg Smith and others.	2013-12-18 23:42:44 +09:00
Bruce Momjian	dba5a9dda9	Comment: COPY comment improvement Etsuro Fujita	2013-12-17 12:51:16 -05:00
Alvaro Herrera	3b97e6823b	Rework tuple freezing protocol Tuple freezing was broken in connection to MultiXactIds; commit `8e53ae025d` tried to fix it, but didn't go far enough. As noted by Noah Misch, freezing a tuple whose Xmax is a multi containing an aborted update might cause locks in the multi to go ignored by later transactions. This is because the code depended on a multixact above their cutoff point not having any lock-only member older than the cutoff point for Xids, which is easily defeated in READ COMMITTED transactions. The fix for this involves creating a new MultiXactId when necessary. But this cannot be done during WAL replay, and moreover multixact examination requires using CLOG access routines which are not supposed to be used during WAL replay either; so tuple freezing cannot be done with the old freeze WAL record. Therefore, separate the freezing computation from its execution, and change the WAL record to carry all necessary information. At WAL replay time, it's easy to re-execute freezing because we don't need to re-compute the new infomask/Xmax values but just take them from the WAL record. While at it, restructure the coding to ensure all page changes occur in a single critical section without much room for failures. The previous coding wasn't using a critical section, without any explanation as to why this was acceptable. In replication scenarios using the 9.3 branch, standby servers must be upgraded before their master, so that they are prepared to deal with the new WAL record once the master is upgraded; failure to do so will cause WAL replay to die with a PANIC message. Later upgrade of the standby will allow the process to continue where it left off, so there's no disruption of the data in the standby in any case. Standbys know how to deal with the old WAL record, so it's okay to keep the master running the old code for a while. In master, the old freeze WAL record is gone, for cleanliness' sake; there's no compatibility concern there. Backpatch to 9.3, where the original bug was introduced and where the previous fix was backpatched. Álvaro Herrera and Andres Freund	2013-12-16 11:29:50 -03:00
Heikki Linnakangas	30b96549ab	Mark variables 'static' where possible. Move GinFuzzySearchLimit to ginget.c Per "clang -Wmissing-variable-declarations" output, posted by Andres Freund. I didn't silence all those warnings, though, only the most obvious cases.	2013-12-16 11:41:17 +02:00
Tatsuo Ishii	1f0626ee40	Add "SHIFT_JIS" as an accepted encoding name for locale checking. When locale is "ja_JP.SJIS", nl_langinfo(CODESET) returns "SHIFT_JIS" on some platforms, at least on RedHat Linux. So the encoding/locale match table (encoding_match_list) needs the entry. Otherwise client encoding is set to SQL_ASCII. Back patch to all supported branches.	2013-12-15 11:09:05 +09:00
Tom Lane	1b4f7f93b4	Allow empty target list in SELECT. This fixes a problem noted as a followup to bug #8648: if a query has a semantically-empty target list, e.g. SELECT * FROM zero_column_table, ruleutils.c will dump it as a syntactically-empty target list, which was not allowed. There doesn't seem to be any reliable way to fix this by hacking ruleutils (note in particular that the originally zero-column table might since have had columns added to it); and even if we had such a fix, it would do nothing for existing dump files that might contain bad syntax. The best bet seems to be to relax the syntactic restriction. Also, add parse-analysis errors for SELECT DISTINCT with no columns (after *-expansion) and RETURNING with no columns. These cases previously produced unexpected behavior because the parsed Query looked like it had no DISTINCT or RETURNING clause, respectively. If anyone ever offers a plausible use-case for this, we could work a bit harder on making the situation distinguishable. Arguably this is a bug fix that should be back-patched, but I'm worried that there may be client apps or PLs that expect "SELECT ;" to throw a syntax error. The issue doesn't seem important enough to risk changing behavior in minor releases.	2013-12-14 20:23:26 -05:00
Tom Lane	c03ad5602f	Fix inherited UPDATE/DELETE with UNION ALL subqueries. Fix an oversight in commit `b3aaf9081a`: we do indeed need to process the planner's append_rel_list when copying RTE subqueries, because if any of them were flattenable UNION ALL subqueries, the append_rel_list shows which subquery RTEs were pulled up out of which other ones. Without this, UNION ALL subqueries aren't correctly inserted into the update plans for inheritance child tables after the first one, typically resulting in no update happening for those child table(s). Per report from Victor Yegorov. Experimentation with this case also exposed a fault in commit `a7b965382c`: if an inherited UPDATE/DELETE was proven totally dummy by constraint exclusion, we might arrive at add_rtes_to_flat_rtable with root->simple_rel_array being NULL. This should be interpreted as not having any RelOptInfos. I chose to code the guard as a check against simple_rel_array_size, so as to also provide some protection against indexing off the end of the array. Back-patch to 9.2 where the faulty code was added.	2013-12-14 17:33:53 -05:00
Alvaro Herrera	60eea3780c	Fix typo	2013-12-13 17:27:16 -03:00
Alvaro Herrera	d881dd6233	Rework MultiXactId cache code The original performs too poorly; in some scenarios it shows way too high while profiling. Try to make it a bit smarter to avoid excessive cosst. In particular, make it have a maximum size, and have entries be sorted in LRU order; once the max size is reached, evict the oldest entry to avoid it from growing too large. Per complaint from Andres Freund in connection with new tuple freezing code.	2013-12-13 17:16:25 -03:00
Tom Lane	2efc6dc256	Add HOLD/RESUME_INTERRUPTS in HandleCatchupInterrupt/HandleNotifyInterrupt. This prevents a possible longjmp out of the signal handler if a timeout or SIGINT occurs while something within the handler has transiently set ImmediateInterruptOK. For safety we must hold off the timeout or cancel error until we're back in mainline, or at least till we reach the end of the signal handler when ImmediateInterruptOK was true at entry. This syncs these functions with the logic now present in handle_sig_alarm. AFAICT there is no live bug here in 9.0 and up, because I don't think we currently can wait for any heavyweight lock inside these functions, and there is no other code (except read-from-client) that will turn on ImmediateInterruptOK. However, that was not true pre-9.0: in older branches ProcessIncomingNotify might block trying to lock pg_listener, and then a SIGINT could lead to undesirable control flow. It might be all right anyway given the relatively narrow code ranges in which NOTIFY interrupts are enabled, but for safety's sake I'm back-patching this.	2013-12-13 14:05:51 -05:00
Heikki Linnakangas	dde6282500	Fix more instances of "the the" in comments. Plus one instance of "to to" in the docs.	2013-12-13 20:02:01 +02:00
Tom Lane	e8312b4f03	Don't let timeout interrupts happen unless ImmediateInterruptOK is set. Serious oversight in commit `16e1b7a1b7`: we should not allow an interrupt to take control away from mainline code except when ImmediateInterruptOK is set. Just to be safe, let's adopt the same save-clear-restore dance that's been used for many years in HandleCatchupInterrupt and HandleNotifyInterrupt, so that nothing bad happens if a timeout handler invokes code that tests or even manipulates ImmediateInterruptOK. Per report of "stuck spinlock" failures from Christophe Pettus, though many other symptoms are possible. Diagnosis by Andres Freund.	2013-12-13 11:50:15 -05:00
Heikki Linnakangas	50e547096c	Add GUC to enable WAL-logging of hint bits, even with checksums disabled. WAL records of hint bit updates is useful to tools that want to examine which pages have been modified. In particular, this is required to make the pg_rewind tool safe (without checksums). This can also be used to test how much extra WAL-logging would occur if you enabled checksums, without actually enabling them (which you can't currently do without re-initdb'ing). Sawada Masahiko, docs by Samrat Revagade. Reviewed by Dilip Kumar, with further changes by me.	2013-12-13 16:26:14 +02:00
Heikki Linnakangas	a49633d8dc	Fix WAL-logging of setting the visibility map bit. The operation that removes the remaining dead tuples from the page must be WAL-logged before the setting of the VM bit. Otherwise, if you replay the WAL to between those two records, you end up with the VM bit set, but the dead tuples are still there. Backpatch to 9.3, where this bug was introduced.	2013-12-13 14:15:04 +02:00
Tom Lane	ccca6f56f5	Fix ancient docs/comments thinko: XID comparison is mod 2^32, not 2^31. Pointed out by Gianni Ciolli.	2013-12-12 12:39:48 -05:00
Tom Lane	f26099057a	Improve EXPLAIN to print the grouping columns in Agg and Group nodes. Per request from Kevin Grittner.	2013-12-12 11:24:38 -05:00
Simon Riggs	8693559cac	New autovacuum_work_mem parameter If autovacuum_work_mem is set, autovacuum workers now use this parameter in preference to maintenance_work_mem. Peter Geoghegan	2013-12-12 11:42:39 +00:00
Simon Riggs	36da3cfb45	Allow time delayed standbys and recovery Set min_recovery_apply_delay to force a delay in recovery apply for commit and restore point WAL records. Other records are replayed immediately. Delay is measured between WAL record time and local standby time. Robert Haas, Fabrízio de Royes Mello and Simon Riggs Detailed review by Mitsumasa Kondo	2013-12-12 10:53:20 +00:00
Heikki Linnakangas	108e3992cd	Display old and new values in pg_resetxlog -n output. For extra clarity. Rajeev Rastogi, reviewed by Amit Kapila	2013-12-12 11:57:18 +02:00
Tom Lane	22310b808d	Remove bogus executable permissions on xlog.c. Apparently fat-fingered in `1a3d104475`. Noted by Peter Geoghegan.	2013-12-11 22:12:25 -05:00
Tom Lane	6bff0e7d92	Add a regression test case for plpython function returning setof RECORD. We had coverage for functions returning setof a named composite type, but not for anonymous records, which is a somewhat different code path. In view of recent crash report from Sergey Konoplev, this seems worth testing, though I doubt there's any deterministic bug here today.	2013-12-11 17:22:55 -05:00
Simon Riggs	cf589c9c1f	Regression tests for SCHEMA commands Hari Babu Kommi reviewed by David Rowley	2013-12-11 20:45:15 +00:00
Simon Riggs	b921a26fb8	Regression tests for ALTER TABLESPACE RENAME,OWNER Hari Babu Kommi reviewed by David Rowley	2013-12-11 20:42:58 +00:00
Tom Lane	b5e0a2a384	Tweak placement of explicit ANALYZE commands in the regression tests. Make the COPY test, which loads most of the large static tables used in the tests, also explicitly ANALYZE those tables. This allows us to get rid of various ad-hoc, and rather redundant, ANALYZE commands that had gotten stuck into various test scripts over time to ensure we got consistent plan choices. (We could have done a database-wide ANALYZE, but that would cause stats to get attached to the small static tables too, which results in plan changes compared to the historical behavior. I'm not sure that's a good idea, so not going that far for now.) Back-patch to 9.0, since 9.0 and 9.1 are currently sometimes failing regression tests for lack of an "ANALYZE tenk1" in the subselect test. There's no need for this in 8.4 since we didn't print any plans back then.	2013-12-11 15:09:15 -05:00
Robert Haas	60dd40bbda	Under wal_level=logical, when saving old tuples, always save OID. There's no real point in not doing this. It doesn't cost anything in performance or space. So let's go wild. Andres Freund, with substantial editing as to style by me.	2013-12-11 13:19:31 -05:00
Kevin Grittner	09df854b8a	Add table name to VACUUM statement in matview.c. The test only needs the one table to be vacuumed. Vacuuming the database may affect other tests. Per gripe from Tom Lane. Back-patch to 9.3, where the test was was added.	2013-12-11 08:53:03 -06:00
Peter Eisentraut	e5dc4cc24d	PL/Perl: Add event trigger support From: Dimitri Fontaine <dimitri@2ndQuadrant.fr>	2013-12-11 08:11:59 -05:00
Robert Haas	6bea96dd49	Add a new option, -g, to createuser, to add membership in a role. Chistopher Browne, reviewed by Sameer Thakur, Amit Kapila, and Peter Eisentraut.	2013-12-11 07:50:36 -05:00
Robert Haas	66abc2608c	Add a new reloption, user_catalog_table. When this reloption is set and wal_level=logical is configured, we'll record the CIDs stamped by inserts, updates, and deletes to the table just as we would for an actual catalog table. This will allow logical decoding to use historical MVCC snapshots to access such tables just as they access ordinary catalog tables. Replication solutions built around the logical decoding machinery will likely need to set this operation for their configuration tables; it might also be needed by extensions which perform table access in their output functions. Andres Freund, reviewed by myself and others.	2013-12-10 19:17:34 -05:00
Robert Haas	e55704d8b2	Add new wal_level, logical, sufficient for logical decoding. When wal_level=logical, we'll log columns from the old tuple as configured by the REPLICA IDENTITY facility added in commit `07cacba983`. This makes it possible a properly-configured logical replication solution to correctly follow table updates even if they change the chosen key columns, or, with REPLICA IDENTITY FULL, even if the table has no key at all. Note that updates which do not modify the replica identity column won't log anything extra, making the choice of a good key (i.e. one that will rarely be changed) important to performance when wal_level=logical is configured. Each insert, update, or delete to a catalog table will also log the CMIN and/or CMAX values of stamped by the current transaction. This is necessary because logical decoding will require access to historical snapshots of the catalog in order to decode some data types, and the CMIN/CMAX values that we may need in order to judge row visibility may have been overwritten by the time we need them. Andres Freund, reviewed in various versions by myself, Heikki Linnakangas, KONDO Mitsumasa, and many others.	2013-12-10 19:01:40 -05:00
Tom Lane	9ec6199d18	Fix possible crash with nested SubLinks. An expression such as WHERE (... x IN (SELECT ...) ...) IN (SELECT ...) could produce an invalid plan that results in a crash at execution time, if the planner attempts to flatten the outer IN into a semi-join. This happens because convert_testexpr() was not expecting any nested SubLinks and would wrongly replace any PARAM_SUBLINK Params belonging to the inner SubLink. (I think the comment denying that this case could happen was wrong when written; it's certainly been wrong for quite a long time, since very early versions of the semijoin flattening logic.) Per report from Teodor Sigaev. Back-patch to all supported branches.	2013-12-10 16:10:17 -05:00
Noah Misch	53685d7981	Rename TABLE() to ROWS FROM(). SQL-standard TABLE() is a subset of UNNEST(); they deal with arrays and other collection types. This feature, however, deals with set-returning functions. Use a different syntax for this feature to keep open the possibility of implementing the standard TABLE().	2013-12-10 09:34:37 -05:00
Robert Haas	d9250da032	Fixups for dsm.c's file descriptor handling. Per complaint from Tom Lane.	2013-12-09 11:15:19 -05:00
Peter Eisentraut	3164721462	SSL: Support ECDH key exchange This sets up ECDH key exchange, when compiling against OpenSSL that supports EC. Then the ECDHE-RSA and ECDHE-ECDSA cipher suites can be used for SSL connections. The latter one means that EC keys are now usable. The reason for EC key exchange is that it's faster than DHE and it allows to go to higher security levels where RSA will be horribly slow. There is also new GUC option ssl_ecdh_curve that specifies the curve name used for ECDH. It defaults to "prime256v1", which is the most common curve in use in HTTPS. From: Marko Kreen <markokr@gmail.com> Reviewed-by: Adrian Klaver <adrian.klaver@gmail.com>	2013-12-07 15:11:44 -05:00
Peter Eisentraut	ef3267523d	SSL: Add configuration option to prefer server cipher order By default, OpenSSL (and SSL/TLS in general) lets the client cipher order take priority. This is OK for browsers where the ciphers were tuned, but few PostgreSQL client libraries make the cipher order configurable. So it makes sense to have the cipher order in postgresql.conf take priority over client defaults. This patch adds the setting "ssl_prefer_server_ciphers" that can be turned on so that server cipher order is preferred. Per discussion, this now defaults to on. From: Marko Kreen <markokr@gmail.com> Reviewed-by: Adrian Klaver <adrian.klaver@gmail.com>	2013-12-07 08:13:50 -05:00
Alvaro Herrera	312bde3d40	Fix improper abort during update chain locking In `247c76a989`, I added some code to do fine-grained checking of MultiXact status of locking/updating transactions when traversing an update chain. There was a thinko in that patch which would have the traversing abort, that is return HeapTupleUpdated, when the other transaction is a committed lock-only. In this case we should ignore it and return success instead. Of course, in the case where there is a committed update, HeapTupleUpdated is the correct return value. A user-visible symptom of this bug is that in REPEATABLE READ and SERIALIZABLE transaction isolation modes spurious serializability errors can occur: ERROR: could not serialize access due to concurrent update In order for this to happen, there needs to be a tuple that's key-share- locked and also updated, and the update must abort; a subsequent transaction trying to acquire a new lock on that tuple would abort with the above error. The reason is that the initial FOR KEY SHARE is seen as committed by the new locking transaction, which triggers this bug. (If the UPDATE commits, then the serialization error is correctly reported.) When running a query in READ COMMITTED mode, what happens is that the locking is aborted by the HeapTupleUpdated return value, then EvalPlanQual fetches the newest version of the tuple, which is then the only version that gets locked. (The second time the tuple is checked there is no misbehavior on the committed lock-only, because it's not checked by the code that traverses update chains; so no bug.) Only the newest version of the tuple is locked, not older ones, but this is harmless. The isolation test added by this commit illustrates the desired behavior, including the proper serialization errors that get thrown. Backpatch to 9.3.	2013-12-05 17:47:51 -03:00
Tom Lane	74242c23c1	Clear retry flags properly in replacement OpenSSL sock_write function. Current OpenSSL code includes a BIO_clear_retry_flags() step in the sock_write() function. Either we failed to copy the code correctly, or they added this since we copied it. In any case, lack of the clear step appears to be the cause of the server lockup after connection loss reported in bug #8647 from Valentine Gogichashvili. Assume that this is correct coding for all OpenSSL versions, and hence back-patch to all supported branches. Diagnosis and patch by Alexander Kukushkin.	2013-12-05 12:48:28 -05:00
Alvaro Herrera	07aeb1fec5	Avoid resetting Xmax when it's a multi with an aborted update HeapTupleSatisfiesUpdate can very easily "forget" tuple locks while checking the contents of a multixact and finding it contains an aborted update, by setting the HEAP_XMAX_INVALID bit. This would lead to concurrent transactions not noticing any previous locks held by transactions that might still be running, and thus being able to acquire subsequent locks they wouldn't be normally able to acquire. This bug was introduced in commit 1ce150b7bb; backpatch this fix to 9.3, like that commit. This change reverts the change to the delete-abort-savept isolation test in `1ce150b7bb`, because that behavior change was caused by this bug. Noticed by Andres Freund while investigating a different issue reported by Noah Misch.	2013-12-05 12:21:55 -03:00
Bruce Momjian	86ef4796f5	build: pass EXTRA_REGRESS_OPTS to secondary regression tests Christoph Berg	2013-12-04 10:14:45 -05:00
Heikki Linnakangas	9e857436ef	Don't include unused space in LOG_NEWPAGE records. This is the same trick we use when taking a full page image of a buffer passed to XLogInsert.	2013-12-04 00:10:47 +02:00
Heikki Linnakangas	22122c83f1	Fix full-page writes of internal GIN pages. Insertion to a non-leaf GIN page didn't make a full-page image of the page, which is wrong. The code used to do it correctly, but was changed (commit `853d1c3103`) because the redo-routine didn't track incomplete splits correctly when the page was restored from a full page image. Of course, that was not right way to fix it, the redo routine should've been fixed instead. The redo-routine was surreptitiously fixed in 2010 (commit `4016bdef8a`), so all we need to do now is revert the code that creates the record to its original form. This doesn't change the format of the WAL record. Backpatch to all supported versions.	2013-12-03 23:16:01 +02:00
Bruce Momjian	4a8adfd4d0	C comment: again update comment for pg_fe_sendauth for error cases	2013-12-03 11:42:18 -05:00
Bruce Momjian	6a6b7bbb81	Update C comment for pg_fe_getauthname This function no longer takes an argument.	2013-12-03 11:33:46 -05:00
Bruce Momjian	9e0a97f1c8	libpq: change PQconndefaults() to ignore invalid service files Previously missing or invalid service files returned NULL. Also fix pg_upgrade to report "out of memory" for a null return from PQconndefaults(). Patch by Steve Singer, rewritten by me	2013-12-03 11:12:25 -05:00
Peter Eisentraut	fef88b3fda	Report exit code from external recovery commands properly When an external recovery command such as restore_command or archive_cleanup_command fails, report the exit code properly, distinguishing signals and normal exists, using the existing wait_result_to_str() facility, instead of just reporting the return value from system(). Reviewed-by: Peter Geoghegan <pg@heroku.com>	2013-12-02 22:31:05 -05:00
Tom Lane	7ab321404c	Fix crash in assign_collations_walker for EXISTS with empty SELECT list. We (I think I, actually) forgot about this corner case while coding collation resolution. Per bug #8648 from Arjen Nienhuis.	2013-12-02 20:28:45 -05:00
Tom Lane	7a1e34d371	Increase git_changelog's timestamp_slop from 10 min to 1 day. Many committers seem to now be using a work flow in which back-patched commits are timestamped minutes or even hours apart in different branches (most likely because they commit in one branch before starting work on the next one). git_changelog was failing to merge its reports in such cases, so increase the max time it's willing to merge commits across. I considered getting rid of the limit altogether, but that produces some odd results in terms of how the merged commit gets sorted relative to unrelated commits.	2013-12-02 11:33:49 -05:00
Robert Haas	c6d4b1dd3e	Flag mmap implemenation of dynamic shared memory as resize-capable. Error noted by Heikki Linnakangas	2013-12-02 11:18:54 -05:00
Robert Haas	a8656a3ab0	Make NUM_TOCHAR_prepare and NUM_TOCHAR_finish macros declare "len". Remove the variable from the enclosing scopes so that nothing can be relying on it. The net result of this refactoring is that we get rid of a few unnecessary strlen() calls. Original patch from Greg Jaskiewicz, substantially expanded by me.	2013-12-02 10:51:06 -05:00
Robert Haas	9d140f7be2	Avoid out-of-bounds read in errfinish if error_stack_depth < 0. If errordata_stack_depth < 0, we won't find that out and correct the problem until CHECK_STACK_DEPTH() is invoked. In the meantime, elevel will be set based on an invalid read. This is probably harmless in practice, but it seems cleaner this way. Xi Wang	2013-12-02 10:42:01 -05:00
Peter Eisentraut	3e3520cf7a	Translation updates	2013-12-02 00:17:07 -05:00
Tom Lane	335470251d	Update time zone data files to tzdata release 2013h. DST law changes in Argentina, Brazil, Jordan, Libya, Liechtenstein, Morocco, Palestine. New timezone abbreviations WIB, WIT, WITA for Indonesia.	2013-12-01 14:11:44 -05:00
Kevin Grittner	4bd371f6f8	Fix pg_dumpall to work for databases flagged as read-only. pg_dumpall's charter is to be able to recreate a database cluster's contents in a virgin installation, but it was failing to honor that contract if the cluster had any ALTER DATABASE SET default_transaction_read_only settings. By including a SET command for the connection for each connection opened by pg_dumpall output, errors are avoided and the source cluster is successfully recreated. There was discussion of whether to also set this for the connection applying pg_dump output, but it was felt that it was both less appropriate in that context, and far easier to work around. Backpatch to all supported branches.	2013-11-30 11:24:56 -06:00
Peter Eisentraut	34fa72ec9c	Remove use of obsolescent Autoconf macros Remove the use of the following macros, which are obsolescent according to the Autoconf documentation: - AC_C_CONST - AC_C_STRINGIZE - AC_C_VOLATILE - AC_FUNC_MEMCMP	2013-11-30 09:17:08 -05:00
Alvaro Herrera	2393c7d102	Fix a couple of bugs in MultiXactId freezing Both heap_freeze_tuple() and heap_tuple_needs_freeze() neglected to look into a multixact to check the members against cutoff_xid. This means that a very old Xid could survive hidden within a multi, possibly outliving its CLOG storage. In the distant future, this would cause clog lookup failures: ERROR: could not access status of transaction 3883960912 DETAIL: Could not open file "pg_clog/0E78": No such file or directory. This mostly was problematic when the updating transaction aborted, since in that case the row wouldn't get pruned away earlier in vacuum and the multixact could possibly survive for a long time. In many cases, data that is inaccessible for this reason way can be brought back heuristically. As a second bug, heap_freeze_tuple() didn't properly handle multixacts that need to be frozen according to cutoff_multi, but whose updater xid is still alive. Instead of preserving the update Xid, it just set Xmax invalid, which leads to both old and new tuple versions becoming visible. This is pretty rare in practice, but a real threat nonetheless. Existing corrupted rows, unfortunately, cannot be repaired in an automated fashion. Existing physical replicas might have already incorrectly frozen tuples because of different behavior than in master, which might only become apparent in the future once pg_multixact/ is truncated; it is recommended that all clones be rebuilt after upgrading. Following code analysis caused by bug report by J Smith in message CADFUPgc5bmtv-yg9znxV-vcfkb+JPRqs7m2OesQXaM_4Z1JpdQ@mail.gmail.com and privately by F-Secure. Backpatch to 9.3, where freezing of MultiXactIds was introduced. Analysis and patch by Andres Freund, with some tweaks by Álvaro.	2013-11-29 21:47:25 -03:00
Alvaro Herrera	1ce150b7bb	Don't TransactionIdDidAbort in HeapTupleGetUpdateXid It is dangerous to do so, because some code expects to be able to see what's the true Xmax even if it is aborted (particularly while traversing HOT chains). So don't do it, and instead rely on the callers to verify for abortedness, if necessary. Several race conditions and bugs fixed in the process. One isolation test changes the expected output due to these. This also reverts commit `c235a6a589`, which is no longer necessary. Backpatch to 9.3, where this function was introduced. Andres Freund	2013-11-29 21:47:21 -03:00
Alvaro Herrera	1df0122daa	Truncate pg_multixact/'s contents during crash recovery Commit `9dc842f08` of 8.2 era prevented MultiXact truncation during crash recovery, because there was no guarantee that enough state had been setup, and because it wasn't deemed to be a good idea to remove data during crash recovery anyway. Since then, due to Hot-Standby, streaming replication and PITR, the amount of time a cluster can spend doing crash recovery has increased significantly, to the point that a cluster may even never come out of it. This has made not truncating the content of pg_multixact/ not defensible anymore. To fix, take care to setup enough state for multixact truncation before crash recovery starts (easy since checkpoints contain the required information), and move the current end-of-recovery actions to a new TrimMultiXact() function, analogous to TrimCLOG(). At some later point, this should probably done similarly to the way clog.c is doing it, which is to just WAL log truncations, but we can't do that for the back branches. Back-patch to 9.0. 8.4 also has the problem, but since there's no hot standby there, it's much less pressing. In 9.2 and earlier, this patch is simpler than in newer branches, because multixact access during recovery isn't required. Add appropriate checks to make sure that's not happening. Andres Freund	2013-11-29 21:47:15 -03:00
Alvaro Herrera	f54106f77e	Fix full-table-vacuum request mechanism for MultiXactIds While autovacuum dutifully launched anti-multixact-wraparound vacuums when the multixact "age" was reached, the vacuum code was not aware that it needed to make them be full table vacuums. As the resulting partial-table vacuums aren't capable of actually increasing relminmxid, autovacuum continued to launch anti-wraparound vacuums that didn't have the intended effect, until age of relfrozenxid caused the vacuum to finally be a full table one via vacuum_freeze_table_age. To fix, introduce logic for multixacts similar to that for plain TransactionIds, using the same GUCs. Backpatch to 9.3, where permanent MultiXactIds were introduced. Andres Freund, some cleanup by Álvaro	2013-11-29 21:47:13 -03:00
Alvaro Herrera	76a31c689c	Replace hardcoded 200000000 with autovacuum_freeze_max_age Parts of the code used autovacuum_freeze_max_age to determine whether anti-multixact-wraparound vacuums are necessary, while others used a hardcoded 200000000 value. This leads to problems when autovacuum_freeze_max_age is set to a non-default value. Use the latter everywhere. Backpatch to 9.3, where vacuuming of multixacts was introduced. Andres Freund	2013-11-29 21:47:09 -03:00
Tom Lane	79193c75f8	Fix assorted issues in pg_ctl's pgwin32_CommandLine(). Ensure that the invocation command for postgres or pg_ctl runservice double-quotes the executable's pathname; failure to do this leads to trouble when the path contains spaces. Also, ensure that the path ends in ".exe" in both cases and uses backslashes rather than slashes as directory separators. The latter issue is reported to confuse some third-party tools such as Symantec Backup Exec. Also, rewrite the function to avoid buffer overrun issues by using a PQExpBuffer instead of a fixed-size static buffer. Combinations of very long executable pathnames and very long data directory pathnames could have caused trouble before, for example. Back-patch to all active branches, since this code has been like this for a long while. Naoya Anzai and Tom Lane, reviewed by Rajeev Rastogi	2013-11-29 18:34:07 -05:00
Tom Lane	8b151558c8	Be sure to release proc->backendLock after SetupLockInTable() failure. The various places that transferred fast-path locks to the main lock table neglected to release the PGPROC's backendLock if SetupLockInTable failed due to being out of shared memory. In most cases this is no big deal since ensuing error cleanup would release all held LWLocks anyway. But there are some hot-standby functions that don't consider failure of FastPathTransferRelationLocks to be a hard error, and in those cases this oversight could lead to system lockup. For consistency, make all of these places look the same as FastPathTransferRelationLocks. Noted while looking for the cause of Dan Wood's bugs --- this wasn't it, but it's a bug anyway.	2013-11-29 17:35:09 -05:00
Tom Lane	16e1b7a1b7	Fix assorted race conditions in the new timeout infrastructure. Prevent handle_sig_alarm from losing control partway through due to a query cancel (either an asynchronous SIGINT, or a cancel triggered by one of the timeout handler functions). That would at least result in failure to schedule any required future interrupt, and might result in actual corruption of timeout.c's data structures, if the interrupt happened while we were updating those. We could still lose control if an asynchronous SIGINT arrives just as the function is entered. This wouldn't break any data structures, but it would have the same effect as if the SIGALRM interrupt had been silently lost: we'd not fire any currently-due handlers, nor schedule any new interrupt. To forestall that scenario, forcibly reschedule any pending timer interrupt during AbortTransaction and AbortSubTransaction. We can avoid any extra kernel call in most cases by not doing that until we've allowed LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events. Another hazard is that some platforms (at least Linux and *BSD) block a signal before calling its handler and then unblock it on return. When we longjmp out of the handler, the unblock doesn't happen, and the signal is left blocked indefinitely. Again, we can fix that by forcibly unblocking signals during AbortTransaction and AbortSubTransaction. These latter two problems do not manifest when the longjmp reaches postgres.c, because the error recovery code there kills all pending timeout events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal mask is restored. So errors thrown outside any transaction should be OK already, and cleaning up in AbortTransaction and AbortSubTransaction should be enough to fix these issues. (We're assuming that any code that catches a query cancel error and doesn't re-throw it will do at least a subtransaction abort to clean up; but that was pretty much required already by other subsystems.) Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when disabling that event: if a lock timeout interrupt happened after the lock was granted, the ensuing query cancel is still going to happen at the next CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user cancel. Per reports from Dan Wood. Back-patch to 9.3 where the new timeout handling infrastructure was introduced. We may at some point decide to back-patch the signal unblocking changes further, but I'll desist from that until we hear actual field complaints about it.	2013-11-29 16:41:00 -05:00
Robert Haas	8e18d04d4d	Refine our definition of what constitutes a system relation. Although user-defined relations can't be directly created in pg_catalog, it's possible for them to end up there, because you can create them in some other schema and then use ALTER TABLE .. SET SCHEMA to move them there. Previously, such relations couldn't afterwards be manipulated, because IsSystemRelation()/IsSystemClass() rejected all attempts to modify objects in the pg_catalog schema, regardless of their origin. With this patch, they now reject only those objects in pg_catalog which were created at initdb-time, allowing most operations on user-created tables in pg_catalog to proceed normally. This patch also adds new functions IsCatalogRelation() and IsCatalogClass(), which is similar to IsSystemRelation() and IsSystemClass() but with a slightly narrower definition: only TOAST tables of system catalogs are included, rather than all TOAST tables. This is currently used only for making decisions about when invalidation messages need to be sent, but upcoming logical decoding patches will find other uses for this information. Andres Freund, with some modifications by me.	2013-11-28 20:57:20 -05:00
Heikki Linnakangas	2fe69cacff	Another gin_desc fix. The number of items inserted was incorrectly printed as if it was a boolean.	2013-11-28 23:35:50 +02:00
Heikki Linnakangas	97c19e6c38	Fix gin_desc routine to match the WAL format. In the GIN incomplete-splits patch, I used BlockIdDatas to store the block number of left and right children, when inserting a downlink after a split to an internal page posting list page. But gin_desc thought they were stored as BlockNumbers.	2013-11-28 21:57:42 +02:00
Tom Lane	da8a716089	Fix latent(?) race condition in LockReleaseAll. We have for a long time checked the head pointer of each of the backend's proclock lists and skipped acquiring the corresponding locktable partition lock if the head pointer was NULL. This was safe enough in the days when proclock lists were changed only by the owning backend, but it is pretty questionable now that the fast-path patch added cases where backends add entries to other backends' proclock lists. However, we don't really wish to revert to locking each partition lock every time, because in simple transactions that would add a lot of useless lock/unlock cycles on already-heavily-contended LWLocks. Fortunately, the only way that another backend could be modifying our proclock list at this point would be if it was promoting a formerly fast-path lock of ours; and any such lock must be one that we'd decided not to delete in the previous loop over the locallock table. So it's okay if we miss seeing it in this loop; we'd just decide not to delete it again. However, once we've detected a non-empty list, we'd better re-fetch the list head pointer after acquiring the partition lock. This guards against possibly fetching a corrupt-but-non-null pointer if pointer fetch/store isn't atomic. It's not clear if any practical architectures are like that, but we've never assumed that before and don't wish to start here. In any case, the situation certainly deserves a code comment. While at it, refactor the partition traversal loop to use a for() construct instead of a while() loop with goto's. Back-patch, just in case the risk is real and not hypothetical.	2013-11-28 12:17:46 -05:00
Alvaro Herrera	d51a8c52ba	Unbreak buildfarm I removed an intermediate commit before pushing and forgot to test the resulting tree :-(	2013-11-28 12:59:45 -03:00
Alvaro Herrera	247c76a989	Use a more granular approach to follow update chains Instead of simply checking the KEYS_UPDATED bit, we need to check whether each lock held on the future version of the tuple conflicts with the lock we're trying to acquire. Per bug report #8434 by Tomonari Katsumata	2013-11-28 12:00:12 -03:00
Alvaro Herrera	e4828e9ccb	Compare Xmin to previous Xmax when locking an update chain Not doing so causes us to traverse an update chain that has been broken by concurrent page pruning. All other code that traverses update chains uses this check as one of the cases in which to stop iterating, so replicate it here too. Failure to do so leads to erroneous CLOG, subtrans or multixact lookups. Per discussion following the bug report by J Smith in CADFUPgc5bmtv-yg9znxV-vcfkb+JPRqs7m2OesQXaM_4Z1JpdQ@mail.gmail.com as diagnosed by Andres Freund.	2013-11-28 12:00:12 -03:00
Alvaro Herrera	c235a6a589	Don't try to set InvalidXid as page pruning hint If a transaction updates/deletes a tuple just before aborting, and a concurrent transaction tries to prune the page concurrently, the pruner may see HeapTupleSatisfiesVacuum return HEAPTUPLE_DELETE_IN_PROGRESS, but a later call to HeapTupleGetUpdateXid() return InvalidXid. This would cause an assertion failure in development builds, but would be otherwise Mostly Harmless. Fix by checking whether the updater Xid is valid before trying to apply it as page prune point. Reported by Andres in 20131124000203.GA4403@alap2.anarazel.de	2013-11-28 12:00:12 -03:00
Alvaro Herrera	e518fa7adf	Cope with heap_fetch failure while locking an update chain The reason for the fetch failure is that the tuple was removed because it was dead; so the failure is innocuous and can be ignored. Moreover, there's no need for further work and we can return success to the caller immediately. EvalPlanQualFetch is doing something very similar to this already. Report and test case from Andres Freund in 20131124000203.GA4403@alap2.anarazel.de	2013-11-28 12:00:12 -03:00
Tom Lane	7db285afc9	Fix stale-pointer problem in fast-path locking logic. When acquiring a lock in fast-path mode, we must reset the locallock object's lock and proclock fields to NULL. They are not necessarily that way to start with, because the locallock could be left over from a failed lock acquisition attempt earlier in the transaction. Failure to do this led to all sorts of interesting misbehaviors when LockRelease tried to clean up no-longer-related lock and proclock objects in shared memory. Per report from Dan Wood. In passing, modify LockRelease to elog not just Assert if it doesn't find lock and proclock objects for a formerly fast-path lock, matching the code in FastPathGetRelationLockEntry and LockRefindAndRelease. This isn't a bug but it will help in diagnosing any future bugs in this area. Also, modify FastPathTransferRelationLocks and FastPathGetRelationLockEntry to break out of their loops over the fastpath array once they've found the sole matching entry. This was inconsistently done in some search loops and not others. Improve assorted related comments, too. Back-patch to 9.2 where the fast-path mechanism was introduced.	2013-11-27 18:10:00 -05:00
Tom Lane	8c84803e14	Minor corrections in lmgr/README. Correct an obsolete statement that no backend touches another backend's PROCLOCK lists. This was probably wrong even when written (the deadlock checker looks at everybody's lists), and it's certainly quite wrong now that fast-path locking can require creation of lock and proclock objects on behalf of another backend. Also improve some statements in the hot standby explanation, and do one or two other trivial bits of wordsmithing/ reformatting.	2013-11-27 15:07:13 -05:00
Heikki Linnakangas	631118fe1e	Get rid of the post-recovery cleanup step of GIN page splits. Replace it with an approach similar to what GiST uses: when a page is split, the left sibling is marked with a flag indicating that the parent hasn't been updated yet. When the parent is updated, the flag is cleared. If an insertion steps on a page with the flag set, it will finish split before proceeding with the insertion. The post-recovery cleanup mechanism was never totally reliable, as insertion to the parent could fail e.g because of running out of memory or disk space, leaving the tree in an inconsistent state. This also divides the responsibility of WAL-logging more clearly between the generic ginbtree.c code, and the parts specific to entry and posting trees. There is now a common WAL record format for insertions and deletions, which is written by ginbtree.c, followed by tree-specific payload, which is returned by the placetopage- and split- callbacks.	2013-11-27 19:21:23 +02:00
Heikki Linnakangas	ce5326eed3	More GIN refactoring. Separate the insertion payload from the more static portions of GinBtree. GinBtree now only contains information related to searching the tree, and the information of what to insert is passed separately. Add root block number to GinBtree, instead of passing it around all the functions as argument. Split off ginFinishSplit() from ginInsertValue(). ginFinishSplit is responsible for finding the parent and inserting the downlink to it.	2013-11-27 15:43:05 +02:00
Heikki Linnakangas	4118f7e8ed	Fix plpython3 expected output. I neglected this in the previous commit that updated the plpython2 output, which I forgot to "git add" earlier. As pointed out by Rodolfo Campero and Marko Kreen.	2013-11-27 14:25:13 +02:00
Heikki Linnakangas	82b43f7df2	Don't update relfrozenxid if any pages were skipped. Vacuum recognizes that it can update relfrozenxid by checking whether it has processed all pages of a relation. Unfortunately it performed that check after truncating the dead pages at the end of the relation, and used the new number of pages to decide whether all pages have been scanned. If the new number of pages happened to be smaller or equal to the number of pages scanned, it incorrectly decided that all pages were scanned. This can lead to relfrozenxid being updated, even though some pages were skipped that still contain old XIDs. That can lead to data loss due to xid wraparounds with some rows suddenly missing. This likely has escaped notice so far because it takes a large number (~2^31) of xids being used to see the effect, while a full-table vacuum before that would fix the issue. The incorrect logic was introduced by commit `b4b6923e03`. Backpatch this fix down to 8.4, like that commit. Andres Freund, with some modifications by me.	2013-11-27 13:43:27 +02:00
Michael Meskes	51867a0f9b	ECPG: Fix searching for quoted cursor names case-sensitively. Patch by Böszörményi Zoltán <zb@cybertec.at>	2013-11-27 11:02:13 +01:00
Fujii Masao	d1b88f6b36	Add --xlogdir option to pg_basebackup, for specifying the pg_xlog directory. Haribabu kommi, slightly modified by me.	2013-11-27 14:00:16 +09:00
Peter Eisentraut	85ed91ee7d	Implement information_schema.parameters.parameter_default column Reviewed-by: Ali Dar <ali.munir.dar@gmail.com> Reviewed-by: Amit Khandekar <amit.khandekar@enterprisedb.com> Reviewed-by: Rodolfo Campero <rodolfo.campero@anachronics.com>	2013-11-26 23:21:35 -05:00
Heikki Linnakangas	4c83e0353f	Oops, forgot to "git add" last minute changes to regression test.	2013-11-26 23:05:48 +02:00
Michael Meskes	d2542f9270	ECPG: Fix offset to NULL/size indicator array. Patch by Boszormenyi Zoltan <zb@cybertec.at>	2013-11-26 17:42:33 +01:00
Michael Meskes	f641fc86fb	ECPG: Simplify free_variable() Patch by Boszormenyi Zoltan <zb@cybertec.at>	2013-11-26 17:42:32 +01:00
Michael Meskes	1ec4c56e76	ECPG: Add EXEC SQL CLOSE C to the tests. Patch by Boszormenyi Zoltan <zb@cybertec.at>	2013-11-26 17:42:32 +01:00
Michael Meskes	db58e8ff7c	ECPG: Free the malloc()'ed variables in the test so it comes out clean on Valgrind runs. Patch by Boszormenyi Zoltan <zb@cybertec.at>	2013-11-26 17:42:32 +01:00
Michael Meskes	b46fa32100	ECPG: Make the preprocessor emit ';' if the variable type for a list of variables is varchar. This fixes this test case: int main(void) { exec sql begin declare section; varchar a[50], b[50]; exec sql end declare section; return 0; } Since varchars are internally turned into custom structs and the type name is emitted for these variable declarations, the preprocessed code previously had: struct varchar_1 { ... } a _,_ struct varchar_2 { ... } b ; The comma in the generated C file was a syntax error. There are no regression test changes since it's not exercised. Patch by Boszormenyi Zoltan <zb@cybertec.at>	2013-11-26 17:42:32 +01:00
Heikki Linnakangas	37364c6311	Handle domains over arrays like plain arrays in PL/python. Domains over arrays are now converted to/from python lists when passed as arguments or return values. Like regular arrays. This has some potential to break applications that rely on the old behavior that they are passed as strings, but in practice there probably aren't many such applications out there. Rodolfo Campero	2013-11-26 14:33:31 +02:00
Jeff Davis	7cc0ba9f17	Add missing entry for session_preload_libraries in sample config. The omission was apparently an oversight in the original patch.	2013-11-25 21:03:07 -08:00
Bruce Momjian	a6542a4b68	Change SET LOCAL/CONSTRAINTS/TRANSACTION and ABORT behavior Change SET LOCAL/CONSTRAINTS/TRANSACTION behavior outside of a transaction block from error (post-9.3) to warning. (Was nothing in <= 9.3.) Also change ABORT outside of a transaction block from notice to warning.	2013-11-25 19:19:40 -05:00
Michael Meskes	05b476c298	More improvement to comment parsing in ecpg. ECPG is not supposed to allow and output nested comments in C. These comments are only allowed in the SQL parts and must not be written into the C file. Also the different handling of different comments is documented.	2013-11-25 15:38:09 +01:00
Michael Meskes	ef8b3b00b5	Fix ecpg parsing of sizeof(). The last fix used the wrong non-terminal to define valid types.	2013-11-25 15:11:39 +01:00
Jeff Davis	559d535819	Lessen library-loading log level. Previously, messages were emitted at the LOG level every time a backend preloaded a library. That was acceptable (though unnecessary) for shared_preload_libraries; but it was excessive for local_preload_libraries and session_preload_libraries. Reduce to DEBUG1. Also, there was logic in the EXEC_BACKEND case to avoid repeated messages for shared_preload_libraries by demoting them to DEBUG2. DEBUG1 seems more appropriate there, as well, so eliminate that special case. Peter Geoghegan.	2013-11-24 10:50:54 -08:00
Tom Lane	36a3be6540	Fix new and latent bugs with errno handling in secure_read/secure_write. These functions must be careful that they return the intended value of errno to their callers. There were several scenarios where this might not happen: 1. The recent SSL renegotiation patch added a hunk of code that would execute after setting errno. In the first place, it's doubtful that we should consider renegotiation to be successfully completed after a failure, and in the second, there's no real guarantee that the called OpenSSL routines wouldn't clobber errno. Fix by not executing that hunk except during success exit. 2. errno was left in an unknown state in case of an unrecognized return code from SSL_get_error(). While this is a "can't happen" case, it seems like a good idea to be sure we know what would happen, so reset errno to ECONNRESET in such cases. (The corresponding code in libpq's fe-secure.c already did this.) 3. There was an (undocumented) assumption that client_read_ended() wouldn't change errno. While true in the current state of the code, this seems less than future-proof. Add explicit saving/restoring of errno to make sure that changes in the called functions won't break things. I see no need to back-patch, since #1 is new code and the other two issues are mostly hypothetical. Per discussion with Amit Kapila.	2013-11-24 13:09:38 -05:00
Michael Meskes	08d1b22b3b	Allow C array definitions to use sizeof(). When parsing C variable definitions ecpg should allow sizeof() operators as array dimensions.	2013-11-24 12:51:21 +01:00
Michael Meskes	8ac5e88f9f	Distinguish between C and SQL mode for C-style comments. SQL standard asks for allowing nested comments, while C does not. Therefore the two comments, while mostly similar, have to be parsed seperately.	2013-11-24 12:26:00 +01:00
Peter Eisentraut	a5036ca998	PL/Tcl: Add event trigger support From: Dimitri Fontaine <dimitri@2ndQuadrant.fr>	2013-11-23 21:32:00 -05:00
Tom Lane	45e02e3232	Fix array slicing of int2vector and oidvector values. The previous coding labeled expressions such as pg_index.indkey[1:3] as being of int2vector type; which is not right because the subscript bounds of such a result don't, in general, satisfy the restrictions of int2vector. To fix, implicitly promote the result of slicing int2vector to int2[], or oidvector to oid[]. This is similar to what we've done with domains over arrays, which is a good analogy because these types are very much like restricted domains of the corresponding regular-array types. A side-effect is that we now also forbid array-element updates on such columns, eg while "update pg_index set indkey[4] = 42" would have worked before if you were superuser (and corrupted your catalogs irretrievably, no doubt) it's now disallowed. This seems like a good thing since, again, some choices of subscripting would've led to results not satisfying the restrictions of int2vector. The case of an array-slice update was rejected before, though with a different error message than you get now. We could make these cases work in future if we added a cast from int2[] to int2vector (with a cast function checking the subscript restrictions) but it seems unlikely that there's any value in that. Per report from Ronan Dunklau. Back-patch to all supported branches because of the crash risks involved.	2013-11-23 20:03:56 -05:00
Tom Lane	f145454d57	Ensure _dosmaperr() actually sets errno correctly. If logging is enabled, either ereport() or fprintf() might stomp on errno internally, causing this function to return the wrong result. That might only end in a misleading error report, but in any code that's examining errno to decide what to do next, the consequences could be far graver. This has been broken since the very first version of this file in 2006 ... it's a bit astonishing that we didn't identify this long ago. Reported by Amit Kapila, though this isn't his proposed fix.	2013-11-23 18:24:26 -05:00
Peter Eisentraut	b7212c9726	Fix thinko in SPI_execute_plan() calls Two call sites were apparently thinking that the last argument of SPI_execute_plan() is the number of query parameters, but it is actually the row limit. Change the calls to 0, since we don't care about the limit there. The previous code didn't break anything, but it was still wrong.	2013-11-23 09:34:57 -05:00
Peter Eisentraut	4053189d59	Avoid potential buffer overflow crash A pointer to a C string was treated as a pointer to a "name" datum and passed to SPI_execute_plan(). This pointer would then end up being passed through datumCopy(), which would try to copy the entire 64 bytes of name data, thus running past the end of the C string. Fix by converting the string to a proper name structure. Found by LLVM AddressSanitizer.	2013-11-23 07:25:37 -05:00
Tom Lane	f19e92ed04	Flatten join alias Vars before pulling up targetlist items from a subquery. pullup_replace_vars()'s decisions about whether a pulled-up replacement expression needs to be wrapped in a PlaceHolderVar depend on the assumption that what looks like a Var behaves like a Var. However, if the Var is a join alias reference, later flattening of join aliases might replace the Var with something that's not a Var at all, and should have been wrapped. To fix, do a forcible pass of flatten_join_alias_vars() on the subquery targetlist before we start to copy items out of it. We'll re-run that processing on the pulled-up expressions later, but that's harmless. Per report from Ken Tanzer; the added regression test case is based on his example. This bug has been there since the PlaceHolderVar mechanism was invented, but has escaped detection because the circumstances that trigger it are fairly narrow. You need a flattenable query underneath an outer join, which contains another flattenable query inside a join of its own, with a dangerous expression (a constant or something else non-strict) in that one's targetlist. Having seen this, I'm wondering if it wouldn't be prudent to do all alias-variable flattening earlier, perhaps even in the rewriter. But that would probably not be a back-patchable change.	2013-11-22 14:37:21 -05:00
Heikki Linnakangas	98f58a30c1	Fix Hot-Standby initialization of clog and subtrans. These bugs can cause data loss on standbys started with hot_standby=on at the moment they start to accept read only queries, by marking committed transactions as uncommited. The likelihood of such corruptions is small unless the primary has a high transaction rate. `5a031a5556` fixed bugs in HS's startup logic by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both clog and subtrans are written to before that. This only failed to fail in common cases because the usage of ExtendCLOG in procarray.c was superflous since clog extensions are actually WAL logged. f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing extensions of pg_subtrans due to the former commit's changes - which are not WAL logged - by performing the extensions when switching to a state > STANDBY_INITIALIZED and not performing xid assignments before that - again missing the fact that ExtendCLOG is unneccessary - but screwed up twice: Once because latestObservedXid wasn't updated anymore in that state due to the earlier commit and once by having an off-by-one error in the loop performing extensions. This means that whenever a CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed between the start of the checkpoint recovery started from and the first xl_running_xact record old transactions commit bits in pg_clog could be overwritten if they started and committed in that window. Fix this mess by not performing ExtendCLOG() in HS at all anymore since it's unneeded and evidently dangerous and by performing subtrans extensions even before reaching STANDBY_SNAPSHOT_PENDING. Analysis and patch by Andres Freund. Reported by Christophe Pettus. Backpatch down to 9.0, like the previous commit that caused this.	2013-11-22 14:45:41 +02:00
Heikki Linnakangas	1a3d104475	Avoid acquiring spinlock when checking if recovery has finished, for speed. RecoveryIsInProgress() can be called very frequently. During normal operation, it just checks a backend-local variable and returns quickly, but during hot standby, it checks a spinlock-protected shared variable. Those spinlock acquisitions can become a point of contention on a busy hot standby system. Replace the spinlock acquisition with a memory barrier. Per discussion with Andres Freund, Ants Aasma and Merlin Moncure.	2013-11-22 13:07:23 +02:00
Peter Eisentraut	f4482a542c	Tweak streamutil.c further to avoid scan-build warning The previous change added a new scan-build warning about need_password assigned but not read.	2013-11-21 21:46:43 -05:00
Tom Lane	784e762e88	Support multi-argument UNNEST(), and TABLE() syntax for multiple functions. This patch adds the ability to write TABLE( function1(), function2(), ...) as a single FROM-clause entry. The result is the concatenation of the first row from each function, followed by the second row from each function, etc; with NULLs inserted if any function produces fewer rows than others. This is believed to be a much more useful behavior than what Postgres currently does with multiple SRFs in a SELECT list. This syntax also provides a reasonable way to combine use of column definition lists with WITH ORDINALITY: put the column definition list inside TABLE(), where it's clear that it doesn't control the ordinality column as well. Also implement SQL-compliant multiple-argument UNNEST(), by turning UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)). The SQL standard specifies TABLE() with only a single function, not multiple functions, and it seems to require an implicit UNNEST() which is not what this patch does. There may be something wrong with that reading of the spec, though, because if it's right then the spec's TABLE() is just a pointless alternative spelling of UNNEST(). After further review of that, we might choose to adopt a different syntax for what this patch does, but in any case this functionality seems clearly worthwhile. Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and significantly revised by me	2013-11-21 19:37:20 -05:00
Fujii Masao	38f4328981	Fix pg_isready to handle -d option properly. Previously, -d option for pg_isready was broken. When the name of the database was specified by -d option, pg_isready failed with an error. When the conninfo specified by -d option contained the setting of the host name but not Numeric IP address (i.e., hostaddr), pg_isready displayed wrong connection message. -d option could not handle a valid URI prefix at all. This commit fixes these bugs of pg_isready. Backpatch to 9.3, where pg_isready was introduced. Per report from Josh Berkus and Robert Haas. Original patch by Fabrízio de Royes Mello, heavily modified by me.	2013-11-21 21:52:03 +09:00
Heikki Linnakangas	04eee1fa9e	More GIN refactoring. Split off the portion of ginInsertValue that inserts the tuple to current level into a separate function, ginPlaceToPage. ginInsertValue's charter is now to recurse up the tree to insert the downlink, when a page split is required. This is in preparation for a patch to change the way incomplete splits are handled, which will need to do these operations separately. And IMHO makes the code more readable anyway.	2013-11-20 17:01:33 +02:00
Heikki Linnakangas	501012631e	Refactor the internal GIN B-tree interface for forming a downlink. This creates a new gin-btree callback function for creating a downlink for a page. Previously, ginxlog.c duplicated the logic used during normal operation.	2013-11-20 16:57:41 +02:00
Heikki Linnakangas	04965ad40e	Further GIN refactoring. Merge some functions that were always called together. Makes the code little bit more readable.	2013-11-20 16:09:14 +02:00
Peter Eisentraut	b21de4e7b3	ecpg: Split off mmfatal() from mmerror() This allows decorating mmfatal() with noreturn compiler hints, leading to better diagnostics.	2013-11-19 21:56:54 -05:00
Fujii Masao	b1543cc8a8	Add tab completion for \pset in psql. Pavel Stehule, reviewed by Ian Lawrence Barwick	2013-11-19 23:44:14 +09:00
Heikki Linnakangas	fea437681d	Spell SQL keywords in uppercase in pg_dump's query. The server won't care, but let's be consistent. David Rowley.	2013-11-18 18:34:51 +02:00
Heikki Linnakangas	32ceba3ea7	Replace appendPQExpBuffer(..., <constant>) with appendPQExpBufferStr Arguably makes the code a bit more readable, and might give a small performance gain. David Rowley	2013-11-18 18:34:51 +02:00
Robert Haas	f1df4731ee	Use cstring_to_text_with_len when length is known. This avoids a potentially-expensive extra call to strlen(). David Rowley	2013-11-18 10:19:00 -05:00
Heikki Linnakangas	4c697d8f48	Count locked pages that don't need vacuuming as scanned. Previously, if VACUUM skipped vacuuming a page because it's pinned, it didn't count that page as scanned. However, that meant that relfrozenxid was not bumped up either, which prevented anti-wraparound vacuum from doing its job. Report by Миша Тюрин, analysis and patch by Sergey Burladyn and Jeff Janes. Backpatch to 9.2, where the skip-locked-pages behavior was introduced.	2013-11-18 09:51:09 +02:00
Tom Lane	f901bb50e3	Add make_date() and make_time() functions. Pavel Stehule, reviewed by Jeevan Chalke and Atri Sharma	2013-11-17 15:06:50 -05:00
Tom Lane	69c8fbac20	Improve performance of numeric sum(), avg(), stddev(), variance(), etc. This patch improves performance of most built-in aggregates that formerly used a NUMERIC or NUMERIC array as their transition type; this includes not only aggregates on numeric inputs, but some aggregates on integer inputs where overflow of an int8 value is a possibility. The code now uses a special-purpose data structure to avoid array construction and deconstruction overhead, as well as packing and unpacking overhead for numeric values. These aggregates' transition type is now declared as INTERNAL, since it doesn't correspond to any SQL data type. To keep the planner from thinking that that means a lot of storage will be used, we make use of the just-added pg_aggregate.aggtransspace feature. The space estimate is set to 128 bytes, which is at least in the right ballpark. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra	2013-11-16 18:46:34 -05:00
Tom Lane	6cb86143e8	Allow aggregates to provide estimates of their transition state data size. Formerly the planner had a hard-wired rule of thumb for guessing the amount of space consumed by an aggregate function's transition state data. This estimate is critical to deciding whether it's OK to use hash aggregation, and in many situations the built-in estimate isn't very good. This patch adds a column to pg_aggregate wherein a per-aggregate estimate can be provided, overriding the planner's default, and infrastructure for setting the column via CREATE AGGREGATE. It may be that additional smarts will be required in future, perhaps even a per-aggregate estimation function. But this is already a step forward. This is extracted from a larger patch to improve the performance of numeric and int8 aggregates. I (tgl) thought it was worth reviewing and committing this infrastructure separately. In this commit, all built-in aggregates are given aggtransspace = 0, so no behavior should change. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra	2013-11-16 16:03:40 -05:00
Tom Lane	f1f21b2d6f	Fix incorrect loop counts in tidbitmap.c. A couple of places that should have been iterating over WORDS_PER_CHUNK words were iterating over WORDS_PER_PAGE words instead. This thinko accidentally failed to fail, because (at least on common architectures with default BLCKSZ) WORDS_PER_CHUNK is a bit less than WORDS_PER_PAGE, and the extra words being looked at were always zero so nothing happened. Still, it's a bug waiting to happen if anybody ever fools with the parameters affecting TIDBitmap sizes, and it's a small waste of cycles too. So back-patch to all active branches. Etsuro Fujita	2013-11-15 18:34:14 -05:00
Tom Lane	97e1ec4670	Speed up printing of INSERT statements in pg_dump. In --inserts and especially --column-inserts mode, we can get a useful speedup by generating the common prefix of all a table's INSERT commands just once, and then printing the prebuilt string for each row. This avoids multiple invocations of fmtId() and other minor fooling around. David Rowley	2013-11-15 18:02:06 -05:00
Tom Lane	3172eea062	Clean up password prompting logic in streamutil.c. The previous coding was fairly unreadable and drew double-free warnings from clang. I believe the double free was actually not reachable, because PQconnectionNeedsPassword is coded to not return true if a password was provided, so that the loop can't iterate more than twice. Nonetheless it seems worth rewriting. No back-patch since this is just cosmetic.	2013-11-15 17:27:41 -05:00
Tom Lane	f3b3b8d5be	Compute correct em_nullable_relids in get_eclass_for_sort_expr(). Bug #8591 from Claudio Freire demonstrates that get_eclass_for_sort_expr must be able to compute valid em_nullable_relids for any new equivalence class members it creates. I'd worried about this in the commit message for `db9f0e1d9a`, but claimed that it wasn't a problem because multi-member ECs should already exist when it runs. That is transparently wrong, though, because this function is also called by initialize_mergeclause_eclasses, which runs during deconstruct_jointree. The example given in the bug report (which the new regression test item is based upon) fails because the COALESCE() expression is first seen by initialize_mergeclause_eclasses rather than process_equivalence. Fixing this requires passing the appropriate nullable_relids set to get_eclass_for_sort_expr, and it requires new code to compute that set for top-level expressions such as ORDER BY, GROUP BY, etc. We store the top-level nullable_relids in a new field in PlannerInfo to avoid computing it many times. In the back branches, I've added the new field at the end of the struct to minimize ABI breakage for planner plugins. There doesn't seem to be a good alternative to changing get_eclass_for_sort_expr's API signature, though. There probably aren't any third-party extensions calling that function directly; moreover, if there are, they probably need to think about what to pass for nullable_relids anyway. Back-patch to 9.2, like the previous patch in this area.	2013-11-15 16:46:18 -05:00
Tom Lane	c7b849a896	Prevent leakage of cached plans and execution trees in plpgsql DO blocks. plpgsql likes to cache query plans and simple-expression execution state trees across calls. This is a considerable win for multiple executions of the same function. However, it's useless for DO blocks, since by definition those are executed only once and discarded. Nonetheless, we were allowing a DO block's expression execution trees to survive until end of transaction, resulting in a significant intra-transaction memory leak, as reported by Yeb Havinga. Worse, if the DO block exited with an error, the compiled form of the block's code was leaked till end of session --- along with subsidiary plancache entries. To fix, make DO blocks keep their expression execution trees in a private EState that's deleted at exit from the block, and add a PG_TRY block to plpgsql_inline_handler to make sure that memory cleanup happens even on error exits. Also add a regression test covering error handling in a DO block, because my first try at this broke that. (The test is not meant to prove that we don't leak memory anymore, though it could be used for that with a much larger loop count.) Ideally we'd back-patch this into all versions supporting DO blocks; but the patch needs to add a field to struct PLpgSQL_execstate, and that would break ABI compatibility for third-party plugins such as the plpgsql debugger. Given the small number of complaints so far, fixing this in HEAD only seems like an acceptable choice.	2013-11-15 13:52:03 -05:00
Tom Lane	80e3a470ba	Minor comment corrections for sequence hashtable patch. There were enough typos in the comments to annoy me ...	2013-11-15 12:17:12 -05:00
Kevin Grittner	7cb964acb7	Fix buffer overrun in isolation test program. Commit `061b88c732` saved argv0 to a global buffer without ensuring that it was zero terminated, allowing references to it to overrun the buffer and access other memory. This probably would not have presented any security risk, but could have resulted in very confusing failures if the path to the executable was very long. Reported by David Rowley	2013-11-15 08:27:42 -06:00
Heikki Linnakangas	5cb719beee	Fix bogus hash table creation. Andres Freund	2013-11-15 14:23:40 +02:00
Heikki Linnakangas	21025d4a53	Use a hash table to store current sequence values. This speeds up nextval() and currval(), when you touch a lot of different sequences in the same backend. David Rowley	2013-11-15 12:29:38 +02:00
Tom Lane	982b82d6b1	Add a regression test case for \d on an index. Previous commit shows the need for this. The coverage isn't really thorough, but it's better than nothing.	2013-11-14 10:35:15 -05:00
Tom Lane	e694cf25d7	Fix incorrect column name in psql \d code. pg_index.indisreplident had at one time in its development been called indisidentity. describe.c got missed when it was renamed. Bug introduced in commit `07cacba983`. Andres Freund	2013-11-14 10:27:24 -05:00
Peter Eisentraut	87d8378f60	Fix whitespace	2013-11-13 21:25:52 -05:00
Andrew Dunstan	869b1e4a67	Fix isolation check for MSVC to handle recent changes.	2013-11-13 12:59:48 -05:00
Robert Haas	c46c803f8a	Fix relfilenodemap.c's handling of cache invalidations. The old code entered a new hash table entry first, then scanned pg_class to determine what value to fill in, and then populated the entry. This fails to work properly if a cache invalidation happens as a result of opening pg_class. Repair. Along the way, get rid of the idea of blowing away the entire hash table as a method of processing invalidations. Instead, just delete all the entries one by one. This is probably not quite as cheap but it's simpler, and shouldn't happen often. Andres Freund	2013-11-13 10:52:59 -05:00
Kevin Grittner	fe67d25233	Free ignorelist after each regression test schedule. It's a trivial amount of RAM held until the end of the regression test run; but it's probably worth fixing to silence future warnings from code analyzers. This was the only memory leak pointed out by clang's static code analysis tool.	2013-11-13 09:01:06 -06:00
Heikki Linnakangas	07fca603b5	Fix bug in GIN posting tree root creation. The root page is filled with as many items as fit, and the rest are inserted using normal insertions. However, I fumbled the variable names, and the code actually memcpy'd all the items on the page, overflowing the buffer. While at it, rename the variable to make the distinction more clear. Reported by Teodor Sigaev. This bug was introduced by my recent refactorings, so no backpatching required.	2013-11-13 13:47:59 +02:00
Peter Eisentraut	aa04b323c3	Move variable closer to where it is used This avoids an unused variable warning on Windows when building without asserts From: David Rowley <dgrowleyml@gmail.com>	2013-11-13 06:26:27 -05:00
Robert Haas	061b88c732	Try again to make pg_isolation_regress work its build directory. We can't search for the isolationtester binary until after we've set up the environment, because otherwise when find_other_exec() tries to invoke it with the -V option, it might fail for inability to locate a working libpq. So postpone that step. Andres Freund	2013-11-12 11:23:47 -05:00
Peter Eisentraut	3626adf266	Remove leftovers of IRIX port This removes the remaining pieces of the IRIX port that was removed by `ea91a6be89`.	2013-11-12 06:39:36 -05:00
Tom Lane	ebefbb5fde	Fix failure with whole-row reference to a subquery. Simple oversight in commit `1cb108efb0` --- recursively examining a subquery output column is only sane if the original Var refers to a single output column. Found by Kevin Grittner.	2013-11-11 16:36:27 -05:00
Tom Lane	0b7e660d6c	Fix ruleutils pretty-printing to not generate trailing whitespace. The pretty-printing logic in ruleutils.c operates by inserting a newline and some indentation whitespace into strings that are already valid SQL. This naturally results in leaving some trailing whitespace before the newline in many cases; which can be annoying when processing the output with other tools, as complained of by Joe Abbate. We can fix that in a pretty localized fashion by deleting any trailing whitespace before we append a pretty-printing newline. In addition, we have to modify the code inserted by commit `2f582f76b1` so that we also delete trailing whitespace when transposing items from temporary buffers into the main result string, when a temporary item starts with a newline. This results in rather voluminous changes to the regression test results, but it's easily verified that they are only removal of trailing whitespace. Back-patch to 9.3, because the aforementioned commit resulted in many more cases of trailing whitespace than had occurred in earlier branches.	2013-11-11 13:36:38 -05:00
Tom Lane	648bd05b13	Re-allow duplicate aliases within aliased JOINs. Although the SQL spec forbids duplicate table aliases, historically we've allowed queries like SELECT ... FROM tab1 x CROSS JOIN (tab2 x CROSS JOIN tab3 y) z on the grounds that the aliased join (z) hides the aliases within it, therefore there is no conflict between the two RTEs named "x". The LATERAL patch broke this, on the misguided basis that "x" could be ambiguous if tab3 were a LATERAL subquery. To avoid breaking existing queries, it's better to allow this situation and complain only if tab3 actually does contain an ambiguous reference. We need only remove the check that was throwing an error, because the column lookup code is already prepared to handle ambiguous references. Per bug #8444.	2013-11-11 10:42:57 -05:00
Magnus Hagander	705556a631	Don't abort pg_basebackup when receiving empty WAL block This is a similar fix as `c6ec8793aa` 9.2. This should never happen in 9.3 and newer since the special case cannot happen there, but this patch synchronizes up the code so there is no confusion on why they're different. An empty block is as harmless in 9.3 as it was in 9.2, and can safely be ignored.	2013-11-11 14:59:55 +01:00
Peter Eisentraut	001e114b8d	Fix whitespace issues found by git diff --check, add gitattributes Set per file type attributes in .gitattributes to fine-tune whitespace checks. With the associated cleanups, the tree is now clean for git	2013-11-10 14:48:29 -05:00
Robert Haas	dca09ac533	Fix ECPG compiler warning. Commit `9b4d52f209` failed to notice that pg_regress_ecpg needed updating. This patch was independently submitted by both David Rowley and Andres Freund.	2013-11-09 18:53:57 -05:00
Heikki Linnakangas	ac4ab97ec0	Fix race condition in GIN posting tree page deletion. If a page is deleted, and reused for something else, just as a search is following a rightlink to it from its left sibling, the search would continue scanning whatever the new contents of the page are. That could lead to incorrect query results, or even something more curious if the page is reused for a different kind of a page. To fix, modify the search algorithm to lock the next page before releasing the previous one, and refrain from deleting pages from the leftmost branch of the tree. Add a new Concurrency section to the README, explaining why this works. There is a lot more one could say about concurrency in GIN, but that's for another patch. Backpatch to all supported versions.	2013-11-08 22:21:42 +02:00
Robert Haas	9b4d52f209	Fix pg_isolation_regress to work outside its build directory. This makes it possible to, for example, use the isolation tester to test a contrib module. Andres Freund	2013-11-08 14:40:41 -05:00
Robert Haas	07cacba983	Add the notion of REPLICA IDENTITY for a table. Pending patches for logical replication will use this to determine which columns of a tuple ought to be considered as its candidate key. Andres Freund, with minor, mostly cosmetic adjustments by me	2013-11-08 12:30:43 -05:00
Tom Lane	b97ee66cc1	Make contain_volatile_functions/contain_mutable_functions look into SubLinks. This change prevents us from doing inappropriate subquery flattening in cases such as dangerous functions hidden inside a sub-SELECT in the targetlist of another sub-SELECT. That could result in unexpected behavior due to multiple evaluations of a volatile function, as in a recent complaint from Etienne Dube. It's been questionable from the very beginning whether these functions should look into subqueries (as noted in their comments), and this case seems to provide proof that they should. Because the new code only descends into SubLinks, not SubPlans or InitPlans, the change only affects the planner's behavior during prepjointree processing and not later on --- for example, you can still get it to use a volatile function in an indexqual if you wrap the function in (SELECT ...). That's a historical behavior, for sure, but it's reasonable given that the executor's evaluation rules for subplans don't depend on whether there are volatile functions inside them. In any case, we need to constrain the behavioral change as narrowly as we can to make this reasonable to back-patch.	2013-11-08 11:36:57 -05:00
Tom Lane	060b22a99a	Fix subtly-wrong volatility checking in BeginCopyFrom(). contain_volatile_functions() is best applied to the output of expression_planner(), not its input, so that insertion of function default arguments and constant-folding have been done. (See comments at CheckMutability, for instance.) It's perhaps unlikely that anyone will notice a difference in practice, but still we should do it properly. In passing, change variable type from Node* to Expr* to reduce the net number of casts needed. Noted while perusing uses of contain_volatile_functions().	2013-11-08 08:59:39 -05:00
Tom Lane	20803d7881	Make LOCK_PRINT & PROCLOCK_PRINT expand to ((void) 0) when not in use. This avoids warnings from more-anal-than-average compilers, and might prevent hidden syntax problems in the future. Andres Freund	2013-11-07 19:07:48 -05:00
Kevin Grittner	b64b5ccb6a	Silence benign warnings from clang version 3.0-6ubuntu3.	2013-11-07 16:35:43 -06:00
Tom Lane	c28b289bf3	Prevent display of dropped columns in row constraint violation messages. ExecBuildSlotValueDescription() printed "null" for each dropped column in a row being complained of by ExecConstraints(). This has some sanity in terms of the underlying implementation, but is of course pretty surprising to users. To fix, we must pass the target relation's descriptor to ExecBuildSlotValueDescription(), because the slot descriptor it had been using doesn't get labeled with attisdropped markers. Per bug #8408 from Maxim Boguk. Back-patch to 9.2 where the feature of printing row values in NOT NULL and CHECK constraint violation messages was introduced. Michael Paquier and Tom Lane	2013-11-07 14:41:36 -05:00
Tom Lane	5e900bc00f	Fix generation of MergeAppend plans for optimized min/max on expressions. Before jamming a desired targetlist into a plan node, one really ought to make sure the plan node can handle projections, and insert a buffering Result plan node if not. planagg.c forgot to do this, which is a hangover from the days when it only dealt with IndexScan plan types. MergeAppend doesn't project though, not to mention that it gets unhappy if you remove its possibly-resjunk sort columns. The code accidentally failed to fail for cases in which the min/max argument was a simple Var, because the new targetlist would be equivalent to the original "flat" tlist anyway. For any more complex case, it's been broken since 9.1 where we introduced the ability to optimize min/max using MergeAppend, as reported by Raphael Bauduin. Fix by duplicating the logic from grouping_planner that decides whether we need a Result node. In 9.2 and 9.1, this requires back-porting the tlist_same_exprs() function introduced in commit `4387cf956b`, else we'd uselessly add a Result node in cases that worked before. It's rather tempting to back-patch that whole commit so that we can avoid extra Result nodes in mainline cases too; but I'll refrain, since that code hasn't really seen all that much field testing yet.	2013-11-07 13:14:14 -05:00
Heikki Linnakangas	fde7172d93	Fix setting of right bound at GIN page split. Broken by my refactoring.	2013-11-07 19:45:07 +02:00
Tom Lane	8dace66e07	Add #ifdef guards for some POSIX error symbols that Windows doesn't like. Per buildfarm results. It looks like the older the Windows version, the more errno codes it hasn't got ...	2013-11-06 20:22:42 -05:00
Tom Lane	8e68816cc2	Be more robust when strerror() doesn't give a useful result. glibc, at least, is capable of returning "???" instead of anything useful if it doesn't like the setting of LC_CTYPE. If this happens, or in the previously-known case of strerror() returning an empty string, try to print the C macro name for the error code ("EACCES" etc). Only if we don't have the error code in our compiled-in list of popular error codes (which covers most though not quite all of what's called out in the POSIX spec) will we fall back to printing a numeric error code. This should simplify debugging. Note that this functionality is currently only provided for %m in backend ereport/elog messages. That may be sufficient, since we don't fool with the locale environment in frontend clients, but it's foreseeable that we might want similar code in libpq for instance. There was some talk of back-patching this, but let's see how the buildfarm likes it first. It seems likely that at least some of the POSIX-defined error code symbols don't exist on all platforms. I don't want to clutter the entire list with #ifdefs, but we may need more than are here now. MauMau, edited by me	2013-11-06 15:50:17 -05:00
Tom Lane	bb45c64041	Support default arguments and named-argument notation for window functions. These things didn't work because the planner omitted to do the necessary preprocessing of a WindowFunc's argument list. Add the few dozen lines of code needed to handle that. Although this sounds like a feature addition, it's really a bug fix because the default-argument case was likely to crash previously, due to lack of checking of the number of supplied arguments in the built-in window functions. It's not a security issue because there's no way for a non-superuser to create a window function definition with defaults that refers to a built-in C function, but nonetheless people might be annoyed that it crashes rather than producing a useful error message. So back-patch as far as the patch applies easily, which turns out to be 9.2. I'll put a band-aid in earlier versions as a separate patch. (Note that these features still don't work for aggregates, and fixing that case will be harder since we represent aggregate arg lists as target lists not bare expression lists. There's no crash risk though because CREATE AGGREGATE doesn't accept defaults, and we reject named-argument notation when parsing an aggregate call.)	2013-11-06 13:33:09 -05:00
Kevin Grittner	5829082a57	Keep heap open until new heap generated in RMV. Early close became apparent when invalidation messages were processed in a new location under CLOBBER_CACHE_ALWAYS builds, due to additional locking. Back-patch to 9.3	2013-11-06 12:27:52 -06:00
Heikki Linnakangas	0ea53256a8	Fix missing argument and function prototypes. Not sure how I missed these in previous commit.	2013-11-06 11:22:58 +02:00
Heikki Linnakangas	ecaa4708e5	Misc GIN refactoring. Merge the isEnoughSpace and placeToPage functions in the b-tree interface into one function that tries to put a tuple on page, and returns false if it doesn't fit. Move createPostingTree function to gindatapage.c, and change its contract so that it can be passed more items than fit on the root page. It's in a better position than the callers to know how many items fit. Move ginMergeItemPointers out of gindatapage.c, into a separate file. These changes make no difference now, but reduce the footprint of Alexander Korotkov's upcoming patch to pack item pointers more tightly.	2013-11-06 10:32:09 +02:00
Tom Lane	920c8261d5	Improve the error message given for modifying a window with frame clause. For rather inscrutable reasons, SQL:2008 disallows copying-and-modifying a window definition that has any explicit framing clause. The error message we gave for this only made sense if the referencing window definition itself contains an explicit framing clause, which it might well not. Moreover, in the context of an OVER clause it's not exactly obvious that "OVER (windowname)" implies copy-and-modify while "OVER windowname" does not. This has led to multiple complaints, eg bug #5199 from Iliya Krapchatov. Change to a hopefully more intelligible error message, and in the case where we have just "OVER (windowname)", add a HINT suggesting that omitting the parentheses will fix it. Also improve the related documentation. Back-patch to all supported branches.	2013-11-05 21:58:08 -05:00
Tom Lane	d4e6133c68	Revert commit `0725065b37`. The previous commit was intended to make psql show the full path name when doing a \s (history save), but it was very badly implemented and would show confusing if not outright wrong information in many situations; for instance if the path name given to \s is absolute, or if \cd commands involving relative paths have been issued. Consensus seems to be that we don't especially need this functionality in \s, and certainly not in \s alone. So revert rather than trying to fix it up. Per gripe from Ian Barwick. Although the bogus behavior exists in all supported versions, I'm not back-patching, because the work created for translators (by change of a translatable message) would probably outweigh the value of what is after all a mostly-cosmetic change.	2013-11-05 17:52:09 -05:00
Kevin Grittner	2636ecf78b	Lock relation used to generate fresh data for RMV. The relation should not be accessible to any other process, but it should be locked for consistency. Since this is not known to cause any bug, it will not be back-patch, at least for now. Per report from Andres Freund	2013-11-05 15:36:33 -06:00
Tom Lane	6331de1d44	Fix some obsolete information in src/backend/optimizer/README. Constant quals aren't handled the same way they used to be. Also, add mention of a couple more major steps in grouping_planner. Per complaint a couple months back from Etsuro Fujita.	2013-11-05 11:31:35 -05:00
Kevin Grittner	732758db4c	Fix breakage of MV column name list usage. Per bug report from Tomonari Katsumata. Back-patch to 9.3.	2013-11-04 14:31:07 -06:00
Robert Haas	dddc34408a	Fix format code used to print dsm request sizes. Per report from Peter Eisentraut.	2013-11-04 11:22:03 -05:00
Heikki Linnakangas	2103430179	Fix parsing of xlog file name in pg_receivexlog. The parsing of WAL filenames of segments larger than > 255 was broken, making pg_receivexlog unable to restart streaming after stopping it. The bug was introduced by the changes in 9.3 to represent WAL segment number as a 64-bit integer instead of two ints, log and seg. To fix, replace the plain sscanf call with XLogFromFileName macro, which does the conversion from log+seg to a 64-bit integer correcly. Reported by Mika Eloranta.	2013-11-04 10:57:58 +02:00
Tom Lane	e36ce0c7f7	Get rid of more cases of the "must detoast before output function" meme. I missed that json.c was doing this too, because for some bizarre reason it wasn't doing it adjacent to the output function call.	2013-11-03 11:55:37 -05:00
Tom Lane	b006f4ddb9	Prevent memory leaks from accumulating across printtup() calls. Historically, printtup() has assumed that it could prevent memory leakage by pfree'ing the string result of each output function and manually managing detoasting of toasted values. This amounts to assuming that datatype output functions never leak any memory internally; an assumption we've already decided to be bogus elsewhere, for example in COPY OUT. range_out in particular is known to leak multiple kilobytes per call, as noted in bug #8573 from Godfried Vanluffelen. While we could go in and fix that leak, it wouldn't be very notationally convenient, and in any case there have been and undoubtedly will again be other leaks in other output functions. So what seems like the best solution is to run the output functions in a temporary memory context that can be reset after each row, as we're doing in COPY OUT. Some quick experimentation suggests this is actually a tad faster than the retail pfree's anyway. This patch fixes all the variants of printtup, except for debugtup() which is used in standalone mode. It doesn't seem worth worrying about query-lifespan leaks in standalone mode, and fixing that case would be a bit tedious since debugtup() doesn't currently have any startup or shutdown functions. While at it, remove manual detoast management from several other output-function call sites that had copied it from printtup(). This doesn't make a lot of difference right now, but in view of recent discussions about supporting "non-flattened" Datums, we're going to want that code gone eventually anyway. Back-patch to 9.2 where range_out was introduced. We might eventually decide to back-patch this further, but in the absence of known major leaks in older output functions, I'll refrain for now.	2013-11-03 11:33:05 -05:00
Michael Meskes	84a05d479e	Changed test case slightly so it doesn't have an unused typedef.	2013-11-03 15:37:34 +01:00
Kevin Grittner	2a781d57dc	Acquire appropriate locks when rewriting during RMV. Since the query has not been freshly parsed when executing REFRESH MATERIALIZED VIEW, locks must be explicitly taken before rewrite. Backpatch to 9.3. Andres Freund	2013-11-02 19:18:08 -05:00
Kevin Grittner	be420fa02e	Fix subquery reference to non-populated MV in CMV. A subquery reference to a matview should be allowed by CREATE MATERIALIZED VIEW WITH NO DATA, just like a direct reference is. Per bug report from Laurent Sartran. Backpatch to 9.3.	2013-11-02 18:38:17 -05:00
Tom Lane	24ace4053d	Retry after buffer locking failure during SPGiST index creation. The original coding thought this case was impossible, but it can happen if the bgwriter or checkpointer processes decide to write out an index page while creation is still proceeding, leading to a bogus "unexpected spgdoinsert() failure" error. Problem reported by Jonathan S. Katz. Teodor Sigaev	2013-11-02 16:45:42 -04:00
Tom Lane	bffd1ce92c	Ensure all files created for a single BufFile have the same resource owner. Callers expect that they only have to set the right resource owner when creating a BufFile, not during subsequent operations on it. While we could insist this be fixed at the caller level, it seems more sensible for the BufFile to take care of it. Without this, some temp files belonging to a BufFile can go away too soon, eg at the end of a subtransaction, leading to errors or crashes. Reported and fixed by Andres Freund. Back-patch to all active branches.	2013-11-01 16:09:48 -04:00
Tom Lane	45f64f1bbf	Remove CTimeZone/HasCTZSet, root and branch. These variables no longer have any useful purpose, since there's no reason to special-case brute force timezones now that we have a valid session_timezone setting for them. Remove the variables, and remove the SET/SHOW TIME ZONE code that deals with them. The user-visible impact of this is that SHOW TIME ZONE will now show a POSIX-style zone specification, in the form "<+-offset>-+offset", rather than an interval value when a brute-force zone has been set. While perhaps less intuitive, this is a better definition than before because it's actually possible to give that string back to SET TIME ZONE and get the same behavior, unlike what used to happen. We did not previously mention the angle-bracket syntax when describing POSIX timezone specifications; add some documentation so that people can figure out what these strings do. (There's still quite a lot of undocumented functionality there, but anybody who really cares can go read the POSIX spec to find out about it. In practice most people seem to prefer Olsen-style city names anyway.)	2013-11-01 13:57:31 -04:00
Tom Lane	1c8a7f617f	Remove internal uses of CTimeZone/HasCTZSet. The only remaining places where we actually look at CTimeZone/HasCTZSet are abstime2tm() and timestamp2tm(). Now that session_timezone is always valid, we can remove these special cases. The caller-visible impact of this is that these functions now always return a valid zone abbreviation if requested, whereas before they'd return a NULL pointer if a brute-force timezone was in use. In the existing code, the only place I can find that changes behavior is to_char(), whose TZ format code will now print something useful rather than nothing for such zones. (In the places where the returned zone abbreviation is passed to EncodeDateTime, the lack of visible change is because we've chosen the abbreviation used for these zones to match what EncodeTimezone would have printed.) It's likely that there is now a fair amount of removable dead code around the call sites, namely anything that's meant to cope with getting a NULL timezone abbreviation, but I've not made an effort to root that out. This could be back-patched if we decide we'd like to fix to_char()'s behavior in the back branches, but there doesn't seem to be much enthusiasm for that at present.	2013-11-01 12:51:27 -04:00
Tom Lane	631dc390f4	Fix some odd behaviors when using a SQL-style simple GMT offset timezone. Formerly, when using a SQL-spec timezone setting with a fixed GMT offset (called a "brute force" timezone in the code), the session_timezone variable was not updated to match the nominal timezone; rather, all code was expected to ignore session_timezone if HasCTZSet was true. This is of course obviously fragile, though a search of the code finds only timeofday() failing to honor the rule. A bigger problem was that DetermineTimeZoneOffset() supposed that if its pg_tz parameter was pointer-equal to session_timezone, then HasCTZSet should override the parameter. This would cause datetime input containing an explicit zone name to be treated as referencing the brute-force zone instead, if the zone name happened to match the session timezone that had prevailed before installing the brute-force zone setting (as reported in bug #8572). The same malady could affect AT TIME ZONE operators. To fix, set up session_timezone so that it matches the brute-force zone specification, which we can do using the POSIX timezone definition syntax "<abbrev>offset", and get rid of the bogus lookaside check in DetermineTimeZoneOffset(). Aside from fixing the erroneous behavior in datetime parsing and AT TIME ZONE, this will cause the timeofday() function to print its result in the user-requested time zone rather than some previously-set zone. It might also affect results in third-party extensions, if there are any that make use of session_timezone without considering HasCTZSet, but in all cases the new behavior should be saner than before. Back-patch to all supported branches.	2013-11-01 12:13:18 -04:00
Robert Haas	cacbdd7810	Use appendStringInfoString instead of appendStringInfo where possible. This shaves a few cycles, and generally seems like good programming practice. David Rowley	2013-10-31 10:55:59 -04:00
Robert Haas	343bb134ea	Avoid too-large shift on 32-bit Windows. Apparently, shifts greater than or equal to the width of the type are undefined, and can surprisingly produce a non-zero value. Amit Kapila, with a comment by me.	2013-10-30 09:14:56 -04:00
Tom Lane	6756c8ad30	Fix old typo in comment. NFAs have children, but their individual states don't.	2013-10-29 15:34:18 -04:00
Tom Lane	9a9473f3cc	Prevent using strncpy with src == dest in TupleDescInitEntry. The C and POSIX standards state that strncpy's behavior is undefined when source and destination areas overlap. While it remains dubious whether any implementations really misbehave when the pointers are exactly equal, some platforms are now starting to force the issue by complaining when an undefined call occurs. (In particular OS X 10.9 has been seen to dump core here, though the exact set of circumstances needed to trigger that remain elusive. Similar behavior can be expected to be optional on Linux and other platforms in the near future.) So tweak the code to explicitly do nothing when nothing need be done. Back-patch to all active branches. In HEAD, this also lets us get rid of an exception in valgrind.supp. Per discussion of a report from Matthias Schmitt.	2013-10-28 20:49:24 -04:00
Robert Haas	d2aecaea15	Modify dynamic shared memory code to use Size rather than uint64. This is more consistent with what we do elsewhere.	2013-10-28 12:12:06 -04:00
Tom Lane	c2b51cf190	Improve documentation about usage of FDW validator functions. SGML documentation, as well as code comments, failed to note that an FDW's validator will be applied to foreign-table options for foreign tables using the FDW. Etsuro Fujita	2013-10-28 10:28:35 -04:00
Noah Misch	c50b7c09d8	Add large object functions catering to SQL callers. With these, one need no longer manipulate large object descriptors and extract numeric constants from header files in order to read and write large object contents from SQL. Pavel Stehule, reviewed by Rushabh Lathia.	2013-10-27 22:56:54 -04:00
Tom Lane	9c339eb4f8	Use unaligned output in selected regression queries to reduce diff noise. The rules regression test prints all known views and rules, which is a set that changes regularly. Previously, a change in one rule would frequently lead to whitespace changes across the entire output of this query, which is painful to verify and causes undesirable conflicts between unrelated patch sets. Use \a mode to improve matters. Also use \t mode to suppress the total-rows count, which was also a source of unnecessary patch conflicts. Likewise modify the output mode for the list of indexed tables generated in sanity_check.sql. There might be other places where we should use this idea, but these are the ones that have caused the most problems. Andres Freund	2013-10-26 11:24:04 -04:00
Tom Lane	9f9d9b51f0	Improve pqexpbuffer.c to use modern vsnprintf implementations efficiently. When using a C99-compliant vsnprintf, we can use its report of the required buffer size to avoid making multiple loops through the formatting logic. This is similar to the changes recently made in stringinfo.c, but we can't use psprintf.c here because in libpq we don't want to exit() on error. (The behavior pqexpbuffer.c has historically used is to mark the PQExpBuffer as "broken", ie empty, if it runs into any fatal problem.) To avoid duplicating code more than necessary, I refactored printfPQExpBuffer and appendPQExpBuffer to share a subroutine that's very similar to psprintf.c's pvsnprintf in spirit.	2013-10-25 17:42:26 -04:00
Tom Lane	43fe90f66a	Suppress -0 in the C field of lines computed by line_construct_pts(). It's not entirely clear why some PPC machines are generating -0 here, since the underlying computation should be exactly 0 - 0. Perhaps there's some wider-than-nominal-precision calculations happening? Anyway, the best way to avoid platform-dependent results seems to be to explicitly reset -0 to regular zero.	2013-10-25 15:55:15 -04:00
Tom Lane	1f7a47912a	Revert "Tweak "line" test to avoid negative zeros on some platforms" This reverts commit `a0a546f0d9`. It seems better to tweak the code to suppress -0 results during line_construct_pts(), which I'll do in the next commit.	2013-10-25 15:50:31 -04:00
Peter Eisentraut	a0a546f0d9	Tweak "line" test to avoid negative zeros on some platforms	2013-10-25 07:08:40 -04:00
Tom Lane	5e1e47c7c0	Ignore SIGSYS during initdb. This prevents the recently-added probe for shm_open() from crashing on platforms that are impolite enough to deliver a signal rather than returning ENOSYS for an unimplemented kernel call. At least on the one known example (HPUX 10.20), ignoring SIGSYS does result in the desired behavior of getting an ENOSYS error return instead. Per discussion, we might later wish to do this in the backend as well, but for now it seems sufficient to do it in initdb.	2013-10-24 21:51:30 -04:00
Tom Lane	3147acd63e	Use improved vsnprintf calling logic in more places. When we are using a C99-compliant vsnprintf implementation (which should be most places, these days) it is worth the trouble to make use of its report of how large the buffer needs to be to succeed. This patch adjusts stringinfo.c and some miscellaneous usages in pg_dump to do that, relying on the logic recently added in libpgcommon's psprintf.c. Since these places want to know the number of bytes written once we succeed, modify the API of pvsnprintf() to report that. There remains near-duplicate logic in pqexpbuffer.c, but since that code is in libpq, psprintf.c's approach of exit()-on-error isn't appropriate for use there. Also note that I didn't bother touching the multitude of places that call (v)snprintf without any attempt to provide a resizable buffer. Release-note-worthy incompatibility: the API of appendStringInfoVA() changed. If there's any third-party code that's calling that directly, it will need tweaking along the same lines as in this patch. David Rowley and Tom Lane	2013-10-24 21:43:57 -04:00
Heikki Linnakangas	98c50656ca	Increase the number of different values used when seeding random(). When a backend process is forked, we initialize the system's random number generator with srandom(). The seed used is derived from the backend's pid and the timestamp. However, we only used the microseconds part of the timestamp, and it was XORed with the pid, so the total range of different seed values chosen was 0-999999. That's quite limited. Change the code to also use the seconds part of the timestamp in the seed, and shift the microseconds so that all 32 bits of the seed are used. Honza Horak	2013-10-24 17:00:18 +03:00
Heikki Linnakangas	138184adc5	Plug memory leak when reloading config file. The absolute path to config file was not pfreed. There are probably more small leaks here and there in the config file reload code and assign hooks, and in practice no-one reloads the config files frequently enough for it to be a problem, but this one is trivial enough that might as well fix it. Backpatch to 9.3 where the leak was introduced.	2013-10-24 15:27:40 +03:00
Heikki Linnakangas	bb598456dc	Fix memory leak when an empty ident file is reloaded. Hari Babu	2013-10-24 14:03:26 +03:00
Heikki Linnakangas	4d6d425ab8	Fix typos in comments.	2013-10-24 11:50:02 +03:00

... 2 3 4 5 6 ...

24930 Commits