postgresql

Commit Graph

Author	SHA1	Message	Date
Bruce Momjian	dba5a9dda9	Comment: COPY comment improvement Etsuro Fujita	2013-12-17 12:51:16 -05:00
Alvaro Herrera	3b97e6823b	Rework tuple freezing protocol Tuple freezing was broken in connection to MultiXactIds; commit `8e53ae025d` tried to fix it, but didn't go far enough. As noted by Noah Misch, freezing a tuple whose Xmax is a multi containing an aborted update might cause locks in the multi to go ignored by later transactions. This is because the code depended on a multixact above their cutoff point not having any lock-only member older than the cutoff point for Xids, which is easily defeated in READ COMMITTED transactions. The fix for this involves creating a new MultiXactId when necessary. But this cannot be done during WAL replay, and moreover multixact examination requires using CLOG access routines which are not supposed to be used during WAL replay either; so tuple freezing cannot be done with the old freeze WAL record. Therefore, separate the freezing computation from its execution, and change the WAL record to carry all necessary information. At WAL replay time, it's easy to re-execute freezing because we don't need to re-compute the new infomask/Xmax values but just take them from the WAL record. While at it, restructure the coding to ensure all page changes occur in a single critical section without much room for failures. The previous coding wasn't using a critical section, without any explanation as to why this was acceptable. In replication scenarios using the 9.3 branch, standby servers must be upgraded before their master, so that they are prepared to deal with the new WAL record once the master is upgraded; failure to do so will cause WAL replay to die with a PANIC message. Later upgrade of the standby will allow the process to continue where it left off, so there's no disruption of the data in the standby in any case. Standbys know how to deal with the old WAL record, so it's okay to keep the master running the old code for a while. In master, the old freeze WAL record is gone, for cleanliness' sake; there's no compatibility concern there. Backpatch to 9.3, where the original bug was introduced and where the previous fix was backpatched. Álvaro Herrera and Andres Freund	2013-12-16 11:29:50 -03:00
Heikki Linnakangas	30b96549ab	Mark variables 'static' where possible. Move GinFuzzySearchLimit to ginget.c Per "clang -Wmissing-variable-declarations" output, posted by Andres Freund. I didn't silence all those warnings, though, only the most obvious cases.	2013-12-16 11:41:17 +02:00
Tatsuo Ishii	1f0626ee40	Add "SHIFT_JIS" as an accepted encoding name for locale checking. When locale is "ja_JP.SJIS", nl_langinfo(CODESET) returns "SHIFT_JIS" on some platforms, at least on RedHat Linux. So the encoding/locale match table (encoding_match_list) needs the entry. Otherwise client encoding is set to SQL_ASCII. Back patch to all supported branches.	2013-12-15 11:09:05 +09:00
Tom Lane	1b4f7f93b4	Allow empty target list in SELECT. This fixes a problem noted as a followup to bug #8648: if a query has a semantically-empty target list, e.g. SELECT * FROM zero_column_table, ruleutils.c will dump it as a syntactically-empty target list, which was not allowed. There doesn't seem to be any reliable way to fix this by hacking ruleutils (note in particular that the originally zero-column table might since have had columns added to it); and even if we had such a fix, it would do nothing for existing dump files that might contain bad syntax. The best bet seems to be to relax the syntactic restriction. Also, add parse-analysis errors for SELECT DISTINCT with no columns (after *-expansion) and RETURNING with no columns. These cases previously produced unexpected behavior because the parsed Query looked like it had no DISTINCT or RETURNING clause, respectively. If anyone ever offers a plausible use-case for this, we could work a bit harder on making the situation distinguishable. Arguably this is a bug fix that should be back-patched, but I'm worried that there may be client apps or PLs that expect "SELECT ;" to throw a syntax error. The issue doesn't seem important enough to risk changing behavior in minor releases.	2013-12-14 20:23:26 -05:00
Tom Lane	c03ad5602f	Fix inherited UPDATE/DELETE with UNION ALL subqueries. Fix an oversight in commit b3aaf9081a1a95c245fd605dcf02c91b3a5c3a29: we do indeed need to process the planner's append_rel_list when copying RTE subqueries, because if any of them were flattenable UNION ALL subqueries, the append_rel_list shows which subquery RTEs were pulled up out of which other ones. Without this, UNION ALL subqueries aren't correctly inserted into the update plans for inheritance child tables after the first one, typically resulting in no update happening for those child table(s). Per report from Victor Yegorov. Experimentation with this case also exposed a fault in commit a7b965382cf0cb30aeacb112572718045e6d4be7: if an inherited UPDATE/DELETE was proven totally dummy by constraint exclusion, we might arrive at add_rtes_to_flat_rtable with root->simple_rel_array being NULL. This should be interpreted as not having any RelOptInfos. I chose to code the guard as a check against simple_rel_array_size, so as to also provide some protection against indexing off the end of the array. Back-patch to 9.2 where the faulty code was added.	2013-12-14 17:33:53 -05:00
Alvaro Herrera	60eea3780c	Fix typo	2013-12-13 17:27:16 -03:00
Alvaro Herrera	d881dd6233	Rework MultiXactId cache code The original performs too poorly; in some scenarios it shows way too high while profiling. Try to make it a bit smarter to avoid excessive cosst. In particular, make it have a maximum size, and have entries be sorted in LRU order; once the max size is reached, evict the oldest entry to avoid it from growing too large. Per complaint from Andres Freund in connection with new tuple freezing code.	2013-12-13 17:16:25 -03:00
Tom Lane	2efc6dc256	Add HOLD/RESUME_INTERRUPTS in HandleCatchupInterrupt/HandleNotifyInterrupt. This prevents a possible longjmp out of the signal handler if a timeout or SIGINT occurs while something within the handler has transiently set ImmediateInterruptOK. For safety we must hold off the timeout or cancel error until we're back in mainline, or at least till we reach the end of the signal handler when ImmediateInterruptOK was true at entry. This syncs these functions with the logic now present in handle_sig_alarm. AFAICT there is no live bug here in 9.0 and up, because I don't think we currently can wait for any heavyweight lock inside these functions, and there is no other code (except read-from-client) that will turn on ImmediateInterruptOK. However, that was not true pre-9.0: in older branches ProcessIncomingNotify might block trying to lock pg_listener, and then a SIGINT could lead to undesirable control flow. It might be all right anyway given the relatively narrow code ranges in which NOTIFY interrupts are enabled, but for safety's sake I'm back-patching this.	2013-12-13 14:05:51 -05:00
Heikki Linnakangas	dde6282500	Fix more instances of "the the" in comments. Plus one instance of "to to" in the docs.	2013-12-13 20:02:01 +02:00
Tom Lane	e8312b4f03	Don't let timeout interrupts happen unless ImmediateInterruptOK is set. Serious oversight in commit 16e1b7a1b7f7ffd8a18713e83c8cd72c9ce48e07: we should not allow an interrupt to take control away from mainline code except when ImmediateInterruptOK is set. Just to be safe, let's adopt the same save-clear-restore dance that's been used for many years in HandleCatchupInterrupt and HandleNotifyInterrupt, so that nothing bad happens if a timeout handler invokes code that tests or even manipulates ImmediateInterruptOK. Per report of "stuck spinlock" failures from Christophe Pettus, though many other symptoms are possible. Diagnosis by Andres Freund.	2013-12-13 11:50:15 -05:00
Heikki Linnakangas	50e547096c	Add GUC to enable WAL-logging of hint bits, even with checksums disabled. WAL records of hint bit updates is useful to tools that want to examine which pages have been modified. In particular, this is required to make the pg_rewind tool safe (without checksums). This can also be used to test how much extra WAL-logging would occur if you enabled checksums, without actually enabling them (which you can't currently do without re-initdb'ing). Sawada Masahiko, docs by Samrat Revagade. Reviewed by Dilip Kumar, with further changes by me.	2013-12-13 16:26:14 +02:00
Magnus Hagander	56afe8509e	Fix double "the" in the documentation Erik Rijkers	2013-12-13 15:01:56 +01:00
Heikki Linnakangas	a49633d8dc	Fix WAL-logging of setting the visibility map bit. The operation that removes the remaining dead tuples from the page must be WAL-logged before the setting of the VM bit. Otherwise, if you replay the WAL to between those two records, you end up with the VM bit set, but the dead tuples are still there. Backpatch to 9.3, where this bug was introduced.	2013-12-13 14:15:04 +02:00
Peter Eisentraut	46328916ee	configure: Allow adding a custom string to PG_VERSION This can be used to mark custom built binaries with an extra version string such as a git describe identifier or distribution package release version. From: Oskari Saarenmaa <os@ohmu.fi>	2013-12-12 22:01:27 -05:00
Tom Lane	ccca6f56f5	Fix ancient docs/comments thinko: XID comparison is mod 2^32, not 2^31. Pointed out by Gianni Ciolli.	2013-12-12 12:39:48 -05:00
Tom Lane	f26099057a	Improve EXPLAIN to print the grouping columns in Agg and Group nodes. Per request from Kevin Grittner.	2013-12-12 11:24:38 -05:00
Simon Riggs	8693559cac	New autovacuum_work_mem parameter If autovacuum_work_mem is set, autovacuum workers now use this parameter in preference to maintenance_work_mem. Peter Geoghegan	2013-12-12 11:42:39 +00:00
Simon Riggs	36da3cfb45	Allow time delayed standbys and recovery Set min_recovery_apply_delay to force a delay in recovery apply for commit and restore point WAL records. Other records are replayed immediately. Delay is measured between WAL record time and local standby time. Robert Haas, Fabrízio de Royes Mello and Simon Riggs Detailed review by Mitsumasa Kondo	2013-12-12 10:53:20 +00:00
Tatsuo Ishii	841a65482d	Fix progress logging when scale factor is large. Integer overflow showed minus percent and minus remaining time something like this. 239300000 of 3800000000 tuples (-48%) done (elapsed 226.86 s, remaining -696.10 s).	2013-12-12 19:10:35 +09:00
Heikki Linnakangas	108e3992cd	Display old and new values in pg_resetxlog -n output. For extra clarity. Rajeev Rastogi, reviewed by Amit Kapila	2013-12-12 11:57:18 +02:00
Tom Lane	22310b808d	Remove bogus executable permissions on xlog.c. Apparently fat-fingered in `1a3d104475`. Noted by Peter Geoghegan.	2013-12-11 22:12:25 -05:00
Tom Lane	6bff0e7d92	Add a regression test case for plpython function returning setof RECORD. We had coverage for functions returning setof a named composite type, but not for anonymous records, which is a somewhat different code path. In view of recent crash report from Sergey Konoplev, this seems worth testing, though I doubt there's any deterministic bug here today.	2013-12-11 17:22:55 -05:00
Simon Riggs	cf589c9c1f	Regression tests for SCHEMA commands Hari Babu Kommi reviewed by David Rowley	2013-12-11 20:45:15 +00:00
Simon Riggs	b921a26fb8	Regression tests for ALTER TABLESPACE RENAME,OWNER Hari Babu Kommi reviewed by David Rowley	2013-12-11 20:42:58 +00:00
Tom Lane	b5e0a2a384	Tweak placement of explicit ANALYZE commands in the regression tests. Make the COPY test, which loads most of the large static tables used in the tests, also explicitly ANALYZE those tables. This allows us to get rid of various ad-hoc, and rather redundant, ANALYZE commands that had gotten stuck into various test scripts over time to ensure we got consistent plan choices. (We could have done a database-wide ANALYZE, but that would cause stats to get attached to the small static tables too, which results in plan changes compared to the historical behavior. I'm not sure that's a good idea, so not going that far for now.) Back-patch to 9.0, since 9.0 and 9.1 are currently sometimes failing regression tests for lack of an "ANALYZE tenk1" in the subselect test. There's no need for this in 8.4 since we didn't print any plans back then.	2013-12-11 15:09:15 -05:00
Robert Haas	60dd40bbda	Under wal_level=logical, when saving old tuples, always save OID. There's no real point in not doing this. It doesn't cost anything in performance or space. So let's go wild. Andres Freund, with substantial editing as to style by me.	2013-12-11 13:19:31 -05:00
Kevin Grittner	09df854b8a	Add table name to VACUUM statement in matview.c. The test only needs the one table to be vacuumed. Vacuuming the database may affect other tests. Per gripe from Tom Lane. Back-patch to 9.3, where the test was was added.	2013-12-11 08:53:03 -06:00
Peter Eisentraut	e5dc4cc24d	PL/Perl: Add event trigger support From: Dimitri Fontaine <dimitri@2ndQuadrant.fr>	2013-12-11 08:11:59 -05:00
Robert Haas	6bea96dd49	Add a new option, -g, to createuser, to add membership in a role. Chistopher Browne, reviewed by Sameer Thakur, Amit Kapila, and Peter Eisentraut.	2013-12-11 07:50:36 -05:00
Peter Eisentraut	a06af43695	doc: Fix DocBook table column count declaration This was broken in `d6464fdc0a`.	2013-12-10 21:47:45 -05:00
Robert Haas	66abc2608c	Add a new reloption, user_catalog_table. When this reloption is set and wal_level=logical is configured, we'll record the CIDs stamped by inserts, updates, and deletes to the table just as we would for an actual catalog table. This will allow logical decoding to use historical MVCC snapshots to access such tables just as they access ordinary catalog tables. Replication solutions built around the logical decoding machinery will likely need to set this operation for their configuration tables; it might also be needed by extensions which perform table access in their output functions. Andres Freund, reviewed by myself and others.	2013-12-10 19:17:34 -05:00
Robert Haas	e55704d8b2	Add new wal_level, logical, sufficient for logical decoding. When wal_level=logical, we'll log columns from the old tuple as configured by the REPLICA IDENTITY facility added in commit `07cacba983`. This makes it possible a properly-configured logical replication solution to correctly follow table updates even if they change the chosen key columns, or, with REPLICA IDENTITY FULL, even if the table has no key at all. Note that updates which do not modify the replica identity column won't log anything extra, making the choice of a good key (i.e. one that will rarely be changed) important to performance when wal_level=logical is configured. Each insert, update, or delete to a catalog table will also log the CMIN and/or CMAX values of stamped by the current transaction. This is necessary because logical decoding will require access to historical snapshots of the catalog in order to decode some data types, and the CMIN/CMAX values that we may need in order to judge row visibility may have been overwritten by the time we need them. Andres Freund, reviewed in various versions by myself, Heikki Linnakangas, KONDO Mitsumasa, and many others.	2013-12-10 19:01:40 -05:00
Tom Lane	9ec6199d18	Fix possible crash with nested SubLinks. An expression such as WHERE (... x IN (SELECT ...) ...) IN (SELECT ...) could produce an invalid plan that results in a crash at execution time, if the planner attempts to flatten the outer IN into a semi-join. This happens because convert_testexpr() was not expecting any nested SubLinks and would wrongly replace any PARAM_SUBLINK Params belonging to the inner SubLink. (I think the comment denying that this case could happen was wrong when written; it's certainly been wrong for quite a long time, since very early versions of the semijoin flattening logic.) Per report from Teodor Sigaev. Back-patch to all supported branches.	2013-12-10 16:10:17 -05:00
Noah Misch	53685d7981	Rename TABLE() to ROWS FROM(). SQL-standard TABLE() is a subset of UNNEST(); they deal with arrays and other collection types. This feature, however, deals with set-returning functions. Use a different syntax for this feature to keep open the possibility of implementing the standard TABLE().	2013-12-10 09:34:37 -05:00
Bruce Momjian	01cc1fecfd	pgcrypto docs: update cpu type used in duration testing	2013-12-09 16:12:24 -05:00
Bruce Momjian	d6464fdc0a	pgcrypto docs: update encryption timings and add relative times Miles Elam	2013-12-09 16:10:47 -05:00
Robert Haas	d9250da032	Fixups for dsm.c's file descriptor handling. Per complaint from Tom Lane.	2013-12-09 11:15:19 -05:00
Magnus Hagander	33d3f5594a	Fix pg_stat_statements build on 32-bit systems Peter Geoghegan	2013-12-08 11:59:07 +01:00
Joe Conway	d6ca510d9d	Fix performance regression in dblink connection speed. Previous commit `e5de601267` modified dblink to ensure client encoding matched the server. However the added PQsetClientEncoding() call added significant overhead. Restore original performance in the common case where client encoding already matches server encoding by doing nothing in that case. Applies to all active branches. Issue reported and work sponsored by Zonar Systems.	2013-12-07 17:00:26 -08:00
Magnus Hagander	54aa5ef7f2	Fix a couple of typos Noted by Peter Geoghegan	2013-12-07 23:08:17 +01:00
Peter Eisentraut	3164721462	SSL: Support ECDH key exchange This sets up ECDH key exchange, when compiling against OpenSSL that supports EC. Then the ECDHE-RSA and ECDHE-ECDSA cipher suites can be used for SSL connections. The latter one means that EC keys are now usable. The reason for EC key exchange is that it's faster than DHE and it allows to go to higher security levels where RSA will be horribly slow. There is also new GUC option ssl_ecdh_curve that specifies the curve name used for ECDH. It defaults to "prime256v1", which is the most common curve in use in HTTPS. From: Marko Kreen <markokr@gmail.com> Reviewed-by: Adrian Klaver <adrian.klaver@gmail.com>	2013-12-07 15:11:44 -05:00
Fujii Masao	91484409bd	Expose qurey ID in pg_stat_statements view. The query ID is the internal hash identifier of the statement, and was not available in pg_stat_statements view so far. Daniel Farina, Sameer Thakur and Peter Geoghegan, reviewed by me.	2013-12-08 02:06:02 +09:00
Peter Eisentraut	ef3267523d	SSL: Add configuration option to prefer server cipher order By default, OpenSSL (and SSL/TLS in general) lets the client cipher order take priority. This is OK for browsers where the ciphers were tuned, but few PostgreSQL client libraries make the cipher order configurable. So it makes sense to have the cipher order in postgresql.conf take priority over client defaults. This patch adds the setting "ssl_prefer_server_ciphers" that can be turned on so that server cipher order is preferred. Per discussion, this now defaults to on. From: Marko Kreen <markokr@gmail.com> Reviewed-by: Adrian Klaver <adrian.klaver@gmail.com>	2013-12-07 08:13:50 -05:00
Bruce Momjian	8fe3d90d34	docs: update partition encryption options Text from Adam Vande More	2013-12-06 09:47:39 -05:00
Bruce Momjian	fa4add50c4	docs: clarify SSL certificate authority chain docs Previously, the requirements of how intermediate certificates were handled and their chain to root certificates was unclear.	2013-12-06 09:42:08 -05:00
Alvaro Herrera	312bde3d40	Fix improper abort during update chain locking In `247c76a989`, I added some code to do fine-grained checking of MultiXact status of locking/updating transactions when traversing an update chain. There was a thinko in that patch which would have the traversing abort, that is return HeapTupleUpdated, when the other transaction is a committed lock-only. In this case we should ignore it and return success instead. Of course, in the case where there is a committed update, HeapTupleUpdated is the correct return value. A user-visible symptom of this bug is that in REPEATABLE READ and SERIALIZABLE transaction isolation modes spurious serializability errors can occur: ERROR: could not serialize access due to concurrent update In order for this to happen, there needs to be a tuple that's key-share- locked and also updated, and the update must abort; a subsequent transaction trying to acquire a new lock on that tuple would abort with the above error. The reason is that the initial FOR KEY SHARE is seen as committed by the new locking transaction, which triggers this bug. (If the UPDATE commits, then the serialization error is correctly reported.) When running a query in READ COMMITTED mode, what happens is that the locking is aborted by the HeapTupleUpdated return value, then EvalPlanQual fetches the newest version of the tuple, which is then the only version that gets locked. (The second time the tuple is checked there is no misbehavior on the committed lock-only, because it's not checked by the code that traverses update chains; so no bug.) Only the newest version of the tuple is locked, not older ones, but this is harmless. The isolation test added by this commit illustrates the desired behavior, including the proper serialization errors that get thrown. Backpatch to 9.3.	2013-12-05 17:47:51 -03:00
Tom Lane	74242c23c1	Clear retry flags properly in replacement OpenSSL sock_write function. Current OpenSSL code includes a BIO_clear_retry_flags() step in the sock_write() function. Either we failed to copy the code correctly, or they added this since we copied it. In any case, lack of the clear step appears to be the cause of the server lockup after connection loss reported in bug #8647 from Valentine Gogichashvili. Assume that this is correct coding for all OpenSSL versions, and hence back-patch to all supported branches. Diagnosis and patch by Alexander Kukushkin.	2013-12-05 12:48:28 -05:00
Alvaro Herrera	07aeb1fec5	Avoid resetting Xmax when it's a multi with an aborted update HeapTupleSatisfiesUpdate can very easily "forget" tuple locks while checking the contents of a multixact and finding it contains an aborted update, by setting the HEAP_XMAX_INVALID bit. This would lead to concurrent transactions not noticing any previous locks held by transactions that might still be running, and thus being able to acquire subsequent locks they wouldn't be normally able to acquire. This bug was introduced in commit 1ce150b7bb; backpatch this fix to 9.3, like that commit. This change reverts the change to the delete-abort-savept isolation test in `1ce150b7bb`, because that behavior change was caused by this bug. Noticed by Andres Freund while investigating a different issue reported by Noah Misch.	2013-12-05 12:21:55 -03:00
Bruce Momjian	86ef4796f5	build: pass EXTRA_REGRESS_OPTS to secondary regression tests Christoph Berg	2013-12-04 10:14:45 -05:00

1 2 3 4 5 ...

35890 Commits All Branches Search

35890 Commits

All Branches