postgresql

Commit Graph

Author	SHA1	Message	Date
Robert Haas	240067b3b0	Merge synchronous_replication setting into synchronous_commit. This means one less thing to configure when setting up synchronous replication, and also avoids some ambiguity around what the behavior should be when the settings of these variables conflict. Fujii Masao, with additional hacking by me.	2011-04-04 16:25:52 -04:00
Heikki Linnakangas	647f8b3dba	Reword the phrase on zero replication_timeout in the docs.	2011-03-31 10:19:29 +03:00
Heikki Linnakangas	754baa21f7	Automatically terminate replication connections that are idle for more than replication_timeout (a new GUC) milliseconds. The TCP timeout is often too long, you want the master to notice a dead connection much sooner. People complained about that in 9.0 too, but with synchronous replication it's even more important to notice dead connections promptly. Fujii Masao and Heikki Linnakangas	2011-03-30 10:20:37 +03:00
Robert Haas	9a56dc3389	Fix various possible problems with synchronous replication. 1. Don't ignore query cancel interrupts. Instead, if the user asks to cancel the query after we've already committed it, but before it's on the standby, just emit a warning and let the COMMIT finish. 2. Don't ignore die interrupts (pg_terminate_backend or fast shutdown). Instead, emit a warning message and close the connection without acknowledging the commit. Other backends will still see the effect of the commit, but there's no getting around that; it's too late to abort at this point, and ignoring die interrupts altogether doesn't seem like a good idea. 3. If synchronous_standby_names becomes empty, wake up all backends waiting for synchronous replication to complete. Without this, someone attempting to shut synchronous replication off could easily wedge the entire system instead. 4. Avoid depending on the assumption that if a walsender updates MyProc->syncRepState, we'll see the change even if we read it without holding the lock. The window for this appears to be quite narrow (and probably doesn't exist at all on machines with strong memory ordering) but protecting against it is practically free, so do that. 5. Remove useless state SYNC_REP_MUST_DISCONNECT, which isn't needed and doesn't actually do anything. There's still some further work needed here to make the behavior of fast shutdown plausible, but that looks complex, so I'm leaving it for a separate commit. Review by Fujii Masao.	2011-03-17 13:12:21 -04:00
Bruce Momjian	e148443ddd	Document guc context values, and reference them from the config doc section. Tom Lane	2011-03-17 00:27:01 -04:00
Bruce Momjian	df4a9595c2	Wording adjustment for restart_after_crash entry Specifically, mention that "restart" is disabled by this parameter.	2011-03-15 12:43:39 -04:00
Robert Haas	f0f3617135	Minor sync rep documentation improvements. - Make the name of the ID tag for the GUC entry match the GUC name. - Clarify that synchronous_replication waits for xlog flush, not receipt. - Mention that synchronous_replication won't wait if max_wal_senders=0.	2011-03-15 11:25:04 -04:00
Bruce Momjian	7a8f43968a	In docs, rename "backwards compatibility" to "backward compatibility" for consistency.	2011-03-11 14:33:10 -05:00
Bruce Momjian	a1bb5a480d	Document how listen_addresses can do only IPv4 or IPv6.	2011-03-11 10:31:43 -05:00
Itagaki Takahiro	1144726d07	synchronous_standby_names is a string parameter.	2011-03-09 19:49:16 +09:00
Robert Haas	c74d3aceb9	Synchronous replication doc corrections. Thom Brown	2011-03-07 11:59:58 -05:00
Simon Riggs	a8a8a3e096	Efficient transaction-controlled synchronous replication. If a standby is broadcasting reply messages and we have named one or more standbys in synchronous_standby_names then allow users who set synchronous_replication to wait for commit, which then provides strict data integrity guarantees. Design avoids sending and receiving transaction state information so minimises bookkeeping overheads. We synchronize with the highest priority standby that is connected and ready to synchronize. Other standbys can be defined to takeover in case of standby failure. This version has very strict behaviour; more relaxed options may be added at a later date. Simon Riggs and Fujii Masao, with reviews by Yeb Havinga, Jaime Casanova, Heikki Linnakangas and Robert Haas, plus the assistance of many other design reviewers.	2011-03-06 22:49:16 +00:00
Robert Haas	59d6a75942	Avoid excessive Hot Standby feedback messages. Without this patch, when wal_receiver_status_interval=0, indicating that no status messages should be sent, Hot Standby feedback messages are instead sent extremely frequently. Fujii Masao, with documentation changes by me.	2011-03-01 11:34:25 -05:00
Heikki Linnakangas	be6668d6ef	Increase the default for wal_sender_delay from 200ms to 1s. Now that WAL sender is immediately woken up by transaction commit, there's no need to wake up so aggressively.	2011-02-26 23:38:25 +02:00
Simon Riggs	bca8b7f16a	Hot Standby feedback for avoidance of cleanup conflicts on standby. Standby optionally sends back information about oldestXmin of queries which is then checked and applied to the WALSender's proc->xmin. GetOldestXmin() is modified slightly to agree with GetSnapshotData(), so that all backends on primary include WALSender within their snapshots. Note this does nothing to change the snapshot xmin on either master or standby. Feedback piggybacks on the standby reply message. vacuum_defer_cleanup_age is no longer used on standby, though parameter still exists on primary, since some use cases still exist. Simon Riggs, review comments from Fujii Masao, Heikki Linnakangas, Robert Haas	2011-02-16 19:29:37 +00:00
Robert Haas	883a9659fa	Assorted corrections to the patch to add WAL receiver replies. Per reports from Fujii Masao.	2011-02-15 12:05:00 -05:00
Robert Haas	6a77e9385e	Rename max_predicate_locks_per_transaction. The new name, max_pred_locks_per_transaction, is shorter. Kevin Grittner, per discussion.	2011-02-15 08:04:55 -05:00
Heikki Linnakangas	b186523fd9	Send status updates back from standby server to master, indicating how far the standby has written, flushed, and applied the WAL. At the moment, this is for informational purposes only, the values are only shown in pg_stat_replication system view, but in the future they will also be needed for synchronous replication. Extracted from Simon riggs' synchronous replication patch by Robert Haas, with some tweaking by me.	2011-02-10 21:04:02 +02:00
Robert Haas	32896c40ca	Avoid having autovacuum workers wait for relation locks. Waiting for relation locks can lead to starvation - it pins down an autovacuum worker for as long as the lock is held. But if we're doing an anti-wraparound vacuum, then we still wait; maintenance can no longer be put off. To assist with troubleshooting, if log_autovacuum_min_duration >= 0, we log whenever an autovacuum or autoanalyze is skipped for this reason. Per a gripe by Josh Berkus, and ensuing discussion.	2011-02-07 22:04:29 -05:00
Heikki Linnakangas	dafaa3efb7	Implement genuine serializable isolation level. Until now, our Serializable mode has in fact been what's called Snapshot Isolation, which allows some anomalies that could not occur in any serialized ordering of the transactions. This patch fixes that using a method called Serializable Snapshot Isolation, based on research papers by Michael J. Cahill (see README-SSI for full references). In Serializable Snapshot Isolation, transactions run like they do in Snapshot Isolation, but a predicate lock manager observes the reads and writes performed and aborts transactions if it detects that an anomaly might occur. This method produces some false positives, ie. it sometimes aborts transactions even though there is no anomaly. To track reads we implement predicate locking, see storage/lmgr/predicate.c. Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared memory is finite, so when a transaction takes many tuple-level locks on a page, the locks are promoted to a single page-level lock, and further to a single relation level lock if necessary. To lock key values with no matching tuple, a sequential scan always takes a relation-level lock, and an index scan acquires a page-level lock that covers the search key, whether or not there are any matching keys at the moment. A predicate lock doesn't conflict with any regular locks or with another predicate locks in the normal sense. They're only used by the predicate lock manager to detect the danger of anomalies. Only serializable transactions participate in predicate locking, so there should be no extra overhead for for other transactions. Predicate locks can't be released at commit, but must be remembered until all the transactions that overlapped with it have completed. That means that we need to remember an unbounded amount of predicate locks, so we apply a lossy but conservative method of tracking locks for committed transactions. If we run short of shared memory, we overflow to a new "pg_serial" SLRU pool. We don't currently allow Serializable transactions in Hot Standby mode. That would be hard, because even read-only transactions can cause anomalies that wouldn't otherwise occur. Serializable isolation mode now means the new fully serializable level. Repeatable Read gives you the old Snapshot Isolation level that we have always had. Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and Anssi Kääriäinen	2011-02-08 00:09:08 +02:00
Bruce Momjian	ad76242633	remove tags.	2011-02-06 18:44:43 -05:00
Robert Haas	0af695fd43	Log restartpoints in the same fashion as checkpoints. Prior to 9.0, restartpoints never created, deleted, or recycled WAL files, but now they can. This code makes log_checkpoints treat checkpoints and restartpoints symmetrically. It also adjusts up the documentation of the parameter to mention restartpoints. Fujii Masao. Docs by me, as suggested by Itagaki Takahiro.	2011-02-02 21:08:53 -05:00
Bruce Momjian	d56d246e70	Properly capitalize hyphenated words in documentation titles.	2011-02-01 17:00:26 -05:00
Bruce Momjian	7106f74e2a	Clarify documentation to state that "zero_damaged_pages" does not force data to disk, so the table or index should be recreated before the parameter is turned off again.	2011-02-01 16:44:22 -05:00
Bruce Momjian	6c6e6f7fd3	Document that effective cache size does not assume data remains in the cache between queries.	2011-02-01 15:23:35 -05:00
Itagaki Takahiro	03282bfa89	Add a link from client_encoding parameter to the list of character sets in documentation. Thom Brown	2011-02-01 14:26:17 +09:00
Bruce Momjian	5d5678d7c3	Properly capitalize documentation headings; some only had initial-word capitalization.	2011-01-29 13:01:48 -05:00
Magnus Hagander	048d148fe6	Add pg_basebackup tool for streaming base backups This tool makes it possible to do the pg_start_backup/ copy files/pg_stop_backup step in a single command. There are still some steps to be done before this is a complete backup solution, such as the ability to stream the required WAL logs, but it's still usable, and could do with some buildfarm coverage. In passing, make the checkpoint request optionally fast instead of hardcoding it. Magnus Hagander, reviewed by Fujii Masao and Dimitri Fontaine	2011-01-23 12:21:23 +01:00
Tom Lane	0f73aae13d	Allow the wal_buffers setting to be auto-tuned to a reasonable value. If wal_buffers is initially set to -1 (which is now the default), it's replaced by 1/32nd of shared_buffers, with a minimum of 8 (the old default) and a maximum of the XLOG segment size. The allowed range for manual settings is still from 4 up to whatever will fit in shared memory. Greg Smith, with implementation correction by me.	2011-01-22 20:31:24 -05:00
Robert Haas	7a32ff9732	Revert patch adding support for logging the current role. This reverts commit `a8a8867912`, committed by me earlier today (2011-01-12). This isn't safe inside an aborted transaction. Noted by Tom Lane.	2011-01-12 11:59:21 -05:00
Robert Haas	a8a8867912	Add support for logging the current role. Stephen Frost, with some editorialization by me.	2011-01-12 11:34:53 -05:00
Tom Lane	576477e73c	Force default wal_sync_method to be fdatasync on Linux. Recent versions of the Linux system header files cause xlogdefs.h to believe that open_datasync should be the default sync method, whereas formerly fdatasync was the default on Linux. open_datasync is a bad choice, first because it doesn't actually outperform fdatasync (in fact the reverse), and second because we try to use O_DIRECT with it, causing failures on certain filesystems (e.g., ext4 with data=journal option). This part of the patch is largely per a proposal from Marti Raudsepp. More extensive changes are likely to follow in HEAD, but this is as much change as we want to back-patch. Also clean up confusing code and incorrect documentation surrounding the fsync_writethrough option. Those changes shouldn't result in any actual behavioral change, but I chose to back-patch them anyway to keep the branches looking similar in this area. In 9.0 and HEAD, also do some copy-editing on the WAL Reliability documentation section. Back-patch to all supported branches, since any of them might get used on modern Linux versions.	2010-12-08 20:01:09 -05:00
Simon Riggs	e620ee35b2	Optimize commit_siblings in two ways to improve group commit. First, avoid scanning the whole ProcArray once we know there are at least commit_siblings active; second, skip the check altogether if commit_siblings = 0. Greg Smith	2010-12-08 18:48:03 +00:00
Tom Lane	c623365ff9	Point out in default_tablespace's description that CREATE DATABASE ignores it. Per gripe from Andreas Scherbaum.	2010-11-27 16:08:32 -05:00
Peter Eisentraut	fc946c39ae	Remove useless whitespace at end of lines	2010-11-23 22:34:55 +02:00
Robert Haas	5a12c808cf	Note that effective_io_concurrency only affects bitmap heap scans. Josh Kupershmidt	2010-10-26 21:44:14 -04:00
Bruce Momjian	f75d6a1b19	Add mention of using tools/fsync to test fsync methods. Restructure recent wal_sync_method doc paragraph to be clearer.	2010-10-19 14:56:53 +00:00
Robert Haas	694c56af2b	Improve WAL reliability documentation, and add more cross-references to it. In particular, we are now more explicit about the fact that you may need wal_sync_method=fsync_writethrough for crash-safety on some platforms, including MaxOS X. There's also now an explicit caution against assuming that the default setting of wal_sync_method is either crash-safe or best for performance.	2010-10-07 12:22:00 -04:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Tom Lane	73b3bd5574	Document the existence of the socket lock file under unix_socket_directory, which is perhaps not a terribly good spot for it but there doesn't seem to be a better place. Also add a source-code comment pointing out a couple reasons for having a separate lock file. Per suggestion from Greg Smith.	2010-08-26 22:00:19 +00:00
Bruce Momjian	c107c35df3	Update autovacuum_freeze_max_age documentation to mention that the default is low because of pg_clog file removal. Backpatch to 9.0.X.	2010-08-24 13:32:25 +00:00
Tom Lane	005e427a22	Make an editorial pass over the 9.0 release notes. This is mostly about grammar, style, and presentation, though I did find a few small factual errors.	2010-08-23 02:43:25 +00:00
Bruce Momjian	d8986332cb	Document that autovacuum_freeze_max_age is used for pg_clog recycling. We already mentioned xid wraparound.	2010-08-22 02:37:32 +00:00
Tom Lane	79dc97a401	Bring some sanity to the trace_recovery_messages code and docs. Per gripe from Fujii Masao, though this is not exactly his proposed patch. Categorize as DEVELOPER_OPTIONS and set context PGC_SIGHUP, as per Fujii, but set the default to LOG because higher values aren't really sensible (see the code for trace_recovery()). Fix the documentation to agree with the code and to try to explain what the variable actually does. Get rid of no-op calls trace_recovery(LOG), which accomplish nothing except to demonstrate that this option confuses even its author.	2010-08-19 22:55:01 +00:00
Peter Eisentraut	5194b9d049	Spell and markup checking	2010-08-17 04:37:21 +00:00
Tom Lane	e20df55cca	Fix mangled grammar.	2010-08-03 19:02:21 +00:00
Peter Eisentraut	66424a2848	Fix indentation of verbatim block elements Block elements with verbatim formatting (literallayout, programlisting, screen, synopsis) should be aligned at column 0 independent of the surrounding SGML, because whitespace is significant, and indenting them creates erratic whitespace in the output. The CSS stylesheets already take care of indenting the output. Assorted markup improvements to go along with it.	2010-07-29 19:34:41 +00:00
Peter Eisentraut	d33cfbd2e0	Spelling fixes	2010-07-27 19:01:16 +00:00
Peter Eisentraut	f581215660	Remove tab from SGML file	2010-07-24 12:16:20 +00:00
Robert Haas	ce68df468a	Add options to force quoting of all identifiers. I've added a quote_all_identifiers GUC which affects the behavior of the backend, and a --quote-all-identifiers argument to pg_dump and pg_dumpall which sets the GUC and also affects the quoting done internally by those applications. Design by Tom Lane; review by Alex Hunsaker; in response to bug #5488 filed by Hartmut Goebel.	2010-07-22 01:22:35 +00:00

1 2 3 4 5 ...

348 Commits