postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	46825d4978	Clean up some sloppy coding in repl_gram.y. Remove unused copy-and-pasted macro definitions, and improve formatting of recently-added productions. I got interested in this because buildfarm member protosciurus has been crashing in "bison repl_gram.y" since commit `858ec11`. It's a long shot that this will fix that, though maybe the missing trailing semicolon has something to do with it? In any case, there's no need to approve of dead code, nor of code whose formatting isn't even self-consistent let alone consistent with what's around it.	2014-02-02 12:51:14 -05:00
Fujii Masao	63be3b78f6	Fix typos in docs and comments. Thom Brown	2014-02-02 10:28:18 +09:00
Tom Lane	214c7a4f0b	Fix some more bugs in signal handlers and process shutdown logic. WalSndKill was doing things exactly backwards: it should first clear MyWalSnd (to stop signal handlers from touching MyWalSnd->latch), then disown the latch, and only then mark the WalSnd struct unused by clearing its pid field. Also, WalRcvSigUsr1Handler and worker_spi_sighup failed to preserve errno, which is surely a requirement for any signal handler. Per discussion of recent buildfarm failures. Back-patch as far as the relevant code exists.	2014-02-01 16:21:23 -05:00
Robert Haas	858ec11858	Introduce replication slots. Replication slots are a crash-safe data structure which can be created on either a master or a standby to prevent premature removal of write-ahead log segments needed by a standby, as well as (with hot_standby_feedback=on) pruning of tuples whose removal would cause replication conflicts. Slots have some advantages over existing techniques, as explained in the documentation. In a few places, we refer to the type of replication slots introduced by this patch as "physical" slots, because forthcoming patches for logical decoding will also have slots, but with somewhat different properties. Andres Freund and Robert Haas	2014-01-31 22:45:36 -05:00
Fujii Masao	dd515d4082	Change the suffix of auto conf temporary file from "temp" to "tmp". Michael Paquier	2014-01-27 12:39:11 +09:00
Heikki Linnakangas	a472ae1e4e	Fix Hot Standby feedback sending when streaming busily. Commit `6f60fdd701` accidentally removed a call to XLogWalRcvSendHSFeedback() after flushing received WAL to disk. The consequence is that when walsender is busy streaming WAL, it doesn't send HS feedback messages. One is sent if nothing is received from the master for 100ms, but if there's a steady stream of WAL, it never happens. Backpatch to 9.3. Andres Freund and Amit Kapila	2014-01-16 23:15:41 +02:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Magnus Hagander	b168c5ef27	Avoid including tablespaces inside PGDATA twice in base backups If a tablespace was crated inside PGDATA it was backed up both as part of the PGDATA backup and as the backup of the tablespace. Avoid this by skipping any directory inside PGDATA that contains one of the active tablespaces. Dimitri Fontaine and Magnus Hagander	2014-01-07 17:11:32 +01:00
Tatsuo Ishii	65d6e4cb5c	Add ALTER SYSTEM command to edit the server configuration file. Patch contributed by Amit Kapila. Reviewed by Hari Babu, Masao Fujii, Boszormenyi Zoltan, Andres Freund, Greg Smith and others.	2013-12-18 23:42:44 +09:00
Heikki Linnakangas	dde6282500	Fix more instances of "the the" in comments. Plus one instance of "to to" in the docs.	2013-12-13 20:02:01 +02:00
Heikki Linnakangas	a93bdfc711	Fix typo in comment. Also line-wrap an over-wide line in a comment that's ignored by pgindent.	2013-09-03 13:17:09 +03:00
Magnus Hagander	db4ef73760	Don't crash when pg_xlog is empty and pg_basebackup -x is used The backup will not work (without a logarchive, and that's the whole point of -x) in this case, this patch just changes it to throw an error instead of crashing when this happens. Noticed and diagnosed by TAKATSUKA Haruka	2013-08-24 17:13:49 +02:00
Peter Eisentraut	229fb58d4f	Treat timeline IDs as unsigned in replication parser Timeline IDs are unsigned ints everywhere, except the replication parser treated them as signed ints.	2013-08-14 23:18:49 -04:00
Peter Eisentraut	626092a2e1	Message style improvements	2013-07-28 07:01:13 -04:00
Fujii Masao	985bd7d497	Support clean switchover. In replication, when we shutdown the master, walsender tries to send all the outstanding WAL records to the standby, and then to exit. This basically means that all the WAL records are fully synced between two servers after the clean shutdown of the master. So, after promoting the standby to new master, we can restart the stopped master as new standby without the need for a fresh backup from new master. But there was one problem so far: though walsender tries to send all the outstanding WAL records, it doesn't wait for them to be replicated to the standby. Then, before receiving all the WAL records, walreceiver can detect the closure of connection and exit. We cannot guarantee that there is no missing WAL in the standby after clean shutdown of the master. In this case, backup from new master is required when restarting the stopped master as new standby. This patch fixes this problem. It just changes walsender so that it waits for all the outstanding WAL records to be replicated to the standby before closing the replication connection. Per discussion, this is a fix that needs to get backpatched rather than new feature. So, back-patch to 9.1 where enough infrastructure for this exists. Patch by me, reviewed by Andres Freund.	2013-06-26 02:14:37 +09:00
Peter Eisentraut	d7eb6f46de	Minor spell checking	2013-05-30 20:56:58 -04:00
Bruce Momjian	9af4159fce	pgindent run for release 9.3 This is the first run of the Perl-based pgindent script. Also update pgindent instructions.	2013-05-29 16:58:43 -04:00
Heikki Linnakangas	2ffa66f497	Fix walsender failure at promotion. If a standby server has a cascading standby server connected to it, it's possible that WAL has already been sent up to the next WAL page boundary, splitting a WAL record in the middle, when the first standby server is promoted. Don't throw an assertion failure or error in walsender if that happens. Also, fix a variant of the same bug in pg_receivexlog: if it had already received WAL on previous timeline up to a segment boundary, when the upstream standby server is promoted so that the timeline switch record falls on the previous segment, pg_receivexlog would miss the segment containing the timeline switch. To fix that, have walsender send the position of the timeline switch at end-of-streaming, in addition to the next timeline's ID. It was previously assumed that the switch happened exactly where the streaming stopped. Note: this is an incompatible change in the streaming protocol. You might get an error if you try to stream over timeline switches, if the client is running 9.3beta1 and the server is more recent. It should be fine after a reconnect, however. Reported by Fujii Masao.	2013-05-08 20:30:17 +03:00
Heikki Linnakangas	28ba260906	In base backup, only include our own tablespace version directory. If you have clusters of different versions pointing to the same tablespace location, we would incorrectly include all the data belonging to the other versions, too. Fixes bug #7986, reported by Sergey Burladyan.	2013-03-25 20:19:22 +02:00
Tom Lane	da5aeccf64	Move pqsignal() to libpgport. We had two copies of this function in the backend and libpq, which was already pretty bogus, but it turns out that we need it in some other programs that don't use libpq (such as pg_test_fsync). So put it where it probably should have been all along. The signal-mask-initialization support in src/backend/libpq/pqsignal.c stays where it is, though, since we only need that in the backend.	2013-03-17 12:06:42 -04:00
Heikki Linnakangas	3a9e64aa0d	Cannot use WL_SOCKET_WRITEABLE without WL_SOCKET_READABLE. In copy-out mode, the frontend should not send any messages until the backend has finished streaming, by sending a CopyDone message. I'm not sure if it would be legal for the client to send a new query before receiving the CopyDone message from the backend, but trying to support that would require bigger changes to the backend code structure. Fixes an assertion failure reported by Fujii Masao.	2013-02-27 19:28:51 +02:00
Peter Eisentraut	4f36292669	Add quotes to messages	2013-02-22 23:33:07 -05:00
Simon Riggs	c2f79ba269	Force archive_status of .done for xlogs created by dearchival/replication. This is a forward-patch of commit `6f4b8a4f4f`, applied to 9.2 back in August. The plan was to do something else in master, but it looks like it's not going to happen, so let's just apply the 9.2 solution to master as well. Fujii Masao	2013-02-15 19:28:06 +02:00
Peter Eisentraut	0cb1fac3b1	Add noreturn attributes to some error reporting functions	2013-02-12 07:13:22 -05:00
Simon Riggs	bd56e74127	Reset master xmin when hot_standby_feedback disabled. If walsender has xmin of standby then ensure we reset the value to 0 when we change from hot_standby_feedback=on to hot_standby_feedback=off.	2013-02-04 10:29:22 +00:00
Heikki Linnakangas	990fe3c4ed	Fix more issues with cascading replication and timeline switches. When a standby server follows the master using WAL archive, and it chooses a new timeline (recovery_target_timeline='latest'), it only fetches the timeline history file for the chosen target timeline, not any other history files that might be missing from pg_xlog. For example, if the current timeline is 2, and we choose 4 as the new recovery target timeline, the history file for timeline 3 is not fetched, even if it's part of this server's history. That's enough for the standby itself - the history file for timeline 4 includes timeline 3 as well - but if a cascading standby server wants to recover to timeline 3, it needs the history file. To fix, when a new recovery target timeline is chosen, try to copy any missing history files from the archive to pg_xlog between the old and new target timeline. A second similar issue was with the WAL files. When a standby recovers from archive, and it reaches a segment that contains a switch to a new timeline, recovery fetches only the WAL file labelled with the new timeline's ID. The file from the new timeline contains a copy of the WAL from the old timeline up to the point where the switch happened, and recovery recovers it from the new file. But in streaming replication, walsender only tries to read it from the old timeline's file. To fix, change walsender to read it from the new file, so that it behaves the same as recovery in that sense, and doesn't try to open the possibly nonexistent file with the old timeline's ID.	2013-01-23 10:19:20 +02:00
Heikki Linnakangas	6f7cddc7ae	Now that START_REPLICATION returns the next timeline's ID after reaching end of timeline, take advantage of that in walreceiver. Startup process is still in control of choosign the target timeline, by scanning the timeline history files present in pg_xlog, but walreceiver now uses the next timeline's ID to fetch its history file immediately after it has finished streaming the old timeline. Before, the standby would first try to restart streaming on the old timeline, which fetches the missing timeline history file as a side-effect, and only then restart from the new timeline. This patch eliminates the extra iteration, which speeds up the timeline switch and reduces the noise in the log caused by the extra restart on the old timeline.	2013-01-18 11:59:34 +02:00
Heikki Linnakangas	3684a534ef	I added a result set to START_STREAMING command, but neglected walreceiver. The patch to allow pg_receivexlog to switch timeline added a result set after copy has ended in START_STREAMING command, to return the next timeline's ID to the client. But walreceived didn't get the memo, and threw an error on the unexpected result set. Fix.	2013-01-17 23:45:45 +02:00
Heikki Linnakangas	0b6329130e	Make pg_receivexlog and pg_basebackup -X stream work across timeline switches. This mirrors the changes done earlier to the server in standby mode. When receivelog reaches the end of a timeline, as reported by the server, it fetches the timeline history file of the next timeline, and restarts streaming from the new timeline by issuing a new START_STREAMING command. When pg_receivexlog crosses a timeline, it leaves the .partial suffix on the last segment on the old timeline. This helps you to tell apart a partial segment left in the directory because of a timeline switch, and a completed segment. If you just follow a single server, it won't make a difference, but it can be significant in more complicated scenarios where new WAL is still generated on the old timeline. This includes two small changes to the streaming replication protocol: First, when you reach the end of timeline while streaming, the server now sends the TLI of the next timeline in the server's history to the client. pg_receivexlog uses that as the next timeline, so that it doesn't need to parse the timeline history file like a standby server does. Second, when BASE_BACKUP command sends the begin and end WAL positions, it now also sends the timeline IDs corresponding the positions.	2013-01-17 20:23:00 +02:00
Heikki Linnakangas	3f4b1749a8	Return value of lseek() can be negative on failure. Because the return value of lseek() was assigned to an unsigned size_t variable, we'd fail to notice an error return code -1. Compiler gave a warning about this. Andres Freund	2013-01-15 00:42:37 +02:00
Tom Lane	b853eb9718	Improve handling of ereport(ERROR) and elog(ERROR). In commit `71450d7fd6`, we added code to inform suitably-intelligent compilers that ereport() doesn't return if the elevel is ERROR or higher. This patch extends that to elog(), and also fixes a double-evaluation hazard that the previous commit created in ereport(), as well as reducing the emitted code size. The elog() improvement requires the compiler to support __VA_ARGS__, which should be available in just about anything nowadays since it's required by C99. But our minimum language baseline is still C89, so add a configure test for that. The previous commit assumed that ereport's elevel could be evaluated twice, which isn't terribly safe --- there are already counterexamples in xlog.c. On compilers that have __builtin_constant_p, we can use that to protect the second test, since there's no possible optimization gain if the compiler doesn't know the value of elevel. Otherwise, use a local variable inside the macros to prevent double evaluation. The local-variable solution is inferior because (a) it leads to useless code being emitted when elevel isn't constant, and (b) it increases the optimization level needed for the compiler to recognize that subsequent code is unreachable. But it seems better than not teaching non-gcc compilers about unreachability at all. Lastly, if the compiler has __builtin_unreachable(), we can use that instead of abort(), resulting in a noticeable code savings since no function call is actually emitted. However, it seems wise to do this only in non-assert builds. In an assert build, continue to use abort(), so that the behavior will be predictable and debuggable if the "impossible" happens. These changes involve making the ereport and elog macros emit do-while statement blocks not just expressions, which forces small changes in a few call sites. Andres Freund, Tom Lane, Heikki Linnakangas	2013-01-13 18:40:09 -05:00
Heikki Linnakangas	b0daba57bb	Tolerate timeline switches while "pg_basebackup -X fetch" is running. If you take a base backup from a standby server with "pg_basebackup -X fetch", and the timeline switches while the backup is being taken, the backup used to fail with an error "requested WAL segment %s has already been removed". This is because the server-side code that sends over the required WAL files would not construct the WAL filename with the correct timeline after a switch. Fix that by using readdir() to scan pg_xlog for all the WAL segments in the range, regardless of timeline. Also, include all timeline history files in the backup, if taken with "-X fetch". That fixes another related bug: If a timeline switch happened just before the backup was initiated in a standby, the WAL segment containing the initial checkpoint record contains WAL from the older timeline too. Recovery will not accept that without a timeline history file that lists the older timeline. Backpatch to 9.2. Versions prior to that were not affected as you could not take a base backup from a standby before 9.2.	2013-01-03 19:51:00 +02:00
Heikki Linnakangas	ee994272ca	Delay reading timeline history file until it's fetched from master. Streaming replication can fetch any missing timeline history files from the master, but recovery would read the timeline history file for the target timeline before reading the checkpoint record, and before walreceiver has had a chance to fetch it from the master. Delay reading it, and the sanity checks involving timeline history, until after reading the checkpoint record. There is at least one scenario where this makes a difference: if you take a base backup from a standby server right after a timeline switch, the WAL segment containing the initial checkpoint record will begin with an older timeline ID. Without the timeline history file, recovering that file will fail as the older timeline ID is not recognized to be an ancestor of the target timeline. If you try to recover from such a backup, using only streaming replication to fetch the WAL, this patch is required for that to work.	2013-01-03 10:41:58 +02:00
Magnus Hagander	794397ae1d	Move tar function headers to pgtar.h This makes it possible to include them only where they are used, so we can avoid the conflict of the uid_t and gid_t datatypes that happened in plperl (since plperl doesn't need the tar functions)	2013-01-02 20:34:08 +01:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Magnus Hagander	f5d4bdd3a5	Unify some tar functionality across different parts Move some of the tar functionality that existed mostly duplicated in both pg_dump and the walsender basebackup functionality into port/tar.c instead, so it can be used from both. It will also be used by pg_basebackup in the future, which would've caused a third copy of it around. Zoltan Boszormenyi and Magnus Hagander	2013-01-01 18:15:57 +01:00
Alvaro Herrera	5ab3af46dd	Remove obsolete XLogRecPtr macros This gets rid of XLByteLT, XLByteLE, XLByteEQ and XLByteAdvance. These were useful for brevity when XLogRecPtrs were split in xlogid/xrecoff; but now that they are simple uint64's, they are just clutter. The only downside to making this change would be ease of backporting patches, but that has been negated by other substantive changes to the involved code anyway. The clarity of simpler expressions makes the change worthwhile. Most of the changes are mechanical, but in a couple of places, the patch author chose to invert the operator sense, making the code flow more logical (and more in line with preceding comments). Author: Andres Freund Eyeballed by Dimitri Fontaine and Alvaro Herrera	2012-12-28 13:06:15 -03:00
Alvaro Herrera	24eca7977e	Assign InvalidXLogRecPtr instead of MemSet(0) For consistency. Author: Andres Freund	2012-12-27 18:33:03 -03:00
Heikki Linnakangas	1ff92eea14	Fix sloppiness in the timeline switch over streaming replication patch. Here's another attempt at fixing the logic that decides how far the WAL can be streamed, which was still broken if the timeline changed while streaming. You would get an assertion failure. The way the logic is now written is more readable, too. Thom Brown reported the assertion failure.	2012-12-21 20:08:12 +02:00
Heikki Linnakangas	36e4456d78	Fix race condition if a file is removed while pg_basebackup is running. If a relation file was removed when the server-side counterpart of pg_basebackup was just about to open it to send it to the client, you'd get a "could not open file" error. Fix that. Backpatch to 9.1, this goes back to when pg_basebackup was introduced.	2012-12-21 15:34:15 +02:00
Heikki Linnakangas	af275a12df	Follow TLI of last replayed record, not recovery target TLI, in walsenders. Most of the time, the last replayed record comes from the recovery target timeline, but there is a corner case where it makes a difference. When the startup process scans for a new timeline, and decides to change recovery target timeline, there is a window where the recovery target TLI has already been bumped, but there are no WAL segments from the new timeline in pg_xlog yet. For example, if we have just replayed up to point 0/30002D8, on timeline 1, there is a WAL file called 000000010000000000000003 in pg_xlog that contains the WAL up to that point. When recovery switches recovery target timeline to 2, a walsender can immediately try to read WAL from 0/30002D8, from timeline 2, so it will try to open WAL file 000000020000000000000003. However, that doesn't exist yet - the startup process hasn't copied that file from the archive yet nor has the walreceiver streamed it yet, so walsender fails with error "requested WAL segment 000000020000000000000003 has already been removed". That's harmless, in that the standby will try to reconnect later and by that time the segment is already created, but error messages that should be ignored are not good. To fix that, have walsender track the TLI of the last replayed record, instead of the recovery target timeline. That way walsender will not try to read anything from timeline 2, until the WAL segment has been created and at least one record has been replayed from it. The recovery target timeline is now xlog.c's internal affair, it doesn't need to be exposed in shared memory anymore. This fixes the error reported by Thom Brown. depesz the same error message, but I'm not sure if this fixes his scenario.	2012-12-20 14:39:04 +02:00
Heikki Linnakangas	abfd192b1b	Allow a streaming replication standby to follow a timeline switch. Before this patch, streaming replication would refuse to start replicating if the timeline in the primary doesn't exactly match the standby. The situation where it doesn't match is when you have a master, and two standbys, and you promote one of the standbys to become new master. Promoting bumps up the timeline ID, and after that bump, the other standby would refuse to continue. There's significantly more timeline related logic in streaming replication now. First of all, when a standby connects to primary, it will ask the primary for any timeline history files that are missing from the standby. The missing files are sent using a new replication command TIMELINE_HISTORY, and stored in standby's pg_xlog directory. Using the timeline history files, the standby can follow the latest timeline present in the primary (recovery_target_timeline='latest'), just as it can follow new timelines appearing in an archive directory. START_REPLICATION now takes a TIMELINE parameter, to specify exactly which timeline to stream WAL from. This allows the standby to request the primary to send over WAL that precedes the promotion. The replication protocol is changed slightly (in a backwards-compatible way although there's little hope of streaming replication working across major versions anyway), to allow replication to stop when the end of timeline reached, putting the walsender back into accepting a replication command. Many thanks to Amit Kapila for testing and reviewing various versions of this patch.	2012-12-13 19:17:32 +02:00
Heikki Linnakangas	add6c3179a	Make the streaming replication protocol messages architecture-independent. We used to send structs wrapped in CopyData messages, which works as long as the client and server agree on things like endianess, timestamp format and alignment. That's good enough for running a standby server, which has to run on the same platform anyway, but it's useful for tools like pg_receivexlog to work across platforms. This breaks protocol compatibility of streaming replication, but we never promised that to be compatible across versions, anyway.	2012-11-07 19:09:13 +02:00
Heikki Linnakangas	7d3ed5ae78	Fix typo in comment. Fujii Masao	2012-10-15 13:01:31 +03:00
Heikki Linnakangas	6f60fdd701	Improve replication connection timeouts. Rename replication_timeout to wal_sender_timeout, and add a new setting called wal_receiver_timeout that does the same at the walreceiver side. There was previously no timeout in walreceiver, so if the network went down, for example, the walreceiver could take a long time to notice that the connection was lost. Now with the two settings, both sides of a replication connection will detect a broken connection similarly. It is no longer necessary to manually set wal_receiver_status_interval to a value smaller than the timeout. Both wal sender and receiver now automatically send a "ping" message if more than 1/2 of the configured timeout has elapsed, and it hasn't received any messages from the other end. Amit Kapila, heavily edited by me.	2012-10-11 17:48:08 +03:00
Peter Eisentraut	8521d13194	Refactor flex and bison make rules Numerous flex and bison make rules have appeared in the source tree over time, and they are all virtually identical, so we can replace them by pattern rules with some variables for customization. Users of pgxs will also be able to benefit from this.	2012-10-11 06:57:04 -04:00
Heikki Linnakangas	0b77aebabf	Remove stray newline in comment.	2012-10-09 13:06:48 +03:00
Peter Eisentraut	b6d4522296	Remove generation of repl_gram.h It was apparently never necessary.	2012-10-08 20:36:46 -04:00
Heikki Linnakangas	9c0e2b9182	Fix walsender handling of postmaster shutdown, to not go into endless loop. This bug was introduced by my patch to use the regular die/quickdie signal handlers in walsender processes. I tried to make walsender exit at next CHECK_FOR_INTERRUPTS() by setting ProcDiePending, but that's not enough, you need to set InterruptPending too. On second thoght, it was not a very good way to make walsender exit anyway, so use proc_exit(0) instead. Also, send a CommandComplete message before exiting; that's what we did before, and you get a nicer error message in the standby that way. Reported by Thom Brown.	2012-10-08 13:32:14 +03:00
Heikki Linnakangas	fd5942c18f	Use the regular main processing loop also in walsenders. The regular backend's main loop handles signal handling and error recovery better than the current WAL sender command loop does. For example, if the client hangs and a SIGTERM is received before starting streaming, the walsender will now terminate immediately, rather than hang until the connection times out.	2012-10-05 17:21:12 +03:00

1 2 3 4 5

228 Commits