Commit Graph

1284 Commits

Author SHA1 Message Date
Tom Lane a2c3931a24 Fix pg_hba.conf matching so that replication connections only match records
with database = replication.  The previous coding would allow them to match
ordinary records too, but that seems like a recipe for security breaches.
Improve the messages associated with no-such-pg_hba.conf entry to report
replication connections as such, since that's now a critical aspect of
whether the connection matches.  Make some cursory improvements in the related
documentation, too.
2010-04-21 03:32:53 +00:00
Tom Lane a3c6d10575 Move the check for whether walreceiver has authenticated as a superuser
from walsender.c, where it didn't really belong, to postinit.c where it does
belong (and is essentially free, too).
2010-04-21 00:51:57 +00:00
Tom Lane 7de2dfccc5 Fix code that doesn't work on machines with strict alignment requirements:
must use memcpy here rather than struct assignment.

In passing, rearrange some randomly-ordered declarations to be a tad less
random.
2010-04-20 22:55:03 +00:00
Magnus Hagander 03a571a4cf Add wrapper function libpqrcv_PQexec() in the walreceiver that uses async
libpq to send queries, making the waiting for responses interruptible on
platforms where PQexec() can't normally be interrupted by signals, such
as win32.

Fujii Masao and Magnus Hagander
2010-04-19 14:10:45 +00:00
Magnus Hagander a95d15ff5d Only try to do a graceful disconnect if we've successfully loaded the
shared library with the disconnect function in it. Fixes segmentation
fault reported by Jeff Davis.

Fujii Masao
2010-04-13 08:16:09 +00:00
Heikki Linnakangas 258174b462 Need to use the start pointer of a block we read from WAL segment in
the calculation, not the end pointer, as pointed out by Fujii Masao.
2010-04-12 10:18:50 +00:00
Heikki Linnakangas e57cd7f0a1 Change the logic to decide when to delete old WAL segments, so that it
doesn't take into account how far the WAL senders are. This way a hung
WAL sender doesn't prevent old WAL segments from being recycled/removed
in the primary, ultimately causing the disk to fill up. Instead add
standby_keep_segments setting to control how many old WAL segments are
kept in the primary. This also makes it more reliable to use streaming
replication without WAL archiving, assuming that you set
standby_keep_segments high enough.
2010-04-12 09:52:29 +00:00
Robert Haas 54943734f8 Refer to max_wal_senders in a more consistent fashion.
The error message now makes explicit reference to the GUC that must be changed
to fix the problem, using wording suggested by Tom Lane.  Along the way,
rename the GUC from MaxWalSenders to max_wal_senders for consistency and
grep-ability.
2010-04-01 00:43:29 +00:00
Heikki Linnakangas 59292f28ca Flush CopyOutResponse when starting streaming in walsender, so that it's
not delayed until the first WAL record is sent.

Fujii Masao
2010-03-26 12:23:34 +00:00
Simon Riggs 92fc0db99f Additional thoughts on WALSender cpu reduction. Use long type
and alter a comment to reduce confusion.
2010-03-24 21:41:57 +00:00
Simon Riggs 08882ce74c Reduce CPU utilisation of WALSender process. Process was using 10% CPU
doing nothing, caused by naptime specified in milliseconds yet units of
pg_usleep() parameter is microseconds. Correctly specifying units
reduces call frequency by 1000. Reduction in CPU consumption verified.
2010-03-24 20:11:12 +00:00
Heikki Linnakangas de3483acfa Update description of walrcv_receive() function to match reality. 2010-03-24 06:25:39 +00:00
Peter Eisentraut c248d17120 Message tuning 2010-03-21 00:17:59 +00:00
Simon Riggs 6a771d1d36 Add connection messages for streaming replication. log_connections
was broken for a replication connection and no messages were
displayed on either standby or primary, at any debug level.
Connection messages needed to diagnose session drop/reconnect
events. Use LOG mode for now, discuss lowering in later releases.
2010-03-19 19:19:38 +00:00
Simon Riggs 75867c528d Minor tweaks on libpqrcv_connect(): ensure conninfo_repl[] is
correctly sized and expand comment to explain otherwise
undocumented use of replication connection parameter.
2010-03-19 17:51:42 +00:00
Heikki Linnakangas a383c55a1d Throw a nicer error message if a standby server attempts to connect while
the master is still in recovery. We don't support cascading slaves yet.

Patch by Fujii Masao, with slightly changed wording.
2010-03-16 09:09:55 +00:00
Bruce Momjian 65e806cba1 pgindent run for 9.0 2010-02-26 02:01:40 +00:00
Heikki Linnakangas cd2b7d3c4d Fix streaming replication starting at the very first WAL segment.
Per complaint from Greg Stark.
2010-02-25 07:31:40 +00:00
Heikki Linnakangas ad458cfe81 Don't use O_DIRECT when writing WAL files if archiving or streaming is
enabled. Bypassing the kernel cache is counter-productive in that case,
because the archiver/walsender process will read from the WAL file
soon after it's written, and if it's not cached the read will cause
a physical read, eating I/O bandwidth available on the WAL drive.

Also, walreceiver process does unaligned writes, so disable O_DIRECT
in walreceiver process for that reason too.
2010-02-19 10:51:04 +00:00
Heikki Linnakangas 3e87ba6ef7 Fix pq_getbyte_if_available() function. It was confused on what it
returns if no data is immediately available. Patch by me with numerous
fixes from Fujii Masao and Magnus Hagander.
2010-02-18 11:13:46 +00:00
Tom Lane 50a90fac40 Stamp HEAD as 9.0devel, and update various places that were referring to 8.5
(hope I got 'em all).  Per discussion, this release will be 9.0 not 8.5.
2010-02-17 04:19:41 +00:00
Heikki Linnakangas 808969d0e7 Add a message type header to the CopyData messages sent from primary
to standby in streaming replication. While we only have one message type
at the moment, adding a message type header makes this easier to extend.
2010-02-03 09:47:19 +00:00
Heikki Linnakangas 83cb7da7dc Fix bug in wasender's xlogid boundary handling, reported by Erik Rijkers.
LogwrtRqst.Write can be set to non-existent FF log segment, we mustn't
try to send that in XLogSend().

Also fix similar bug in ReadRecord(), which I just introduced in the
ReadRecord() refactoring patch.
2010-01-27 16:41:09 +00:00
Heikki Linnakangas 1bb2558046 Make standby server continuously retry restoring the next WAL segment with
restore_command, if the connection to the primary server is lost. This
ensures that the standby can recover automatically, if the connection is
lost for a long time and standby falls behind so much that the required
WAL segments have been archived and deleted in the master.

This also makes standby_mode useful without streaming replication; the
server will keep retrying restore_command every few seconds until the
trigger file is found. That's the same basic functionality pg_standby
offers, but without the bells and whistles.

To implement that, refactor the ReadRecord/FetchRecord functions. The
FetchRecord() function introduced in the original streaming replication
patch is removed, and all the retry logic is now in a new function called
XLogReadPage(). XLogReadPage() is now responsible for executing
restore_command, launching walreceiver, and waiting for new WAL to arrive
from primary, as required.

This also changes the life cycle of walreceiver. When launched, it now only
tries to connect to the master once, and exits if the connection fails, or
is lost during streaming for any reason. The startup process detects the
death, and re-launches walreceiver if necessary.
2010-01-27 15:27:51 +00:00
Heikki Linnakangas 2d29f5f59a Fix bogus comments. 2010-01-21 08:19:57 +00:00
Heikki Linnakangas e8aae273d4 Fix bogus subdir setting. Again. I must've unfixed it by accident while
moving files around.
2010-01-20 20:34:51 +00:00
Heikki Linnakangas b3a1ef53c3 Add missing "!= NULL", for the sake of consistency.
Fujii Masao
2010-01-20 11:58:44 +00:00
Heikki Linnakangas 32bc08b1d4 Rethink the way walreceiver is linked into the backend. Instead than shoving
walreceiver as whole into a dynamically loaded module, split the
libpq-specific parts of it into dynamically loaded module and keep the rest
in the main backend binary.

Although Tom fixed the Windows compilation problems with the old walreceiver
module already, this is a cleaner division of labour and makes the code
more readable. There's also the prospect of adding new transport methods
as pluggable modules in the future, which this patch makes easier, though for
now the API between libpqwalreceiver and walreceiver process should be
considered private.

The libpq-specific module is now in src/backend/replication/libpqwalreceiver,
and the part linked with postgres binary is in
src/backend/replication/walreceiver.c.
2010-01-20 09:16:24 +00:00
Bruce Momjian a736540958 Add #include <sys/time.h> for struct timeval definition on BSD/OS. 2010-01-16 01:55:28 +00:00
Tom Lane cf7af28068 No, scratch that, it was getting added twice. 2010-01-15 21:06:26 +00:00
Tom Lane 4335281604 Actually, I'll bet the mingw problem is lack of $(BE_DLLLIBS) ... 2010-01-15 20:45:42 +00:00
Tom Lane 798fe1d513 Fix bogus subdir setting ... wonder just what that affects ... 2010-01-15 20:34:11 +00:00
Heikki Linnakangas d3be71a208 Remove unused (in non-assertion-enabled build) variable. 2010-01-15 11:47:15 +00:00
Heikki Linnakangas 40f908bdcd Introduce Streaming Replication.
This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.

Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.

Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.

Fujii Masao, with additional hacking by me
2010-01-15 09:19:10 +00:00