Incorporate README.fsync into main documentation body

This commit is contained in:
Peter Eisentraut 2000-07-16 14:47:57 +00:00
parent b4c315ba9e
commit 81fd7532a9
2 changed files with 34 additions and 40 deletions

View File

@ -1,34 +0,0 @@
Fsync() patch (backend -F option)
=================================
Normally, the Postgres'95 backend makes sure that updates are actually
committed to disk by calling the standard function fsync() in
several places. Fsync() should guarantee that every modification to
a certain file is actually written to disk and will not hang around
in write caches anymore. This increases the chance that a database
will still be usable after a system crash by a large amount.
However, this operation severely slows down Postgres'95, because at all
those points it has to wait for the OS to flush the buffers. Especially
in one-shot operations, like creating a new database or loading lots
of data, you'll have a clear restart point if something goes wrong. That's
where the -F option kicks in: it simply disables the calls to fsync().
Without fsync(), the OS is allowed to do its best in buffering, sorting
and delaying writes, so this can be a _very_ big perfomance increase. However,
if the system crashes, large parts of the latest transactions will still hang
around in memory without having been committed to disk - lossage of data
is therefore almost certain to occur.
So it's a tradeoff between data integrity and speed. When initializing a
database, I'd use it - if the machine crashes, you simply remove the files
created and redo the operation. The same goes for bulk-loading data: on
a crash, you remove the database and restore the backup you made before
starting the bulk-load (you always make backups before bulk-loading,
don't you?).
Whether you want to use it in production, is up to you. If you trust your
operating system, your utility company, and your hardware, you might enable
it; however, keep in mind that you're running in an unsecure mode and that
performance gains will very much depend on access patterns (because it won't
help on reading data). I'd recommend against it.

View File

@ -1,5 +1,5 @@
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.14 2000/07/15 21:35:47 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.15 2000/07/16 14:47:57 petere Exp $
-->
<Chapter Id="runtime">
@ -846,11 +846,39 @@ env PGOPTIONS='--geqo=off' psql
<term>FSYNC (<type>boolean</type>)</term>
<listitem>
<para>
When this is on (default), an <function>fsync()</function>
call is done after each transaction. Turning this off
increases performance but an operating system crash or power
outage might cause data corruption. (Note that a crash of
<productname>Postgres</productname> itself is not affected.)
If this is option is on, the <productname>Postgres</> backend
will use the <function>fsync()</> system call in several
places to make sure that updates are physically written to
disk and will not hang around in the write caches. This
increases the chance that a database installation will still
be usable after a operating system or hardware crashes by a
large amount. (Crashes of the database server itself do
<emphasis>not</> affect this consideration.)
</para>
<para>
However, this operation severely slows down
<productname>Postgres</>, because at all those points it has
to block and wait for the operating system to flush the
buffers. Without <function>fsync</>, the operating system is
allowed to do its best in buffering, sorting, and delaying
writes, so this can be a <emphasis>very</> big perfomance
increase. However, if the system crashes, parts of the data of
a transaction that has already been committed -- according to
the information on disk -- will still hang around in memory.
Inconsistent data (i.e., data corruption) is therefore likely
to occur.
</para>
<para>
This option is the subject of an eternal debate in the
<productname>Postgres</> user and developer communities. Some
always leave it off, some turn it off only for bulk loads,
where there is a clear restart point if something goes wrong,
some leave it on just to be on the safe side. Because it is
the safe side, on is also the default. If you trust your
operating system, your utility company, and your hardware, you
might want to disable it.
</para>
</listitem>
</varlistentry>