Documentation on the fsync() patch from OpenLink

Submitted by:  Cees de Groot <C.deGroot@inter.nl.net>
This commit is contained in:
Marc G. Fournier 1996-09-19 20:22:23 +00:00
parent 715c6b6d25
commit 5995953a02
1 changed files with 34 additions and 0 deletions

34
doc/README.fsync Normal file
View File

@ -0,0 +1,34 @@
Fsync() patch (backend -F option)
=================================
Normally, the Postgres'95 backend makes sure that updates are actually
committed to disk by calling the standard function fsync() in
several places. Fsync() should guarantee that every modification to
a certain file is actually written to disk and will not hang around
in write caches anymore. This increases the chance that a database
will still be usable after a system crash by a large amount.
However, this operation severely slows down Postgres'95, because at all
those points it has to wait for the OS to flush the buffers. Especially
in one-shot operations, like creating a new database or loading lots
of data, you'll have a clear restart point if something goes wrong. That's
where the -F option kicks in: it simply disables the calls to fsync().
Without fsync(), the OS is allowed to do its best in buffering, sorting
and delaying writes, so this can be a _very_ big perfomance increase. However,
if the system crashes, large parts of the latest transactions will still hang
around in memory without having been committed to disk - lossage of data
is therefore almost certain to occur.
So it's a tradeoff between data integrity and speed. When initializing a
database, I'd use it - if the machine crashes, you simply remove the files
created and redo the operation. The same goes for bulk-loading data: on
a crash, you remove the database and restore the backup you made before
starting the bulk-load (you always make backups before bulk-loading,
don't you?).
Whether you want to use it in production, is up to you. If you trust your
operating system, your utility company, and your hardware, you might enable
it; however, keep in mind that you're running in an unsecure mode and that
performance gains will very much depend on access patterns (because it won't
help on reading data). I'd recommend against it.