mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-10-02 15:41:17 +02:00
Document the interaction of write-barrier-enabled file systems, and BBU
caches, per June email thread.
This commit is contained in:
parent
20be0d480a
commit
e3243488b0
@ -1,4 +1,4 @@
|
|||||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/wal.sgml,v 1.66 2010/04/13 14:15:25 momjian Exp $ -->
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/wal.sgml,v 1.67 2010/07/07 14:42:09 momjian Exp $ -->
|
||||||
|
|
||||||
<chapter id="wal">
|
<chapter id="wal">
|
||||||
<title>Reliability and the Write-Ahead Log</title>
|
<title>Reliability and the Write-Ahead Log</title>
|
||||||
@ -48,21 +48,27 @@
|
|||||||
some later time. Such caches can be a reliability hazard because the
|
some later time. Such caches can be a reliability hazard because the
|
||||||
memory in the disk controller cache is volatile, and will lose its
|
memory in the disk controller cache is volatile, and will lose its
|
||||||
contents in a power failure. Better controller cards have
|
contents in a power failure. Better controller cards have
|
||||||
<firstterm>battery-backed</> caches, meaning the card has a battery that
|
<firstterm>battery-backed unit</> (<acronym>BBU</>) caches, meaning
|
||||||
|
the card has a battery that
|
||||||
maintains power to the cache in case of system power loss. After power
|
maintains power to the cache in case of system power loss. After power
|
||||||
is restored the data will be written to the disk drives.
|
is restored the data will be written to the disk drives.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
And finally, most disk drives have caches. Some are write-through
|
And finally, most disk drives have caches. Some are write-through
|
||||||
while some are write-back, and the
|
while some are write-back, and the same concerns about data loss
|
||||||
same concerns about data loss exist for write-back drive caches as
|
exist for write-back drive caches as exist for disk controller
|
||||||
exist for disk controller caches. Consumer-grade IDE and SATA drives are
|
caches. Consumer-grade IDE and SATA drives are particularly likely
|
||||||
particularly likely to have write-back caches that will not survive a
|
to have write-back caches that will not survive a power failure,
|
||||||
power failure, though <acronym>ATAPI-6</> introduced a drive cache
|
though <acronym>ATAPI-6</> introduced a drive cache flush command
|
||||||
flush command (FLUSH CACHE EXT) that some file systems use, e.g. <acronym>ZFS</>.
|
(<command>FLUSH CACHE EXT</>) that some file systems use, e.g.
|
||||||
Many solid-state drives (SSD) also have volatile write-back
|
<acronym>ZFS</>, <acronym>ext4</>. (The SCSI command
|
||||||
caches, and many do not honor cache flush commands by default.
|
<command>SYNCHRONIZE CACHE</> has long been available.) Many
|
||||||
|
solid-state drives (SSD) also have volatile write-back caches, and
|
||||||
|
many do not honor cache flush commands by default.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
To check write caching on <productname>Linux</> use
|
To check write caching on <productname>Linux</> use
|
||||||
<command>hdparm -I</>; it is enabled if there is a <literal>*</> next
|
<command>hdparm -I</>; it is enabled if there is a <literal>*</> next
|
||||||
to <literal>Write cache</>; <command>hdparm -W</> to turn off
|
to <literal>Write cache</>; <command>hdparm -W</> to turn off
|
||||||
@ -82,6 +88,25 @@
|
|||||||
<literal>fsync_writethrough</> never do write caching.
|
<literal>fsync_writethrough</> never do write caching.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Many file systems that use write barriers (e.g. <acronym>ZFS</>,
|
||||||
|
<acronym>ext4</>) internally use <command>FLUSH CACHE EXT</> or
|
||||||
|
<command>SYNCHRONIZE CACHE</> commands to flush data to the platers on
|
||||||
|
write-back-enabled drives. Unfortunately, such write barrier file
|
||||||
|
systems behave suboptimally when combined with battery-backed unit
|
||||||
|
(<acronym>BBU</>) disk controllers. In such setups, the synchronize
|
||||||
|
command forces all data from the BBU to the disks, eliminating much
|
||||||
|
of the benefit of the BBU. You can run the utility
|
||||||
|
<filename>src/tools/fsync</> in the PostgreSQL source tree to see
|
||||||
|
if you are effected. If you are effected, the performance benefits
|
||||||
|
of the BBU cache can be regained by turning off write barriers in
|
||||||
|
the file system or reconfiguring the disk controller, if that is
|
||||||
|
an option. If write barriers are turned off, make sure the battery
|
||||||
|
remains active; a faulty battery can potentially lead to data loss.
|
||||||
|
Hopefully file system and disk controller designers will eventually
|
||||||
|
address this suboptimal behavior.
|
||||||
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
When the operating system sends a write request to the storage hardware,
|
When the operating system sends a write request to the storage hardware,
|
||||||
there is little it can do to make sure the data has arrived at a truly
|
there is little it can do to make sure the data has arrived at a truly
|
||||||
|
Loading…
Reference in New Issue
Block a user