From 28b11bd1069ed35f45125b4057780cc55b9d716a Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 28 Jul 2015 18:42:59 -0400 Subject: [PATCH] Update our documentation concerning where to create data directories. Although initdb has long discouraged use of a filesystem mount-point directory as a PG data directory, this point was covered nowhere in the user-facing documentation. Also, with the popularity of pg_upgrade, we really need to recommend that the PG user own not only the data directory but its parent directory too. (Without a writable parent directory, operations such as "mv data data.old" fail immediately. pg_upgrade itself doesn't do that, but wrapper scripts for it often do.) Hence, adjust the "Creating a Database Cluster" section to address these points. I also took the liberty of wordsmithing the discussion of NFS a bit. These considerations aren't by any means new, so back-patch to all supported branches. --- doc/src/sgml/runtime.sgml | 79 ++++++++++++++++++++++++++++----------- 1 file changed, 57 insertions(+), 22 deletions(-) diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 547567e9ca..6d5b1082d2 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -49,7 +49,7 @@ Before you can do anything, you must initialize a database storage area on disk. We call this a database cluster. - (SQL uses the term catalog cluster.) A + (The SQL standard uses the term catalog cluster.) A database cluster is a collection of databases that is managed by a single instance of a running database server. After initialization, a database cluster will contain a database named postgres, @@ -65,7 +65,7 @@ - In file system terms, a database cluster will be a single directory + In file system terms, a database cluster is a single directory under which all data will be stored. We call this the data directory or data area. It is completely up to you where you choose to store your data. There is no @@ -109,15 +109,18 @@ initdb will attempt to create the directory you - specify if it does not already exist. It is likely that it will not - have the permission to do so (if you followed our advice and created - an unprivileged account). In that case you should create the - directory yourself (as root) and change the owner to be the - PostgreSQL user. Here is how this might - be done: + specify if it does not already exist. Of course, this will fail if + initdb does not have permissions to write in the + parent directory. It's generally recommendable that the + PostgreSQL user own not just the data + directory but its parent directory as well, so that this should not + be a problem. If the desired parent directory doesn't exist either, + you will need to create it first, using root privileges if the + grandparent directory isn't writable. So the process might look + like this: -root# mkdir /usr/local/pgsql/data -root# chown postgres /usr/local/pgsql/data +root# mkdir /usr/local/pgsql +root# chown postgres /usr/local/pgsql root# su postgres postgres$ initdb -D /usr/local/pgsql/data @@ -125,7 +128,9 @@ postgres$ initdb -D /usr/local/pgsql/data initdb will refuse to run if the data directory - looks like it has already been initialized. + exists and already contains files; this is to prevent accidentally + overwriting an existing installation. + Because the data directory contains all the data stored in the @@ -178,8 +183,30 @@ postgres$ initdb -D /usr/local/pgsql/data locale setting. For details see . + + Use of Secondary File Systems + + + file system mount points + + + + Many installations create their database clusters on file systems + (volumes) other than the machine's root volume. If you + choose to do this, it is not advisable to try to use the secondary + volume's topmost directory (mount point) as the data directory. + Best practice is to create a directory within the mount-point + directory that is owned by the PostgreSQL + user, and then create the data directory within that. This avoids + permissions problems, particularly for operations such + as pg_upgrade, and it also ensures clean failures if + the secondary volume is taken offline. + + + + - Network File Systems + Use of Network File Systems Network File Systems @@ -188,22 +215,30 @@ postgres$ initdb -D /usr/local/pgsql/data Network Attached Storage (NAS)Network File Systems - Many installations create database clusters on network file systems. - Sometimes this is done directly via NFS, or by using a + Many installations create their database clusters on network file + systems. Sometimes this is done via NFS, or by using a Network Attached Storage (NAS) device that uses NFS internally. PostgreSQL does nothing special for NFS file systems, meaning it assumes - NFS behaves exactly like locally-connected drives - (DAS, Direct Attached Storage). If client and server - NFS implementations have non-standard semantics, this can + NFS behaves exactly like locally-connected drives. + If the client or server NFS implementation does not + provide standard file system semantics, this can cause reliability problems (see ). Specifically, delayed (asynchronous) writes to the NFS - server can cause reliability problems; if possible, mount - NFS file systems synchronously (without caching) to avoid - this. Also, soft-mounting NFS is not recommended. - (Storage Area Networks (SAN) use a low-level - communication protocol rather than NFS.) + server can cause data corruption problems. If possible, mount the + NFS file system synchronously (without caching) to avoid + this hazard. Also, soft-mounting the NFS file system is + not recommended. + + + + Storage Area Networks (SAN) typically use communication + protocols other than NFS, and may or may not be subject + to hazards of this sort. It's advisable to consult the vendor's + documentation concerning data consistency guarantees. + PostgreSQL cannot be more reliable than + the file system it's using.