Managing Databases database Every instance of a running PostgreSQL server manages one or more databases. Databases are therefore the topmost hierarchical level for organizing SQL objects (database objects). This chapter describes the properties of databases, and how to create, manage, and destroy them. Overview A database is a named collection of SQL objects (database objects). Generally, every database object (tables, functions, etc.) belongs to one and only one database. (But there are a few system catalogs, for example pg_database, that belong to a whole cluster and are accessible from each database within the cluster.) More accurately, a database is a collection of schemas and the schemas contain the tables, functions, etc. So the full hierarchy is: server, database, schema, table (or something else instead of a table). An application that connects to the database server specifies in its connection request the name of the database it wants to connect to. It is not possible to access more than one database per connection. (But an application is not restricted in the number of connections it opens to the same or other databases.) It is possible, however, to access more than one schema from the same connection. Schemas are a purely logical structure and who can access what is managed by the privilege system. Databases are physically separated and access control is managed at the connection level. If one PostgreSQL server instance is to house projects or users that should be separate and for the most part unaware of each other, it is therefore recommendable to put them into separate databases. If the projects or users are interrelated and should be able to use each other's resources they should be put in the same databases but possibly into separate schemas. More information about managing schemas is in the &cite-user;. SQL calls databases catalogs, but there is no difference in practice. Creating a Database In order to create a databases, the PostgreSQL server must be up and running (see ). Databases are created with the SQL command CREATE DATABASE: CREATE DATABASE name; where name follows the usual rules for SQL identifiers. The current user automatically becomes the owner of the new database. It is the privilege of the owner of a database to remove it later on (which also removes all the objects in it, even if they have a different owner). The creation of databases is a restricted operation. See for how to grant permission. Since you need to be connected to the database server in order to execute the CREATE DATABASE command, the question remains how the first database at any given site can be created. The first database is always created by the initdb command when the data storage area is initialized. (See .) This database is called template1. So to create the first real database you can connect to template1. The name template1 is no accident: When a new database is created, the template database is essentially cloned. This means that any changes you make in template1 are propagated to all subsequently created databases. This implies that you should not use the template database for real work, but when used judiciously this feature can be convenient. More details appear in . As an extra convenience, there is also a program that you can execute from the shell to create new databases, createdb. createdb dbname createdb does no magic. It connects to the template1 database and issues the CREATE DATABASE command, exactly as described above. It uses the psql program internally. The reference page on createdb contains the invocation details. Note that createdb without any arguments will create a database with the current user name, which may or may not be what you want. contains information about how to restrict who can connect to a given database. Sometimes you want to create a database for someone else. That user should become the owner of the new database, so he can configure and manage it himself. To achieve that, use one of the following commands: CREATE DATABASE dbname OWNER username; from the SQL environment, or createdb -O username dbname You must be a superuser to be allowed to create a database for someone else. Template Databases CREATE DATABASE actually works by copying an existing database. By default, it copies the standard system database named template1. Thus that database is the template from which new databases are made. If you add objects to template1, these objects will be copied into subsequently created user databases. This behavior allows site-local modifications to the standard set of objects in databases. For example, if you install the procedural language PL/pgSQL in template1, it will automatically be available in user databases without any extra action being taken when those databases are made. There is a second standard system database named template0. This database contains the same data as the initial contents of template1, that is, only the standard objects predefined by your version of PostgreSQL. template0 should never be changed after initdb. By instructing CREATE DATABASE to copy template0 instead of template1, you can create a virgin user database that contains none of the site-local additions in template1. This is particularly handy when restoring a pg_dump dump: the dump script should be restored in a virgin database to ensure that one recreates the correct contents of the dumped database, without any conflicts with additions that may now be present in template1. To create a database by copying template0, use CREATE DATABASE dbname TEMPLATE template0; from the SQL environment, or createdb -T template0 dbname from the shell. It is possible to create additional template databases, and indeed one might copy any database in a cluster by specifying its name as the template for CREATE DATABASE. It is important to understand, however, that this is not (yet) intended as a general-purpose COPY DATABASE facility. In particular, it is essential that the source database be idle (no data-altering transactions in progress) for the duration of the copying operation. CREATE DATABASE will check that no session (other than itself) is connected to the source database at the start of the operation, but this does not guarantee that changes cannot be made while the copy proceeds, which would result in an inconsistent copied database. Therefore, we recommend that databases used as templates be treated as read-only. Two useful flags exist in pg_database for each database: the columns datistemplate and datallowconn. datistemplate may be set to indicate that a database is intended as a template for CREATE DATABASE. If this flag is set, the database may be cloned by any user with CREATEDB privileges; if it is not set, only superusers and the owner of the database may clone it. If datallowconn is false, then no new connections to that database will be allowed (but existing sessions are not killed simply by setting the flag false). The template0 database is normally marked datallowconn = false to prevent modification of it. Both template0 and template1 should always be marked with datistemplate = true. After preparing a template database, or making any changes to one, it is a good idea to perform VACUUM FREEZE or VACUUM FULL FREEZE in that database. If this is done when there are no other open transactions in the same database, then it is guaranteed that all rows in the database are frozen and will not be subject to transaction ID wraparound problems. This is particularly important for a database that will have datallowconn set to false, since it will be impossible to do routine maintenance VACUUM in such a database. See for more information. template1 and template0 do not have any special status beyond the fact that the name template1 is the default source database name for CREATE DATABASE and the default database-to-connect-to for various programs such as createdb. For example, one could drop template1 and recreate it from template0 without any ill effects. This course of action might be advisable if one has carelessly added a bunch of junk in template1. Database Configuration Recall from that the PostgreSQL server provides a large number of run-time configuration variables. You can set database-specific default values for many of these settings. For example, if for some reason you want to disable the GEQO optimizer for a given database, you'd ordinarily have to either disable it for all databases or make sure that every connecting client is careful to issue SET geqo TO off;. To make this setting the default you can execute the command ALTER DATABASE mydb SET geqo TO off; This will save the setting (but not set it immediately) and in subsequent connections it will appear as though SET geqo TO off; had been called right before the session started. Note that users can still alter this setting during the session; it will only be the default. To undo any such setting, use ALTER DATABASE dbname RESET varname;. Alternative Locations It is possible to create a database in a location other than the default location for the installation. But remember that all database access occurs through the database server, so any location specified must be accessible by the server. Alternative database locations are referenced by an environment variable which gives the absolute path to the intended storage location. This environment variable must be present in the server's environment, so it must have been defined before the server was started. (Thus, the set of available alternative locations is under the site administrator's control; ordinary users can't change it.) Any valid environment variable name may be used to reference an alternative location, although using variable names with a prefix of PGDATA is recommended to avoid confusion and conflict with other variables. To create the variable in the environment of the server process you must first shut down the server, define the variable, initialize the data area, and finally restart the server. (See also and .) To set an environment variable, type PGDATA2=/home/postgres/data export PGDATA2 in Bourne shells, or setenv PGDATA2 /home/postgres/data in csh or tcsh. You have to make sure that this environment variable is always defined in the server environment, otherwise you won't be able to access that database. Therefore you probably want to set it in some sort of shell start-up file or server start-up script. initlocation To create a data storage area in PGDATA2, ensure that the containing directory (here, /home/postgres) already exists and is writable by the user account that runs the server (see ). Then from the command line, type initlocation PGDATA2 (not initlocation $PGDATA2). Then you can restart the server. To create a database within the new location, use the command CREATE DATABASE name WITH LOCATION 'location'; where location is the environment variable you used, PGDATA2 in this example. The createdb command has the option Databases created in alternative locations can be accessed and dropped like any other database. It can also be possible to specify absolute paths directly to the CREATE DATABASE command without defining environment variables. This is disallowed by default because it is a security risk. To allow it, you must compile PostgreSQL with the C preprocessor macro ALLOW_ABSOLUTE_DBPATHS defined. One way to do this is to run the compilation step like this: gmake CPPFLAGS=-DALLOW_ABSOLUTE_DBPATHS all Destroying a Database Databases are destroyed with the command DROP DATABASE: DROP DATABASE name; Only the owner of the database (i.e., the user that created it) or a superuser, can drop a database. Dropping a database removes all objects that were contained within the database. The destruction of a database cannot be undone. You cannot execute the DROP DATABASE command while connected to the victim database. You can, however, be connected to any other database, including the template1 database. template1 would be the only option for dropping the last user database of a given cluster. For convenience, there is also a shell program to drop databases: dropdb dbname (Unlike createdb, it is not the default action to drop the database with the current user name.)