postgresql/src/bin/pg_upgrade/IMPLEMENTATION

------------------------------------------------------------------------------
PG_UPGRADE: IN-PLACE UPGRADES FOR POSTGRESQL
------------------------------------------------------------------------------

Upgrading a PostgreSQL database from one major release to another can be
an expensive process. For minor upgrades, you can simply install new
executables and forget about upgrading existing data. But for major
upgrades, you have to export all of your data using pg_dump, install the
new release, run initdb to create a new cluster, and then import your
old data. If you have a lot of data, that can take a considerable amount
of time. If you have too much data, you may have to buy more storage
since you need enough room to hold the original data plus the exported
data.  pg_upgrade can reduce the amount of time and disk space required
for many upgrades.

The URL http://momjian.us/main/writings/pgsql/pg_upgrade.pdf contains a
presentation about pg_upgrade internals that mirrors the text
description below.

------------------------------------------------------------------------------
WHAT IT DOES
------------------------------------------------------------------------------

pg_upgrade is a tool that performs an in-place upgrade of existing
data. Some upgrades change the on-disk representation of data;
pg_upgrade cannot help in those upgrades.  However, many upgrades do
not change the on-disk representation of a user-defined table.  In those
cases, pg_upgrade can move existing user-defined tables from the old
database cluster into the new cluster.

There are two factors that determine whether an in-place upgrade is
practical.

Every table in a cluster shares the same on-disk representation of the
table headers and trailers and the on-disk representation of tuple
headers. If this changes between the old version of PostgreSQL and the
new version, pg_upgrade cannot move existing tables to the new cluster;
you will have to pg_dump the old data and then import that data into the
new cluster.

Second, all data types should have the same binary representation
between the two major PostgreSQL versions.

------------------------------------------------------------------------------
HOW IT WORKS
------------------------------------------------------------------------------

To use pg_upgrade during an upgrade, start by installing a fresh
cluster using the newest version in a new directory. When you've
finished installation, the new cluster will contain the new executables
and the usual template0, template1, and postgres databases, but no
user-defined tables. At this point, you can shut down the old and new
postmasters and invoke pg_upgrade.

When pg_upgrade starts, it ensures that all required executables are
present and contain the expected version numbers. The verification
process also checks the old and new $PGDATA directories to ensure that
the expected files and subdirectories are in place.  If the verification
process succeeds, pg_upgrade starts the old postmaster and runs
pg_dumpall --schema-only to capture the metadata contained in the old
cluster. The script produced by pg_dumpall will be used in a later step
to recreate all user-defined objects in the new cluster.

Note that the script produced by pg_dumpall will only recreate
user-defined objects, not system-defined objects.  The new cluster will
contain the system-defined objects created by the latest version of
PostgreSQL.

Once pg_upgrade has extracted the metadata from the old cluster, it
performs a number of bookkeeping tasks required to 'sync up' the new
cluster with the existing data.

First, pg_upgrade copies the commit status information and 'next
transaction ID' from the old cluster to the new cluster. This step
ensures that the proper tuples are visible from the new cluster.
Remember, pg_upgrade does not export/import the content of user-defined
tables so the transaction IDs in the new cluster must match the
transaction IDs in the old data. pg_upgrade also copies the starting
address for write-ahead logs from the old cluster to the new cluster.

Now pg_upgrade begins reconstructing the metadata obtained from the old
cluster using the first part of the pg_dumpall output.

Next, pg_upgrade executes the remainder of the script produced earlier
by pg_dumpall --- this script effectively creates the complete
user-defined metadata from the old cluster to the new cluster.  It
preserves the relfilenode numbers so TOAST and other references
to relfilenodes in user data is preserved.  (See binary-upgrade usage
in pg_dump). We choose to preserve tablespace and database OIDs as well.

Finally, pg_upgrade links or copies each user-defined table and its
supporting indexes and toast tables from the old cluster to the new
cluster.

An important feature of the pg_upgrade design is that it leaves the
original cluster intact --- if a problem occurs during the upgrade, you
can still run the previous version, after renaming the tablespaces back
to the original names.
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`------------------------------------------------------------------------------`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`PG_UPGRADE: IN-PLACE UPGRADES FOR POSTGRESQL`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`------------------------------------------------------------------------------`

			`Upgrading a PostgreSQL database from one major release to another can be`
			`an expensive process. For minor upgrades, you can simply install new`
			`executables and forget about upgrading existing data. But for major`
			`upgrades, you have to export all of your data using pg_dump, install the`
			`new release, run initdb to create a new cluster, and then import your`
			`old data. If you have a lot of data, that can take a considerable amount`
			`of time. If you have too much data, you may have to buy more storage`
			`since you need enough room to hold the original data plus the exported`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`data. pg_upgrade can reduce the amount of time and disk space required`
Remove useless whitespace at end of lines 2010-11-23 21:27:50 +01:00			`for many upgrades.`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`The URL http://momjian.us/main/writings/pgsql/pg_upgrade.pdf contains a`
			`presentation about pg_upgrade internals that mirrors the text`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`description below.`

			`------------------------------------------------------------------------------`
			`WHAT IT DOES`
			`------------------------------------------------------------------------------`

Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`pg_upgrade is a tool that performs an in-place upgrade of existing`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`data. Some upgrades change the on-disk representation of data;`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`pg_upgrade cannot help in those upgrades. However, many upgrades do`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`not change the on-disk representation of a user-defined table. In those`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`cases, pg_upgrade can move existing user-defined tables from the old`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`database cluster into the new cluster.`

			`There are two factors that determine whether an in-place upgrade is`
			`practical.`

			`Every table in a cluster shares the same on-disk representation of the`
			`table headers and trailers and the on-disk representation of tuple`
			`headers. If this changes between the old version of PostgreSQL and the`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`new version, pg_upgrade cannot move existing tables to the new cluster;`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`you will have to pg_dump the old data and then import that data into the`
			`new cluster.`

			`Second, all data types should have the same binary representation`
			`between the two major PostgreSQL versions.`

			`------------------------------------------------------------------------------`
			`HOW IT WORKS`
			`------------------------------------------------------------------------------`

Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`To use pg_upgrade during an upgrade, start by installing a fresh`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`cluster using the newest version in a new directory. When you've`
			`finished installation, the new cluster will contain the new executables`
Improve some wording in pg_upgrade/IMPLEMENTATION Author: Gurjeet Singh Discussion: https://postgr.es/m/CABwTF4VFKtKrb78fBnMXwHvOu4a+-7y86siBSEety2knti2eGA@mail.gmail.com 2023-10-11 06:54:33 +02:00			`and the usual template0, template1, and postgres databases, but no`
			`user-defined tables. At this point, you can shut down the old and new`
			`postmasters and invoke pg_upgrade.`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`When pg_upgrade starts, it ensures that all required executables are`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`present and contain the expected version numbers. The verification`
			`process also checks the old and new $PGDATA directories to ensure that`
			`the expected files and subdirectories are in place. If the verification`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`process succeeds, pg_upgrade starts the old postmaster and runs`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`pg_dumpall --schema-only to capture the metadata contained in the old`
			`cluster. The script produced by pg_dumpall will be used in a later step`
			`to recreate all user-defined objects in the new cluster.`

			`Note that the script produced by pg_dumpall will only recreate`
			`user-defined objects, not system-defined objects. The new cluster will`
			`contain the system-defined objects created by the latest version of`
			`PostgreSQL.`

Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`Once pg_upgrade has extracted the metadata from the old cluster, it`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`performs a number of bookkeeping tasks required to 'sync up' the new`
			`cluster with the existing data.`

Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`First, pg_upgrade copies the commit status information and 'next`
Fix wording in pg_upgrade docs Author: Daniel Gustafsson 2017-05-05 12:42:21 +02:00			`transaction ID' from the old cluster to the new cluster. This step`
			`ensures that the proper tuples are visible from the new cluster.`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`Remember, pg_upgrade does not export/import the content of user-defined`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`tables so the transaction IDs in the new cluster must match the`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`transaction IDs in the old data. pg_upgrade also copies the starting`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`address for write-ahead logs from the old cluster to the new cluster.`

Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`Now pg_upgrade begins reconstructing the metadata obtained from the old`
			`cluster using the first part of the pg_dumpall output.`

			`Next, pg_upgrade executes the remainder of the script produced earlier`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`by pg_dumpall --- this script effectively creates the complete`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`user-defined metadata from the old cluster to the new cluster. It`
			`preserves the relfilenode numbers so TOAST and other references`
			`to relfilenodes in user data is preserved. (See binary-upgrade usage`
pg_upgrade: Preserve database OIDs. Commit 9a974cbcba005256a19991203583a94b4f9a21a9 arranged to preserve relfilenodes and tablespace OIDs. For similar reasons, also arrange to preserve database OIDs. One problem is that, up until now, the OIDs assigned to the template0 and postgres databases have not been fixed. This could be a problem when upgrading, because pg_upgrade might try to migrate a database from the old cluster to the new cluster while keeping the OID and find a different database with that OID, resulting in a failure. If it finds a database with the same name and the same OID that's OK: it will be dropped and recreated. But the same OID and a different name is a problem. To prevent that, fix the OIDs for postgres and template0 to specific values less than 16384. To avoid running afoul of this rule, these values should not be changed in future releases. It's not a problem that these OIDs aren't fixed in existing releases, because the OIDs that we're assigning here weren't used for either of these databases in any previous release. Thus, there's no chance that an upgrade of a cluster from any previous release will collide with the OIDs we're assigning here. And going forward, the OIDs will always be fixed, so the only potential collision is with a system database having the same name and the same OID, which is OK. This patch lets users assign a specific OID to a database as well, provided however that it can't be less than 16384. I (rhaas) thought it might be better not to expose this capability to users, but the consensus was otherwise, so the syntax is documented. Letting users assign OIDs below 16384 would not be OK, though, because a user-created database with a low-numbered OID might collide with a system-created database in a future release. We therefore prohibit that. Shruthi KC, based on an earlier patch from Antonin Houska, reviewed and with some adjustments by me. Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com Discussion: http://postgr.es/m/CAASxf_Mnwm1Dh2vd5FAhVX6S1nwNSZUB1z12VddYtM++H2+p7w@mail.gmail.com 2022-01-24 20:23:15 +01:00			`in pg_dump). We choose to preserve tablespace and database OIDs as well.`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`Finally, pg_upgrade links or copies each user-defined table and its`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`supporting indexes and toast tables from the old cluster to the new`
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`cluster.`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00
Update pg_upgrade IMPLEMENTATION doc file to match current 9.0 behavior. 2010-05-25 18:09:29 +02:00			`An important feature of the pg_upgrade design is that it leaves the`
Add pg_upgrade IMPLEMENTATION file to CVS. 2010-05-12 04:24:43 +02:00			`original cluster intact --- if a problem occurs during the upgrade, you`
			`can still run the previous version, after renaming the tablespaces back`
			`to the original names.`