postgresql/src/bin/pg_upgrade/pg_upgrade.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

503 lines
14 KiB
C
Raw Normal View History

/*
* pg_upgrade.h
*
* Copyright (c) 2010-2024, PostgreSQL Global Development Group
* src/bin/pg_upgrade/pg_upgrade.h
*/
#include <unistd.h>
#include <assert.h>
#include <sys/stat.h>
#include <sys/time.h>
#include "common/relpath.h"
#include "libpq-fe.h"
/* For now, pg_upgrade does not use common/logging.c; use our own pg_fatal */
#undef pg_fatal
/* Use port in the private/dynamic port number range */
#define DEF_PGUPORT 50432
#define MAX_STRING 1024
#define QUERY_ALLOC 8192
2010-07-06 21:19:02 +02:00
#define MESSAGE_WIDTH 62
#define GET_MAJOR_VERSION(v) ((v) / 100)
/* contains both global db information and CREATE DATABASE commands */
#define GLOBALS_DUMP_FILE "pg_upgrade_dump_globals.sql"
#define DB_DUMP_FILE_MASK "pg_upgrade_dump_%u.custom"
pg_upgrade: Move all the files generated internally to a subdirectory Historically, the location of any files generated by pg_upgrade, as of the per-database logs and internal dumps, has been the current working directory, leaving all those files behind when using --retain or on a failure. Putting all those contents in a targeted subdirectory makes the whole easier to debug, and simplifies the code in charge of cleaning up the logs. Note that another reason is that this facilitates the move of pg_upgrade to TAP with a fixed location for all the logs to grab if the test fails repeatedly. Initially, we thought about being able to specify the output directory with a new option, but we have settled on using a subdirectory located at the root of the new cluster's data folder, "pg_upgrade_output.d", instead, as at the end the new data directory is the location of all the data generated by pg_upgrade. There is a take with group permissions here though: if the new data folder has been initialized with this option, we need to create all the files and paths with the correct permissions or a base backup taken after a pg_upgrade --retain would fail, meaning that GetDataDirectoryCreatePerm() has to be called before creating the log paths, before a couple of sanity checks on the clusters and before getting the socket directory for the cluster's host settings. The idea of the new location is based on a suggestion from Peter Eisentraut. Also thanks to Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson, Tom Lane and Bruce Momjian for the discussion (in alphabetical order). Author: Justin Pryzby Discussion: https://postgr.es/m/20211212025017.GN17618@telsasoft.com
2022-02-06 04:27:29 +01:00
/*
Restructure pg_upgrade output directories for better idempotence 38bfae3 has moved the contents written to files by pg_upgrade under a new directory called pg_upgrade_output.d/ located in the new cluster's data folder, and it used a simple structure made of two subdirectories leading to a fixed structure: log/ and dump/. This design has made weaker pg_upgrade on repeated calls, as we could get failures when creating one or more of those directories, while potentially losing the logs of a previous run (logs are retained automatically on failure, and cleaned up on success unless --retain is specified). So a user would need to clean up pg_upgrade_output.d/ as an extra step for any repeated calls of pg_upgrade. The most common scenario here is --check followed by the actual upgrade, but one could see a failure when specifying an incorrect input argument value. Removing entirely the logs would have the disadvantage of removing all the past information, even if --retain was specified at some past step. This result is annoying for a lot of users and automated upgrade flows. So, rather than requiring a manual removal of pg_upgrade_output.d/, this redesigns the set of output directories in a more dynamic way, based on a suggestion from Tom Lane and Daniel Gustafsson. pg_upgrade_output.d/ is still the base path, but a second directory level is added, mostly named after an ISO-8601-formatted timestamp (in short human-readable, with milliseconds appended to the name to avoid any conflicts). The logs and dumps are saved within the same subdirectories as previously, as of log/ and dump/, but these are located inside the subdirectory named after the timestamp. The logs of a given run are removed only after a successful run if --retain is not used, and pg_upgrade_output.d/ is kept if there are any logs from a previous run. Note that previously, pg_upgrade would have kept the logs even after a successful --check but that was inconsistent compared to the case without --check when using --retain. The code in charge of the removal of the output directories is now refactored into a single routine. Two TAP tests are added with some --check commands (one failure case and one success case), to look after the issue fixed here. Note that the tests had to be tweaked a bit to fit with the new directory structure so as it can find any logs generated on failure. This is still going to require a change in the buildfarm client for the case where pg_upgrade is tested without the TAP test, though, but I'll tackle that with a separate patch where needed. Reported-by: Tushar Ahuja Author: Michael Paquier Reviewed-by: Daniel Gustafsson, Justin Pryzby Discussion: https://postgr.es/m/77e6ecaa-2785-97aa-f229-4b6e047cbd2b@enterprisedb.com
2022-06-08 03:53:01 +02:00
* Base directories that include all the files generated internally, from the
* root path of the new cluster. The paths are dynamically built as of
* BASE_OUTPUTDIR/$timestamp/{LOG_OUTPUTDIR,DUMP_OUTPUTDIR} to ensure their
* uniqueness in each run.
pg_upgrade: Move all the files generated internally to a subdirectory Historically, the location of any files generated by pg_upgrade, as of the per-database logs and internal dumps, has been the current working directory, leaving all those files behind when using --retain or on a failure. Putting all those contents in a targeted subdirectory makes the whole easier to debug, and simplifies the code in charge of cleaning up the logs. Note that another reason is that this facilitates the move of pg_upgrade to TAP with a fixed location for all the logs to grab if the test fails repeatedly. Initially, we thought about being able to specify the output directory with a new option, but we have settled on using a subdirectory located at the root of the new cluster's data folder, "pg_upgrade_output.d", instead, as at the end the new data directory is the location of all the data generated by pg_upgrade. There is a take with group permissions here though: if the new data folder has been initialized with this option, we need to create all the files and paths with the correct permissions or a base backup taken after a pg_upgrade --retain would fail, meaning that GetDataDirectoryCreatePerm() has to be called before creating the log paths, before a couple of sanity checks on the clusters and before getting the socket directory for the cluster's host settings. The idea of the new location is based on a suggestion from Peter Eisentraut. Also thanks to Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson, Tom Lane and Bruce Momjian for the discussion (in alphabetical order). Author: Justin Pryzby Discussion: https://postgr.es/m/20211212025017.GN17618@telsasoft.com
2022-02-06 04:27:29 +01:00
*/
#define BASE_OUTPUTDIR "pg_upgrade_output.d"
Restructure pg_upgrade output directories for better idempotence 38bfae3 has moved the contents written to files by pg_upgrade under a new directory called pg_upgrade_output.d/ located in the new cluster's data folder, and it used a simple structure made of two subdirectories leading to a fixed structure: log/ and dump/. This design has made weaker pg_upgrade on repeated calls, as we could get failures when creating one or more of those directories, while potentially losing the logs of a previous run (logs are retained automatically on failure, and cleaned up on success unless --retain is specified). So a user would need to clean up pg_upgrade_output.d/ as an extra step for any repeated calls of pg_upgrade. The most common scenario here is --check followed by the actual upgrade, but one could see a failure when specifying an incorrect input argument value. Removing entirely the logs would have the disadvantage of removing all the past information, even if --retain was specified at some past step. This result is annoying for a lot of users and automated upgrade flows. So, rather than requiring a manual removal of pg_upgrade_output.d/, this redesigns the set of output directories in a more dynamic way, based on a suggestion from Tom Lane and Daniel Gustafsson. pg_upgrade_output.d/ is still the base path, but a second directory level is added, mostly named after an ISO-8601-formatted timestamp (in short human-readable, with milliseconds appended to the name to avoid any conflicts). The logs and dumps are saved within the same subdirectories as previously, as of log/ and dump/, but these are located inside the subdirectory named after the timestamp. The logs of a given run are removed only after a successful run if --retain is not used, and pg_upgrade_output.d/ is kept if there are any logs from a previous run. Note that previously, pg_upgrade would have kept the logs even after a successful --check but that was inconsistent compared to the case without --check when using --retain. The code in charge of the removal of the output directories is now refactored into a single routine. Two TAP tests are added with some --check commands (one failure case and one success case), to look after the issue fixed here. Note that the tests had to be tweaked a bit to fit with the new directory structure so as it can find any logs generated on failure. This is still going to require a change in the buildfarm client for the case where pg_upgrade is tested without the TAP test, though, but I'll tackle that with a separate patch where needed. Reported-by: Tushar Ahuja Author: Michael Paquier Reviewed-by: Daniel Gustafsson, Justin Pryzby Discussion: https://postgr.es/m/77e6ecaa-2785-97aa-f229-4b6e047cbd2b@enterprisedb.com
2022-06-08 03:53:01 +02:00
#define LOG_OUTPUTDIR "log"
#define DUMP_OUTPUTDIR "dump"
pg_upgrade: Move all the files generated internally to a subdirectory Historically, the location of any files generated by pg_upgrade, as of the per-database logs and internal dumps, has been the current working directory, leaving all those files behind when using --retain or on a failure. Putting all those contents in a targeted subdirectory makes the whole easier to debug, and simplifies the code in charge of cleaning up the logs. Note that another reason is that this facilitates the move of pg_upgrade to TAP with a fixed location for all the logs to grab if the test fails repeatedly. Initially, we thought about being able to specify the output directory with a new option, but we have settled on using a subdirectory located at the root of the new cluster's data folder, "pg_upgrade_output.d", instead, as at the end the new data directory is the location of all the data generated by pg_upgrade. There is a take with group permissions here though: if the new data folder has been initialized with this option, we need to create all the files and paths with the correct permissions or a base backup taken after a pg_upgrade --retain would fail, meaning that GetDataDirectoryCreatePerm() has to be called before creating the log paths, before a couple of sanity checks on the clusters and before getting the socket directory for the cluster's host settings. The idea of the new location is based on a suggestion from Peter Eisentraut. Also thanks to Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson, Tom Lane and Bruce Momjian for the discussion (in alphabetical order). Author: Justin Pryzby Discussion: https://postgr.es/m/20211212025017.GN17618@telsasoft.com
2022-02-06 04:27:29 +01:00
#define DB_DUMP_LOG_FILE_MASK "pg_upgrade_dump_%u.log"
#define SERVER_LOG_FILE "pg_upgrade_server.log"
#define UTILITY_LOG_FILE "pg_upgrade_utility.log"
#define INTERNAL_LOG_FILE "pg_upgrade_internal.log"
extern char *output_files[];
/*
* WIN32 files do not accept writes from multiple processes
*
* On Win32, we can't send both pg_upgrade output and command output to the
* same file because we get the error: "The process cannot access the file
* because it is being used by another process." so send the pg_ctl
* command-line output to a new file, rather than into the server log file.
* Ideally we could use UTILITY_LOG_FILE for this, but some Windows platforms
* keep the pg_ctl output file open by the running postmaster, even after
* pg_ctl exits.
*
* We could use the Windows pgwin32_open() flags to allow shared file
* writes but is unclear how all other tools would use those flags, so
* we just avoid it and log a little differently on Windows; we adjust
* the error message appropriately.
*/
#ifndef WIN32
#define SERVER_START_LOG_FILE SERVER_LOG_FILE
#define SERVER_STOP_LOG_FILE SERVER_LOG_FILE
#else
#define SERVER_START_LOG_FILE "pg_upgrade_server_start.log"
/*
* "pg_ctl start" keeps SERVER_START_LOG_FILE and SERVER_LOG_FILE open
* while the server is running, so we use UTILITY_LOG_FILE for "pg_ctl
* stop".
*/
#define SERVER_STOP_LOG_FILE UTILITY_LOG_FILE
#endif
#ifndef WIN32
#define pg_mv_file rename
#define PATH_SEPARATOR '/'
#define PATH_QUOTE '\''
#define RM_CMD "rm -f"
#define RMDIR_CMD "rm -rf"
#define SCRIPT_PREFIX "./"
#define SCRIPT_EXT "sh"
#define ECHO_QUOTE "'"
#define ECHO_BLANK ""
#else
#define pg_mv_file pgrename
#define PATH_SEPARATOR '\\'
#define PATH_QUOTE '"'
/* @ prefix disables command echo in .bat files */
#define RM_CMD "@DEL /q"
#define RMDIR_CMD "@RMDIR /s/q"
#define SCRIPT_PREFIX ""
#define SCRIPT_EXT "bat"
#define EXE_EXT ".exe"
#define ECHO_QUOTE ""
#define ECHO_BLANK "."
#endif
/*
* The format of visibility map was changed with this 9.6 commit.
*/
#define VISIBILITY_MAP_FROZEN_BIT_CAT_VER 201603011
Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
/*
* pg_multixact format changed in 9.3 commit 0ac5ad5134f2769ccbaefec73844f85,
* ("Improve concurrency of foreign key locking") which also updated catalog
* version to this value. pg_upgrade behavior depends on whether old and new
* server versions are both newer than this, or only the new one is.
Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
*/
#define MULTIXACT_FORMATCHANGE_CAT_VER 201301231
/*
* large object chunk size added to pg_controldata,
* commit 5f93c37805e7485488480916b4585e098d3cc883
*/
#define LARGE_OBJECT_SIZE_PG_CONTROL_VER 942
/*
* change in JSONB format during 9.4 beta
*/
#define JSONB_FORMAT_CHANGE_CAT_VER 201409291
/*
* Each relation is represented by a relinfo structure.
*/
typedef struct
{
/* Can't use NAMEDATALEN; not guaranteed to be same on client */
char *nspname; /* namespace name */
char *relname; /* relation name */
Oid reloid; /* relation OID */
Change internal RelFileNode references to RelFileNumber or RelFileLocator. We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
RelFileNumber relfilenumber; /* relation file number */
Oid indtable; /* if index, OID of its table, else 0 */
Oid toastheap; /* if toast table, OID of base table, else 0 */
char *tablespace; /* tablespace path; "" for cluster default */
bool nsp_alloc; /* should nspname be freed? */
bool tblsp_alloc; /* should tablespace be freed? */
} RelInfo;
typedef struct
{
RelInfo *rels;
int nrels;
} RelInfoArr;
/*
* Structure to store logical replication slot information.
*/
typedef struct
{
char *slotname; /* slot name */
char *plugin; /* plugin */
bool two_phase; /* can the slot decode 2PC? */
bool caught_up; /* has the slot caught up to latest changes? */
bool invalid; /* if true, the slot is unusable */
bool failover; /* is the slot designated to be synced to the
* physical standby? */
} LogicalSlotInfo;
typedef struct
{
int nslots; /* number of logical slot infos */
LogicalSlotInfo *slots; /* array of logical slot infos */
} LogicalSlotInfoArr;
/*
* The following structure represents a relation mapping.
*/
typedef struct
{
const char *old_tablespace;
const char *new_tablespace;
const char *old_tablespace_suffix;
const char *new_tablespace_suffix;
pg_upgrade: Preserve database OIDs. Commit 9a974cbcba005256a19991203583a94b4f9a21a9 arranged to preserve relfilenodes and tablespace OIDs. For similar reasons, also arrange to preserve database OIDs. One problem is that, up until now, the OIDs assigned to the template0 and postgres databases have not been fixed. This could be a problem when upgrading, because pg_upgrade might try to migrate a database from the old cluster to the new cluster while keeping the OID and find a different database with that OID, resulting in a failure. If it finds a database with the same name and the same OID that's OK: it will be dropped and recreated. But the same OID and a different name is a problem. To prevent that, fix the OIDs for postgres and template0 to specific values less than 16384. To avoid running afoul of this rule, these values should not be changed in future releases. It's not a problem that these OIDs aren't fixed in existing releases, because the OIDs that we're assigning here weren't used for either of these databases in any previous release. Thus, there's no chance that an upgrade of a cluster from any previous release will collide with the OIDs we're assigning here. And going forward, the OIDs will always be fixed, so the only potential collision is with a system database having the same name and the same OID, which is OK. This patch lets users assign a specific OID to a database as well, provided however that it can't be less than 16384. I (rhaas) thought it might be better not to expose this capability to users, but the consensus was otherwise, so the syntax is documented. Letting users assign OIDs below 16384 would not be OK, though, because a user-created database with a low-numbered OID might collide with a system-created database in a future release. We therefore prohibit that. Shruthi KC, based on an earlier patch from Antonin Houska, reviewed and with some adjustments by me. Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com Discussion: http://postgr.es/m/CAASxf_Mnwm1Dh2vd5FAhVX6S1nwNSZUB1z12VddYtM++H2+p7w@mail.gmail.com
2022-01-24 20:23:15 +01:00
Oid db_oid;
Change internal RelFileNode references to RelFileNumber or RelFileLocator. We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
RelFileNumber relfilenumber;
/* the rest are used only for logging and error reporting */
char *nspname; /* namespaces */
char *relname;
} FileNameMap;
/*
* Structure to store database information
*/
typedef struct
{
Oid db_oid; /* oid of the database */
char *db_name; /* database name */
char db_tablespace[MAXPGPATH]; /* database default tablespace
* path */
RelInfoArr rel_arr; /* array of all user relinfos */
LogicalSlotInfoArr slot_arr; /* array of all LogicalSlotInfo */
Allow upgrades to preserve the full subscription's state. This feature will allow us to replicate the changes on subscriber nodes after the upgrade. Previously, only the subscription metadata information was preserved. Without the list of relations and their state, it's not possible to re-enable the subscriptions without missing some records as the list of relations can only be refreshed after enabling the subscription (and therefore starting the apply worker). Even if we added a way to refresh the subscription while enabling a publication, we still wouldn't know which relations are new on the publication side, and therefore should be fully synced, and which shouldn't. To preserve the subscription relations, this patch teaches pg_dump to restore the content of pg_subscription_rel from the old cluster by using binary_upgrade_add_sub_rel_state SQL function. This is supported only in binary upgrade mode. The subscription's replication origin is needed to ensure that we don't replicate anything twice. To preserve the replication origins, this patch teaches pg_dump to update the replication origin along with creating a subscription by using binary_upgrade_replorigin_advance SQL function to restore the underlying replication origin remote LSN. This is supported only in binary upgrade mode. pg_upgrade will check that all the subscription relations are in 'i' (init) or in 'r' (ready) state and will error out if that's not the case, logging the reason for the failure. This helps to avoid the risk of any dangling slot or origin after the upgrade. Author: Vignesh C, Julien Rouhaud, Shlok Kyal Reviewed-by: Peter Smith, Masahiko Sawada, Michael Paquier, Amit Kapila, Hayato Kuroda Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
2024-01-02 03:38:46 +01:00
int nsubs; /* number of subscriptions */
} DbInfo;
/*
* Locale information about a database.
*/
typedef struct
{
char *db_collate;
char *db_ctype;
char db_collprovider;
char *db_iculocale;
int db_encoding;
} DbLocaleInfo;
typedef struct
{
DbInfo *dbs; /* array of db infos */
int ndbs; /* number of db infos */
} DbInfoArr;
/*
* The following structure is used to hold pg_control information.
* Rather than using the backend's control structure we use our own
* structure to avoid pg_control version issues between releases.
*/
typedef struct
{
uint32 ctrl_ver;
uint32 cat_ver;
char nextxlogfile[25];
uint32 chkpnt_nxtxid;
uint32 chkpnt_nxtepoch;
uint32 chkpnt_nxtoid;
Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
uint32 chkpnt_nxtmulti;
uint32 chkpnt_nxtmxoff;
uint32 chkpnt_oldstMulti;
uint32 chkpnt_oldstxid;
uint32 align;
uint32 blocksz;
uint32 largesz;
uint32 walsz;
uint32 walseg;
uint32 ident;
uint32 index;
uint32 toast;
uint32 large_object;
bool date_is_int;
bool float8_pass_by_value;
uint32 data_checksum_version;
} ControlData;
/*
* Enumeration to denote transfer modes
*/
typedef enum
{
TRANSFER_MODE_CLONE,
TRANSFER_MODE_COPY,
TRANSFER_MODE_COPY_FILE_RANGE,
TRANSFER_MODE_LINK,
} transferMode;
/*
* Enumeration to denote pg_log modes
*/
typedef enum
{
PG_VERBOSE,
PG_STATUS, /* these messages do not get a newline added */
PG_REPORT_NONL, /* these too */
PG_REPORT,
PG_WARNING,
PG_FATAL,
} eLogType;
/*
* cluster
*
* information about each cluster
*/
typedef struct
{
ControlData controldata; /* pg_control information */
DbLocaleInfo *template0; /* template0 locale info */
DbInfoArr dbarr; /* dbinfos array */
char *pgdata; /* pathname for cluster's $PGDATA directory */
char *pgconfig; /* pathname for cluster's config file
* directory */
char *bindir; /* pathname for cluster's executable directory */
char *pgopts; /* options to pass to the server, like pg_ctl
* -o */
char *sockdir; /* directory for Unix Domain socket, if any */
unsigned short port; /* port number where postmaster is waiting */
uint32 major_version; /* PG_VERSION of cluster */
char major_version_str[64]; /* string PG_VERSION of cluster */
uint32 bin_version; /* version returned from pg_ctl */
const char *tablespace_suffix; /* directory specification */
} ClusterInfo;
/*
* LogOpts
*/
typedef struct
{
FILE *internal; /* internal log FILE */
bool verbose; /* true -> be verbose in messages */
bool retain; /* retain log files on success */
pg_upgrade: Move all the files generated internally to a subdirectory Historically, the location of any files generated by pg_upgrade, as of the per-database logs and internal dumps, has been the current working directory, leaving all those files behind when using --retain or on a failure. Putting all those contents in a targeted subdirectory makes the whole easier to debug, and simplifies the code in charge of cleaning up the logs. Note that another reason is that this facilitates the move of pg_upgrade to TAP with a fixed location for all the logs to grab if the test fails repeatedly. Initially, we thought about being able to specify the output directory with a new option, but we have settled on using a subdirectory located at the root of the new cluster's data folder, "pg_upgrade_output.d", instead, as at the end the new data directory is the location of all the data generated by pg_upgrade. There is a take with group permissions here though: if the new data folder has been initialized with this option, we need to create all the files and paths with the correct permissions or a base backup taken after a pg_upgrade --retain would fail, meaning that GetDataDirectoryCreatePerm() has to be called before creating the log paths, before a couple of sanity checks on the clusters and before getting the socket directory for the cluster's host settings. The idea of the new location is based on a suggestion from Peter Eisentraut. Also thanks to Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson, Tom Lane and Bruce Momjian for the discussion (in alphabetical order). Author: Justin Pryzby Discussion: https://postgr.es/m/20211212025017.GN17618@telsasoft.com
2022-02-06 04:27:29 +01:00
/* Set of internal directories for output files */
Restructure pg_upgrade output directories for better idempotence 38bfae3 has moved the contents written to files by pg_upgrade under a new directory called pg_upgrade_output.d/ located in the new cluster's data folder, and it used a simple structure made of two subdirectories leading to a fixed structure: log/ and dump/. This design has made weaker pg_upgrade on repeated calls, as we could get failures when creating one or more of those directories, while potentially losing the logs of a previous run (logs are retained automatically on failure, and cleaned up on success unless --retain is specified). So a user would need to clean up pg_upgrade_output.d/ as an extra step for any repeated calls of pg_upgrade. The most common scenario here is --check followed by the actual upgrade, but one could see a failure when specifying an incorrect input argument value. Removing entirely the logs would have the disadvantage of removing all the past information, even if --retain was specified at some past step. This result is annoying for a lot of users and automated upgrade flows. So, rather than requiring a manual removal of pg_upgrade_output.d/, this redesigns the set of output directories in a more dynamic way, based on a suggestion from Tom Lane and Daniel Gustafsson. pg_upgrade_output.d/ is still the base path, but a second directory level is added, mostly named after an ISO-8601-formatted timestamp (in short human-readable, with milliseconds appended to the name to avoid any conflicts). The logs and dumps are saved within the same subdirectories as previously, as of log/ and dump/, but these are located inside the subdirectory named after the timestamp. The logs of a given run are removed only after a successful run if --retain is not used, and pg_upgrade_output.d/ is kept if there are any logs from a previous run. Note that previously, pg_upgrade would have kept the logs even after a successful --check but that was inconsistent compared to the case without --check when using --retain. The code in charge of the removal of the output directories is now refactored into a single routine. Two TAP tests are added with some --check commands (one failure case and one success case), to look after the issue fixed here. Note that the tests had to be tweaked a bit to fit with the new directory structure so as it can find any logs generated on failure. This is still going to require a change in the buildfarm client for the case where pg_upgrade is tested without the TAP test, though, but I'll tackle that with a separate patch where needed. Reported-by: Tushar Ahuja Author: Michael Paquier Reviewed-by: Daniel Gustafsson, Justin Pryzby Discussion: https://postgr.es/m/77e6ecaa-2785-97aa-f229-4b6e047cbd2b@enterprisedb.com
2022-06-08 03:53:01 +02:00
char *rootdir; /* Root directory, aka pg_upgrade_output.d */
char *basedir; /* Base output directory, with timestamp */
pg_upgrade: Move all the files generated internally to a subdirectory Historically, the location of any files generated by pg_upgrade, as of the per-database logs and internal dumps, has been the current working directory, leaving all those files behind when using --retain or on a failure. Putting all those contents in a targeted subdirectory makes the whole easier to debug, and simplifies the code in charge of cleaning up the logs. Note that another reason is that this facilitates the move of pg_upgrade to TAP with a fixed location for all the logs to grab if the test fails repeatedly. Initially, we thought about being able to specify the output directory with a new option, but we have settled on using a subdirectory located at the root of the new cluster's data folder, "pg_upgrade_output.d", instead, as at the end the new data directory is the location of all the data generated by pg_upgrade. There is a take with group permissions here though: if the new data folder has been initialized with this option, we need to create all the files and paths with the correct permissions or a base backup taken after a pg_upgrade --retain would fail, meaning that GetDataDirectoryCreatePerm() has to be called before creating the log paths, before a couple of sanity checks on the clusters and before getting the socket directory for the cluster's host settings. The idea of the new location is based on a suggestion from Peter Eisentraut. Also thanks to Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson, Tom Lane and Bruce Momjian for the discussion (in alphabetical order). Author: Justin Pryzby Discussion: https://postgr.es/m/20211212025017.GN17618@telsasoft.com
2022-02-06 04:27:29 +01:00
char *dumpdir; /* Dumps */
char *logdir; /* Log files */
bool isatty; /* is stdout a tty */
} LogOpts;
/*
* UserOpts
*/
typedef struct
{
bool check; /* true -> ask user for permission to make
* changes */
bool do_sync; /* flush changes to disk */
transferMode transfer_mode; /* copy files or link them? */
int jobs; /* number of processes/threads to use */
char *socketdir; /* directory to use for Unix sockets */
char *sync_method;
} UserOpts;
typedef struct
{
char *name;
int dbnum;
} LibraryInfo;
/*
* OSInfo
*/
typedef struct
{
const char *progname; /* complete pathname for this program */
char *user; /* username for clusters */
bool user_specified; /* user specified on command-line */
char **old_tablespaces; /* tablespaces */
int num_old_tablespaces;
LibraryInfo *libraries; /* loadable libraries */
int num_libraries;
ClusterInfo *running_cluster;
} OSInfo;
/*
* Global variables
*/
extern LogOpts log_opts;
extern UserOpts user_opts;
extern ClusterInfo old_cluster,
new_cluster;
extern OSInfo os_info;
/* check.c */
void output_check_banner(bool live_check);
void check_and_dump_old_cluster(bool live_check);
void check_new_cluster(void);
void report_clusters_compatible(void);
void issue_warnings_and_set_wal_level(void);
void output_completion_banner(char *deletion_script_file_name);
void check_cluster_versions(void);
void check_cluster_compatibility(bool live_check);
void create_script_for_old_cluster_deletion(char **deletion_script_file_name);
/* controldata.c */
void get_control_data(ClusterInfo *cluster, bool live_check);
void check_control_data(ControlData *oldctrl, ControlData *newctrl);
void disable_old_cluster(void);
/* dump.c */
void generate_old_dump(void);
/* exec.c */
#define EXEC_PSQL_ARGS "--echo-queries --set ON_ERROR_STOP=on --no-psqlrc --dbname=template1"
bool exec_prog(const char *log_filename, const char *opt_log_file,
bool report_error, bool exit_on_error, const char *fmt,...) pg_attribute_printf(5, 6);
void verify_directories(void);
bool pid_lock_file_exists(const char *datadir);
/* file.c */
void cloneFile(const char *src, const char *dst,
const char *schemaName, const char *relName);
Improve error reporting in pg_upgrade's file copying/linking/rewriting. The previous design for this had copyFile(), linkFile(), and rewriteVisibilityMap() returning strerror strings, with the caller producing one-size-fits-all error messages based on that. This made it impossible to produce messages that described the failures with any degree of precision, especially not short-read problems since those don't set errno at all. Since pg_upgrade has no intention of continuing after any error in this area, let's fix this by just letting these functions call pg_fatal() for themselves, making it easy for each point of failure to have a suitable error message. Taking this approach also allows dropping cleanup code that was unnecessary and was often rather sloppy about preserving errno. To not lose relevant info that was reported before, pass in the schema name and table name of the current table so that they can be included in the error reports. An additional problem was the use of getErrorText(), which was flat out wrong for all but a couple of call sites, because it unconditionally did "_dosmaperr(GetLastError())" on Windows. That's only appropriate when reporting an error from a Windows-native API, which only a couple of the callers were actually doing. Thus, even the reported strerror string would be unrelated to the actual failure in many cases on Windows. To fix, get rid of getErrorText() altogether, and just have call sites do strerror(errno) instead, since that's the way all the rest of our frontend programs do it. Add back the _dosmaperr() calls in the two places where that's actually appropriate. In passing, make assorted messages hew more closely to project style guidelines, notably by removing initial capitals in not-complete-sentence primary error messages. (I didn't make any effort to clean up places I didn't have another reason to touch, though.) Per discussion of a report from Thomas Kellerer. Back-patch to 9.6, but no further; given the relative infrequency of reports of problems here, it's not clear it's worth adapting the patch to older branches. Patch by me, but with credit to Alvaro Herrera for spotting the issue with getErrorText's misuse of _dosmaperr(). Discussion: <nsjrbh$8li$1@blaine.gmane.org>
2016-10-01 02:40:27 +02:00
void copyFile(const char *src, const char *dst,
const char *schemaName, const char *relName);
void copyFileByRange(const char *src, const char *dst,
const char *schemaName, const char *relName);
Improve error reporting in pg_upgrade's file copying/linking/rewriting. The previous design for this had copyFile(), linkFile(), and rewriteVisibilityMap() returning strerror strings, with the caller producing one-size-fits-all error messages based on that. This made it impossible to produce messages that described the failures with any degree of precision, especially not short-read problems since those don't set errno at all. Since pg_upgrade has no intention of continuing after any error in this area, let's fix this by just letting these functions call pg_fatal() for themselves, making it easy for each point of failure to have a suitable error message. Taking this approach also allows dropping cleanup code that was unnecessary and was often rather sloppy about preserving errno. To not lose relevant info that was reported before, pass in the schema name and table name of the current table so that they can be included in the error reports. An additional problem was the use of getErrorText(), which was flat out wrong for all but a couple of call sites, because it unconditionally did "_dosmaperr(GetLastError())" on Windows. That's only appropriate when reporting an error from a Windows-native API, which only a couple of the callers were actually doing. Thus, even the reported strerror string would be unrelated to the actual failure in many cases on Windows. To fix, get rid of getErrorText() altogether, and just have call sites do strerror(errno) instead, since that's the way all the rest of our frontend programs do it. Add back the _dosmaperr() calls in the two places where that's actually appropriate. In passing, make assorted messages hew more closely to project style guidelines, notably by removing initial capitals in not-complete-sentence primary error messages. (I didn't make any effort to clean up places I didn't have another reason to touch, though.) Per discussion of a report from Thomas Kellerer. Back-patch to 9.6, but no further; given the relative infrequency of reports of problems here, it's not clear it's worth adapting the patch to older branches. Patch by me, but with credit to Alvaro Herrera for spotting the issue with getErrorText's misuse of _dosmaperr(). Discussion: <nsjrbh$8li$1@blaine.gmane.org>
2016-10-01 02:40:27 +02:00
void linkFile(const char *src, const char *dst,
const char *schemaName, const char *relName);
void rewriteVisibilityMap(const char *fromfile, const char *tofile,
const char *schemaName, const char *relName);
void check_file_clone(void);
void check_copy_file_range(void);
void check_hard_link(void);
/* fopen_priv() is no longer different from fopen() */
#define fopen_priv(path, mode) fopen(path, mode)
/* function.c */
void get_loadable_libraries(void);
void check_loadable_libraries(void);
/* info.c */
FileNameMap *gen_db_file_maps(DbInfo *old_db,
DbInfo *new_db, int *nmaps, const char *old_pgdata,
const char *new_pgdata);
void get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check);
int count_old_cluster_logical_slots(void);
Allow upgrades to preserve the full subscription's state. This feature will allow us to replicate the changes on subscriber nodes after the upgrade. Previously, only the subscription metadata information was preserved. Without the list of relations and their state, it's not possible to re-enable the subscriptions without missing some records as the list of relations can only be refreshed after enabling the subscription (and therefore starting the apply worker). Even if we added a way to refresh the subscription while enabling a publication, we still wouldn't know which relations are new on the publication side, and therefore should be fully synced, and which shouldn't. To preserve the subscription relations, this patch teaches pg_dump to restore the content of pg_subscription_rel from the old cluster by using binary_upgrade_add_sub_rel_state SQL function. This is supported only in binary upgrade mode. The subscription's replication origin is needed to ensure that we don't replicate anything twice. To preserve the replication origins, this patch teaches pg_dump to update the replication origin along with creating a subscription by using binary_upgrade_replorigin_advance SQL function to restore the underlying replication origin remote LSN. This is supported only in binary upgrade mode. pg_upgrade will check that all the subscription relations are in 'i' (init) or in 'r' (ready) state and will error out if that's not the case, logging the reason for the failure. This helps to avoid the risk of any dangling slot or origin after the upgrade. Author: Vignesh C, Julien Rouhaud, Shlok Kyal Reviewed-by: Peter Smith, Masahiko Sawada, Michael Paquier, Amit Kapila, Hayato Kuroda Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
2024-01-02 03:38:46 +01:00
int count_old_cluster_subscriptions(void);
/* option.c */
void parseCommandLine(int argc, char *argv[]);
void adjust_data_dir(ClusterInfo *cluster);
void get_sock_dir(ClusterInfo *cluster, bool live_check);
Change internal RelFileNode references to RelFileNumber or RelFileLocator. We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
/* relfilenumber.c */
void transfer_all_new_tablespaces(DbInfoArr *old_db_arr,
DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata);
void transfer_all_new_dbs(DbInfoArr *old_db_arr,
DbInfoArr *new_db_arr, char *old_pgdata, char *new_pgdata,
char *old_tablespace);
/* tablespace.c */
void init_tablespaces(void);
/* server.c */
PGconn *connectToServer(ClusterInfo *cluster, const char *db_name);
PGresult *executeQueryOrDie(PGconn *conn, const char *fmt,...) pg_attribute_printf(2, 3);
char *cluster_conn_opts(ClusterInfo *cluster);
bool start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error);
void stop_postmaster(bool in_atexit);
uint32 get_major_server_version(ClusterInfo *cluster);
void check_pghost_envvar(void);
/* util.c */
char *quote_identifier(const char *s);
int get_user_info(char **user_name_p);
void check_ok(void);
void report_status(eLogType type, const char *fmt,...) pg_attribute_printf(2, 3);
void pg_log(eLogType type, const char *fmt,...) pg_attribute_printf(2, 3);
void pg_fatal(const char *fmt,...) pg_attribute_printf(1, 2) pg_attribute_noreturn();
void end_progress_output(void);
Restructure pg_upgrade output directories for better idempotence 38bfae3 has moved the contents written to files by pg_upgrade under a new directory called pg_upgrade_output.d/ located in the new cluster's data folder, and it used a simple structure made of two subdirectories leading to a fixed structure: log/ and dump/. This design has made weaker pg_upgrade on repeated calls, as we could get failures when creating one or more of those directories, while potentially losing the logs of a previous run (logs are retained automatically on failure, and cleaned up on success unless --retain is specified). So a user would need to clean up pg_upgrade_output.d/ as an extra step for any repeated calls of pg_upgrade. The most common scenario here is --check followed by the actual upgrade, but one could see a failure when specifying an incorrect input argument value. Removing entirely the logs would have the disadvantage of removing all the past information, even if --retain was specified at some past step. This result is annoying for a lot of users and automated upgrade flows. So, rather than requiring a manual removal of pg_upgrade_output.d/, this redesigns the set of output directories in a more dynamic way, based on a suggestion from Tom Lane and Daniel Gustafsson. pg_upgrade_output.d/ is still the base path, but a second directory level is added, mostly named after an ISO-8601-formatted timestamp (in short human-readable, with milliseconds appended to the name to avoid any conflicts). The logs and dumps are saved within the same subdirectories as previously, as of log/ and dump/, but these are located inside the subdirectory named after the timestamp. The logs of a given run are removed only after a successful run if --retain is not used, and pg_upgrade_output.d/ is kept if there are any logs from a previous run. Note that previously, pg_upgrade would have kept the logs even after a successful --check but that was inconsistent compared to the case without --check when using --retain. The code in charge of the removal of the output directories is now refactored into a single routine. Two TAP tests are added with some --check commands (one failure case and one success case), to look after the issue fixed here. Note that the tests had to be tweaked a bit to fit with the new directory structure so as it can find any logs generated on failure. This is still going to require a change in the buildfarm client for the case where pg_upgrade is tested without the TAP test, though, but I'll tackle that with a separate patch where needed. Reported-by: Tushar Ahuja Author: Michael Paquier Reviewed-by: Daniel Gustafsson, Justin Pryzby Discussion: https://postgr.es/m/77e6ecaa-2785-97aa-f229-4b6e047cbd2b@enterprisedb.com
2022-06-08 03:53:01 +02:00
void cleanup_output_dirs(void);
void prep_status(const char *fmt,...) pg_attribute_printf(1, 2);
void prep_status_progress(const char *fmt,...) pg_attribute_printf(1, 2);
unsigned int str2uint(const char *str);
/* version.c */
bool check_for_data_types_usage(ClusterInfo *cluster,
const char *base_query,
const char *output_path);
bool check_for_data_type_usage(ClusterInfo *cluster,
const char *type_name,
const char *output_path);
void old_9_3_check_for_line_data_type_usage(ClusterInfo *cluster);
Change unknown-type literals to type text in SELECT and RETURNING lists. Previously, we left such literals alone if the query or subquery had no properties forcing a type decision to be made (such as an ORDER BY or DISTINCT clause using that output column). This meant that "unknown" could be an exposed output column type, which has never been a great idea because it could result in strange failures later on. For example, an outer query that tried to do any operations on an unknown-type subquery output would generally fail with some weird error like "failed to find conversion function from unknown to text" or "could not determine which collation to use for string comparison". Also, if the case occurred in a CREATE VIEW's query then the view would have an unknown-type column, causing similar failures in queries trying to use the view. To fix, at the tail end of parse analysis of a query, forcibly convert any remaining "unknown" literals in its SELECT or RETURNING list to type text. However, provide a switch to suppress that, and use it in the cases of SELECT inside a set operation or INSERT command. In those cases we already had type resolution rules that make use of context information from outside the subquery proper, and we don't want to change that behavior. Also, change creation of an unknown-type column in a relation from a warning to a hard error. The error should be unreachable now in CREATE VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown" in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because it's nothing but a foot-gun. This change creates a pg_upgrade failure case: a matview that contains an unknown-type column can't be pg_upgraded, because reparsing the matview's defining query will now decide that the column is of type text, which doesn't match the cstring-like storage that the old materialized column would actually have. Add a checking pass to detect that. While at it, we can detect tables or composite types that would fail, essentially for free. Those would fail safely anyway later on, but we might as well fail earlier. This patch is by me, but it owes something to previous investigations by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for review. Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
void old_9_6_check_for_unknown_data_type_usage(ClusterInfo *cluster);
void old_9_6_invalidate_hash_indexes(ClusterInfo *cluster,
bool check_mode);
void old_11_check_for_sql_identifier_data_type_usage(ClusterInfo *cluster);
void report_extension_updates(ClusterInfo *cluster);
/* parallel.c */
void parallel_exec_prog(const char *log_file, const char *opt_log_file,
const char *fmt,...) pg_attribute_printf(3, 4);
void parallel_transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr,
char *old_pgdata, char *new_pgdata,
char *old_tablespace);
bool reap_child(bool wait_for_child);