postgresql/src/bin/pg_dump
Jeff Davis ea1db8ae70 Canonicalize ICU locale names to language tags.
Convert to BCP47 language tags before storing in the catalog, except
during binary upgrade or when the locale comes from an existing
collation or template database.

The resulting language tags can vary slightly between ICU
versions. For instance, "@colBackwards=yes" is converted to
"und-u-kb-true" in older versions of ICU, and to the simpler (but
equivalent) "und-u-kb" in newer versions.

The process of canonicalizing to a language tag also understands more
input locale string formats than ucol_open(). For instance,
"fr_CA.UTF-8" is misinterpreted by ucol_open() and the region is
ignored; effectively treating it the same as the locale "fr" and
opening the wrong collator. Canonicalization properly interprets the
language and region, resulting in the language tag "fr-CA", which can
then be understood by ucol_open().

This commit fixes a problem in prior versions due to ucol_open()
misinterpreting locale strings as described above. For instance,
creating an ICU collation with locale "fr_CA.UTF-8" would store that
string directly in the catalog, which would later be passed to (and
misinterpreted by) ucol_open(). After this commit, the locale string
will be canonicalized to language tag "fr-CA" in the catalog, which
will be properly understood by ucol_open(). Because this fix affects
the resulting collator, we cannot change the locale string stored in
the catalog for existing databases or collations; otherwise we'd risk
corrupting indexes. Therefore, only canonicalize locales for
newly-created (not upgraded) collations/databases. For similar
reasons, do not backport.

Discussion: https://postgr.es/m/8c7af6820aed94dc7bc259d2aa7f9663518e6137.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-04-04 10:38:58 -07:00
..
po meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
t Canonicalize ICU locale names to language tags. 2023-04-04 10:38:58 -07:00
.gitignore Clean up after pg_dump test runs. 2016-05-06 22:28:01 -04:00
Makefile Add LZ4 compression to pg_dump 2023-02-23 21:19:26 +01:00
common.c Simplify and speed up pg_dump's creation of parent-table links. 2023-03-17 13:43:10 -04:00
compress_gzip.c pg_dump: Fix gzip compression of empty data 2023-03-29 02:34:48 +02:00
compress_gzip.h Introduce a generic pg_dump compression API 2023-02-23 18:33:40 +01:00
compress_io.c Improve type handling in pg_dump's compress file API 2023-03-23 17:55:17 +01:00
compress_io.h Unify buffer sizes in pg_dump compression API 2023-03-23 17:55:52 +01:00
compress_lz4.c pg_dump: Use only LZ4 frame format for compression 2023-04-01 00:54:50 +02:00
compress_lz4.h Add LZ4 compression to pg_dump 2023-02-23 21:19:26 +01:00
compress_none.c Unify buffer sizes in pg_dump compression API 2023-03-23 17:55:52 +01:00
compress_none.h Introduce a generic pg_dump compression API 2023-02-23 18:33:40 +01:00
dumputils.c Fix outdated references to guc.c 2023-03-02 13:49:39 +01:00
dumputils.h Update copyright for 2023 2023-01-02 15:00:37 -05:00
meson.build Fix pg_dump for hash partitioning on enum columns. 2023-03-17 13:31:40 -04:00
nls.mk Break up long GETTEXT_FILES lists 2023-03-08 15:05:43 +01:00
parallel.c Remove useless casts to (void *) in arguments of some system functions 2023-02-07 06:57:59 +01:00
parallel.h Update copyright for 2023 2023-01-02 15:00:37 -05:00
pg_backup.h pg_dump: Remove "blob" terminology 2022-12-05 08:52:55 +01:00
pg_backup_archiver.c Improve type handling in pg_dump's compress file API 2023-03-23 17:55:17 +01:00
pg_backup_archiver.h Introduce a generic pg_dump compression API 2023-02-23 18:33:40 +01:00
pg_backup_custom.c Introduce a generic pg_dump compression API 2023-02-23 18:33:40 +01:00
pg_backup_db.c pg_dump: Remove "blob" terminology 2022-12-05 08:52:55 +01:00
pg_backup_db.h Revert "pg_dump: Lock all relations, not just plain tables". 2020-11-06 15:48:04 -05:00
pg_backup_directory.c Unify buffer sizes in pg_dump compression API 2023-03-23 17:55:52 +01:00
pg_backup_null.c pg_dump: Remove "blob" terminology 2022-12-05 08:52:55 +01:00
pg_backup_tar.c pg_dump: Remove "blob" terminology 2022-12-05 08:52:55 +01:00
pg_backup_tar.h Fix tar files emitted by pg_dump and pg_basebackup to be POSIX conformant. 2012-09-28 15:19:15 -04:00
pg_backup_utils.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
pg_backup_utils.h Update copyright for 2023 2023-01-02 15:00:37 -05:00
pg_dump.c Add new predefined role pg_create_subscription. 2023-03-30 11:37:19 -04:00
pg_dump.h Add new predefined role pg_create_subscription. 2023-03-30 11:37:19 -04:00
pg_dump_sort.c Remove useless casts to (void *) in arguments of some system functions 2023-02-07 06:57:59 +01:00
pg_dumpall.c Fix various typos in code and tests 2023-02-09 14:43:53 +09:00
pg_restore.c Improve frontend error logging style. 2022-04-08 14:55:14 -04:00