postgresql/src/bin
Jeff Davis ea1db8ae70 Canonicalize ICU locale names to language tags.
Convert to BCP47 language tags before storing in the catalog, except
during binary upgrade or when the locale comes from an existing
collation or template database.

The resulting language tags can vary slightly between ICU
versions. For instance, "@colBackwards=yes" is converted to
"und-u-kb-true" in older versions of ICU, and to the simpler (but
equivalent) "und-u-kb" in newer versions.

The process of canonicalizing to a language tag also understands more
input locale string formats than ucol_open(). For instance,
"fr_CA.UTF-8" is misinterpreted by ucol_open() and the region is
ignored; effectively treating it the same as the locale "fr" and
opening the wrong collator. Canonicalization properly interprets the
language and region, resulting in the language tag "fr-CA", which can
then be understood by ucol_open().

This commit fixes a problem in prior versions due to ucol_open()
misinterpreting locale strings as described above. For instance,
creating an ICU collation with locale "fr_CA.UTF-8" would store that
string directly in the catalog, which would later be passed to (and
misinterpreted by) ucol_open(). After this commit, the locale string
will be canonicalized to language tag "fr-CA" in the catalog, which
will be properly understood by ucol_open(). Because this fix affects
the resulting collator, we cannot change the locale string stored in
the catalog for existing databases or collations; otherwise we'd risk
corrupting indexes. Therefore, only canonicalize locales for
newly-created (not upgraded) collations/databases. For similar
reasons, do not backport.

Discussion: https://postgr.es/m/8c7af6820aed94dc7bc259d2aa7f9663518e6137.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-04-04 10:38:58 -07:00
..
initdb Canonicalize ICU locale names to language tags. 2023-04-04 10:38:58 -07:00
pg_amcheck amcheck: Generalize one of the recently-added update chain checks. 2023-03-27 13:37:16 -04:00
pg_archivecleanup meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_basebackup pg_basebackup: Correct type of WalSegSz 2023-04-03 07:21:06 +02:00
pg_checksums meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_config meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_controldata meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_ctl meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_dump Canonicalize ICU locale names to language tags. 2023-04-04 10:38:58 -07:00
pg_resetwal meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_rewind meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_test_fsync meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_test_timing meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_upgrade meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_verifybackup meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pg_waldump meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
pgbench pgbench: Prepare commands in pipelines in advance 2023-02-21 10:56:37 +01:00
pgevent Update copyright for 2023 2023-01-02 15:00:37 -05:00
psql Add a run_as_owner option to subscriptions. 2023-04-04 12:03:03 -04:00
scripts meson: add install-{quiet, world} targets 2023-03-23 21:20:18 -07:00
Makefile Update copyright for 2023 2023-01-02 15:00:37 -05:00
meson.build Update copyright for 2023 2023-01-02 15:00:37 -05:00