postgresql

Commit Graph

Author	SHA1	Message	Date
Peter Eisentraut	18cbed13d5	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 647792ce18e56f51614f7559106ad15362c5d1cc	2024-05-20 12:04:11 +02:00
Peter Eisentraut	17974ec259	Revise GUC names quoting in messages again After further review, we want to move in the direction of always quoting GUC names in error messages, rather than the previous (PG16) wildly mixed practice or the intermittent (mid-PG17) idea of doing this depending on how possibly confusing the GUC name is. This commit applies appropriate quotes to (almost?) all mentions of GUC names in error messages. It partially supersedes `a243569bf6` and `8d9978a717`, which had moved things a bit in the opposite direction but which then were abandoned in a partial state. Author: Peter Smith <smithpb2250@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com	2024-05-17 11:44:26 +02:00
Daniel Gustafsson	83ae824c87	Remove auth-options support from initdb When --auth was added to initdb in commit `e7029b2127` it had support for auth options separated by space from the auth type, like: --auth pam <servicename> --auth ident sameuser Passing an option to the ident auth type was removed in `01c1a12a5b` which left the pam auth-options support in place. `8a02339e9b` broke this by inverting a calculation in the strncmp arguments, which went unnoticed for a long time. The ability to pass options to the auth type was never documented. Rather than fixing the support for an undocumented feature which has been broken for all supported versions, and which only supports one out of many auth types which can take options, it is removed. Reported-by: Jingxian Li <aqktjcm@qq.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/tencent_29731C7C7E6A2F9FB807C3A1DC3D81293C06@qq.com	2024-05-14 10:50:26 +02:00
Peter Eisentraut	7a31eb2aaa	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: be182cc55e6f72c66215fd9b38851969e3ce5480	2024-05-06 12:06:31 +02:00
Jeff Davis	e2a2357671	Fix test failures when language environment is not UTF-8. For tests that depend on UTF-8 encoding, force LC_COLLATE=C and LC_CTYPE=C to avoid an encoding mismatch. Reported-by: Thomas Munro Discussion: https://postgr.es/m/CA+hUKGK-ZqV1njkG_=xcCqXh2fcMkz85FTMnhS2opm4ZerH=xw@mail.gmail.com	2024-04-04 16:10:12 -07:00
Jeff Davis	f69319f2f1	Support C.UTF-8 locale in the new builtin collation provider. The builtin C.UTF-8 locale has similar semantics to the libc locale of the same name. That is, code point sort order (fast, memcmp-based) combined with Unicode semantics for character operations such as pattern matching, regular expressions, and LOWER()/INITCAP()/UPPER(). The character semantics are based on Unicode simple case mappings. The builtin provider's C.UTF-8 offers several important advantages over libc: * faster sorting -- benefits from additional optimizations such as abbreviated keys and varstrfastcmp_c * faster case conversion, e.g. LOWER(), at least compared with some libc implementations * available on all platforms with identical semantics, and the semantics are stable, testable, and documentable within a given Postgres major version Being based on memcmp, the builtin C.UTF-8 locale does not offer natural language sort order. But it is an improvement for most use cases that might otherwise use libc's "C.UTF-8" locale, as well as many use cases that use libc's "C" locale. Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider	2024-03-19 15:24:41 -07:00
Jeff Davis	846311051e	Address more review comments on commit `2d819a08a1`. Based on comments from Peter Eisentraut. * Document CREATE DATABASE ... BUILTIN_LOCALE. * Determine required encoding based on locale name for CREATE COLLATION. Use -1 for "C" (requires catversion bump). * initdb output fixups. * Make ctype_is_c a constant true for now. * Fixups to ICU 010_create_database.pl test. Discussion: https://postgr.es/m/4135cf11-206d-40ed-96c0-9363c1232379@eisentraut.org	2024-03-18 11:58:13 -07:00
Peter Eisentraut	5162f663fc	Add missing source files to nls.mk	2024-03-18 16:46:22 +01:00
Jeff Davis	2d819a08a1	Introduce "builtin" collation provider. New provider for collations, like "libc" or "icu", but without any external dependency. Initially, the only locale supported by the builtin provider is "C", which is identical to the libc provider's "C" locale. The libc provider's "C" locale has always been treated as a special case that uses an internal implementation, without using libc at all -- so the new builtin provider uses the same implementation. The builtin provider's locale is independent of the server environment variables LC_COLLATE and LC_CTYPE. Using the builtin provider, the database collation locale can be "C" while LC_COLLATE and LC_CTYPE are set to "en_US", which is impossible with the libc provider. By offering a new builtin provider, it clarifies that the semantics of a collation using this provider will never depend on libc, and makes it easier to document the behavior. Discussion: https://postgr.es/m/ab925f69-5f9d-f85e-b87c-bd2a44798659@joeconway.com Discussion: https://postgr.es/m/dd9261f4-7a98-4565-93ec-336c1c110d90@manitou-mail.org Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider	2024-03-13 23:33:44 -07:00
Michael Paquier	2c8118ee5d	Use printf's %m format instead of strerror(errno) in more places Most callers of strerror() are removed from the backend code. The remaining callers require special handling with a saved errno from a previous system call. The frontend code still needs strerror() where error states need to be handled outside of fprintf. Note that pg_regress is not changed to use %m as the TAP output may clobber errno, since those functions call fprintf() and friends before evaluating the format string. Support for %m in src/port/snprintf.c has been added in `d6c55de1f9`, hence all the stable branches currently supported include it. Author: Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/87sf13jhuw.fsf@wibble.ilmari.org	2024-03-12 10:02:54 +09:00
Jeff Davis	f696c0cd5f	Catalog changes preparing for builtin collation provider. Rename pg_collation.colliculocale to colllocale, and pg_database.daticulocale to datlocale. These names reflects that the fields will be useful for the upcoming builtin provider as well, not just for ICU. This is purely a rename; no changes to the meaning of the fields. Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Peter Eisentraut	2024-03-09 14:48:18 -08:00
Tom Lane	fce2ce797c	Fix initdb's -c option to treat the GUC name case-insensitively. The backend treats GUC names case-insensitively, so this code should too. This avoids ending up with a confusing set of redundant entries in the generated postgresql.conf file. Per report from Kyotaro Horiguchi. Back-patch to v16 where this feature was added (in commit `3e51b278d`). Discussion: https://postgr.es/m/20230928.164904.2153358973162534034.horikyota.ntt@gmail.com	2024-03-04 12:00:48 -05:00
Tom Lane	3d185cfc09	Restore initdb's old behavior of always setting the lc_xxx GUCs. In commit `3e51b278d` I (tgl) caused initdb to leave lc_messages and other lc_xxx GUCs commented-out in the installed postgresql.conf file if they were going to be set to 'C'. This was a hack for cosmetic purposes, and it was buggy because lc_messages' wired-in default is not 'C' but '' (empty string). That led to --no-locale not having the expected effect, since the postmaster would then obtain lc_messages from its startup environment. Let's just revert to the prior behavior of always de-commenting the lc_xxx entries; the argument for changing that longstanding behavior was weak in the first place. Also, fix postgresql.conf.sample's erroneous claim that the default value of lc_messages is 'C'. I suspect that was what misled me into making this mistake in the first place. Report and patch by Kyotaro Horiguchi. Back-patch to v16 where the problem was introduced. Discussion: https://postgr.es/m/20231122.162700.1995154567625541112.horikyota.ntt@gmail.com	2024-01-10 18:09:29 -05:00
Bruce Momjian	29275b1d17	Update copyright for 2024 Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12	2024-01-03 20:49:05 -05:00
Peter Eisentraut	c538592959	Make all Perl warnings fatal There are a lot of Perl scripts in the tree, mostly code generation and TAP tests. Occasionally, these scripts produce warnings. These are probably always mistakes on the developer side (true positives). Typical examples are warnings from genbki.pl or related when you make a mess in the catalog files during development, or warnings from tests when they massage a config file that looks different on different hosts, or mistakes during merges (e.g., duplicate subroutine definitions), or just mistakes that weren't noticed because there is a lot of output in a verbose build. This changes all warnings into fatal errors, by replacing use warnings; by use warnings FATAL => 'all'; in all Perl files. Discussion: https://www.postgresql.org/message-id/flat/06f899fd-1826-05ab-42d6-adeb1fd5e200%40eisentraut.org	2023-12-29 18:20:00 +01:00
Robert Haas	174c480508	Add a new WAL summarizer process. When active, this process writes WAL summary files to $PGDATA/pg_wal/summaries. Each summary file contains information for a certain range of LSNs on a certain TLI. For each relation, it stores a "limit block" which is 0 if a relation is created or destroyed within a certain range of WAL records, or otherwise the shortest length to which the relation was truncated during that range of WAL records, or otherwise InvalidBlockNumber. In addition, it stores a list of blocks which have been modified during that range of WAL records, but excluding blocks which were removed by truncation after they were modified and never subsequently modified again. In other words, it tells us which blocks need to copied in case of an incremental backup covering that range of WAL records. But this doesn't yet add the capability to actually perform an incremental backup; the next patch will do that. A new parameter summarize_wal enables or disables this new background process. The background process also automatically deletes summary files that are older than wal_summarize_keep_time, if that parameter has a non-zero value and the summarizer is configured to run. Patch by me, with some design help from Dilip Kumar and Andres Freund. Reviewed by Matthias van de Meent, Dilip Kumar, Jakub Wartak, Peter Eisentraut, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com	2023-12-20 08:42:28 -05:00
Peter Eisentraut	721856ff24	Remove distprep A PostgreSQL release tarball contains a number of prebuilt files, in particular files produced by bison, flex, perl, and well as html and man documentation. We have done this consistent with established practice at the time to not require these tools for building from a tarball. Some of these tools were hard to get, or get the right version of, from time to time, and shipping the prebuilt output was a convenience to users. Now this has at least two problems: One, we have to make the build system(s) work in two modes: Building from a git checkout and building from a tarball. This is pretty complicated, but it works so far for autoconf/make. It does not currently work for meson; you can currently only build with meson from a git checkout. Making meson builds work from a tarball seems very difficult or impossible. One particular problem is that since meson requires a separate build directory, we cannot make the build update files like gram.h in the source tree. So if you were to build from a tarball and update gram.y, you will have a gram.h in the source tree and one in the build tree, but the way things work is that the compiler will always use the one in the source tree. So you cannot, for example, make any gram.y changes when building from a tarball. This seems impossible to fix in a non-horrible way. Second, there is increased interest nowadays in precisely tracking the origin of software. We can reasonably track contributions into the git tree, and users can reasonably track the path from a tarball to packages and downloads and installs. But what happens between the git tree and the tarball is obscure and in some cases non-reproducible. The solution for both of these issues is to get rid of the step that adds prebuilt files to the tarball. The tarball now only contains what is in the git tree (). Getting the additional build dependencies is no longer a problem nowadays, and the complications to keep these dual build modes working are significant. And of course we want to get the meson build system working universally. This commit removes the make distprep target altogether. The make dist target continues to do its job, it just doesn't call distprep anymore. () - The tarball also contains the INSTALL file that is built at make dist time, but not by distprep. This is unchanged for now. The make maintainer-clean target, whose job it is to remove the prebuilt files in addition to what make distclean does, is now just an alias to make distprep. (In practice, it is probably obsolete given that git clean is available.) The following programs are now hard build requirements in configure (they were already required by meson.build): - bison - flex - perl Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e07408d9-e5f2-d9fd-5672-f53354e9305e@eisentraut.org	2023-11-06 15:18:04 +01:00
Tom Lane	b6c7cfac88	Restore proper linkage of pg_char_to_encoding() and friends. Back in the 8.3 era we discovered that it was problematic if libpq.so had encoding ID assignments different from the backend, which is possible because on some platforms libpq.so might be of a different major version from the calling programs. psql should use libpq's assignments, but initdb has to use the backend's, else it will put wrong values into pg_database. The solution devised in commit `8468146b0` relied on giving initdb its own copy of encnames.c rather than relying on the functions exported by libpq. Later, that metamorphosed into ensuring that libpgcommon got linked before libpq -- which made things OK for initdb but broke psql. We didn't notice for lack of any changes in enum pg_enc since then. Commit `06843df4a` reversed that, fixing the latent bug in psql but adding one in initdb. The meson build infrastructure is also not being sufficiently careful about link order, and trying to make it so would be equally fragile. Hence, let's use a new scheme based on giving the libpq-exported symbols different real names than the same functions exported from libpgcommon.a or libpgcommon_srv.a. (We could distinguish those two cases as well, but there seems no need to.) libpq gets the official names to avoid an ABI break for libpq clients, while the other cases use #define's to make the real names "xxx_private" rather than "xxx". By controlling where the #define's are applied, we can force any particular client program to use one set or the other of the encnames.c functions. We cannot back-patch this, since it'd be an ABI break for backend loadable modules, but there seems little need to. We're just trying to ensure that the world is safe for hypothetical future additions to enum pg_enc. In passing this should fix "duplicate symbol" linker warnings that we've been seeing on AIX buildfarm members since commit `06843df4a`. It's not very clear why that linker is complaining now, when there were strictly more duplicates visible before, but in any case this should remove the reason for complaint. Patch by me; thanks to Andres Freund for review. Discussion: https://postgr.es/m/2385119.1696354473@sss.pgh.pa.us	2023-10-07 12:08:10 -04:00
Peter Eisentraut	64b787656d	Add some const qualifiers There was a mismatch between the const qualifiers for excludeDirContents in src/backend/backup/basebackup.c and src/bin/pg_rewind/filemap.c, which led to a quick search for similar cases. We should make excludeDirContents match, but the rest of the changes seem like a good idea as well. Author: David Steele <david@pgmasters.net> Discussion: https://www.postgresql.org/message-id/flat/669a035c-d23d-2f38-7ff0-0cb93e01d610@pgmasters.net	2023-09-26 11:28:57 +01:00
Nathan Bossart	8c16ad3b43	Allow using syncfs() in frontend utilities. This commit allows specifying a --sync-method in several frontend utilities that must synchronize many files to disk (initdb, pg_basebackup, pg_checksums, pg_dump, pg_rewind, and pg_upgrade). On Linux, users can specify "syncfs" to synchronize the relevant file systems instead of calling fsync() for every single file. In many cases, using syncfs() is much faster. As with recovery_init_sync_method, this new option comes with some caveats. The descriptions of these caveats have been moved to a new appendix section in the documentation. Co-authored-by: Justin Pryzby Reviewed-by: Michael Paquier, Thomas Munro, Robert Haas, Justin Pryzby Discussion: https://postgr.es/m/20210930004340.GM831%40telsasoft.com	2023-09-06 16:27:16 -07:00
Nathan Bossart	cccc6cdeb3	Add support for syncfs() in frontend support functions. This commit adds support for using syncfs() in fsync_pgdata() and fsync_dir_recurse() (which have been renamed to sync_pgdata() and sync_dir_recurse()). Like recovery_init_sync_method, sync_pgdata() calls syncfs() for the data directory, each tablespace, and pg_wal (if it is a symlink). For now, all of the frontend utilities that use these support functions are hard-coded to use fsync(), but a follow-up commit will allow specifying syncfs(). Co-authored-by: Justin Pryzby Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/20210930004340.GM831%40telsasoft.com	2023-09-06 16:27:00 -07:00
Daniel Gustafsson	95fff2abee	Reword user-facing message for "power of two" While there are numerous instances of using "power of 2" in the code, translated user-facing messages use "power of two". Fix two instances which used "power of 2" instead. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20230829.175615.682972421946735863.horikyota.ntt@gmail.com	2023-08-29 11:21:10 +02:00
Peter Eisentraut	36e4419d1f	Make error messages about WAL segment size more consistent Make the primary messages more compact and make the detail messages uniform. In initdb.c and pg_resetwal.c, use the newish option_parse_int() to simplify some of the option parsing. For the backend GUC wal_segment_size, add a GUC check hook to do the verification instead of coding it in bootstrap.c. This might be overkill, but that way the check is in the right place and it becomes more self-documenting. In passing, make pg_controldata use the logging API for warning messages. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/flat/9939aa8a-d7be-da2c-7715-0a0b5535a1f7@eisentraut.org	2023-08-28 15:17:04 +02:00
Peter Eisentraut	2bdd7b262f	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 97398d714ace69f0c919984e160f429b6fd2300e	2023-08-07 12:39:30 +02:00
Peter Eisentraut	cccdbc5d95	Clean up command argument assembly Several commands internally assemble command lines to call other commands. This includes initdb, pg_dumpall, and pg_regress. (Also pg_ctl, but that is different enough that I didn't consider it here.) This has all evolved a bit organically, with fixed-size buffers, and various optional command-line arguments being injected with confusing-looking code, and the spacing between options handled in inconsistent ways. Clean all this up a bit to look clearer and be more easily extensible with new arguments and options. We start each command with printfPQExpBuffer(), and then append arguments as necessary with appendPQExpBuffer(). Also standardize on using initPQExpBuffer() over createPQExpBuffer() where possible. pg_regress uses StringInfo instead of PQExpBuffer, but many of the same ideas apply. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/16d0beac-a141-e5d3-60e9-323da75f49bf@eisentraut.org	2023-07-05 07:15:23 +02:00
Peter Eisentraut	3f0199da5a	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: ab77975e9d2cde44da796c18af3ec1a66f0df7ae	2023-06-26 12:02:02 +02:00
Jeff Davis	f3a01af29b	ICU: do not convert locale 'C' to 'en-US-u-va-posix'. Older versions of ICU canonicalize "C" to "en-US-u-va-posix"; but starting in ICU version 64, the "C" locale is considered obsolete. Postgres commit `ea1db8ae70` introduced code to always canonicalize "C" to "en-US-u-va-posix" for consistency and convenience, but it was deemed too confusing. This commit removes that code, so that "C" is treated like other ICU locale names: canonicalization is attempted, and if it fails, the behavior is controlled by icu_validation_level. A similar change was previously committed as `f7faa9976c`, then reverted due to an ICU-version-dependent test failure. This commit un-reverts it, omitting the test because we now expect the behavior to depend on the version of ICU being used. Discussion: https://postgr.es/m/3a200aca-4672-4b37-fc91-5d198a323503%40eisentraut.org Discussion: https://postgr.es/m/f83f089ee1e9acd5dbbbf3353294d24e1f196e95.camel@j-davis.com Discussion: https://postgr.es/m/37520ec1ae9591f83132f82dbd625f3fc2d69c16.camel@j-davis.com	2023-06-21 13:18:25 -07:00
Jeff Davis	2535c74b1a	initdb: change default --locale-provider back to libc. Reverts `27b62377b4`. Discussion: https://postgr.es/m/eff031036baa07f325de29215371a4c9e69d61f3.camel@j-davis.com Discussion: https://postgr.es/m/3353947.1682092131@sss.pgh.pa.us	2023-06-21 11:10:03 -07:00
Tom Lane	b334612b8a	Pre-beta2 mechanical code beautification. Run pgindent and pgperltidy. It seems we're still some ways away from all committers doing this automatically. Now that we have a buildfarm animal that will whine about poorly-indented code, we'll try to keep the tree more tidy. Discussion: https://postgr.es/m/3156045.1687208823@sss.pgh.pa.us	2023-06-20 09:50:43 -04:00
Jeff Davis	a14e75eb0b	CREATE DATABASE: make LOCALE apply to all collation providers. For CREATE DATABASE, make LOCALE parameter apply regardless of the provider used. Also affects initdb and createdb --locale arguments. Previously, LOCALE (and --locale) only affected the database default collation when using the libc provider. Discussion: https://postgr.es/m/1a63084d-221e-4075-619e-6b3e590f673e@enterprisedb.com Reviewed-by: Peter Eisentraut	2023-06-16 10:27:32 -07:00
Jeff Davis	ec1264f01e	ICU: use uloc_getDefault() for initdb. Simpler, and better preserves the locale name as read from the environment. Author: Daniel Verite Discussion: https://postgr.es/m/a6204a46-c077-451b-8f9d-8965d95bb57c@manitou-mail.org	2023-05-26 11:26:11 -07:00
Peter Eisentraut	473e02f6f9	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 642d41265b1ea68ae71a66ade5c5440ba366a890	2023-05-22 12:44:31 +02:00
Tom Lane	0245f8db36	Pre-beta mechanical code beautification. Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql	2023-05-19 17:24:48 -04:00
Jeff Davis	1c634f6647	ICU: check for U_STRING_NOT_TERMINATED_WARNING. Fixes memory error in cases where the length of the language name returned by uloc_getLanguage() is exactly ULOC_LANG_CAPACITY, in which case the status is set to U_STRING_NOT_TERMINATED_WARNING. Also check in call sites for other ICU functions that are expected to return a C string to be safe (no bug is known at these other call sites). Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/2098874d-c111-41e4-9063-30bcf135226b@gmail.com	2023-05-17 14:18:45 -07:00
Peter Eisentraut	e32701b8d2	initdb: Set collversion for standard collation UNICODE Since the behavior of the UNICODE collation can change with new ICU/Unicode versions, we need to apply the versioning mechanism to it. We do this with an UPDATE command in initdb; this is similar to how we put the collation version into pg_database already. Reported-by: Daniel Verite <daniel@manitou-mail.org> Discussion: https://www.postgresql.org/message-id/49417853-7bdd-4b23-a4e9-04c7aff33821@manitou-mail.org	2023-05-12 10:03:05 +02:00
Jeff Davis	455f948b0d	Revert "ICU: do not convert locale 'C' to 'en-US-u-va-posix'." This reverts commit `f7faa9976c`. Discussion: https://postgr.es/m/483826.1683582475@sss.pgh.pa.us	2023-05-08 20:50:51 -07:00
Jeff Davis	f7faa9976c	ICU: do not convert locale 'C' to 'en-US-u-va-posix'. The conversion was intended to be for convenience, but it's more likely to be confusing than useful. The user can still directly specify 'en-US-u-va-posix' if desired. Discussion: https://postgr.es/m/f83f089ee1e9acd5dbbbf3353294d24e1f196e95.camel@j-davis.com Discussion: https://postgr.es/m/37520ec1ae9591f83132f82dbd625f3fc2d69c16.camel@j-davis.com	2023-05-08 10:34:51 -07:00
Jeff Davis	5cd1a5af4d	Fix initdb --no-locale. Discussion: https://postgr.es/m/878relf7cb.fsf@news-spur.riddles.org.uk Reported-by: Andrew Gierth	2023-04-21 13:11:18 -07:00
Jeff Davis	36320cbc16	Fix MSVC warning introduced in `ea1db8ae70`. Discussion: https://postgr.es/m/CA+hUKGJR1BhCORa5WdvwxztD3arhENcwaN1zEQ1Upg20BwjKWA@mail.gmail.com Reported-by: Thomas Munro	2023-04-04 15:43:18 -07:00
Jeff Davis	ea1db8ae70	Canonicalize ICU locale names to language tags. Convert to BCP47 language tags before storing in the catalog, except during binary upgrade or when the locale comes from an existing collation or template database. The resulting language tags can vary slightly between ICU versions. For instance, "@colBackwards=yes" is converted to "und-u-kb-true" in older versions of ICU, and to the simpler (but equivalent) "und-u-kb" in newer versions. The process of canonicalizing to a language tag also understands more input locale string formats than ucol_open(). For instance, "fr_CA.UTF-8" is misinterpreted by ucol_open() and the region is ignored; effectively treating it the same as the locale "fr" and opening the wrong collator. Canonicalization properly interprets the language and region, resulting in the language tag "fr-CA", which can then be understood by ucol_open(). This commit fixes a problem in prior versions due to ucol_open() misinterpreting locale strings as described above. For instance, creating an ICU collation with locale "fr_CA.UTF-8" would store that string directly in the catalog, which would later be passed to (and misinterpreted by) ucol_open(). After this commit, the locale string will be canonicalized to language tag "fr-CA" in the catalog, which will be properly understood by ucol_open(). Because this fix affects the resulting collator, we cannot change the locale string stored in the catalog for existing databases or collations; otherwise we'd risk corrupting indexes. Therefore, only canonicalize locales for newly-created (not upgraded) collations/databases. For similar reasons, do not backport. Discussion: https://postgr.es/m/8c7af6820aed94dc7bc259d2aa7f9663518e6137.camel@j-davis.com Reviewed-by: Peter Eisentraut	2023-04-04 10:38:58 -07:00
Peter Eisentraut	563f21cda8	Move definition of standard collations from initdb to pg_collation.dat The standard collations "ucs_basic" and "unicode" were defined in initdb, even though pg_collation.dat seems like the correct place for them. It seems this was just forgotten during various reorganizations of initdb and pg_collation.dat/.h over time. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/08b58ecd-0d50-9395-ed51-dc8294e3fd2b%40enterprisedb.com	2023-03-29 09:45:21 +02:00
Jeff Davis	1671f990dd	Validate ICU locales. For ICU collations, ensure that the locale's language exists in ICU, and that the locale can be opened. Basic validation helps avoid minor mistakes and misspellings, which often fall back to the root locale instead of the intended locale. It's even more important to avoid such mistakes in ICU versions 54 and earlier, where the same (misspelled) locale string could fall back to different locales depending on the environment. Discussion: https://postgr.es/m/11b1eeb7e7667fdd4178497aeb796c48d26e69b9.camel@j-davis.com Discussion: https://postgr.es/m/df2efad0cae7c65180df8e5ebb709e5eb4f2a82b.camel@j-davis.com Reviewed-by: Peter Eisentraut	2023-03-28 16:34:29 -07:00
Jeff Davis	c1f1c1f87f	initdb: emit message when using default ICU locale. Helpful to determine from test logs whether the locale came from the environment or a command-line option. Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com Reviewed-by: Peter Eisentraut	2023-03-28 08:24:43 -07:00
Jeff Davis	f8ca22295e	initdb: replace check_icu_locale() with default_icu_locale(). The extra checks done in check_icu_locale() are not necessary. An existing comment already pointed out that the checks would be done during post-bootstrap initialization, when the locale is opened by the backend. This was a mistake in commit `27b62377b4`. This commit creates a simpler function default_icu_locale() to just return the locale of the default collator. Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com Reviewed-by: Peter Eisentraut	2023-03-28 08:24:21 -07:00
Andres Freund	e522049f23	meson: add install-{quiet, world} targets To define our own install target, we need dependencies on the i18n targets, which we did not collect so far. Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com	2023-03-23 21:20:18 -07:00
Jeff Davis	3b50275b12	Handle the "und" locale in ICU versions 54 and older. The "und" locale is an alternative spelling of the root locale, but it was not recognized until ICU 55. To maintain common behavior across all supported ICU versions, check for "und" and replace with "root" before opening. Previously, the lack of support for "und" was dangerous, because versions 54 and older fall back to the environment when a locale is not found. If the user specified "und" for the language (which is expected and documented), it could not only resolve to the wrong collator, but it could unexpectedly change (which could lead to corrupt indexes). This effectively reverts commit `d72900bded`, which worked around the problem for the built-in "unicode" collation, and is no longer necessary. Discussion: https://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73.camel@j-davis.com Discussion: https://postgr.es/m/0c6fa66f2753217d2a40480a96bd2ccf023536a1.camel@j-davis.com Reviewed-by: Peter Eisentraut	2023-03-23 10:08:27 -07:00
Tom Lane	b48af6d174	Fix initdb's handling of min_wal_size and max_wal_size. In commit `3e51b278d`, I misinterpreted the coding in setup_config() as setting min_wal_size and max_wal_size to compile-time-constant values. But it's not: there's a hidden dependency on --wal-segsize. Therefore leaving these variables commented out is the wrong thing. Per report from Andres Freund. Discussion: https://postgr.es/m/20230322200751.jvfvsuuhd3hgm6vv@awork3.anarazel.de	2023-03-22 16:37:41 -04:00
Tom Lane	4fe2aa7656	Reduce memory leakage in initdb. While testing commit `3e51b278d`, I noted that initdb leaks about a megabyte worth of data due to the sloppy bookkeeping in its string-manipulating code. That's not a huge amount on modern machines, but it's still kind of annoying, and it's easy to fix by recognizing that we might as well treat these arrays of strings as modifiable-in-place. There's no caller that cares about preserving the old state of the array after replace_token or replace_guc_value. With this fix, valgrind sees only a few hundred bytes leaked during an initdb run. Discussion: https://postgr.es/m/2844176.1674681919@sss.pgh.pa.us	2023-03-22 14:28:45 -04:00
Tom Lane	3e51b278db	Add "-c name=value" switch to initdb. This option, or its long form --set, sets the GUC "name" to "value". The setting applies in the bootstrap and standalone servers run by initdb, and is also written into the generated postgresql.conf. This can save an extra editing step when creating a new cluster, but the real use-case is for coping with situations where the bootstrap server fails to start due to environmental issues; for example, if it's necessary to force huge_pages to off. Discussion: https://postgr.es/m/2844176.1674681919@sss.pgh.pa.us	2023-03-22 13:49:05 -04:00
Peter Eisentraut	d72900bded	Improve support for UNICODE collation on older ICU The recently added standard collation UNICODE (`0d21d4b9bc`) doesn't give consistent results on some build farm members with old ICU versions. Apparently, the ICU locale specification 'und' (language tag style) misbehaves on some older ICU versions. Replacing it with '' (ICU locale ID style) fixes it at least on some OS versions. Let's see what the build farm says.	2023-03-13 09:08:58 +01:00

1 2 3 4 5 ...

930 Commits