postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-10-06 19:26:53 +02:00

Author	SHA1	Message	Date
Michael Paquier	44e73a498c	Fix regression tests of unaccent to work without UTF8 support The tests of unaccent rely on UTF8 characters, and unlike any other test suite in the tree (fuzzystrmatch, citext, hstore, etc.), they would fail if run on a database that does not support UTF8 encoding. This commit fixes the tests of unaccent so as these are skipped when run on a database without UTF8 support, using the same method as the other test suits based on \if, getdatabaseencoding() and an alternate output file. This has been broken for a long time, but nobody has complained about that either, so no backpatch is done. This can be reproduced with something like REGRESS_OPTS="--no-locale --encoding=sql_ascii", for instance. To defend against that, this module's Makefile and meson.build enforced a UTF8 encoding without locales, but it did not offer protection for options given by REGRESS_OPTS. This switch makes this regression test suite more consistent with all the others, as well. Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/ZIq1HUnIV2ksW85x@paquier.xyz	2023-07-04 08:05:00 +09:00
Jeff Davis	f413941f41	Fix t_isspace(), etc., when datlocprovider=i and datctype=C. Check whether the datctype is C to determine whether t_isspace() and related functions use isspace() or iswspace(). Previously, t_isspace() checked whether the database default collation was C; which is incorrect when the default collation uses the ICU provider. Discussion: https://postgr.es/m/79e4354d9eccfdb00483146a6b9f6295202e7890.camel@j-davis.com Reviewed-by: Peter Eisentraut Backpatch-through: 15	2023-03-17 12:08:46 -07:00
Jeff Davis	27b62377b4	Use ICU by default at initdb time. If the ICU locale is not specified, initialize the default collator and retrieve the locale name from that. Discussion: https://postgr.es/m/510d284759f6e943ce15096167760b2edcb2e700.camel@j-davis.com Reviewed-by: Peter Eisentraut	2023-03-09 10:52:41 -08:00
Michael Paquier	e3dd7c06e6	Simplify a bit the special rules generating unaccent.rules As noted by Thomas Munro, CLDR 36 has added SOUND RECORDING COPYRIGHT (U+2117), and we use CLDR 41, so this can be removed from the set of special cases. The set of regression tests is expanded for degree signs, which are two of the special cases, and a fancy case with U+210C in Latin-ASCII.xml that we have discovered about when diving into what could be done for Cyrillic characters (this last part is material for a future patch, not tackled yet). While on it, some of the assertions of generate_unaccent_rules.py are expanded to report the codepoint on which a failure is found, something useful for debugging. Extracted from a larger patch by the same author. Author: Przemysław Sztoch Discussion: https://postgr.es/m/8478da0d-3b61-d24f-80b4-ce2f5e971c60@sztoch.pl	2022-07-05 16:17:51 +09:00
Thomas Munro	456e3718e7	Add combining characters to unaccent.rules. Strip certain classes of combining characters, so that accents encoded this way are removed. Author: Hugh Ranalli Discussion: https://postgr.es/m/15548-cef1b3f8de190d4f%40postgresql.org	2019-02-01 15:23:01 +01:00
Michael Paquier	e1c1d5444e	Update unaccent rules with release 34 of CLDR for Latin-ASCII.xml This has required an update of the python script generating the rules, as its format has changed in release 29. This release has also added new punctuation and symbols, and a new set of rules has been generated to include them. The way to find newest versions of Latin-ASCII gets also more clearly documented. Author: Hugh Ranalli, Michael Paquier Discussion: https://postgr.es/m/15548-cef1b3f8de190d4f@postgresql.org	2019-01-10 14:10:21 +09:00
Peter Eisentraut	b6f3649bba	Convert unaccent tests to UTF-8 This makes it easier to add new tests that are specific to Unicode features. The files were previously in KOI8-R. Discussion: https://www.postgresql.org/message-id/8506.1545111362@sss.pgh.pa.us	2019-01-02 18:36:05 +01:00
Tom Lane	629b3af27d	Convert contrib modules to use the extension facility. This isn't fully tested as yet, in particular I'm not sure that the "foo--unpackaged--1.0.sql" scripts are OK. But it's time to get some buildfarm cycles on it. sepgsql is not converted to an extension, mainly because it seems to require a very nonstandard installation process. Dimitri Fontaine and Tom Lane	2011-02-13 22:54:49 -05:00
Tom Lane	4b98b613f6	Print the actual DB encoding in the unaccent regression test. This is to help make it more obvious what the problem is, if the encoding isn't what the test expects.	2009-08-18 16:00:50 +00:00
Teodor Sigaev	92e05bc6a5	Unaccent dictionary.	2009-08-18 10:34:39 +00:00

10 Commits