Apparently, the previous update
(2e0e066679) must have used a stale
input file and missed a few additions that were added shortly before
the CLDR release. Update this now so that the next update really only
changes things new in that version.
This has required an update of the python script generating the rules,
as its format has changed in release 29. This release has also added
new punctuation and symbols, and a new set of rules has been generated
to include them. The way to find newest versions of Latin-ASCII gets
also more clearly documented.
Author: Hugh Ranalli, Michael Paquier
Discussion: https://postgr.es/m/15548-cef1b3f8de190d4f@postgresql.org
Improve generate_unaccent_rules.py to handle composed characters whose base
is another composed character rather than a plain letter. The net effect
of this is to add a bunch of multi-accented Vietnamese characters to
unaccent.rules.
Original complaint from Kha Nguyen, diagnosis of the script's shortcoming
by Thomas Munro.
Dang Minh Huong and Michael Paquier
Discussion: https://postgr.es/m/CALo3sF6EC8cy1F2JUz=GRf5h4LMUJTaG3qpdoiLrNbWEXL-tRg@mail.gmail.com
Add Python script for buiding unaccent.rules from Unicode data. Don't
backpatch because unaccent changes may require tsvector/index
rebuild.
Thomas Munro <thomas.munro@enterprisedb.com>