Commit Graph

4 Commits

Author SHA1 Message Date
Tom Lane fd90b5d574 Fix ancient encoding error in hungarian.stop.
When we grabbed this file off the Snowball project's website, we mistakenly
supposed that it was in LATIN1 encoding, but evidently it was actually in
LATIN2.  This resulted in ő (o-double-acute, U+0151, which is code 0xF5 in
LATIN2) being misconverted into õ (o-tilde, U+00F5), as complained of in
bug #10589 from Zoltán Sörös.  We'd have messed up u-double-acute too,
but there aren't any of those in the file.  Other characters used in the
file have the same codes in LATIN1 and LATIN2, which no doubt helped hide
the problem for so long.

The error is not only ours: the Snowball project also was confused about
which encoding is required for Hungarian.  But dealing with that will
require source-code changes that I'm not at all sure we'll wish to
back-patch.  Fixing the stopword file seems reasonably safe to back-patch
however.
2014-06-10 22:48:16 -04:00
Peter Eisentraut 3f11971916 Remove extra newlines at end and beginning of files, add missing newlines
at end of files.
2010-08-19 05:57:36 +00:00
Teodor Sigaev da1248401d Add turkish stopword list. Thanks to Devrim GUNDUZ <devrim@CommandPrompt.com> 2007-09-07 14:46:43 +00:00
Tom Lane 140d4ebcb4 Tsearch2 functionality migrates to core. The bulk of this work is by
Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing,
so anything that's broken is probably my fault.

Documentation is nonexistent as yet, but let's land the patch so we can
get some portability testing done.
2007-08-21 01:11:32 +00:00