Commit Graph

47 Commits

Author SHA1 Message Date
Teodor Sigaev 6cd9a58480 Fix core dump of ispell for case of non-successfull initialization.
Previous versions aren't affected.

Fix synonym dictionary init: string should be malloc'ed, not palloc'ed. Bug
introduced recently while fixing lowerstr().
2006-12-04 09:26:57 +00:00
Teodor Sigaev 3de2682a1e Fix lowercasing while parse OO dictionary 2006-11-23 17:35:14 +00:00
Teodor Sigaev dd92a8c33f Fix type in return value 2006-11-21 18:31:28 +00:00
Teodor Sigaev 419fe7cd1b Fix bug http://archives.postgresql.org/pgsql-bugs/2006-10/msg00258.php.
Fix string's length calculation for recoding, fix strlower() to avoid wrong
assumption about length of recoded string (was: recoded string is no greater
that source, it may not true for multibyte encodings)
Thanks to Thomas H. <me@alternize.com> and Magnus Hagander <mha@sollentuna.net>
2006-11-20 14:03:30 +00:00
Bruce Momjian f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Tom Lane ae643747b1 Fix a passel of recently-committed violations of the rule 'thou shalt
have no other gods before c.h'.  Also remove some demonstrably redundant
#include lines, mostly of <errno.h> which was added to c.h years ago.
2006-07-14 05:28:29 +00:00
Teodor Sigaev 04e9704b9e Now ispell dictionary can eat dictionaries in MySpell format,
used by OpenOffice. Dictionaries are placed at
http://lingucomponent.openoffice.org/spell_dic.html
Dictionary automatically recognizes format of files.

Warning. MySpell's format has limitation with compound
word support: it's impossible to mark affix as
compound-only affix. So for norwegian, german etc
languages it's recommended to use original ispell format.
For that reason I don't want to remove my2ispell
scripts, it's has workaround at least for norwegian language.
2006-06-09 13:25:59 +00:00
Neil Conway 8e5a10d46c This patch makes the error message strings throughout the backend
more compliant with the error message style guide. In particular,
errdetail should begin with a capital letter and end with a period,
whereas errmsg should not. I also fixed a few related issues in
passing, such as fixing the repeated misspelling of "lexeme" in
contrib/tsearch2 (per Tom's suggestion).
2006-03-01 06:30:32 +00:00
Teodor Sigaev dde9457294 Fixing and improve compound word support. This changes cannot be applied to
previous version iwthout recreating tsvector fields...

Thanks to Alexander Presber <aljoscha@weisshuhn.de> to discover a problem.
2006-02-20 17:51:05 +00:00
Tom Lane b35fdaaa1a Clean up some signedness warnings. 2006-02-10 15:57:58 +00:00
Teodor Sigaev 01f2172ec1 Allow "'" symbol in affixes ("'s" affix in english): it was diallowed during
multibyte support work.
Add line number to error output during affix file parsing.
2006-02-10 12:56:14 +00:00
Teodor Sigaev 46a25ce6a9 1 Fix bug with very short word: prefix and suffix might be overlapped,
sorry but fix can't be applyed to previous version: it's require
  refill tsvector...
2 Small optimize of load time for huge dictionaries
3 use palloc instead of malloc during load dict file
2006-02-09 18:04:20 +00:00
Teodor Sigaev a6fefc866c Check number of affixes to prevent core dump with zero number of affixes 2006-02-06 15:45:34 +00:00
Teodor Sigaev 7ac8a4be89 Multibyte encodings support for ISpell dictionary 2005-12-21 13:05:49 +00:00
Teodor Sigaev cb4ea994c6 Improve support of multibyte encoding:
- tsvector_(in|out)
- tsquery_(in|out)
- to_tsvector
- to_tsquery, plainto_tsquery
- 'simple' dictionary
2005-12-12 11:10:12 +00:00
Teodor Sigaev 6812abb673 Fix incorrect header size macros 2005-11-03 18:16:31 +00:00
Tom Lane c62b29a603 Fix several contrib makefiles that failed in VPATH builds, particularly
when not using gcc (which has slightly nonstandard inclusion rules).
2005-10-18 01:30:49 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 0b36cb83dc PGXS should be set with := not =, as specified in the documentation,
to avoid useless multiple executions of pg_config.
2005-09-27 17:13:14 +00:00
Tom Lane 8a65b820e2 Suppress signed-vs-unsigned-char warnings in contrib. 2005-09-24 19:14:05 +00:00
Teodor Sigaev f82b853b47 1 Update Snowball sources
2 Makefile fixes
2005-09-15 11:14:18 +00:00
Bruce Momjian 21634e513f Add extra argument for new pg_regexec API. 2005-07-10 18:31:59 +00:00
Tom Lane c0e0d3e2e9 Avoid unnecessary dependence on u_int16_t, per buildfarm failure.
(It doesn't compile on HPUX either...)
2005-01-26 18:49:39 +00:00
Teodor Sigaev 324300bc7c improve support of agglutinative languages (query with compound words).
regression=# select to_tsquery( '\'fotballklubber\'');
                   to_tsquery
------------------------------------------------
 'fotball' & 'klubb' | 'fot' & 'ball' & 'klubb'
(1 row)

So, changed interface to dictionaries, lexize method of dictionary shoud return
pointer to aray of TSLexeme structs instead of char**. Last element should
have TSLexeme->lexeme == NULL.

typedef struct {
        /* number of variant of split word , for example
                Word 'fotballklubber' (norwegian) has two varian to split:
                ( fotball, klubb ) and ( fot, ball, klubb ). So, dictionary
                should return:
                nvariant        lexeme
                1               fotball
                1               klubb
                2               fot
                2               ball
                2               klubb

        */
        uint16  nvariant;

        /* currently unused */
        uint16  flags;

        /* C-string */
        char    *lexeme;
} TSLexeme;
2005-01-25 15:24:38 +00:00
Teodor Sigaev 5b354d2c7e Fixes:
1 Report error message instead of do nothing in case of error in regex
2 Malloced storage for mask, find and repl part of Affix. This parts may be
  large enough in real life (for example in czech, thanks to moje <moje@kalhotky.net>)
2005-01-11 16:07:55 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Teodor Sigaev df9d87f608 Previous commit wasnt full... 2004-06-23 11:29:58 +00:00
Teodor Sigaev de55c0cef6 1 Fix affixes with void replacement (AFAIK, it's only russian)
2 Optimize regex execution
2004-06-23 11:06:11 +00:00
Teodor Sigaev 7cb55d21ed Fix memory leak with pg_regexec 2004-05-31 13:55:19 +00:00
Teodor Sigaev d222bb4d5e Fix memory leak with pg_regcomp 2004-05-31 13:52:57 +00:00
Teodor Sigaev 11864ab657 Win32 related patch by Darko Prenosil. Small correct by teodor 2004-05-31 13:29:43 +00:00
Tom Lane a90b2a035f Suppress 'uninitialized variable' warning emitted by some (not all)
versions of gcc.  The code is correct AFAICS, but it requires slightly
more analysis than usual to see that the variable can't be used uninitialized.
2004-05-07 13:09:12 +00:00
Tom Lane 0bd61548ab Solve the 'Turkish problem' with undesirable locale behavior for case
conversion of basic ASCII letters.  Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower.  These functions use the same notions of
case folding already developed for identifier case conversion.  I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent.  Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
2004-05-07 00:24:59 +00:00
Tom Lane 47fe0517fc Fix some portability issues (reliance on gcc-isms). 2004-04-01 23:44:38 +00:00
Teodor Sigaev 125d69cd9b Fix signed char in comparison and check memory allocation 2003-12-18 19:27:53 +00:00
Teodor Sigaev 565dc5d1ae Fix integer types to use definition from c.h. Per bug report by Patrick Boulay <patrick.boulay@medrium.com> 2003-12-10 15:54:58 +00:00
Teodor Sigaev 6de3fe3c0d Avoid conflict strndup with glibc 2003-12-04 12:21:11 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Teodor Sigaev cabdf460d3 Fix free instead of pfree 2003-11-28 12:09:02 +00:00
Teodor Sigaev c63c1946a2 Optimize. Improve ispell support for compound words. This work was sponsored by ABC Startsiden AS. 2003-11-17 17:34:35 +00:00
Peter Eisentraut 3d0d78ce2f Bring the makefiles up to our conventions. 2003-08-23 04:25:29 +00:00
Tom Lane 3b29525a79 Sub-Makefiles need to explicitly add CFLAGS_SL to CFLAGS, else their
object files do not get built with -fpic.
2003-08-04 20:34:26 +00:00
Tom Lane f237a80d8a Fix to build correctly outside source tree. 2003-08-04 19:52:37 +00:00
Teodor Sigaev d6f0f44b55 make sub-Makefiles in the sub-directories 2003-08-04 14:54:47 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane 8fd5b3ed67 Error message editing in contrib (mostly by Joe Conway --- thanks Joe!) 2003-07-24 17:52:50 +00:00
Teodor Sigaev b88605337e tsearch2 module 2003-07-21 10:27:44 +00:00