postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-10-09 04:27:04 +02:00

Author	SHA1	Message	Date
Tom Lane	bdf46af748	Post-feature-freeze pgindent run. Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us	2018-04-26 14:47:16 -04:00
Teodor Sigaev	be8a7a6866	Add strict_word_similarity to pg_trgm module strict_word_similarity is similar to existing word_similarity function but it takes into account word boundaries to compute similarity. Author: Alexander Korotkov Review by: David Steele, Liudmila Mantrova, me Discussion: https://www.postgresql.org/message-id/flat/CY4PR17MB13207ED8310F847CF117EED0D85A0@CY4PR17MB1320.namprd17.prod.outlook.com	2018-03-21 14:57:42 +03:00
Peter Eisentraut	2eb4a831e5	Change TRUE/FALSE to true/false The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-11-08 11:37:28 -05:00
Noah Misch	3a0d473192	Use wrappers of PG_DETOAST_DATUM_PACKED() more. This makes almost all core code follow the policy introduced in the previous commit. Specific decisions: - Text search support functions with char* and length arguments, such as prsstart and lexize, may receive unaligned strings. I doubt maintainers of non-core text search code will notice. - Use plain VARDATA() on values detoasted or synthesized earlier in the same function. Use VARDATA_ANY() on varlenas sourced outside the function, even if they happen to always have four-byte headers. As an exception, retain the universal practice of using VARDATA() on return values of SendFunctionCall(). - Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large for a one-byte header, so this misses no optimization.) Sites that do not call get_page_from_raw() typically need the four-byte alignment. - For now, do not change btree_gist. Its use of four-byte headers in memory is partly entangled with storage of 4-byte headers inside GBT_VARKEY, on disk. - For now, do not change gtrgm_consistent() or gtrgm_distance(). They incorporate the varlena header into a cache, and there are multiple credible implementation strategies to consider.	2017-03-12 19:35:34 -04:00
Tom Lane	9c852566a3	Fix comparison of similarity to threshold in GIST trigram searches. There was some very strange code here, dating to commit `b525bf77`, that purported to work around an ancient gcc bug by forcing a float4 comparison to be done as int instead. Commit `5871b8848` broke that when it changed one side of the comparison to "double" but left the comparison code alone. Commit `f576b17cd` doubled down on the weirdness by introducing a "volatile" marker, which had nothing to do with the actual problem. Guess that the gcc bug, even if it's still present in the wild, was triggered by comparison of float4's and can be avoided if we store the result of cnt_sml() into a double before comparing to the double "nlimit". This will at least work correctly on non-broken compilers, and it's way more readable. Per bug #14202 from Greg Navis. Add a regression test based on his example. Report: <20160620115321.5792.10766@wrigleys.postgresql.org>	2016-06-20 10:49:19 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Teodor Sigaev	f576b17cd6	Add word_similarity to pg_trgm contrib module. Patch introduces a concept of similarity over string and just a word from another string. Version of extension is not changed because 1.2 was already introduced in 9.6 release cycle, so, there wasn't a public version. Author: Alexander Korotkov, Artur Zakirov	2016-03-16 18:59:21 +03:00
Teodor Sigaev	5871b88487	GUC variable pg_trgm.similarity_threshold insead of set_limit() Use GUC variable pg_trgm.similarity_threshold insead of set_limit()/show_limit() which was introduced when defining GUC varuables by modules was absent. Author: Artur Zakirov	2016-03-16 17:44:58 +03:00
Alvaro Herrera	26df7066cc	Move strategy numbers to include/access/stratnum.h For upcoming BRIN opclasses, it's convenient to have strategy numbers defined in a single place. Since there's nothing appropriate, create it. The StrategyNumber typedef now lives there, as well as existing strategy numbers for B-trees (from skey.h) and R-tree-and-friends (from gist.h). skey.h is forced to include stratnum.h because of the StrategyNumber typedef, but gist.h is not; extensions that currently rely on gist.h for rtree strategy numbers might need to add a new A few .c files can stop including skey.h and/or gist.h, which is a nice side benefit. Per discussion: https://www.postgresql.org/message-id/20150514232132.GZ2523@alvh.no-ip.org Authored by Emre Hasegeli and Álvaro. (It's not clear to me why bootscanner.l has any #include lines at all.)	2015-05-15 17:03:16 -03:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Peter Eisentraut	e7128e8dbb	Create function prototype as part of PG_FUNCTION_INFO_V1 macro Because of gcc -Wmissing-prototypes, all functions in dynamically loadable modules must have a separate prototype declaration. This is meant to detect global functions that are not declared in header files, but in cases where the function is called via dfmgr, this is redundant. Besides filling up space with boilerplate, this is a frequent source of compiler warnings in extension modules. We can fix that by creating the function prototype as part of the PG_FUNCTION_INFO_V1 macro, which such modules have to use anyway. That makes the code of modules cleaner, because there is one less place where the entry points have to be listed, and creates an additional check that functions have the right prototype. Remove now redundant prototypes from contrib and other modules.	2014-04-18 00:03:19 -04:00
Tom Lane	410bed2ab8	Improve GiST index search performance for trigram regex queries. The initial coding just descended the index if any of the target trigrams were possibly present at the next level down. But actually we can apply trigramsMatchGraph() so as to take advantage of AND requirements when there are some. The input data might contain false positive matches, but that can only result in a false positive result, not false negative, so it's safe to do it this way. Alexander Korotkov	2013-04-15 12:49:29 -04:00
Tom Lane	6f5b8beb64	Make contrib/pg_trgm also support regex searches with GiST indexes. This wasn't addressed in the original patch, but it doesn't take very much additional code to cover the case, so let's get it done. Since pg_trgm 1.1 hasn't been released yet, I just changed the definition of what's in it, rather than inventing a 1.2.	2013-04-10 13:31:02 -04:00
Peter Eisentraut	b8b2e3b2de	Replace int2/int4 in C code with int16/int32 The latter was already the dominant use, and it's preferable because in C the convention is that intXX means XX bits. Therefore, allowing mixed use of int2, int4, int8, int16, int32 is obviously confusing. Remove the typedefs for int2 and int4 for now. They don't seem to be widely used outside of the PostgreSQL source tree, and the few uses can probably be cleaned up by the time this ships.	2012-06-25 01:51:46 +03:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Tom Lane	0a5d5a49d9	Cache the result of makesign() across calls of gtrgm_penalty(). Since gtrgm_penalty() is usually called many times in a row with the same "newval" (to determine which item on an index page newval fits into best), the makesign() calculation is repetitious. It's expensive enough to make it worth caching the result, so do so. On my machine this is good for more than a 40% savings in the time needed to build a trigram index on /usr/share/dict/words. This is all per a suggestion of Heikki's. In passing, make some mostly-cosmetic improvements in the caching logic in the other functions in this file that rely on caching info in fn_extra.	2011-09-30 23:54:27 -04:00
Peter Eisentraut	1b81c2fe6e	Remove many -Wcast-qual warnings This addresses only those cases that are easy to fix by adding or moving a const qualifier or removing an unnecessary cast. There are many more complicated cases remaining.	2011-09-11 21:54:32 +03:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Tom Lane	6e2f3ae884	Support LIKE and ILIKE index searches via contrib/pg_trgm indexes. Unlike Btree-based LIKE optimization, this works for non-left-anchored search patterns. The effectiveness of the search depends on how many trigrams can be extracted from the pattern. (The worst case, with no trigrams, degrades to a full-table scan, so this isn't a panacea. But it can be very useful.) Alexander Korotkov, reviewed by Jan Urbanski	2011-01-31 21:34:49 -05:00
Tom Lane	b525bf771e	Add KNNGIST support to contrib/pg_trgm. Teodor Sigaev, with some revision by Tom	2010-12-04 00:16:21 -05:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Teodor Sigaev	2d6599f471	Add caching of query to GIN/GiST consistent function. Per performance gripe from nomao.com	2008-07-11 11:56:48 +00:00
Andrew Dunstan	53972b460c	Add $PostgreSQL$ markers to a lot of files that were missing them. This particular batch was just for .c and .h file. The changes were made with the following 2 commands: find . $ \( -name 'libstemmer' -o -name 'expected' -o -name 'ppport.h' $ -prune \) -o $ -name '.[ch]' $ $ -exec grep -q '\$PostgreSQL' {} \; -o -print $ \| while read file ; do head -n 1 < $file \| grep -q '^/\' && echo $file; done \| xargs -l sed -i -e '1s/^\// /' -e '1i/\n $PostgreSQL:$ \n ' find . $ \( -name 'libstemmer' -o -name 'expected' -o -name 'ppport.h' $ -prune \) -o $ -name '.[ch]' $ $ -exec grep -q '\$PostgreSQL' {} \; -o -print $ \| xargs -l sed -i -e '1i/\n $PostgreSQL:$ \n */'	2008-05-17 01:28:26 +00:00
Tom Lane	9b5c8d45f6	Push index operator lossiness determination down to GIST/GIN opclass "consistent" functions, and remove pg_amop.opreqcheck, as per recent discussion. The main immediate benefit of this is that we no longer need 8.3's ugly hack of requiring @@@ rather than @@ to test weight-using tsquery searches on GIN indexes. In future it should be possible to optimize some other queries better than is done now, by detecting at runtime whether the index match is exact or not. Tom Lane, after an idea of Heikki's, and with some help from Teodor.	2008-04-14 17:05:34 +00:00
Bruce Momjian	5f0bf6cb0d	Run pgindent on remaining files now that LOOPBYTE is a usable macro.	2007-11-16 01:12:24 +00:00
Bruce Momjian	224f91f66d	Modify LOOPBYTE/LOOPBIT macros to be more logical; rather than have the for() body passed as a parameter, make the macros act as simple headers to code blocks. This allows pgindent to be run on these files.	2007-11-16 00:13:02 +00:00
Tom Lane	3e23b68dac	Support varlena fields with single-byte headers and unaligned storage. This commit breaks any code that assumes that the mere act of forming a tuple (without writing it to disk) does not "toast" any fields. While all available regression tests pass, I'm not totally sure that we've fixed every nook and cranny, especially in contrib. Greg Stark with some help from Tom Lane	2007-04-06 04:21:44 +00:00
Tom Lane	9f652d430f	Fix up several contrib modules that were using varlena datatypes in not-so-obvious ways. I'm not totally sure that I caught everything, but at least now they pass their regression tests with VARSIZE/SET_VARSIZE defined to reverse byte order.	2007-02-28 22:44:38 +00:00
Bruce Momjian	f99a569a2e	pgindent run for 8.2.	2006-10-04 00:30:14 +00:00
Teodor Sigaev	1f7ef548ec	Changes * new split algorithm (as proposed in http://archives.postgresql.org/pgsql-hackers/2006-06/msg00254.php) * possible call pickSplit() for second and below columns * add spl_(l\|r)datum_exists to GIST_SPLITVEC - pickSplit should check its values to use already defined spl_(l\|r)datum for splitting. pickSplit should set spl_(l\|r)datum_exists to 'false' (if they was 'true') to signal to caller about using spl_(l\|r)datum. * support for old pickSplit(): not very optimal but correct split * remove 'bytes' field from GISTENTRY: in any case size of value is defined by it's type. * split GIST_SPLITVEC to two structures: one for using in picksplit and second - for internal use. * some code refactoring * support of subsplit to rtree opclasses TODO: add support of subsplit to contrib modules	2006-06-28 12:00:14 +00:00
Neil Conway	8e5a10d46c	This patch makes the error message strings throughout the backend more compliant with the error message style guide. In particular, errdetail should begin with a capital letter and end with a period, whereas errmsg should not. I also fixed a few related issues in passing, such as fixing the repeated misspelling of "lexeme" in contrib/tsearch2 (per Tom's suggestion).	2006-03-01 06:30:32 +00:00
Tom Lane	33feb55c47	Replace bitwise looping with bytewise looping in hemdistsign and sizebitvec of tsearch2, as well as identical code in several other contrib modules. This provided about a 20X speedup in building a large tsearch2 index ... didn't try to measure its effects for other operations. Thanks to Stephan Vollmer for providing a test case.	2006-01-20 22:46:16 +00:00
Tom Lane	2a8d3d83ef	R-tree is dead ... long live GiST.	2005-11-07 17:36:47 +00:00
Neil Conway	36ab600511	Cleanup of GiST extensions in contrib/: now that we always invoke GiST methods in a short-lived memory context, there is no need for GiST methods to do their own manual (and error-prone) memory management.	2005-05-21 12:08:06 +00:00
Bruce Momjian	b6b71b85bc	Pgindent run for 8.0.	2004-08-29 05:07:03 +00:00
Teodor Sigaev	cbfa4092bb	trgm - Trigram matching for PostgreSQL -------------------------------------- The pg_trgm contrib module provides functions and index classes for determining the similarity of text based on trigram matching.	2004-05-31 17:18:12 +00:00

38 Commits