Go to file
Tom Lane 8413789477 Fix default text search parser's ts_headline code for phrase queries.
This code could produce very poor results when asked to highlight a
string based on a query using phrase-match operators.  The root cause
is that hlCover(), which is supposed to find a minimal substring that
matches the query, was written assuming that word position is not
significant.  I'm only 95% convinced that its algorithm was correct even
for plain AND/OR queries; but it definitely fails completely for phrase
matches, causing it to possibly not identify a cover string at all.

Hence, rewrite hlCover() with a less-tense algorithm that just tries
all the possible substrings, earlier and shorter ones first.  (This is
not as bad as it sounds performance-wise, because all of the string
matching has been done already: the repeated tsquery match checks
boil down to pointer comparisons.)

Unfortunately, since that approach produces more candidate cover
strings than before, it also exposes that there were bugs in the
heuristics in mark_hl_words() for selecting a best cover string.
Fixes there include:
* Do not apply the ShortWord filter to words that appear in the query.
* Remove a misguided optimization for quickly rejecting a cover.
* Fix order-of-operation bug that could cause computation of a
wrong figure of merit (poslen) when shortening a cover.
* Change the preference rule so that candidate headlines that do not
include their whole cover string (after MaxWords trimming) are lowest
priority, since they may not actually satisfy the user's query.

This results in some changes in existing regression test cases,
but they all seem reasonable.  Note in particular that the tests
involving strings like "1 2 3" were previously being affected by
the ShortWord filter, masking the normal matching behavior.

Per bug #16345 from Augustinas Jokubauskas; the new test cases are
based on that example.  Back-patch to 9.6 where phrase search was
added to tsquery.

Discussion: https://postgr.es/m/16345-2e0cf5cddbdcd3b4@postgresql.org
2020-04-09 13:19:23 -04:00
config Update config.guess and config.sub 2019-04-27 14:25:00 +02:00
contrib Fix bogus CALLED_AS_TRIGGER() defenses. 2020-04-03 11:24:56 -04:00
doc doc: remove unnecessary INNER keyword 2020-04-02 17:42:09 -04:00
src Fix default text search parser's ts_headline code for phrase queries. 2020-04-09 13:19:23 -04:00
.dir-locals.el Make Emacs perl-mode indent more like perltidy. 2019-01-13 11:32:31 -08:00
.gitattributes Add XSL stylesheet to fix up SVG files 2019-06-19 21:26:42 +02:00
.gitignore Support for optimizing and emitting code in LLVM JIT provider. 2018-03-22 11:05:22 -07:00
aclocal.m4 Fix configure's AC_CHECK_DECLS tests to work correctly with clang. 2018-11-19 12:01:47 -05:00
configure Use pkg-config, if available, to locate libxml2 during configure. 2020-03-17 12:09:26 -04:00
configure.in Use pkg-config, if available, to locate libxml2 during configure. 2020-03-17 12:09:26 -04:00
COPYRIGHT Update copyrights for 2020 2020-01-01 12:21:45 -05:00
GNUmakefile.in Integrate cpluspluscheck into build system. 2019-05-31 12:36:17 -07:00
HISTORY Change documentation references to PG website to use https: not http: 2017-05-20 21:50:47 -04:00
Makefile Don't unset MAKEFLAGS in non-GNU Makefile. 2019-06-25 09:36:21 +12:00
README Change documentation references to PG website to use https: not http: 2017-05-20 21:50:47 -04:00
README.git Change documentation references to PG website to use https: not http: 2017-05-20 21:50:47 -04:00

PostgreSQL Database Management System
=====================================

This directory contains the source code distribution of the PostgreSQL
database management system.

PostgreSQL is an advanced object-relational database management system
that supports an extended subset of the SQL standard, including
transactions, foreign keys, subqueries, triggers, user-defined types
and functions.  This distribution also contains C language bindings.

PostgreSQL has many language interfaces, many of which are listed here:

	https://www.postgresql.org/download

See the file INSTALL for instructions on how to build and install
PostgreSQL.  That file also lists supported operating systems and
hardware platforms and contains information regarding any other
software packages that are required to build or run the PostgreSQL
system.  Copyright and license information can be found in the
file COPYRIGHT.  A comprehensive documentation set is included in this
distribution; it can be read as described in the installation
instructions.

The latest version of this software may be obtained at
https://www.postgresql.org/download/.  For more information look at our
web site located at https://www.postgresql.org/.