Go to file
Tom Lane b4c6d31c0b Fix serious performance problems in json(b) to_tsvector().
In an off-list followup to bug #14745, Bob Jones complained that
to_tsvector() on a 2MB jsonb value took an unreasonable amount of
time and space --- enough to draw the wrath of the OOM killer on
his machine.  On my machine, his example proved to require upwards
of 18 seconds and 4GB, which seemed pretty bogus considering that
to_tsvector() on the same data treated as text took just a couple
hundred msec and 10 or so MB.

On investigation, the problem is that the implementation scans each
string element of the json(b) and converts it to tsvector separately,
then applies tsvector_concat() to join those separate tsvectors.
The unreasonable memory usage came from leaking every single one of
the transient tsvectors --- but even without that mistake, this is an
O(N^2) or worse algorithm, because tsvector_concat() has to repeatedly
process the words coming from earlier elements.

We can fix it by accumulating all the lexeme data and applying
make_tsvector() just once.  As a side benefit, that also makes the
desired adjustment of lexeme positions far cheaper, because we can
just tweak the running "pos" counter between JSON elements.

In passing, try to make the explanation of that tweak more intelligible.
(I didn't think that a barely-readable comment far removed from the
actual code was helpful.)  And do some minor other code beautification.
2017-07-18 12:45:51 -04:00
config Make configure check for IPC::Run when --enable-tap-tests is specified. 2017-06-15 15:56:12 -04:00
contrib Code review for NextValueExpr expression node type. 2017-07-14 15:25:43 -04:00
doc Doc: fix thinko in v10 release notes. 2017-07-18 10:17:15 -04:00
src Fix serious performance problems in json(b) to_tsvector(). 2017-07-18 12:45:51 -04:00
.dir-locals.el emacs: Set indent-tabs-mode in perl-mode 2015-04-12 23:53:23 -04:00
.gitattributes Remove contrib/tsearch2. 2017-02-13 11:06:11 -05:00
.gitignore Allow .so minor version numbers above 9 in .gitignore. 2016-08-15 17:35:35 -04:00
aclocal.m4 Make configure check for IPC::Run when --enable-tap-tests is specified. 2017-06-15 15:56:12 -04:00
configure Stamp 10beta2. 2017-07-10 16:26:20 -04:00
configure.in Stamp 10beta2. 2017-07-10 16:26:20 -04:00
COPYRIGHT Update copyright for 2017 2017-01-03 12:37:53 -05:00
GNUmakefile.in Have "make coverage" recurse into contrib as well 2016-09-05 18:44:36 -03:00
HISTORY Change documentation references to PG website to use https: not http: 2017-05-20 21:50:47 -04:00
Makefile Allow make check in PL directories 2011-02-15 06:52:12 +02:00
README Change documentation references to PG website to use https: not http: 2017-05-20 21:50:47 -04:00
README.git Change documentation references to PG website to use https: not http: 2017-05-20 21:50:47 -04:00

PostgreSQL Database Management System
=====================================

This directory contains the source code distribution of the PostgreSQL
database management system.

PostgreSQL is an advanced object-relational database management system
that supports an extended subset of the SQL standard, including
transactions, foreign keys, subqueries, triggers, user-defined types
and functions.  This distribution also contains C language bindings.

PostgreSQL has many language interfaces, many of which are listed here:

	https://www.postgresql.org/download

See the file INSTALL for instructions on how to build and install
PostgreSQL.  That file also lists supported operating systems and
hardware platforms and contains information regarding any other
software packages that are required to build or run the PostgreSQL
system.  Copyright and license information can be found in the
file COPYRIGHT.  A comprehensive documentation set is included in this
distribution; it can be read as described in the installation
instructions.

The latest version of this software may be obtained at
https://www.postgresql.org/download/.  For more information look at our
web site located at https://www.postgresql.org/.