Commit Graph

85 Commits

Author SHA1 Message Date
Bruce Momjian ca9cb940d2 doc: more replacement of <literal> with something better
Reported-by: Alexander Law

Author: Alexander Law

Backpatch-through: 9.6
2016-08-24 21:11:44 -04:00
Peter Eisentraut 5676da2d01 Documentation spell checking and markup improvements 2016-07-28 22:46:15 -04:00
Tom Lane 4242a715c3 Adjust text search documentation for recent commits.
Fix some now-obsolete statements that were overlooked in commits
6734a1cac, 3dbbd0f02, 028350f61.  Document the behavior of <0>.
Also do a little bit of rearranging and copy-editing for clarity.
2016-06-29 15:00:33 -04:00
Teodor Sigaev 73e6bea603 Document precedence of FTS operators in tsquery
Oleg Bartunov
2016-06-29 17:59:36 +03:00
Teodor Sigaev 028350f619 Make exact distance match for FTS phrase operator
Phrase operator now requires exact distance betweens lexems instead of
less-or-equal.

Per discussion c19fcfec308e6ccd952cdde9e648b505@mail.gmail.com
2016-06-27 20:41:00 +03:00
Tom Lane 6581e930a8 Polish the documentation concerning phrase text search.
Fix grammar, improve examples, etc.

I did not attempt to document the current behavior concerning distance-zero
matches, because I think that's broken and needs to change, so I'm not
going to use up brain cells figuring out how to explain how it works now.
One way or the other, there's still more to write here.
2016-06-09 00:30:59 -04:00
Tom Lane 0b9a234432 Rename tsvector delete() to ts_delete(), and filter() to ts_filter().
The similarity of the original names to SQL keywords seems like a bad
idea.  Rename them before we're stuck with 'em forever.

In passing, minor code and docs cleanup.

Discussion: <4875.1462210058@sss.pgh.pa.us>
2016-05-05 19:43:32 -04:00
Teodor Sigaev f1e3c76066 Fix tsearch docs
Remove mention of setweight(tsquery) which wasn't included in 9.6. Also
replace old forgotten phrase operator to new one.

Dmitry Ivanov
2016-04-26 20:26:26 +03:00
Teodor Sigaev bb140506df Phrase full text search.
Patch introduces new text search operator (<-> or <DISTANCE>) into tsquery.
On-disk and binary in/out format of tsquery are backward compatible.
It has two side effect:
- change order for tsquery, so, users, who has a btree index over tsquery,
  should reindex it
- less number of parenthesis in tsquery output, and tsquery becomes more
  readable

Authors: Teodor Sigaev, Oleg Bartunov, Dmitry Ivanov
Reviewers: Alexander Korotkov, Artur Zakirov
2016-04-07 18:44:18 +03:00
Teodor Sigaev 6943a946c7 Tsvector editing functions
Adds several tsvector editting function: convert tsvector to/from text array,
set weight for given lexemes, delete lexeme(s), unnest, filter lexemes
with given weights

Author: Stas Kelvich with some editorization by me
Reviewers: Tomas Vondram, Teodor Sigaev
2016-03-11 19:22:36 +03:00
Teodor Sigaev d78a7d9c7f Improve support of Hunspell in ispell dictionary.
Now it's possible to load recent version of Hunspell for several languages.
To handle these dictionaries Hunspell patch adds support for:
* FLAG long - sets the double extended ASCII character flag type
* FLAG num - sets the decimal number flag type (from 1 to 65535)
* AF parameter - alias for flag's set

Also it moves test dictionaries into separate directory.

Author: Artur Zakirov with editorization by me
2016-03-04 20:08:47 +03:00
Bruce Momjian 6d8b2aa83a docs: update guidelines on when to use GIN and GiST indexes
Report by Tomas Vondra

Backpatch through 9.5
2015-10-05 13:38:36 -04:00
Teodor Sigaev a1c44e1af6 Update site address of Snowball project 2015-09-07 15:20:45 +03:00
Bruce Momjian f6d65f0c70 docs: consistently uppercase index method and add spacing
Consistently uppercase index method names, e.g. GIN, and add space after
the index method name and the parentheses enclosing the column names.
2015-05-15 11:42:34 -04:00
Kevin Grittner 05258761bf doc: Various typo/grammar fixes
Errors detected using Topy (https://github.com/intgr/topy), all
changes verified by hand and some manual tweaks added.

Marti Raudsepp

Individual changes backpatched, where applicable, as far as 9.0.
2014-08-30 10:52:36 -05:00
Peter Eisentraut 3a9d430af5 doc: Fix DocBook XML validity
The main problem is that DocBook SGML allows indexterm elements just
about everywhere, but DocBook XML is stricter.  For example, this common
pattern

    <varlistentry>
     <indexterm>...</indexterm>
     <term>...</term>
     ...
    </varlistentry>

needs to be changed to something like

    <varlistentry>
     <term>...<indexterm>...</indexterm></term>
     ...
    </varlistentry>

See also bb4eefe7bf.

There is currently nothing in the build system that enforces that things
stay valid, because that requires additional tools and will receive
separate consideration.
2014-05-06 21:28:58 -04:00
Bruce Momjian 0b5c0f3bc7 docs: Add short "cover density" description
Also, previous commit 1420f3a982 to fix
ts_rank_cd() for stripped lexemes was from a patch created by Alex Hill.
2014-03-24 15:46:59 -04:00
Bruce Momjian 1420f3a982 Fix ts_rank_cd() to ignore stripped lexemes
Previously, stripped lexemes got a default location and could be
considered if mixed with non-stripped lexemes.

BACKWARD INCOMPATIBILITY CHANGE
2014-03-24 14:37:16 -04:00
Peter Eisentraut c99d5d1bcc doc: Fix <synopsis> in <term> markup
Although the DTD technically allows this, the resulting HTML is invalid
because it puts block elements inside inline elements.  DocBook 5.0 also
doesn't allow it anymore, so it's fair to assume that this was never
really intended to work.  Replace <synopsis> with <literal>, which is
the markup used elsewhere in the documentation in similar cases.
2013-06-07 22:00:59 -04:00
Kevin Grittner 4bc0d2e2cf Fix typo: lexemes misspelled in full text search docs.
Dan Scott
2012-09-11 19:46:17 -05:00
Peter Eisentraut 5baf6da717 Documentation spell and markup checking 2012-06-08 00:06:20 +03:00
Bruce Momjian fb4340c5ea Update documentation about ts_rank(). 2011-10-13 14:17:20 -04:00
Peter Eisentraut aeabbccea0 Some markup cleanup to deconfuse the find_gt_lt tool
Josh Kupershmidt
2011-08-30 20:32:49 +03:00
Peter Eisentraut a3b681f0bc Link some tables into the surrounding text by their id 2011-05-04 20:24:07 +03:00
Bruce Momjian 159e3d8629 Update contrib documention mentions to point to actual documentation
sections, rather than just calling it "/contrib/module_name".

Also update pg_test_fsync build instructions now that it is in /contrib.
2011-01-26 09:22:21 -05:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Tom Lane 9389ac8928 Document filtering dictionaries in textsearch.sgml.
While at it, copy-edit the description of prefix-match marker support in
synonym dictionaries, and clarify the description of the default unaccent
dictionary a bit more.
2010-08-25 21:42:55 +00:00
Tom Lane 5344945810 Avoid saying "random" when randomness is not actually meant.
Per Thom Brown.
2010-08-20 13:59:45 +00:00
Peter Eisentraut 66424a2848 Fix indentation of verbatim block elements
Block elements with verbatim formatting (literallayout, programlisting,
screen, synopsis) should be aligned at column 0 independent of the surrounding
SGML, because whitespace is significant, and indenting them creates erratic
whitespace in the output.  The CSS stylesheets already take care of indenting
the output.

Assorted markup improvements to go along with it.
2010-07-29 19:34:41 +00:00
Peter Eisentraut 6dcce3985b Remove unnecessary xref endterm attributes and title ids
The endterm attribute is mainly useful when the toolchain does not support
automatic link target text generation for a particular situation.  In  the
past, this was required by the man page tools for all reference page links,
but that is no longer the case, and it now actually gets in the way of
proper automatic link text generation.  The only remaining use cases are
currently xrefs to refsects.
2010-04-03 07:23:02 +00:00
Peter Eisentraut a95e51962d Update broken and permanently moved links 2010-03-17 17:12:31 +00:00
Bruce Momjian 5473df9eb7 Document what user name email symbols are supported by tsearch. 2010-03-13 03:09:04 +00:00
Teodor Sigaev abd8c94ff9 Add prefix support for synonym dictionary 2009-08-14 14:53:20 +00:00
Tom Lane c30446b9c9 Proofreading for Bruce's recent round of documentation proofreading.
Most of those changes were good, but some not so good ...
2009-06-17 21:58:49 +00:00
Bruce Momjian ba36c48e39 Proofreading adjustments for first two parts of documentation (Tutorial
and SQL).
2009-04-27 16:27:36 +00:00
Tom Lane c1c40e580a Fix textsearch documentation examples to not recommend concatenating separate
fields without putting a space between.  Per gripe from Rick Schumeyer.
2009-04-19 20:36:06 +00:00
Tom Lane bda9dc7e19 Do some copy-editing on description of ts_headline(). 2009-04-14 00:49:56 +00:00
Tom Lane ff301d6e69 Implement "fastupdate" support for GIN indexes, in which we try to accumulate
multiple index entries in a holding area before adding them to the main index
structure.  This helps because bulk insert is (usually) significantly faster
than retail insert for GIN.

This patch also removes GIN support for amgettuple-style index scans.  The
API defined for amgettuple is difficult to support with fastupdate, and
the previously committed partial-match feature didn't really work with
it either.  We might eventually figure a way to put back amgettuple
support, but it won't happen for 8.4.

catversion bumped because of change in GIN's pg_am entry, and because
the format of GIN indexes changed on-disk (there's a metapage now,
and possibly a pending list).

Teodor Sigaev
2009-03-24 20:17:18 +00:00
Tom Lane 445ce15702 Create a third option named "partition" for constraint_exclusion, and make it
the default.  This setting enables constraint exclusion checks only for
appendrel members (ie, inheritance children and UNION ALL arms), which are
the cases in which constraint exclusion is most likely to be useful.  Avoiding
the overhead for simple queries that are unlikely to benefit should bring
the cost down to the point where this is a reasonable default setting.
Per today's discussion.
2009-01-07 22:40:49 +00:00
Teodor Sigaev 2a0083ede8 Improve headeline generation. Now headline can contain
several fragments a-la Google.

Sushant Sinha <sushant354@gmail.com>
2008-10-17 18:05:19 +00:00
Heikki Linnakangas 61d9674988 Make LC_COLLATE and LC_CTYPE database-level settings. Collation and
ctype are now more like encoding, stored in new datcollate and datctype
columns in pg_database.

This is a stripped-down version of Radek Strnad's patch, with further
changes by me.
2008-09-23 09:20:39 +00:00
Tom Lane e6dbcb72fa Extend GIN to support partial-match searches, and extend tsquery to support
prefix matching using this facility.

Teodor Sigaev and Oleg Bartunov
2008-05-16 16:31:02 +00:00
Tom Lane 9b5c8d45f6 Push index operator lossiness determination down to GIST/GIN opclass
"consistent" functions, and remove pg_amop.opreqcheck, as per recent
discussion.  The main immediate benefit of this is that we no longer need
8.3's ugly hack of requiring @@@ rather than @@ to test weight-using tsquery
searches on GIN indexes.  In future it should be possible to optimize some
other queries better than is done now, by detecting at runtime whether the
index match is exact or not.

Tom Lane, after an idea of Heikki's, and with some help from Teodor.
2008-04-14 17:05:34 +00:00
Tom Lane 7953fdcd9e Add a CaseSensitive parameter to synonym dictionaries.
Simon Riggs
2008-03-10 03:01:28 +00:00
Bruce Momjian 271205223a Show example of ts_headline() using a configuration name. 2008-03-04 03:17:18 +00:00
Tom Lane e211470b19 Change a couple of examples to say ALTER MAPPING instead of ADD MAPPING,
per Oleg.
2007-12-13 06:32:47 +00:00
Peter Eisentraut 9293425819 spell checker run 2007-11-28 15:42:31 +00:00
Tom Lane e66d0c6299 Fix some missed usages of 'HTML tag' and 'HTML entity'. 2007-11-20 15:58:52 +00:00
Andrew Dunstan 1157f3cc81 Change descriptions of entity and tag objects to "XML entity" and "XML tag".
Allow tag and entity names that follow XML rules. Provide for hexadecimal
as well as decimal numeric entities. Adjust code names to coincide with
new descriptions.
2007-11-20 02:25:22 +00:00
Tom Lane fb8b38e4bf Add a couple of notes pointing out that GIN index build time is very
sensitive to maintenance_work_mem (something I just learned the hard
way).
2007-11-16 03:23:07 +00:00