postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-10-04 22:06:50 +02:00

Author	SHA1	Message	Date
Tom Lane	7d08ce286c	Distinguish selectivity of < from <= and > from >=. Historically, the selectivity functions have simply not distinguished < from <=, or > from >=, arguing that the fraction of the population that satisfies the "=" aspect can be considered to be vanishingly small, if the comparison value isn't any of the most-common-values for the variable. (If it is, the code path that executes the operator against each MCV will take care of things properly.) But that isn't really true unless we're dealing with a continuum of variable values, and in practice we seldom are. If "x = const" would estimate a nonzero number of rows for a given const value, then it follows that we ought to estimate different numbers of rows for "x < const" and "x <= const", even if the const is not one of the MCVs. Handling this more honestly makes a significant difference in edge cases, such as the estimate for a tight range (x BETWEEN y AND z where y and z are close together). Hence, split scalarltsel into scalarltsel/scalarlesel, and similarly split scalargtsel into scalargtsel/scalargesel. Adjust <= and >= operator definitions to reference the new selectivity functions. Improve the core ineq_histogram_selectivity() function to make a correction for equality. (Along the way, I learned quite a bit about exactly why that function gives good answers, which I tried to memorialize in improved comments.) The corresponding join selectivity functions were, and remain, just stubs. But I chose to split them similarly, to avoid confusion and to prevent the need for doing this exercise again if someone ever makes them less stubby. In passing, change ineq_histogram_selectivity's clamp for extreme probability estimates so that it varies depending on the histogram size, instead of being hardwired at 0.0001. With the default histogram size of 100 entries, you still get the old clamp value, but bigger histograms should allow us to put more faith in edge values. Tom Lane, reviewed by Aleksander Alekseev and Kuntal Ghosh Discussion: https://postgr.es/m/12232.1499140410@sss.pgh.pa.us	2017-09-13 11:12:39 -04:00
Robert Haas	81c5e46c49	Introduce 64-bit hash functions with a 64-bit seed. This will be useful for hash partitioning, which needs a way to seed the hash functions to avoid problems such as a hash index on a hash partitioned table clumping all values into a small portion of the bucket space; it's also useful for anything that wants a 64-bit hash value rather than a 32-bit hash value. Just in case somebody wants a 64-bit hash value that is compatible with the existing 32-bit hash values, make the low 32-bits of the 64-bit hash value match the 32-bit hash value when the seed is 0. Robert Haas and Amul Sul Discussion: http://postgr.es/m/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com	2017-08-31 22:21:21 -04:00
Tom Lane	fdc9186f7e	Replace the built-in GIN array opclasses with a single polymorphic opclass. We had thirty different GIN array opclasses sharing the same operators and support functions. That still didn't cover all the built-in types, nor did it cover arrays of extension-added types. What we want is a single polymorphic opclass for "anyarray". There were two missing features needed to make this possible: 1. We have to be able to declare the index storage type as ANYELEMENT when the opclass is declared to index ANYARRAY. This just takes a few more lines in index_create(). Although this currently seems of use only for GIN, there's no reason to make index_create() restrict it to that. 2. We have to be able to identify the proper GIN compare function for the index storage type. This patch proceeds by making the compare function optional in GIN opclass definitions, and specifying that the default btree comparison function for the index storage type will be looked up when the opclass omits it. Again, that seems pretty generically useful. Since the comparison function lookup is done in initGinState(), making use of the second feature adds an additional cache lookup to GIN index access setup. It seems unlikely that that would be very noticeable given the other costs involved, but maybe at some point we should consider making GinState data persist longer than it now does --- we could keep it in the index relcache entry, perhaps. Rather fortuitously, we don't seem to need to do anything to get this change to play nice with dump/reload or pg_upgrade scenarios: the new opclass definition is automatically selected to replace existing index definitions, and the on-disk data remains compatible. Also, if a user has created a custom opclass definition for a non-builtin type, this doesn't break that, since CREATE INDEX will prefer an exact match to opcintype over a match to ANYARRAY. However, if there's anyone out there with handwritten DDL that explicitly specifies _bool_ops or one of the other replaced opclass names, they'll need to adjust that. Tom Lane, reviewed by Enrique Meneses Discussion: <14436.1470940379@sss.pgh.pa.us>	2016-09-26 14:52:44 -04:00
Peter Eisentraut	a956bf4395	doc: Fix typo From: Guillaume Lelarge <guillaume@lelarge.info>	2016-05-01 21:37:43 -04:00
Alvaro Herrera	a3a8309d45	Document BRIN a bit more thoroughly The chapter "Interfacing Extensions To Indexes" and CREATE OPERATOR CLASS reference page were missed when BRIN was added. We document all our other index access methods there, so make sure BRIN complies. Author: Álvaro Herrera Reported-By: Julien Rouhaud, Tom Lane Reviewed-By: Emre Hasegeli Discussion: https://www.postgresql.org/message-id/56CF604E.9000303%40dalibo.com Backpatch: 9.5, where BRIN was introduced	2016-03-10 13:15:08 -03:00
Tom Lane	65c5fcd353	Restructure index access method API to hide most of it at the C level. This patch reduces pg_am to just two columns, a name and a handler function. All the data formerly obtained from pg_am is now provided in a C struct returned by the handler function. This is similar to the designs we've adopted for FDWs and tablesample methods. There are multiple advantages. For one, the index AM's support functions are now simple C functions, making them faster to call and much less error-prone, since the C compiler can now check function signatures. For another, this will make it far more practical to define index access methods in installable extensions. A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possible from SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much more could be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanity used to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized on by me.	2016-01-17 19:36:59 -05:00
Tom Lane	b0cfb02cec	Update xindex.sgml for recent additions to GIST opclass API. Commit `d04c8ed904` added another support function to the GIST API, but overlooked mentioning it in xindex.sgml's summary of index support functions. Anastasia Lubennikova	2015-12-06 12:42:32 -05:00
Peter Eisentraut	aa68872561	doc: Spell checking	2014-07-16 22:48:11 -04:00
Heikki Linnakangas	c5608ea26a	Allow opclasses to provide tri-valued GIN consistent functions. With the GIN "fast scan" feature, GIN can skip items without fetching all the keys for them, if it can prove that they don't match regardless of those keys. So far, it has done the proving by calling the boolean consistent function with all combinations of TRUE/FALSE for the unfetched keys, but since that's O(n^2), it becomes unfeasible with more than a few keys. We can avoid calling consistent with all the combinations, if we can tell the operator class implementation directly which keys are unknown. This commit includes a triConsistent function for the built-in array and tsvector opclasses. Alexander Korotkov, with some changes by me.	2014-03-12 17:51:30 +02:00
Tom Lane	8daeb5ddd6	Add SP-GiST (space-partitioned GiST) index access method. SP-GiST is comparable to GiST in flexibility, but supports non-balanced partitioned search structures rather than balanced trees. As described at PGCon 2011, this new indexing structure can beat GiST in both index build time and query speed for search problems that it is well matched to. There are a number of areas that could still use improvement, but at this point the code seems committable. Teodor Sigaev and Oleg Bartunov, with considerable revisions by Tom Lane	2011-12-17 16:42:30 -05:00
Tom Lane	c6e3ac11b6	Create a "sort support" interface API for faster sorting. This patch creates an API whereby a btree index opclass can optionally provide non-SQL-callable support functions for sorting. In the initial patch, we only use this to provide a directly-callable comparator function, which can be invoked with a bit less overhead than the traditional SQL-callable comparator. While that should be of value in itself, the real reason for doing this is to provide a datatype-extensible framework for more aggressive optimizations, as in Peter Geoghegan's recent work. Robert Haas and Tom Lane	2011-12-07 00:19:39 -05:00
Robert Haas	367bc426a1	Avoid index rebuild for no-rewrite ALTER TABLE .. ALTER TYPE. Noah Misch. Review and minor cosmetic changes by me.	2011-07-18 11:04:43 -04:00
Tom Lane	b576757d7e	Add external documentation for KNNGIST.	2010-12-03 23:49:06 -05:00
Tom Lane	725d52d0c2	Create the system catalog infrastructure needed for KNNGIST. This commit adds columns amoppurpose and amopsortfamily to pg_amop, and column amcanorderbyop to pg_am. For the moment all the entries in amcanorderbyop are "false", since the underlying support isn't there yet. Also, extend the CREATE OPERATOR CLASS/ALTER OPERATOR FAMILY commands with [ FOR SEARCH \| FOR ORDER BY sort_operator_family ] clauses to allow the new columns of pg_amop to be populated, and create pg_dump support for dumping that information. I also added some documentation, although it's perhaps a bit premature given that the feature doesn't do anything useful yet. Teodor Sigaev, Robert Haas, Tom Lane	2010-11-24 14:22:17 -05:00
Peter Eisentraut	fc946c39ae	Remove useless whitespace at end of lines	2010-11-23 22:34:55 +02:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Peter Eisentraut	5194b9d049	Spell and markup checking	2010-08-17 04:37:21 +00:00
Bruce Momjian	99ef515280	Revert removal of pre-7.4 documenation behavior mentions.	2010-02-24 15:54:31 +00:00
Bruce Momjian	7bfd95a4a2	Remove pre-7.4 documentaiton mentions, now that 8.0 is the oldest supported release.	2010-02-24 03:33:49 +00:00
Alvaro Herrera	aa7f00464d	Desultorily enclose programlisting tags in CDATA, to get rid of some obnoxious SGML-escaping.	2008-12-07 23:46:39 +00:00
Tom Lane	e6dbcb72fa	Extend GIN to support partial-match searches, and extend tsquery to support prefix matching using this facility. Teodor Sigaev and Oleg Bartunov	2008-05-16 16:31:02 +00:00
Tom Lane	9b5c8d45f6	Push index operator lossiness determination down to GIST/GIN opclass "consistent" functions, and remove pg_amop.opreqcheck, as per recent discussion. The main immediate benefit of this is that we no longer need 8.3's ugly hack of requiring @@@ rather than @@ to test weight-using tsquery searches on GIN indexes. In future it should be possible to optimize some other queries better than is done now, by detecting at runtime whether the index match is exact or not. Tom Lane, after an idea of Heikki's, and with some help from Teodor.	2008-04-14 17:05:34 +00:00
Tom Lane	8ee076325f	Mention hash opclasses in 'System Dependencies on Operator Classes', which previously only talked about btree opclasses.	2007-12-02 04:36:40 +00:00
Neil Conway	85904e0d36	Minor consistency tweak for SGML docs.	2007-04-25 19:48:27 +00:00
Tom Lane	91e18dbbcc	Docs updates for cross-type hashing.	2007-02-06 04:38:31 +00:00
Bruce Momjian	09a9f10e7f	Consistenly use colons before '<programlisting>' blocks, where appropriate.	2007-02-01 00:28:19 +00:00
Bruce Momjian	a134ee3379	Update documentation on may/can/might: Standard English uses "may", "can", and "might" in different ways: may - permission, "You may borrow my rake." can - ability, "I can lift that log." might - possibility, "It might rain today." Unfortunately, in conversational English, their use is often mixed, as in, "You may use this variable to do X", when in fact, "can" is a better choice. Similarly, "It may crash" is better stated, "It might crash". Also update two error messages mentioned in the documenation to match.	2007-01-31 20:56:20 +00:00
Tom Lane	a56c5fb0f5	Update xindex.sgml to discuss operator families.	2007-01-23 20:45:28 +00:00
Tom Lane	fcf4b146c6	Simplify pg_am representation of ordering-capable access methods: provide just a boolean 'amcanorder', instead of fields that specify the sort operator strategy numbers. We have decided to require ordering-capable AMs to use btree-compatible strategy numbers, so the old fields are overkill (and indeed misleading about what's allowed).	2007-01-20 23:13:01 +00:00
Tom Lane	4431758229	Support ORDER BY ... NULLS FIRST/LAST, and add ASC/DESC/NULLS FIRST/NULLS LAST per-column options for btree indexes. The planner's support for this is still pretty rudimentary; it does not yet know how to plan mergejoins with nondefault ordering options. The documentation is pretty rudimentary, too. I'll work on improving that stuff later. Note incompatible change from prior behavior: ORDER BY ... USING will now be rejected if the operator is not a less-than or greater-than member of some btree opclass. This prevents less-than-sane behavior if an operator that doesn't actually define a proper sort ordering is selected.	2007-01-09 02:14:16 +00:00
Tom Lane	08fa6a6851	Editorial improvements for GIN documentation.	2006-12-01 23:46:46 +00:00
Peter Eisentraut	0f763503ff	Spellchecking and such	2006-10-23 18:10:32 +00:00
Teodor Sigaev	b0d64a090b	Add comments about STORAGE option for GIN	2006-09-21 15:09:38 +00:00
Teodor Sigaev	823ffd88e3	Fix table's caption	2006-09-21 15:03:53 +00:00
Teodor Sigaev	bcbb402e31	Improve wordings by David Fuhry <dfuhry@cs.kent.edu>	2006-09-18 12:11:36 +00:00
Bruce Momjian	32cebaecff	Remove emacs info from footer of SGML files.	2006-09-16 00:30:20 +00:00
Teodor Sigaev	e25c3e84b6	Fix SGML markup	2006-09-14 13:40:28 +00:00
Teodor Sigaev	0ca9907ce4	GIN documentation and slightly improving GiST docs. Thanks to Christopher Kings-Lynne <chris.kingslynne@gmail.com> for initial version and Jeff Davis <pgsql@j-davis.com> for inspection	2006-09-14 11:16:27 +00:00
Bruce Momjian	10964008c9	Remove GIN documentation Christopher Kings-Lynne	2006-09-05 03:09:56 +00:00
Bruce Momjian	19dd2fbf7e	Add GIN documentation. Christopher Kings-Lynne	2006-09-04 20:10:53 +00:00
Bruce Momjian	497b5ad928	Make $PostgreSQL CVS tags consistent for SGML files.	2006-03-10 19:10:50 +00:00
Tom Lane	2a8d3d83ef	R-tree is dead ... long live GiST.	2005-11-07 17:36:47 +00:00
Neil Conway	ca76df425b	Documentation tweak: make <command>CREATE OPERATOR CLASS</command> into an <xref/>.	2005-07-19 01:27:59 +00:00
Tom Lane	b90f8f20f0	Extend r-tree operator classes to handle Y-direction tests equivalent to the existing X-direction tests. An rtree class now includes 4 actual 2-D tests, 4 1-D X-direction tests, and 4 1-D Y-direction tests. This involved adding four new Y-direction test operators for each of box and polygon; I followed the PostGIS project's lead as to the names of these operators. NON BACKWARDS COMPATIBLE CHANGE: the poly_overleft (&<) and poly_overright (&>) operators now have semantics comparable to box_overleft and box_overright. This is necessary to make r-tree indexes work correctly on polygons. Also, I changed circle_left and circle_right to agree with box_left and box_right --- formerly they allowed the boundaries to touch. This isn't actually essential given the lack of any r-tree opclass for circles, but it seems best to sync all the definitions while we are at it.	2005-06-24 20:53:34 +00:00
Tom Lane	c6521b1b93	Write some real documentation about the index access method API.	2005-02-13 03:04:15 +00:00
Bruce Momjian	d08889aa8b	Add tools/find_gt_lt to find < and > in SGML source. Lowercase some uppercase tags so tools is more reliable at finding problems.	2005-01-23 00:30:59 +00:00
Neil Conway	ec7a6bd9a2	Replace "--" and "---" with "—" as appropriate, for better-looking output.	2004-11-15 06:32:15 +00:00
PostgreSQL Daemon	969685ad44	$Header: -> $PostgreSQL Changes ...	2003-11-29 19:52:15 +00:00
Tom Lane	fa5c8a055a	Cross-data-type comparisons are now indexable by btrees, pursuant to my pghackers proposal of 8-Nov. All the existing cross-type comparison operators (int2/int4/int8 and float4/float8) have appropriate support. The original proposal of storing the right-hand-side datatype as part of the primary key for pg_amop and pg_amproc got modified a bit in the event; it is easier to store zero as the 'default' case and only store a nonzero when the operator is actually cross-type. Along the way, remove the long-since-defunct bigbox_ops operator class.	2003-11-12 21:15:59 +00:00
Peter Eisentraut	8442a92e5a	Spell checking, consistent terminology.	2003-11-01 01:56:29 +00:00

1 2

83 Commits