2008-05-17 03:28:26 +02:00
|
|
|
/*
|
2010-09-20 22:08:53 +02:00
|
|
|
* contrib/pg_trgm/trgm_gist.c
|
2008-05-17 03:28:26 +02:00
|
|
|
*/
|
2010-12-04 06:16:21 +01:00
|
|
|
#include "postgres.h"
|
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
#include "access/reloptions.h"
|
2015-05-15 22:03:16 +02:00
|
|
|
#include "access/stratnum.h"
|
|
|
|
#include "fmgr.h"
|
Make use of compiler builtins and/or assembly for CLZ, CTZ, POPCNT.
Test for the compiler builtins __builtin_clz, __builtin_ctz, and
__builtin_popcount, and make use of these in preference to
handwritten C code if they're available. Create src/port
infrastructure for "leftmost one", "rightmost one", and "popcount"
so as to centralize these decisions.
On x86_64, __builtin_popcount generally won't make use of the POPCNT
opcode because that's not universally supported yet. Provide code
that checks CPUID and then calls POPCNT via asm() if available.
This requires indirecting through a function pointer, which is
an annoying amount of overhead for a one-instruction operation,
but it's probably not worth working harder than this for our
current use-cases.
I'm not sure we've found all the existing places that could profit
from this new infrastructure; but we at least touched all the
ones that used copied-and-pasted versions of the bitmapset.c code,
and got rid of multiple copies of the associated constant arrays.
While at it, replace c-compiler.m4's one-per-builtin-function
macros with a single one that can handle all the cases we need
to worry about so far. Also, because I'm paranoid, make those
checks into AC_LINK checks rather than just AC_COMPILE; the
former coding failed to verify that libgcc has support for the
builtin, in cases where it's not inline code.
David Rowley, Thomas Munro, Alvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-02-16 05:22:27 +01:00
|
|
|
#include "port/pg_bitutils.h"
|
2019-10-23 05:56:22 +02:00
|
|
|
#include "trgm.h"
|
2010-12-04 06:16:21 +01:00
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
/* gist_trgm_ops opclass options */
|
|
|
|
typedef struct
|
|
|
|
{
|
|
|
|
int32 vl_len_; /* varlena header (do not touch directly!) */
|
|
|
|
int siglen; /* signature length in bytes */
|
|
|
|
} TrgmGistOptions;
|
|
|
|
|
2020-11-12 04:19:16 +01:00
|
|
|
#define GET_SIGLEN() (PG_HAS_OPCLASS_OPTIONS() ? \
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
((TrgmGistOptions *) PG_GET_OPCLASS_OPTIONS())->siglen : \
|
|
|
|
SIGLEN_DEFAULT)
|
|
|
|
|
2013-04-10 19:30:14 +02:00
|
|
|
typedef struct
|
|
|
|
{
|
|
|
|
/* most recent inputs to gtrgm_consistent */
|
|
|
|
StrategyNumber strategy;
|
|
|
|
text *query;
|
|
|
|
/* extracted trigrams for query */
|
|
|
|
TRGM *trigrams;
|
|
|
|
/* if a regex operator, the extracted graph */
|
|
|
|
TrgmPackedGraph *graph;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The "query" and "trigrams" are stored in the same palloc block as this
|
|
|
|
* cache struct, at MAXALIGN'ed offsets. The graph however isn't.
|
|
|
|
*/
|
|
|
|
} gtrgm_consistent_cache;
|
|
|
|
|
|
|
|
#define GETENTRY(vec,pos) ((TRGM *) DatumGetPointer((vec)->vector[(pos)].key))
|
|
|
|
|
|
|
|
|
2004-05-31 19:18:12 +02:00
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_in);
|
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_out);
|
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_compress);
|
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_decompress);
|
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_consistent);
|
2010-12-04 06:16:21 +01:00
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_distance);
|
2004-05-31 19:18:12 +02:00
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_union);
|
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_same);
|
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_penalty);
|
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_picksplit);
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
PG_FUNCTION_INFO_V1(gtrgm_options);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_in(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2006-03-01 07:30:32 +01:00
|
|
|
elog(ERROR, "not implemented");
|
2004-05-31 19:18:12 +02:00
|
|
|
PG_RETURN_DATUM(0);
|
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_out(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2006-03-01 07:30:32 +01:00
|
|
|
elog(ERROR, "not implemented");
|
2004-05-31 19:18:12 +02:00
|
|
|
PG_RETURN_DATUM(0);
|
|
|
|
}
|
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
static TRGM *
|
|
|
|
gtrgm_alloc(bool isalltrue, int siglen, BITVECP sign)
|
|
|
|
{
|
|
|
|
int flag = SIGNKEY | (isalltrue ? ALLISTRUE : 0);
|
|
|
|
int size = CALCGTSIZE(flag, siglen);
|
|
|
|
TRGM *res = palloc(size);
|
|
|
|
|
|
|
|
SET_VARSIZE(res, size);
|
|
|
|
res->flag = flag;
|
|
|
|
|
|
|
|
if (!isalltrue)
|
|
|
|
{
|
|
|
|
if (sign)
|
|
|
|
memcpy(GETSIGN(res), sign, siglen);
|
|
|
|
else
|
|
|
|
memset(GETSIGN(res), 0, siglen);
|
|
|
|
}
|
|
|
|
|
|
|
|
return res;
|
|
|
|
}
|
|
|
|
|
2004-05-31 19:18:12 +02:00
|
|
|
static void
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
makesign(BITVECP sign, TRGM *a, int siglen)
|
2004-05-31 19:18:12 +02:00
|
|
|
{
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 k,
|
2004-05-31 19:18:12 +02:00
|
|
|
len = ARRNELEM(a);
|
|
|
|
trgm *ptr = GETARR(a);
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 tmp = 0;
|
2004-05-31 19:18:12 +02:00
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
MemSet((void *) sign, 0, siglen);
|
|
|
|
SETBIT(sign, SIGLENBIT(siglen)); /* set last unused bit */
|
2004-08-29 07:07:03 +02:00
|
|
|
for (k = 0; k < len; k++)
|
|
|
|
{
|
|
|
|
CPTRGM(((char *) &tmp), ptr + k);
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
HASH(sign, tmp, siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_compress(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
|
2020-11-12 04:19:16 +01:00
|
|
|
int siglen = GET_SIGLEN();
|
2004-05-31 19:18:12 +02:00
|
|
|
GISTENTRY *retval = entry;
|
|
|
|
|
|
|
|
if (entry->leafkey)
|
|
|
|
{ /* trgm */
|
2004-08-29 07:07:03 +02:00
|
|
|
TRGM *res;
|
2017-03-13 00:35:34 +01:00
|
|
|
text *val = DatumGetTextPP(entry->key);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
2017-03-13 00:35:34 +01:00
|
|
|
res = generate_trgm(VARDATA_ANY(val), VARSIZE_ANY_EXHDR(val));
|
2004-05-31 19:18:12 +02:00
|
|
|
retval = (GISTENTRY *) palloc(sizeof(GISTENTRY));
|
|
|
|
gistentryinit(*retval, PointerGetDatum(res),
|
|
|
|
entry->rel, entry->page,
|
2017-08-16 06:22:32 +02:00
|
|
|
entry->offset, false);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
else if (ISSIGNKEY(DatumGetPointer(entry->key)) &&
|
|
|
|
!ISALLTRUE(DatumGetPointer(entry->key)))
|
|
|
|
{
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
int32 i;
|
2004-08-29 07:07:03 +02:00
|
|
|
TRGM *res;
|
2004-05-31 19:18:12 +02:00
|
|
|
BITVECP sign = GETSIGN(DatumGetPointer(entry->key));
|
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
LOOPBYTE(siglen)
|
2007-11-16 01:13:02 +01:00
|
|
|
{
|
2007-11-16 02:12:24 +01:00
|
|
|
if ((sign[i] & 0xff) != 0xff)
|
|
|
|
PG_RETURN_POINTER(retval);
|
2007-11-16 01:13:02 +01:00
|
|
|
}
|
2004-05-31 19:18:12 +02:00
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
res = gtrgm_alloc(true, siglen, sign);
|
2004-05-31 19:18:12 +02:00
|
|
|
retval = (GISTENTRY *) palloc(sizeof(GISTENTRY));
|
|
|
|
gistentryinit(*retval, PointerGetDatum(res),
|
|
|
|
entry->rel, entry->page,
|
2017-08-16 06:22:32 +02:00
|
|
|
entry->offset, false);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
PG_RETURN_POINTER(retval);
|
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_decompress(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2007-04-06 06:21:44 +02:00
|
|
|
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
|
|
|
|
GISTENTRY *retval;
|
2007-11-16 02:12:24 +01:00
|
|
|
text *key;
|
2007-04-06 06:21:44 +02:00
|
|
|
|
2017-03-13 00:35:34 +01:00
|
|
|
key = DatumGetTextPP(entry->key);
|
2007-04-06 06:21:44 +02:00
|
|
|
|
|
|
|
if (key != (text *) DatumGetPointer(entry->key))
|
|
|
|
{
|
|
|
|
/* need to pass back the decompressed item */
|
|
|
|
retval = palloc(sizeof(GISTENTRY));
|
|
|
|
gistentryinit(*retval, PointerGetDatum(key),
|
|
|
|
entry->rel, entry->page, entry->offset, entry->leafkey);
|
|
|
|
PG_RETURN_POINTER(retval);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/* we can return the entry as-is */
|
|
|
|
PG_RETURN_POINTER(entry);
|
|
|
|
}
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
|
2012-06-25 00:51:46 +02:00
|
|
|
static int32
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
cnt_sml_sign_common(TRGM *qtrg, BITVECP sign, int siglen)
|
2010-12-04 06:16:21 +01:00
|
|
|
{
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 count = 0;
|
|
|
|
int32 k,
|
2010-12-04 06:16:21 +01:00
|
|
|
len = ARRNELEM(qtrg);
|
|
|
|
trgm *ptr = GETARR(qtrg);
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 tmp = 0;
|
2010-12-04 06:16:21 +01:00
|
|
|
|
|
|
|
for (k = 0; k < len; k++)
|
|
|
|
{
|
|
|
|
CPTRGM(((char *) &tmp), ptr + k);
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
count += GETBIT(sign, HASHVAL(tmp, siglen));
|
2010-12-04 06:16:21 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
return count;
|
|
|
|
}
|
|
|
|
|
2004-05-31 19:18:12 +02:00
|
|
|
Datum
|
|
|
|
gtrgm_consistent(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2008-04-14 19:05:34 +02:00
|
|
|
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
|
|
|
|
text *query = PG_GETARG_TEXT_P(1);
|
2010-12-04 06:16:21 +01:00
|
|
|
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
|
2011-04-10 17:42:00 +02:00
|
|
|
|
2008-04-14 19:05:34 +02:00
|
|
|
/* Oid subtype = PG_GETARG_OID(3); */
|
|
|
|
bool *recheck = (bool *) PG_GETARG_POINTER(4);
|
2020-11-12 04:19:16 +01:00
|
|
|
int siglen = GET_SIGLEN();
|
2008-04-14 19:05:34 +02:00
|
|
|
TRGM *key = (TRGM *) DatumGetPointer(entry->key);
|
2008-07-11 13:56:48 +02:00
|
|
|
TRGM *qtrg;
|
2010-12-04 06:16:21 +01:00
|
|
|
bool res;
|
2011-10-01 05:54:27 +02:00
|
|
|
Size querysize = VARSIZE(query);
|
2013-04-10 19:30:14 +02:00
|
|
|
gtrgm_consistent_cache *cache;
|
2016-03-16 16:59:21 +01:00
|
|
|
double nlimit;
|
2011-02-01 03:33:55 +01:00
|
|
|
|
|
|
|
/*
|
2013-04-10 19:30:14 +02:00
|
|
|
* We keep the extracted trigrams in cache, because trigram extraction is
|
|
|
|
* relatively CPU-expensive. When trying to reuse a cached value, check
|
|
|
|
* strategy number not just query itself, because trigram extraction
|
|
|
|
* depends on strategy.
|
2011-10-01 05:54:27 +02:00
|
|
|
*
|
2013-04-10 19:30:14 +02:00
|
|
|
* The cached structure is a single palloc chunk containing the
|
2017-03-13 00:35:34 +01:00
|
|
|
* gtrgm_consistent_cache header, then the input query (4-byte length
|
|
|
|
* word, uncompressed, starting at a MAXALIGN boundary), then the TRGM
|
|
|
|
* value (also starting at a MAXALIGN boundary). However we don't try to
|
|
|
|
* include the regex graph (if any) in that struct. (XXX currently, this
|
|
|
|
* approach can leak regex graphs across index rescans. Not clear if
|
|
|
|
* that's worth fixing.)
|
2011-02-01 03:33:55 +01:00
|
|
|
*/
|
2013-04-10 19:30:14 +02:00
|
|
|
cache = (gtrgm_consistent_cache *) fcinfo->flinfo->fn_extra;
|
2011-10-01 05:54:27 +02:00
|
|
|
if (cache == NULL ||
|
2013-04-10 19:30:14 +02:00
|
|
|
cache->strategy != strategy ||
|
|
|
|
VARSIZE(cache->query) != querysize ||
|
|
|
|
memcmp((char *) cache->query, (char *) query, querysize) != 0)
|
2008-07-11 13:56:48 +02:00
|
|
|
{
|
2013-04-10 19:30:14 +02:00
|
|
|
gtrgm_consistent_cache *newcache;
|
|
|
|
TrgmPackedGraph *graph = NULL;
|
|
|
|
Size qtrgsize;
|
2011-10-01 05:54:27 +02:00
|
|
|
|
2011-02-01 03:33:55 +01:00
|
|
|
switch (strategy)
|
|
|
|
{
|
|
|
|
case SimilarityStrategyNumber:
|
2016-03-16 16:59:21 +01:00
|
|
|
case WordSimilarityStrategyNumber:
|
2018-03-21 12:57:42 +01:00
|
|
|
case StrictWordSimilarityStrategyNumber:
|
2020-11-15 06:52:12 +01:00
|
|
|
case EqualStrategyNumber:
|
2011-10-01 05:54:27 +02:00
|
|
|
qtrg = generate_trgm(VARDATA(query),
|
|
|
|
querysize - VARHDRSZ);
|
2011-02-01 03:33:55 +01:00
|
|
|
break;
|
|
|
|
case ILikeStrategyNumber:
|
|
|
|
#ifndef IGNORECASE
|
|
|
|
elog(ERROR, "cannot handle ~~* with case-sensitive trigrams");
|
|
|
|
#endif
|
|
|
|
/* FALL THRU */
|
|
|
|
case LikeStrategyNumber:
|
2011-10-01 05:54:27 +02:00
|
|
|
qtrg = generate_wildcard_trgm(VARDATA(query),
|
|
|
|
querysize - VARHDRSZ);
|
2011-02-01 03:33:55 +01:00
|
|
|
break;
|
2013-04-10 19:30:14 +02:00
|
|
|
case RegExpICaseStrategyNumber:
|
|
|
|
#ifndef IGNORECASE
|
|
|
|
elog(ERROR, "cannot handle ~* with case-sensitive trigrams");
|
|
|
|
#endif
|
|
|
|
/* FALL THRU */
|
|
|
|
case RegExpStrategyNumber:
|
|
|
|
qtrg = createTrgmNFA(query, PG_GET_COLLATION(),
|
|
|
|
&graph, fcinfo->flinfo->fn_mcxt);
|
|
|
|
/* just in case an empty array is returned ... */
|
|
|
|
if (qtrg && ARRNELEM(qtrg) <= 0)
|
|
|
|
{
|
|
|
|
pfree(qtrg);
|
|
|
|
qtrg = NULL;
|
|
|
|
}
|
|
|
|
break;
|
2011-02-01 03:33:55 +01:00
|
|
|
default:
|
|
|
|
elog(ERROR, "unrecognized strategy number: %d", strategy);
|
2011-04-10 17:42:00 +02:00
|
|
|
qtrg = NULL; /* keep compiler quiet */
|
2011-02-01 03:33:55 +01:00
|
|
|
break;
|
|
|
|
}
|
2008-07-11 13:56:48 +02:00
|
|
|
|
2013-04-10 19:30:14 +02:00
|
|
|
qtrgsize = qtrg ? VARSIZE(qtrg) : 0;
|
|
|
|
|
|
|
|
newcache = (gtrgm_consistent_cache *)
|
|
|
|
MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
|
|
|
|
MAXALIGN(sizeof(gtrgm_consistent_cache)) +
|
|
|
|
MAXALIGN(querysize) +
|
|
|
|
qtrgsize);
|
2011-10-01 05:54:27 +02:00
|
|
|
|
2013-04-10 19:30:14 +02:00
|
|
|
newcache->strategy = strategy;
|
|
|
|
newcache->query = (text *)
|
|
|
|
((char *) newcache + MAXALIGN(sizeof(gtrgm_consistent_cache)));
|
|
|
|
memcpy((char *) newcache->query, (char *) query, querysize);
|
|
|
|
if (qtrg)
|
|
|
|
{
|
|
|
|
newcache->trigrams = (TRGM *)
|
|
|
|
((char *) newcache->query + MAXALIGN(querysize));
|
|
|
|
memcpy((char *) newcache->trigrams, (char *) qtrg, qtrgsize);
|
|
|
|
/* release qtrg in case it was made in fn_mcxt */
|
|
|
|
pfree(qtrg);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
newcache->trigrams = NULL;
|
|
|
|
newcache->graph = graph;
|
2011-10-01 05:54:27 +02:00
|
|
|
|
2008-07-11 13:56:48 +02:00
|
|
|
if (cache)
|
|
|
|
pfree(cache);
|
2013-04-10 19:30:14 +02:00
|
|
|
fcinfo->flinfo->fn_extra = (void *) newcache;
|
|
|
|
cache = newcache;
|
2008-07-11 13:56:48 +02:00
|
|
|
}
|
|
|
|
|
2013-04-10 19:30:14 +02:00
|
|
|
qtrg = cache->trigrams;
|
2008-07-11 13:56:48 +02:00
|
|
|
|
2010-12-04 06:16:21 +01:00
|
|
|
switch (strategy)
|
|
|
|
{
|
|
|
|
case SimilarityStrategyNumber:
|
2016-03-16 16:59:21 +01:00
|
|
|
case WordSimilarityStrategyNumber:
|
2018-03-21 12:57:42 +01:00
|
|
|
case StrictWordSimilarityStrategyNumber:
|
2018-04-26 20:47:16 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Similarity search is exact. (Strict) word similarity search is
|
|
|
|
* inexact
|
|
|
|
*/
|
2018-03-21 12:57:42 +01:00
|
|
|
*recheck = (strategy != SimilarityStrategyNumber);
|
|
|
|
|
|
|
|
nlimit = index_strategy_get_limit(strategy);
|
2011-02-01 03:33:55 +01:00
|
|
|
|
2010-12-04 06:16:21 +01:00
|
|
|
if (GIST_LEAF(entry))
|
2011-04-10 17:42:00 +02:00
|
|
|
{ /* all leafs contains orig trgm */
|
2016-06-20 16:49:19 +02:00
|
|
|
double tmpsml = cnt_sml(qtrg, key, *recheck);
|
2016-06-10 00:02:36 +02:00
|
|
|
|
2016-06-20 16:49:19 +02:00
|
|
|
res = (tmpsml >= nlimit);
|
2010-12-04 06:16:21 +01:00
|
|
|
}
|
|
|
|
else if (ISALLTRUE(key))
|
2011-04-10 17:42:00 +02:00
|
|
|
{ /* non-leaf contains signature */
|
2010-12-04 06:16:21 +01:00
|
|
|
res = true;
|
|
|
|
}
|
|
|
|
else
|
2011-04-10 17:42:00 +02:00
|
|
|
{ /* non-leaf contains signature */
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
int32 count = cnt_sml_sign_common(qtrg, GETSIGN(key), siglen);
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 len = ARRNELEM(qtrg);
|
2010-12-04 06:16:21 +01:00
|
|
|
|
|
|
|
if (len == 0)
|
|
|
|
res = false;
|
|
|
|
else
|
2016-03-16 16:59:21 +01:00
|
|
|
res = (((((float8) count) / ((float8) len))) >= nlimit);
|
2010-12-04 06:16:21 +01:00
|
|
|
}
|
|
|
|
break;
|
2011-02-01 03:33:55 +01:00
|
|
|
case ILikeStrategyNumber:
|
|
|
|
#ifndef IGNORECASE
|
|
|
|
elog(ERROR, "cannot handle ~~* with case-sensitive trigrams");
|
|
|
|
#endif
|
|
|
|
/* FALL THRU */
|
|
|
|
case LikeStrategyNumber:
|
2020-11-15 06:52:12 +01:00
|
|
|
case EqualStrategyNumber:
|
|
|
|
/* Wildcard and equal search are inexact */
|
2011-02-01 03:33:55 +01:00
|
|
|
*recheck = true;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check if all the extracted trigrams can be present in child
|
|
|
|
* nodes.
|
|
|
|
*/
|
|
|
|
if (GIST_LEAF(entry))
|
2011-04-10 17:42:00 +02:00
|
|
|
{ /* all leafs contains orig trgm */
|
2011-02-01 03:33:55 +01:00
|
|
|
res = trgm_contained_by(qtrg, key);
|
|
|
|
}
|
|
|
|
else if (ISALLTRUE(key))
|
2011-04-10 17:42:00 +02:00
|
|
|
{ /* non-leaf contains signature */
|
2011-02-01 03:33:55 +01:00
|
|
|
res = true;
|
|
|
|
}
|
|
|
|
else
|
2011-04-10 17:42:00 +02:00
|
|
|
{ /* non-leaf contains signature */
|
|
|
|
int32 k,
|
|
|
|
tmp = 0,
|
|
|
|
len = ARRNELEM(qtrg);
|
|
|
|
trgm *ptr = GETARR(qtrg);
|
|
|
|
BITVECP sign = GETSIGN(key);
|
2011-02-01 03:33:55 +01:00
|
|
|
|
|
|
|
res = true;
|
|
|
|
for (k = 0; k < len; k++)
|
|
|
|
{
|
|
|
|
CPTRGM(((char *) &tmp), ptr + k);
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
if (!GETBIT(sign, HASHVAL(tmp, siglen)))
|
2011-02-01 03:33:55 +01:00
|
|
|
{
|
|
|
|
res = false;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
break;
|
2013-04-10 19:30:14 +02:00
|
|
|
case RegExpICaseStrategyNumber:
|
|
|
|
#ifndef IGNORECASE
|
|
|
|
elog(ERROR, "cannot handle ~* with case-sensitive trigrams");
|
|
|
|
#endif
|
|
|
|
/* FALL THRU */
|
|
|
|
case RegExpStrategyNumber:
|
|
|
|
/* Regexp search is inexact */
|
|
|
|
*recheck = true;
|
|
|
|
|
|
|
|
/* Check regex match as much as we can with available info */
|
|
|
|
if (qtrg)
|
|
|
|
{
|
|
|
|
if (GIST_LEAF(entry))
|
|
|
|
{ /* all leafs contains orig trgm */
|
|
|
|
bool *check;
|
|
|
|
|
|
|
|
check = trgm_presence_map(qtrg, key);
|
|
|
|
res = trigramsMatchGraph(cache->graph, check);
|
|
|
|
pfree(check);
|
|
|
|
}
|
|
|
|
else if (ISALLTRUE(key))
|
|
|
|
{ /* non-leaf contains signature */
|
|
|
|
res = true;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{ /* non-leaf contains signature */
|
|
|
|
int32 k,
|
|
|
|
tmp = 0,
|
|
|
|
len = ARRNELEM(qtrg);
|
|
|
|
trgm *ptr = GETARR(qtrg);
|
|
|
|
BITVECP sign = GETSIGN(key);
|
2013-04-15 18:49:29 +02:00
|
|
|
bool *check;
|
2013-04-10 19:30:14 +02:00
|
|
|
|
2013-04-15 18:49:29 +02:00
|
|
|
/*
|
|
|
|
* GETBIT() tests may give false positives, due to limited
|
2014-05-06 18:12:18 +02:00
|
|
|
* size of the sign array. But since trigramsMatchGraph()
|
2013-04-15 18:49:29 +02:00
|
|
|
* implements a monotone boolean function, false positives
|
|
|
|
* in the check array can't lead to false negative answer.
|
|
|
|
* So we can apply trigramsMatchGraph despite uncertainty,
|
|
|
|
* and that usefully improves the quality of the search.
|
|
|
|
*/
|
|
|
|
check = (bool *) palloc(len * sizeof(bool));
|
2013-04-10 19:30:14 +02:00
|
|
|
for (k = 0; k < len; k++)
|
|
|
|
{
|
|
|
|
CPTRGM(((char *) &tmp), ptr + k);
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
check[k] = GETBIT(sign, HASHVAL(tmp, siglen));
|
2013-04-10 19:30:14 +02:00
|
|
|
}
|
2013-04-15 18:49:29 +02:00
|
|
|
res = trigramsMatchGraph(cache->graph, check);
|
|
|
|
pfree(check);
|
2013-04-10 19:30:14 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/* trigram-free query must be rechecked everywhere */
|
|
|
|
res = true;
|
|
|
|
}
|
|
|
|
break;
|
2010-12-04 06:16:21 +01:00
|
|
|
default:
|
|
|
|
elog(ERROR, "unrecognized strategy number: %d", strategy);
|
|
|
|
res = false; /* keep compiler quiet */
|
|
|
|
break;
|
2004-08-29 07:07:03 +02:00
|
|
|
}
|
2010-12-04 06:16:21 +01:00
|
|
|
|
|
|
|
PG_RETURN_BOOL(res);
|
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_distance(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
|
|
|
|
text *query = PG_GETARG_TEXT_P(1);
|
|
|
|
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
|
2011-04-10 17:42:00 +02:00
|
|
|
|
2010-12-04 06:16:21 +01:00
|
|
|
/* Oid subtype = PG_GETARG_OID(3); */
|
2016-03-16 16:59:21 +01:00
|
|
|
bool *recheck = (bool *) PG_GETARG_POINTER(4);
|
2020-11-12 04:19:16 +01:00
|
|
|
int siglen = GET_SIGLEN();
|
2010-12-04 06:16:21 +01:00
|
|
|
TRGM *key = (TRGM *) DatumGetPointer(entry->key);
|
|
|
|
TRGM *qtrg;
|
|
|
|
float8 res;
|
2011-10-01 05:54:27 +02:00
|
|
|
Size querysize = VARSIZE(query);
|
2010-12-04 06:16:21 +01:00
|
|
|
char *cache = (char *) fcinfo->flinfo->fn_extra;
|
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
/*
|
2012-06-10 21:20:04 +02:00
|
|
|
* Cache the generated trigrams across multiple calls with the same query.
|
2011-10-01 05:54:27 +02:00
|
|
|
*/
|
|
|
|
if (cache == NULL ||
|
|
|
|
VARSIZE(cache) != querysize ||
|
|
|
|
memcmp(cache, query, querysize) != 0)
|
2010-12-04 06:16:21 +01:00
|
|
|
{
|
2011-10-01 05:54:27 +02:00
|
|
|
char *newcache;
|
2010-12-04 06:16:21 +01:00
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
qtrg = generate_trgm(VARDATA(query), querysize - VARHDRSZ);
|
|
|
|
|
|
|
|
newcache = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
|
|
|
|
MAXALIGN(querysize) +
|
|
|
|
VARSIZE(qtrg));
|
2010-12-04 06:16:21 +01:00
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
memcpy(newcache, query, querysize);
|
|
|
|
memcpy(newcache + MAXALIGN(querysize), qtrg, VARSIZE(qtrg));
|
2010-12-04 06:16:21 +01:00
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
if (cache)
|
|
|
|
pfree(cache);
|
|
|
|
fcinfo->flinfo->fn_extra = newcache;
|
|
|
|
cache = newcache;
|
2004-08-29 07:07:03 +02:00
|
|
|
}
|
2004-05-31 19:18:12 +02:00
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
qtrg = (TRGM *) (cache + MAXALIGN(querysize));
|
2010-12-04 06:16:21 +01:00
|
|
|
|
|
|
|
switch (strategy)
|
|
|
|
{
|
|
|
|
case DistanceStrategyNumber:
|
2016-03-16 16:59:21 +01:00
|
|
|
case WordDistanceStrategyNumber:
|
2018-03-21 12:57:42 +01:00
|
|
|
case StrictWordDistanceStrategyNumber:
|
|
|
|
/* Only plain trigram distance is exact */
|
|
|
|
*recheck = (strategy != DistanceStrategyNumber);
|
2010-12-04 06:16:21 +01:00
|
|
|
if (GIST_LEAF(entry))
|
2011-04-10 17:42:00 +02:00
|
|
|
{ /* all leafs contains orig trgm */
|
2016-06-10 00:02:36 +02:00
|
|
|
|
2016-03-16 16:59:21 +01:00
|
|
|
/*
|
|
|
|
* Prevent gcc optimizing the sml variable using volatile
|
|
|
|
* keyword. Otherwise res can differ from the
|
|
|
|
* word_similarity_dist_op() function.
|
|
|
|
*/
|
|
|
|
float4 volatile sml = cnt_sml(qtrg, key, *recheck);
|
2016-06-10 00:02:36 +02:00
|
|
|
|
2016-03-16 16:59:21 +01:00
|
|
|
res = 1.0 - sml;
|
2010-12-04 06:16:21 +01:00
|
|
|
}
|
|
|
|
else if (ISALLTRUE(key))
|
2011-04-10 17:42:00 +02:00
|
|
|
{ /* all leafs contains orig trgm */
|
2010-12-04 06:16:21 +01:00
|
|
|
res = 0.0;
|
|
|
|
}
|
|
|
|
else
|
2011-04-10 17:42:00 +02:00
|
|
|
{ /* non-leaf contains signature */
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
int32 count = cnt_sml_sign_common(qtrg, GETSIGN(key), siglen);
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 len = ARRNELEM(qtrg);
|
2010-12-04 06:16:21 +01:00
|
|
|
|
|
|
|
res = (len == 0) ? -1.0 : 1.0 - ((float8) count) / ((float8) len);
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
elog(ERROR, "unrecognized strategy number: %d", strategy);
|
|
|
|
res = 0; /* keep compiler quiet */
|
|
|
|
break;
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
|
2010-12-04 06:16:21 +01:00
|
|
|
PG_RETURN_FLOAT8(res);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
|
2012-06-25 00:51:46 +02:00
|
|
|
static int32
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
unionkey(BITVECP sbase, TRGM *add, int siglen)
|
2004-05-31 19:18:12 +02:00
|
|
|
{
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 i;
|
2004-05-31 19:18:12 +02:00
|
|
|
|
|
|
|
if (ISSIGNKEY(add))
|
|
|
|
{
|
|
|
|
BITVECP sadd = GETSIGN(add);
|
|
|
|
|
|
|
|
if (ISALLTRUE(add))
|
|
|
|
return 1;
|
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
LOOPBYTE(siglen)
|
2007-11-16 02:12:24 +01:00
|
|
|
sbase[i] |= sadd[i];
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
trgm *ptr = GETARR(add);
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 tmp = 0;
|
2004-05-31 19:18:12 +02:00
|
|
|
|
2004-08-29 07:07:03 +02:00
|
|
|
for (i = 0; i < ARRNELEM(add); i++)
|
|
|
|
{
|
|
|
|
CPTRGM(((char *) &tmp), ptr + i);
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
HASH(sbase, tmp, siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_union(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2004-08-29 07:07:03 +02:00
|
|
|
GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 len = entryvec->n;
|
2004-05-31 19:18:12 +02:00
|
|
|
int *size = (int *) PG_GETARG_POINTER(1);
|
2020-11-12 04:19:16 +01:00
|
|
|
int siglen = GET_SIGLEN();
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 i;
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
TRGM *result = gtrgm_alloc(false, siglen, NULL);
|
|
|
|
BITVECP base = GETSIGN(result);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
|
|
|
for (i = 0; i < len; i++)
|
|
|
|
{
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
if (unionkey(base, GETENTRY(entryvec, i), siglen))
|
2004-05-31 19:18:12 +02:00
|
|
|
{
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
result->flag = ALLISTRUE;
|
|
|
|
SET_VARSIZE(result, CALCGTSIZE(ALLISTRUE, siglen));
|
2004-05-31 19:18:12 +02:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
*size = VARSIZE(result);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
|
|
|
PG_RETURN_POINTER(result);
|
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_same(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2004-08-29 07:07:03 +02:00
|
|
|
TRGM *a = (TRGM *) PG_GETARG_POINTER(0);
|
|
|
|
TRGM *b = (TRGM *) PG_GETARG_POINTER(1);
|
2004-05-31 19:18:12 +02:00
|
|
|
bool *result = (bool *) PG_GETARG_POINTER(2);
|
2020-11-12 04:19:16 +01:00
|
|
|
int siglen = GET_SIGLEN();
|
2004-05-31 19:18:12 +02:00
|
|
|
|
|
|
|
if (ISSIGNKEY(a))
|
|
|
|
{ /* then b also ISSIGNKEY */
|
|
|
|
if (ISALLTRUE(a) && ISALLTRUE(b))
|
|
|
|
*result = true;
|
|
|
|
else if (ISALLTRUE(a))
|
|
|
|
*result = false;
|
|
|
|
else if (ISALLTRUE(b))
|
|
|
|
*result = false;
|
|
|
|
else
|
|
|
|
{
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 i;
|
2004-05-31 19:18:12 +02:00
|
|
|
BITVECP sa = GETSIGN(a),
|
|
|
|
sb = GETSIGN(b);
|
|
|
|
|
|
|
|
*result = true;
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
LOOPBYTE(siglen)
|
2007-11-16 01:13:02 +01:00
|
|
|
{
|
|
|
|
if (sa[i] != sb[i])
|
|
|
|
{
|
|
|
|
*result = false;
|
|
|
|
break;
|
|
|
|
}
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{ /* a and b ISARRKEY */
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 lena = ARRNELEM(a),
|
2004-05-31 19:18:12 +02:00
|
|
|
lenb = ARRNELEM(b);
|
|
|
|
|
|
|
|
if (lena != lenb)
|
|
|
|
*result = false;
|
|
|
|
else
|
|
|
|
{
|
|
|
|
trgm *ptra = GETARR(a),
|
|
|
|
*ptrb = GETARR(b);
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 i;
|
2004-05-31 19:18:12 +02:00
|
|
|
|
|
|
|
*result = true;
|
|
|
|
for (i = 0; i < lena; i++)
|
2004-08-29 07:07:03 +02:00
|
|
|
if (CMPTRGM(ptra + i, ptrb + i))
|
2004-05-31 19:18:12 +02:00
|
|
|
{
|
|
|
|
*result = false;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
PG_RETURN_POINTER(result);
|
|
|
|
}
|
|
|
|
|
2012-06-25 00:51:46 +02:00
|
|
|
static int32
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
sizebitvec(BITVECP sign, int siglen)
|
2004-05-31 19:18:12 +02:00
|
|
|
{
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
return pg_popcount(sign, siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
hemdistsign(BITVECP a, BITVECP b, int siglen)
|
2004-08-29 07:07:03 +02:00
|
|
|
{
|
|
|
|
int i,
|
2006-01-20 23:46:16 +01:00
|
|
|
diff,
|
2004-08-29 07:07:03 +02:00
|
|
|
dist = 0;
|
2004-05-31 19:18:12 +02:00
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
LOOPBYTE(siglen)
|
2007-11-16 01:13:02 +01:00
|
|
|
{
|
|
|
|
diff = (unsigned char) (a[i] ^ b[i]);
|
Make use of compiler builtins and/or assembly for CLZ, CTZ, POPCNT.
Test for the compiler builtins __builtin_clz, __builtin_ctz, and
__builtin_popcount, and make use of these in preference to
handwritten C code if they're available. Create src/port
infrastructure for "leftmost one", "rightmost one", and "popcount"
so as to centralize these decisions.
On x86_64, __builtin_popcount generally won't make use of the POPCNT
opcode because that's not universally supported yet. Provide code
that checks CPUID and then calls POPCNT via asm() if available.
This requires indirecting through a function pointer, which is
an annoying amount of overhead for a one-instruction operation,
but it's probably not worth working harder than this for our
current use-cases.
I'm not sure we've found all the existing places that could profit
from this new infrastructure; but we at least touched all the
ones that used copied-and-pasted versions of the bitmapset.c code,
and got rid of multiple copies of the associated constant arrays.
While at it, replace c-compiler.m4's one-per-builtin-function
macros with a single one that can handle all the cases we need
to worry about so far. Also, because I'm paranoid, make those
checks into AC_LINK checks rather than just AC_COMPILE; the
former coding failed to verify that libgcc has support for the
builtin, in cases where it's not inline code.
David Rowley, Thomas Munro, Alvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-02-16 05:22:27 +01:00
|
|
|
/* Using the popcount functions here isn't likely to win */
|
|
|
|
dist += pg_number_of_ones[diff];
|
2007-11-16 01:13:02 +01:00
|
|
|
}
|
2004-05-31 19:18:12 +02:00
|
|
|
return dist;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
hemdist(TRGM *a, TRGM *b, int siglen)
|
2004-08-29 07:07:03 +02:00
|
|
|
{
|
|
|
|
if (ISALLTRUE(a))
|
|
|
|
{
|
2004-05-31 19:18:12 +02:00
|
|
|
if (ISALLTRUE(b))
|
|
|
|
return 0;
|
|
|
|
else
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
return SIGLENBIT(siglen) - sizebitvec(GETSIGN(b), siglen);
|
2004-08-29 07:07:03 +02:00
|
|
|
}
|
|
|
|
else if (ISALLTRUE(b))
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
return SIGLENBIT(siglen) - sizebitvec(GETSIGN(a), siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
return hemdistsign(GETSIGN(a), GETSIGN(b), siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_penalty(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
GISTENTRY *origentry = (GISTENTRY *) PG_GETARG_POINTER(0); /* always ISSIGNKEY */
|
|
|
|
GISTENTRY *newentry = (GISTENTRY *) PG_GETARG_POINTER(1);
|
|
|
|
float *penalty = (float *) PG_GETARG_POINTER(2);
|
2020-11-12 04:19:16 +01:00
|
|
|
int siglen = GET_SIGLEN();
|
2004-08-29 07:07:03 +02:00
|
|
|
TRGM *origval = (TRGM *) DatumGetPointer(origentry->key);
|
|
|
|
TRGM *newval = (TRGM *) DatumGetPointer(newentry->key);
|
2004-05-31 19:18:12 +02:00
|
|
|
BITVECP orig = GETSIGN(origval);
|
|
|
|
|
|
|
|
*penalty = 0.0;
|
|
|
|
|
2004-08-29 07:07:03 +02:00
|
|
|
if (ISARRKEY(newval))
|
|
|
|
{
|
2011-10-01 05:54:27 +02:00
|
|
|
char *cache = (char *) fcinfo->flinfo->fn_extra;
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
TRGM *cachedVal = (TRGM *) (cache + MAXALIGN(siglen));
|
2011-10-01 05:54:27 +02:00
|
|
|
Size newvalsize = VARSIZE(newval);
|
|
|
|
BITVECP sign;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Cache the sign data across multiple calls with the same newval.
|
|
|
|
*/
|
|
|
|
if (cache == NULL ||
|
|
|
|
VARSIZE(cachedVal) != newvalsize ||
|
|
|
|
memcmp(cachedVal, newval, newvalsize) != 0)
|
|
|
|
{
|
|
|
|
char *newcache;
|
|
|
|
|
|
|
|
newcache = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
MAXALIGN(siglen) +
|
2011-10-01 05:54:27 +02:00
|
|
|
newvalsize);
|
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
makesign((BITVECP) newcache, newval, siglen);
|
2004-08-29 07:07:03 +02:00
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
cachedVal = (TRGM *) (newcache + MAXALIGN(siglen));
|
2011-10-01 05:54:27 +02:00
|
|
|
memcpy(cachedVal, newval, newvalsize);
|
|
|
|
|
|
|
|
if (cache)
|
|
|
|
pfree(cache);
|
|
|
|
fcinfo->flinfo->fn_extra = newcache;
|
|
|
|
cache = newcache;
|
|
|
|
}
|
|
|
|
|
|
|
|
sign = (BITVECP) cache;
|
2004-05-31 19:18:12 +02:00
|
|
|
|
2004-08-29 07:07:03 +02:00
|
|
|
if (ISALLTRUE(origval))
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
*penalty = ((float) (SIGLENBIT(siglen) - sizebitvec(sign, siglen))) / (float) (SIGLENBIT(siglen) + 1);
|
2004-08-29 07:07:03 +02:00
|
|
|
else
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
*penalty = hemdistsign(sign, orig, siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
2004-08-29 07:07:03 +02:00
|
|
|
else
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
*penalty = hemdist(origval, newval, siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
PG_RETURN_POINTER(penalty);
|
|
|
|
}
|
|
|
|
|
|
|
|
typedef struct
|
|
|
|
{
|
|
|
|
bool allistrue;
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
BITVECP sign;
|
2007-11-16 02:12:24 +01:00
|
|
|
} CACHESIGN;
|
2004-05-31 19:18:12 +02:00
|
|
|
|
|
|
|
static void
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
fillcache(CACHESIGN *item, TRGM *key, BITVECP sign, int siglen)
|
2004-05-31 19:18:12 +02:00
|
|
|
{
|
|
|
|
item->allistrue = false;
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
item->sign = sign;
|
2004-05-31 19:18:12 +02:00
|
|
|
if (ISARRKEY(key))
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
makesign(item->sign, key, siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
else if (ISALLTRUE(key))
|
|
|
|
item->allistrue = true;
|
|
|
|
else
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
memcpy((void *) item->sign, (void *) GETSIGN(key), siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
#define WISH_F(a,b,c) (double)( -(double)(((a)-(b))*((a)-(b))*((a)-(b)))*(c) )
|
|
|
|
typedef struct
|
|
|
|
{
|
|
|
|
OffsetNumber pos;
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 cost;
|
2007-11-16 02:12:24 +01:00
|
|
|
} SPLITCOST;
|
2004-05-31 19:18:12 +02:00
|
|
|
|
|
|
|
static int
|
|
|
|
comparecost(const void *a, const void *b)
|
|
|
|
{
|
2011-09-11 20:54:32 +02:00
|
|
|
if (((const SPLITCOST *) a)->cost == ((const SPLITCOST *) b)->cost)
|
2004-05-31 19:18:12 +02:00
|
|
|
return 0;
|
|
|
|
else
|
2011-09-11 20:54:32 +02:00
|
|
|
return (((const SPLITCOST *) a)->cost > ((const SPLITCOST *) b)->cost) ? 1 : -1;
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static int
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
hemdistcache(CACHESIGN *a, CACHESIGN *b, int siglen)
|
2004-08-29 07:07:03 +02:00
|
|
|
{
|
|
|
|
if (a->allistrue)
|
|
|
|
{
|
2004-05-31 19:18:12 +02:00
|
|
|
if (b->allistrue)
|
|
|
|
return 0;
|
|
|
|
else
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
return SIGLENBIT(siglen) - sizebitvec(b->sign, siglen);
|
2004-08-29 07:07:03 +02:00
|
|
|
}
|
|
|
|
else if (b->allistrue)
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
return SIGLENBIT(siglen) - sizebitvec(a->sign, siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
return hemdistsign(a->sign, b->sign, siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_picksplit(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2004-08-29 07:07:03 +02:00
|
|
|
GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
|
2020-11-12 15:34:37 +01:00
|
|
|
OffsetNumber maxoff = entryvec->n - 1;
|
2004-05-31 19:18:12 +02:00
|
|
|
GIST_SPLITVEC *v = (GIST_SPLITVEC *) PG_GETARG_POINTER(1);
|
2020-11-12 04:19:16 +01:00
|
|
|
int siglen = GET_SIGLEN();
|
2004-05-31 19:18:12 +02:00
|
|
|
OffsetNumber k,
|
|
|
|
j;
|
2004-08-29 07:07:03 +02:00
|
|
|
TRGM *datum_l,
|
2004-05-31 19:18:12 +02:00
|
|
|
*datum_r;
|
|
|
|
BITVECP union_l,
|
|
|
|
union_r;
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 size_alpha,
|
2004-05-31 19:18:12 +02:00
|
|
|
size_beta;
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 size_waste,
|
2004-05-31 19:18:12 +02:00
|
|
|
waste = -1;
|
2012-06-25 00:51:46 +02:00
|
|
|
int32 nbytes;
|
2004-05-31 19:18:12 +02:00
|
|
|
OffsetNumber seed_1 = 0,
|
|
|
|
seed_2 = 0;
|
|
|
|
OffsetNumber *left,
|
|
|
|
*right;
|
|
|
|
BITVECP ptr;
|
|
|
|
int i;
|
|
|
|
CACHESIGN *cache;
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
char *cache_sign;
|
2004-05-31 19:18:12 +02:00
|
|
|
SPLITCOST *costvector;
|
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
/* cache the sign data for each existing item */
|
2020-11-12 15:34:37 +01:00
|
|
|
cache = (CACHESIGN *) palloc(sizeof(CACHESIGN) * (maxoff + 1));
|
|
|
|
cache_sign = palloc(siglen * (maxoff + 1));
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
for (k = FirstOffsetNumber; k <= maxoff; k = OffsetNumberNext(k))
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
fillcache(&cache[k], GETENTRY(entryvec, k), &cache_sign[siglen * k],
|
|
|
|
siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
/* now find the two furthest-apart items */
|
2004-08-29 07:07:03 +02:00
|
|
|
for (k = FirstOffsetNumber; k < maxoff; k = OffsetNumberNext(k))
|
|
|
|
{
|
|
|
|
for (j = OffsetNumberNext(k); j <= maxoff; j = OffsetNumberNext(j))
|
|
|
|
{
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
size_waste = hemdistcache(&(cache[j]), &(cache[k]), siglen);
|
2004-08-29 07:07:03 +02:00
|
|
|
if (size_waste > waste)
|
|
|
|
{
|
2004-05-31 19:18:12 +02:00
|
|
|
waste = size_waste;
|
|
|
|
seed_1 = k;
|
|
|
|
seed_2 = j;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
/* just in case we didn't make a selection ... */
|
2004-08-29 07:07:03 +02:00
|
|
|
if (seed_1 == 0 || seed_2 == 0)
|
|
|
|
{
|
2004-05-31 19:18:12 +02:00
|
|
|
seed_1 = 1;
|
|
|
|
seed_2 = 2;
|
|
|
|
}
|
|
|
|
|
2011-10-01 05:54:27 +02:00
|
|
|
/* initialize the result vectors */
|
2020-11-12 15:34:37 +01:00
|
|
|
nbytes = maxoff * sizeof(OffsetNumber);
|
2011-10-01 05:54:27 +02:00
|
|
|
v->spl_left = left = (OffsetNumber *) palloc(nbytes);
|
|
|
|
v->spl_right = right = (OffsetNumber *) palloc(nbytes);
|
|
|
|
v->spl_nleft = 0;
|
|
|
|
v->spl_nright = 0;
|
|
|
|
|
2004-05-31 19:18:12 +02:00
|
|
|
/* form initial .. */
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
datum_l = gtrgm_alloc(cache[seed_1].allistrue, siglen, cache[seed_1].sign);
|
|
|
|
datum_r = gtrgm_alloc(cache[seed_2].allistrue, siglen, cache[seed_2].sign);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
2004-08-29 07:07:03 +02:00
|
|
|
union_l = GETSIGN(datum_l);
|
|
|
|
union_r = GETSIGN(datum_r);
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
|
2004-05-31 19:18:12 +02:00
|
|
|
/* sort before ... */
|
|
|
|
costvector = (SPLITCOST *) palloc(sizeof(SPLITCOST) * maxoff);
|
2004-08-29 07:07:03 +02:00
|
|
|
for (j = FirstOffsetNumber; j <= maxoff; j = OffsetNumberNext(j))
|
|
|
|
{
|
2004-05-31 19:18:12 +02:00
|
|
|
costvector[j - 1].pos = j;
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
size_alpha = hemdistcache(&(cache[seed_1]), &(cache[j]), siglen);
|
|
|
|
size_beta = hemdistcache(&(cache[seed_2]), &(cache[j]), siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
costvector[j - 1].cost = abs(size_alpha - size_beta);
|
|
|
|
}
|
|
|
|
qsort((void *) costvector, maxoff, sizeof(SPLITCOST), comparecost);
|
|
|
|
|
2004-08-29 07:07:03 +02:00
|
|
|
for (k = 0; k < maxoff; k++)
|
|
|
|
{
|
2004-05-31 19:18:12 +02:00
|
|
|
j = costvector[k].pos;
|
2004-08-29 07:07:03 +02:00
|
|
|
if (j == seed_1)
|
|
|
|
{
|
2004-05-31 19:18:12 +02:00
|
|
|
*left++ = j;
|
|
|
|
v->spl_nleft++;
|
|
|
|
continue;
|
2004-08-29 07:07:03 +02:00
|
|
|
}
|
|
|
|
else if (j == seed_2)
|
|
|
|
{
|
2004-05-31 19:18:12 +02:00
|
|
|
*right++ = j;
|
|
|
|
v->spl_nright++;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2004-08-29 07:07:03 +02:00
|
|
|
if (ISALLTRUE(datum_l) || cache[j].allistrue)
|
|
|
|
{
|
|
|
|
if (ISALLTRUE(datum_l) && cache[j].allistrue)
|
|
|
|
size_alpha = 0;
|
2004-05-31 19:18:12 +02:00
|
|
|
else
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
size_alpha = SIGLENBIT(siglen) -
|
2020-01-30 17:42:14 +01:00
|
|
|
sizebitvec((cache[j].allistrue) ? GETSIGN(datum_l) :
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
GETSIGN(cache[j].sign),
|
|
|
|
siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
2004-08-29 07:07:03 +02:00
|
|
|
else
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
size_alpha = hemdistsign(cache[j].sign, GETSIGN(datum_l), siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
2004-08-29 07:07:03 +02:00
|
|
|
if (ISALLTRUE(datum_r) || cache[j].allistrue)
|
|
|
|
{
|
|
|
|
if (ISALLTRUE(datum_r) && cache[j].allistrue)
|
|
|
|
size_beta = 0;
|
2004-05-31 19:18:12 +02:00
|
|
|
else
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
size_beta = SIGLENBIT(siglen) -
|
2020-01-30 17:42:14 +01:00
|
|
|
sizebitvec((cache[j].allistrue) ? GETSIGN(datum_r) :
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
GETSIGN(cache[j].sign),
|
|
|
|
siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
2004-08-29 07:07:03 +02:00
|
|
|
else
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
size_beta = hemdistsign(cache[j].sign, GETSIGN(datum_r), siglen);
|
2004-05-31 19:18:12 +02:00
|
|
|
|
2004-08-29 07:07:03 +02:00
|
|
|
if (size_alpha < size_beta + WISH_F(v->spl_nleft, v->spl_nright, 0.1))
|
|
|
|
{
|
|
|
|
if (ISALLTRUE(datum_l) || cache[j].allistrue)
|
|
|
|
{
|
|
|
|
if (!ISALLTRUE(datum_l))
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
MemSet((void *) GETSIGN(datum_l), 0xff, siglen);
|
2004-08-29 07:07:03 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
ptr = cache[j].sign;
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
LOOPBYTE(siglen)
|
2007-11-16 01:13:02 +01:00
|
|
|
union_l[i] |= ptr[i];
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
*left++ = j;
|
|
|
|
v->spl_nleft++;
|
2004-08-29 07:07:03 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
if (ISALLTRUE(datum_r) || cache[j].allistrue)
|
|
|
|
{
|
|
|
|
if (!ISALLTRUE(datum_r))
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
MemSet((void *) GETSIGN(datum_r), 0xff, siglen);
|
2004-08-29 07:07:03 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
ptr = cache[j].sign;
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
LOOPBYTE(siglen)
|
2007-11-16 01:13:02 +01:00
|
|
|
union_r[i] |= ptr[i];
|
2004-05-31 19:18:12 +02:00
|
|
|
}
|
|
|
|
*right++ = j;
|
|
|
|
v->spl_nright++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
v->spl_ldatum = PointerGetDatum(datum_l);
|
|
|
|
v->spl_rdatum = PointerGetDatum(datum_r);
|
|
|
|
|
|
|
|
PG_RETURN_POINTER(v);
|
|
|
|
}
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
|
|
|
|
Datum
|
|
|
|
gtrgm_options(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
local_relopts *relopts = (local_relopts *) PG_GETARG_POINTER(0);
|
|
|
|
|
|
|
|
init_local_reloptions(relopts, sizeof(TrgmGistOptions));
|
|
|
|
add_local_int_reloption(relopts, "siglen",
|
|
|
|
"signature length in bytes",
|
|
|
|
SIGLEN_DEFAULT, 1, SIGLEN_MAX,
|
|
|
|
offsetof(TrgmGistOptions, siglen));
|
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|