Commit Graph

9 Commits

Author SHA1 Message Date
Michael Paquier 31acee4b66 Reduce dependency to money data type in main regression test suite
Most of these tests have been introduced in 6dd8b00807, to check for
behaviors related to hashing and hash plans, and money is a data type
with btree support but no hash functions.  These tests are switched to
use varbit instead, to provide the same coverage.

Some other tests historically used money but don't really need it for
what they wanted to test (see rules.sql).  Plans and coverage are
unchanged after the modifications done here.

Support for money may be removed a a later point, but this needs more
discussion.

Discussion: https://postgr.es/m/18240-c5da758d7dc1ecf0@postgresql.org
2024-01-15 09:30:16 +09:00
Tom Lane fd549145d5 Fix portability issue in tests from commit ce773f230.
Modern POSIX seems to require strtod() to accept "-NaN", but there's
nothing about NaN in SUSv2, and some of our oldest buildfarm members
don't like it.  Let's try writing it as -'NaN' instead; that seems
to produce the same result, at least on Intel hardware.

Per buildfarm.
2021-09-03 10:01:02 -04:00
Tom Lane ce773f230d Fix float4/float8 hash functions to produce uniform results for NaNs.
The IEEE 754 standard allows a wide variety of bit patterns for NaNs,
of which at least two ("NaN" and "-NaN") are pretty easy to produce
from SQL on most machines.  This is problematic because our btree
comparison functions deem all NaNs to be equal, but our float hash
functions know nothing about NaNs and will happily produce varying
hash codes for them.  That causes unexpected results from queries
that hash a column containing different NaN values.  It could also
produce unexpected lookup failures when using a hash index on a
float column, i.e. "WHERE x = 'NaN'" will not find all the rows
it should.

To fix, special-case NaN in the float hash functions, not too much
unlike the existing special case that forces zero and minus zero
to hash the same.  I arranged for the most vanilla sort of NaN
(that coming from the C99 NAN constant) to still have the same
hash code as before, to reduce the risk to existing hash indexes.

I dithered about whether to back-patch this into stable branches,
but ultimately decided to do so.  It's a clear improvement for
queries that hash internally.  If there is anybody who has -NaN
in a hash index, they'd be well advised to re-index after applying
this patch ... but the misbehavior if they don't will not be much
worse than the misbehavior they had before.

Per bug #17172 from Ma Liangzhu.

Discussion: https://postgr.es/m/17172-7505bea9e04e230f@postgresql.org
2021-09-02 17:24:41 -04:00
Alexander Korotkov 6df7a9698b Multirange datatypes
Multiranges are basically sorted arrays of non-overlapping ranges with
set-theoretic operations defined over them.

Since v14, each range type automatically gets a corresponding multirange
datatype.  There are both manual and automatic mechanisms for naming multirange
types.  Once can specify multirange type name using multirange_type_name
attribute in CREATE TYPE.  Otherwise, a multirange type name is generated
automatically.  If the range type name contains "range" then we change that to
"multirange".  Otherwise, we add "_multirange" to the end.

Implementation of multiranges comes with a space-efficient internal
representation format, which evades extra paddings and duplicated storage of
oids.  Altogether this format allows fetching a particular range by its index
in O(n).

Statistic gathering and selectivity estimation are implemented for multiranges.
For this purpose, stored multirange is approximated as union range without gaps.
This field will likely need improvements in the future.

Catversion is bumped.

Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com
Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0
Author: Paul Jungwirth, revised by me
Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule
Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston
Reviewed-by: Zhihong Yu, Alexander Korotkov
2020-12-20 07:20:33 +03:00
Peter Eisentraut afaccbba78 Rename object in test to avoid conflict
In 01e658fa74, the hash_func test
creates a type t1, but apparently a test running in parallel might
also use that name, depending on timing.  Rename the type to avoid the
issue.
2020-11-19 11:34:54 +01:00
Peter Eisentraut 01e658fa74 Hash support for row types
Add hash functions for the record type as well as a hash operator
family and operator class for the record type.  This enables all the
hash functionality for the record type such as hash-based plans for
UNION/INTERSECT/EXCEPT DISTINCT, recursive queries using UNION
DISTINCT, hash joins, and hash partitioning.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/38eccd35-4e2d-6767-1b3c-dada1eac3124%402ndquadrant.com
2020-11-19 09:32:47 +01:00
Peter Eisentraut 6dd8b00807 Add more tests for hashing and hash-based plans
- Test hashing of an array of a non-hashable element type.

- Test UNION [DISTINCT] with hash- and sort-based plans.  (Previously,
  only INTERSECT and EXCEPT where tested there.)

- Test UNION [DISTINCT] with a non-hashable column type.  This
  currently reverts to a sort-based plan even if enable_hashagg is on.

- Test UNION/INTERSECT/EXCEPT hash- and sort-based plans with arrays
  as column types.  Also test an array with a non-hashable element
  type.

- Test UNION/INTERSECT/EXCEPT similarly with row types as column
  types.  Currently, this uses only sort-based plans because there is
  no hashing support for row types.

- Add a test case that shows that recursive queries using UNION
  [DISTINCT] require hashable column types.

- Add a currently failing test that uses UNION DISTINCT in a
  cycle-detection use case using row types as column types.

Discussion: https://www.postgresql.org/message-id/flat/38eccd35-4e2d-6767-1b3c-dada1eac3124%402ndquadrant.com
2020-11-18 08:29:50 +01:00
Peter Eisentraut afa0d53d4d Improve whitespace
The SQL file was using a mix of tabs and spaces inconsistently.
Convert all to spaces and adjust indentation accordingly.
2020-10-19 08:32:53 +02:00
Robert Haas 81c5e46c49 Introduce 64-bit hash functions with a 64-bit seed.
This will be useful for hash partitioning, which needs a way to seed
the hash functions to avoid problems such as a hash index on a hash
partitioned table clumping all values into a small portion of the
bucket space; it's also useful for anything that wants a 64-bit hash
value rather than a 32-bit hash value.

Just in case somebody wants a 64-bit hash value that is compatible
with the existing 32-bit hash values, make the low 32-bits of the
64-bit hash value match the 32-bit hash value when the seed is 0.

Robert Haas and Amul Sul

Discussion: http://postgr.es/m/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com
2017-08-31 22:21:21 -04:00