Commit Graph

11 Commits

Author SHA1 Message Date
Tom Lane 3d2b31e30e Fix brin_summarize_new_values() to check index type and ownership.
brin_summarize_new_values() did not check that the passed OID was for
an index at all, much less that it was a BRIN index, and would fail in
obscure ways if it wasn't (possibly damaging data first?).  It also
lacked any permissions test; by analogy to VACUUM, we should only allow
the table's owner to summarize.

Noted by Jeff Janes, fix by Michael Paquier and me
2015-12-26 12:56:09 -05:00
Tom Lane 1676e4381f Fix brin regression test so it actually tests cidr.
The problem noted in my previous commit was simpler than I thought:
we weren't getting an index plan because the column wasn't indexed.
2015-06-04 15:24:22 -04:00
Tom Lane 79454c696b Tighten the per-operator testing done in brin regression test.
Verify that the number of matches is exactly what it should be, not just
that it not be zero.  This should help us detect any environment-dependent
issues.

Also, verify that we're getting the expected type of scan plan (either
bitmap or seqscan as appropriate).  Right now, this is failing on the
cidrcol test cases, as shown in the output file.  I'll look into that
in a bit, but it seems good to commit this as-is temporarily to verify
that it behaves as expected on the buildfarm.
2015-06-04 14:39:52 -04:00
Tom Lane 78e72794a7 Fix brin "char" test to actually test what it meant to test.
Casting to char, without quotes, does not give the same results as casting
to "char".  That meant we were not testing the brin "char" paths at all,
since we ended up with a text operator not a "char" operator.
2015-06-04 13:50:32 -04:00
Tom Lane bac99475eb Stabilize results of brin regression test.
This test used seqscans on tenk1, with LIMIT, to build test data.
That works most of the time, but if the synchronized-seqscan logic
kicks in, we get varying test data.  This seems likely to explain
the erratic test failures on buildfarm member chipmunk, which uses
smaller-than-default shared_buffers.  To fix, add ORDER BY clauses to
force the ordering to be what it was implicitly being assumed to be.

Peter Geoghegan had noticed this with respect to one of the trouble
spots, though not the ones actually causing the chipmunk issue.
2015-06-04 13:46:34 -04:00
Tom Lane 1f303fd1be Suppress occasional failures in brin regression test.
brin.sql included a call of brin_summarize_new_values(), and expected
it to always report exactly 5 summarization events.  This failed sometimes
during parallel regression tests, as a consequence of the database-wide
VACUUM in gist.sql getting there first.  The most future-proof way
to avoid variation in the test results is to forget about using
brin_summarize_new_values() and just do a plain "VACUUM brintest",
which will exercise the same code anyway.

Having done that, there's no need for preventing autovacuum on brintest;
doing so just reduces the scope of test coverage, so let's not.
2015-05-26 14:11:12 -04:00
Alvaro Herrera b0b7be6133 Add BRIN infrastructure for "inclusion" opclasses
This lets BRIN be used with R-Tree-like indexing strategies.

Also provided are operator classes for range types, box and inet/cidr.
The infrastructure provided here should be sufficient to create operator
classes for similar datatypes; for instance, opclasses for PostGIS
geometries should be doable, though we didn't try to implement one.

(A box/point opclass was also submitted, but we ripped it out before
commit because the handling of floating point comparisons in existing
code is inconsistent and would generate corrupt indexes.)

Author: Emre Hasegeli.  Cosmetic changes by me
Review: Andreas Karlsson
2015-05-15 18:05:22 -03:00
Alvaro Herrera db5f98ab4f Improve BRIN infra, minmax opclass and regression test
The minmax opclass was using the wrong support functions when
cross-datatypes queries were run.  Instead of trying to fix the
pg_amproc definitions (which apparently is not possible), use the
already correct pg_amop entries instead.  This requires jumping through
more hoops (read: extra syscache lookups) to obtain the underlying
functions to execute, but it is necessary for correctness.

Author: Emre Hasegeli, tweaked by Álvaro
Review: Andreas Karlsson

Also change BrinOpcInfo to record each stored type's typecache entry
instead of just the OID.  Turns out that the full type cache is
necessary in brin_deform_tuple: the original code used the indexed
type's byval and typlen properties to extract the stored tuple, which is
correct in Minmax; but in other implementations that want to store
something different, that's wrong.  The realization that this is a bug
comes from Emre also, but I did not use his patch.

I also adopted Emre's regression test code (with smallish changes),
which is more complete.
2015-05-07 13:02:22 -03:00
Alvaro Herrera 972bf7d6f1 Tweak BRIN minmax operator class
In the union support proc, we were not checking the hasnulls flag of
value A early enough, so it could be skipped if the "allnulls" flag in
value B is set.  Also, a check on the allnulls flag of value "B" was
redundant, so remove it.

Also change inet_minmax_ops to not be the default opclass for type inet,
as a future inclusion operator class would be more useful and it's
pretty difficult to change default opclass for a datatype later on.
(There is no catversion bump for this catalog change; this shouldn't be
a problem.)

Extracted from a larger patch to add an "inclusion" operator class.

Author: Emre Hasegeli
2015-01-22 17:01:09 -03:00
Alvaro Herrera 86cf9a5650 Reduce disk footprint of brin regression test
Per complaint from Tom.

While at it, throw in some extra tests for nulls as well, and make sure
that the set of data we insert on the second round is not identical to
the first one.  Both measures are intended to improve coverage of the
test.

Also uncomment the ON COMMIT DROP clause on the CREATE TEMP TABLE
commands.  This doesn't have any effect for someone examining the
regression database after the tests are done, but it reduces clutter for
those that execute the script directly.
2014-11-14 16:31:48 -03:00
Alvaro Herrera 7516f52594 BRIN: Block Range Indexes
BRIN is a new index access method intended to accelerate scans of very
large tables, without the maintenance overhead of btrees or other
traditional indexes.  They work by maintaining "summary" data about
block ranges.  Bitmap index scans work by reading each summary tuple and
comparing them with the query quals; all pages in the range are returned
in a lossy TID bitmap if the quals are consistent with the values in the
summary tuple, otherwise not.  Normal index scans are not supported
because these indexes do not store TIDs.

As new tuples are added into the index, the summary information is
updated (if the block range in which the tuple is added is already
summarized) or not; in the latter case, a subsequent pass of VACUUM or
the brin_summarize_new_values() function will create the summary
information.

For data types with natural 1-D sort orders, the summary info consists
of the maximum and the minimum values of each indexed column within each
page range.  This type of operator class we call "Minmax", and we
supply a bunch of them for most data types with B-tree opclasses.
Since the BRIN code is generalized, other approaches are possible for
things such as arrays, geometric types, ranges, etc; even for things
such as enum types we could do something different than minmax with
better results.  In this commit I only include minmax.

Catalog version bumped due to new builtin catalog entries.

There's more that could be done here, but this is a good step forwards.

Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera,
with contribution by Heikki Linnakangas.

Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas.
Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo.

PS:
  The research leading to these results has received funding from the
  European Union's Seventh Framework Programme (FP7/2007-2013) under
  grant agreement n° 318633.
2014-11-07 16:38:14 -03:00