Commit Graph

65 Commits

Author SHA1 Message Date
Robert Haas 87d90ac61f Improve pg_amcheck's TAP test 003_check.pl.
Disable autovacuum, because we don't want it to run against
intentionally corrupted tables. Also, before corrupting the tables,
run pg_amcheck and ensure that it passes. Otherwise, if something
unexpected happens when we check the corrupted tables, it's not so
clear whether it would have also happened before we corrupted
them.

Mark Dilger

Discussion: http://postgr.es/m/AA5506CE-7D2A-42E4-A51D-358635E3722D@enterprisedb.com
2021-03-23 14:57:45 -04:00
Robert Haas bbe0a81db6 Allow configurable LZ4 TOAST compression.
There is now a per-column COMPRESSION option which can be set to pglz
(the default, and the only option in up until now) or lz4. Or, if you
like, you can set the new default_toast_compression GUC to lz4, and
then that will be the default for new table columns for which no value
is specified. We don't have lz4 support in the PostgreSQL code, so
to use lz4 compression, PostgreSQL must be built --with-lz4.

In general, TOAST compression means compression of individual column
values, not the whole tuple, and those values can either be compressed
inline within the tuple or compressed and then stored externally in
the TOAST table, so those properties also apply to this feature.

Prior to this commit, a TOAST pointer has two unused bits as part of
the va_extsize field, and a compessed datum has two unused bits as
part of the va_rawsize field. These bits are unused because the length
of a varlena is limited to 1GB; we now use them to indicate the
compression type that was used. This means we only have bit space for
2 more built-in compresison types, but we could work around that
problem, if necessary, by introducing a new vartag_external value for
any further types we end up wanting to add. Hopefully, it won't be
too important to offer a wide selection of algorithms here, since
each one we add not only takes more coding but also adds a build
dependency for every packager. Nevertheless, it seems worth doing
at least this much, because LZ4 gets better compression than PGLZ
with less CPU usage.

It's possible for LZ4-compressed datums to leak into composite type
values stored on disk, just as it is for PGLZ. It's also possible for
LZ4-compressed attributes to be copied into a different table via SQL
commands such as CREATE TABLE AS or INSERT .. SELECT.  It would be
expensive to force such values to be decompressed, so PostgreSQL has
never done so. For the same reasons, we also don't force recompression
of already-compressed values even if the target table prefers a
different compression method than was used for the source data.  These
architectural decisions are perhaps arguable but revisiting them is
well beyond the scope of what seemed possible to do as part of this
project.  However, it's relatively cheap to recompress as part of
VACUUM FULL or CLUSTER, so this commit adjusts those commands to do
so, if the configured compression method of the table happens not to
match what was used for some column value stored therein.

Dilip Kumar. The original patches on which this work was based were
written by Ildus Kurbangaliev, and those were patches were based on
even earlier work by Nikita Glukhov, but the design has since changed
very substantially, since allow a potentially large number of
compression methods that could be added and dropped on a running
system proved too problematic given some of the architectural issues
mentioned above; the choice of which specific compression method to
add first is now different; and a lot of the code has been heavily
refactored.  More recently, Justin Przyby helped quite a bit with
testing and reviewing and this version also includes some code
contributions from him. Other design input and review from Tomas
Vondra, Álvaro Herrera, Andres Freund, Oleg Bartunov, Alexander
Korotkov, and me.

Discussion: http://postgr.es/m/20170907194236.4cefce96%40wp.localdomain
Discussion: http://postgr.es/m/CAFiTN-uUpX3ck%3DK0mLEk-G_kUQY%3DSNOTeqdaNRR9FMdQrHKebw%40mail.gmail.com
2021-03-19 15:10:38 -04:00
Robert Haas 4078ce65a0 Fix a confusing amcheck corruption message.
Don't complain about the last TOAST chunk number being different
from what we expected if there are no TOAST chunks at all.
In such a case, saying that the final chunk number is 0 is not
really accurate, and the fact the value is missing from the
TOAST table is reported separately anyway.

Mark Dilger

Discussion: http://postgr.es/m/AA5506CE-7D2A-42E4-A51D-358635E3722D@enterprisedb.com
2021-03-16 15:42:50 -04:00
Alvaro Herrera acb7e4eb6b
Implement pipeline mode in libpq
Pipeline mode in libpq lets an application avoid the Sync messages in
the FE/BE protocol that are implicit in the old libpq API after each
query.  The application can then insert Sync at its leisure with a new
libpq function PQpipelineSync.  This can lead to substantial reductions
in query latency.

Co-authored-by: Craig Ringer <craig.ringer@enterprisedb.com>
Co-authored-by: Matthieu Garrigues <matthieu.garrigues@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Aya Iwata <iwata.aya@jp.fujitsu.com>
Reviewed-by: Daniel Vérité <daniel@manitou-mail.org>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kirk Jamison <k.jamison@fujitsu.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: Nikhil Sontakke <nikhils@2ndquadrant.com>
Reviewed-by: Vaishnavi Prabakaran <VaishnaviP@fast.au.fujitsu.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>

Discussion: https://postgr.es/m/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com
Discussion: https://postgr.es/m/CAJkzx4T5E-2cQe3dtv2R78dYFvz+in8PY7A8MArvLhs_pg75gg@mail.gmail.com
2021-03-15 18:13:42 -03:00
Tom Lane 58f57490fa Doc: add note about how to run the pg_amcheck regression tests.
It's not immediately obvious what you have to do to get "make
installcheck" to work here, so document that along the same lines
as we've used elsewhere.
2021-03-13 11:10:30 -05:00
Robert Haas 945d2cb7d0 In pg_amcheck tests, don't depend on perl's Q/q pack code.
It does not work on all versions of perl across all platforms.

To avoid endian-ness issues, pick a new value for column a
that has the same upper 4 bytes as lower 4 bytes. Try to
make it something that isn't likely to occur anywhere nearby
in the page.

Discussion: http://postgr.es/m/29DA079B-0658-4E66-BDAA-0EFD7B64D9C6@enterprisedb.com
2021-03-13 10:57:01 -05:00
Tom Lane 9e294d0f34 pg_amcheck: Keep trying to fix the tests.
Fix another example of non-portable option ordering in the tests.
Oversight in 24189277f.

Mark Dilger

Discussion: https://postgr.es/m/C37D28BA-3BA3-4776-B812-17F05F3472D8@enterprisedb.com
2021-03-13 00:06:56 -05:00
Robert Haas b9164eab20 pg_amcheck: Keep trying to fix the tests.
Commit 24189277f6 managed to remove
one of the two places where we were checking for a "no such user"
error while leaving the other one right next to it. So remove that
too. In fact, remove the entire test, because the whole point of
this test was to see which message we got on a failure.
2021-03-12 21:59:56 -05:00
Robert Haas 24189277f6 pg_amcheck: Try to fix still more test failures.
Avoid use of non-portable option ordering in command_checks_all().
The use of bare command line arguments before switches doesn't work
everywhere.  Per buildfarm members drongo and hoverfly.

Avoid testing for the message "role \"%s\" does not exist", because
some buildfarm machines report a different error. fairywren complains
about "SSPI authentication failed for user \"%s\"", for example.

Mark Dilger

Discussion: http://postgr.es/m/9E76E46A-48B2-4869-BD0C-422204C1F767@enterprisedb.com
Discussion: http://postgr.es/m/F0A1FD70-A2F4-4528-8A03-8650CAEC0554%40enterprisedb.com
2021-03-12 20:11:47 -05:00
Robert Haas f371a4cdba Try to avoid apparent platform-dependency in IPC::Run
It's hard to believe, but buildfarm results from the new pg_amcheck
suggest that command_checks_all() perform shell expansion on some
machines but not others, apparently due to an underlying behavior
difference in IPC::Run. Let's see if we can work around that - and
confirm that it is the real problem - by passing '-S*' as a single
argument rather than '-S' and '*' as two separate ones.

Failures were observed on jacana and hoverfly.

Mark Dilger

Discussion: http://postgr.es/m/9E76E46A-48B2-4869-BD0C-422204C1F767@enterprisedb.com
2021-03-12 19:00:41 -05:00
Robert Haas 6611256127 Fix portability issues in pg_amcheck's 004_verify_heapam.pl.
Test #12 overwrote a 1-byte varlena header to make it look like the
initial byte of a 4-byte varlena header, but the results were
endian-dependent. Also, the byte "abc" that followed the overwritten
byte would be interpreted differently depending on endian-ness.
Overwrite 4 bytes instead, in an endian-aware manner.

Test #13 accidentally managed to depend on TOAST_MAX_CHUNK_SIZE,
which varies slightly depending on MAXIMUM_ALIGNOF. That's not
the point anyway, so make the regexp insensitive to the expected
number of chunks.

Mark Dilger

Discussion: http://postgr.es/m/A80D68F6-E38F-482D-9522-E2FB6AAFE8A1@enterprisedb.com
2021-03-12 17:34:32 -05:00
Robert Haas ac44595585 Move PG_USED_FOR_ASSERTS_ONLY before initializer.
Erik Rijkers reported a compile failure, and I think this is probably
the reason.
2021-03-12 15:04:10 -05:00
Robert Haas 7a1527c02c Adjust perl style.
Per buildfarm member crake.
2021-03-12 14:55:40 -05:00
Robert Haas d60e61de4f Try to fix compiler warnings.
Per report from Peter Geoghegan.

Discussion: http://postgr.es/m/CAH2-WznpwULZ3uJ1_6WXvNMXYbOy8k8tYs3r=qSdGmZeRd6tDw@mail.gmail.com
2021-03-12 14:35:10 -05:00
Robert Haas 9706092839 Add pg_amcheck, a CLI for contrib/amcheck.
This makes it a lot easier to run the corruption checks that are
implemented by contrib/amcheck against lots of relations and get
the result in an easily understandable format. It has a wide variety
of options for choosing which relations to check and which checks
to perform, and it can run checks in parallel if you want.

Mark Dilger, reviewed by Peter Geoghegan and by me.

Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com
Discussion: http://postgr.es/m/BA592F2D-F928-46FF-9516-2B827F067F57@enterprisedb.com
2021-03-12 13:00:01 -05:00