Commit Graph

49532 Commits

Author SHA1 Message Date
Tom Lane 44cd434ec4 Fix behavior of ecpg's "EXEC SQL elif name".
This ought to work much like C's "#elif defined(name)"; but the code
implemented it in a way equivalent to endif followed by ifdef, so that
it didn't matter whether any previous branch of the IF construct had
succeeded.  Fix that; add some test cases covering elif and nested IFs;
and improve the documentation, which also seemed a bit confused.

AFAICS the code has been like this since the feature was added in 1999
(commit b57b0e044).  So while it's surely wrong, there might be code
out there relying on the current behavior.  Hence, don't back-patch
into stable branches.  It seems all right to fix it in v13 though.

Per report from Ashutosh Sharma.  Reviewed by Ashutosh Sharma and
Michael Meskes.

Discussion: https://postgr.es/m/CAE9k0P=dQk9X0cU2tN49S7a9tv733-e1pVdpB1P-pWJ5PdTktg@mail.gmail.com
2020-08-03 09:46:12 -04:00
Thomas Munro f5293fb09e Fix rare failure in LDAP tests.
Instead of writing a query to psql's stdin, use -c.  This avoids a
failure where psql exits before we write, seen a few times on the build
farm.  Thanks to Tom Lane for the suggestion.

Back-patch to 11, where the LDAP tests arrived.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CA%2BhUKGLFmW%2BHQYPeKiwSp5sdFFHtFViCpw4Mh6yAgEx74r5-Cw%40mail.gmail.com
2020-08-03 12:43:11 +12:00
Tom Lane 719304a304 Fix minor issues in psql's new \dAc and related commands.
The type-name pattern in \dAc and \dAf was matched only to the actual
pg_type.typname string, which is fairly user-unfriendly in cases where
that is not what's shown to the user by format_type (compare "_int4"
and "integer[]").  Make this code match what \dT does, i.e. match the
pattern against either typname or format_type() output.  Also fix its
broken handling of schema-name restrictions.  (IOW, make these
processSQLNamePattern calls match \dT's.)  While here, adjust
whitespace to make the query a little prettier in -E output, too.

Also improve some inaccuracies and shaky grammar in the related
documentation.

Noted while working on a patch for intarray's opclasses; I wondered
why I couldn't get a match to "integer*" for the input type name.
2020-08-02 17:00:26 -04:00
David Rowley 22c105595f Use int64 instead of long in incremental sort code
Windows 64bit has 4-byte long values which is not suitable for tracking
disk space usage in the incremental sort code. Let's just make all these
fields int64s.

Author: James Coleman
Discussion: https://postgr.es/m/CAApHDvpky%2BUhof8mryPf5i%3D6e6fib2dxHqBrhp0Qhu0NeBhLJw%40mail.gmail.com
Backpatch-through: 13, where the incremental sort code was added
2020-08-02 14:23:57 +12:00
Peter Geoghegan 725b43b9c3 Restore lost amcheck TOAST test coverage.
Commit eba77534 fixed an amcheck false positive bug involving
inconsistencies in TOAST input state between table and index.  A test
case was added that verified that such an inconsistency didn't result in
a spurious corruption related error.

Test coverage from the test was accidentally lost by commit 501e41dd,
which propagated ALTER TABLE ...  SET STORAGE attstorage state to
indexes.  This broke the test because the test specifically relied on
attstorage not being propagated.  This artificially forced there to be
index tuples whose datums were equivalent to the datums in the heap
without the datums actually being bitwise equal.

Fix this by updating pg_attribute directly instead.  Commit 501e41dd
made similar changes to a test_decoding TOAST-related test case which
made the same assumption, but overlooked the amcheck test case.

Backpatch: 11-, just like commit eba77534 (and commit 501e41dd).
2020-07-31 15:34:26 -07:00
Tom Lane 5c439f189b Fix oversight in ALTER TYPE: typmodin/typmodout must propagate to arrays.
If a base type supports typmods, its array type does too, with the
same interpretation.  Hence changes in pg_type.typmodin/typmodout
must be propagated to the array type.

While here, improve AlterTypeRecurse to not recurse to domains if
there is nothing we'd need to change.

Oversight in fe30e7ebf.  Back-patch to v13 where that came in.
2020-07-31 17:11:28 -04:00
Tom Lane 11dce63d6c Fix recently-introduced performance problem in ts_headline().
The new hlCover() algorithm that I introduced in commit c9b0c678d
turns out to potentially take O(N^2) or worse time on long documents,
if there are many occurrences of individual query words but few or no
substrings that actually satisfy the query.  (One way to hit this
behavior is with a "common_word & rare_word" type of query.)  This
seems unavoidable given the original goal of checking every substring
of the document, so we have to back off that idea.  Fortunately, it
seems unlikely that anyone would really want headlines spanning all of
a long document, so we can avoid the worse-than-linear behavior by
imposing a maximum length of substring that we'll consider.

For now, just hard-wire that maximum length as a multiple of max_words
times max_fragments.  Perhaps at some point somebody will argue for
exposing it as a ts_headline parameter, but I'm hesitant to make such
a feature addition in a back-patched bug fix.

I also noted that the hlFirstIndex() function I'd added in that
commit was unnecessarily stupid: it really only needs to check whether
a HeadlineWordEntry's item pointer is null or not.  This wouldn't make
all that much difference in typical cases with queries having just
a few terms, but a cycle shaved is a cycle earned.

In addition, add a CHECK_FOR_INTERRUPTS call in TS_execute_recurse.
This ensures that hlCover's loop is cancellable if it manages to take
a long time, and it may protect some other TS_execute callers as well.

Back-patch to 9.6 as the previous commit was.  I also chose to add the
CHECK_FOR_INTERRUPTS call to 9.5.  The old hlCover() algorithm seems
to avoid the O(N^2) behavior, at least on the test case I tried, but
nonetheless it's not very quick on a long document.

Per report from Stephen Frost.

Discussion: https://postgr.es/m/20200724160535.GW12375@tamriel.snowman.net
2020-07-31 11:43:12 -04:00
Tatsuo Ishii 0d9b64fe8d Doc: fix high availability solutions comparison.
In "High Availability, Load Balancing, and Replication" chapter,
certain descriptions of Pgpool-II were not correct at this point.  It
does not need conflict resolution. Also "Multiple-Server Parallel
Query Execution" is not supported anymore.

Discussion: https://postgr.es/m/20200726.230128.53842489850344110.t-ishii%40sraoss.co.jp
Author: Tatsuo Ishii
Reviewed-by: Bruce Momjian
Backpatch-through: 9.5
2020-07-31 07:46:25 +09:00
Jeff Davis 07cbcdd00c Use pg_bitutils for HyperLogLog.
Using pg_leftmost_one_post32() yields substantial performance benefits.

Backpatching to version 13 because HLL is used for HashAgg
improvements in 9878b643, which was also backpatched to 13.

Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-WzkGvDKVDo+0YvfvZ+1CE=iCi88DCOGFF3i1hTGGaxcKPw@mail.gmail.com
Backpatch-through: 13
2020-07-30 09:17:00 -07:00
Michael Paquier e710f3b8e7 doc: Mention index references in pg_inherits
Partitioned indexes are also registered in pg_inherits, but the
description of this catalog did not reflect that.

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87k0ynj35y.fsf@wibble.ilmari.org
Backpatch-through: 11
2020-07-30 15:48:52 +09:00
Peter Geoghegan 78530c8e7a Add hash_mem_multiplier GUC.
Add a GUC that acts as a multiplier on work_mem.  It gets applied when
sizing executor node hash tables that were previously size constrained
using work_mem alone.

The new GUC can be used to preferentially give hash-based nodes more
memory than the generic work_mem limit.  It is intended to enable admin
tuning of the executor's memory usage.  Overall system throughput and
system responsiveness can be improved by giving hash-based executor
nodes more memory (especially over sort-based alternatives, which are
often much less sensitive to being memory constrained).

The default value for hash_mem_multiplier is 1.0, which is also the
minimum valid value.  This means that hash-based nodes continue to apply
work_mem in the traditional way by default.

hash_mem_multiplier is generally useful.  However, it is being added now
due to concerns about hash aggregate performance stability for users
that upgrade to Postgres 13 (which added disk-based hash aggregation in
commit 1f39bce0).  While the old hash aggregate behavior risked
out-of-memory errors, it is nevertheless likely that many users actually
benefited.  Hash agg's previous indifference to work_mem during query
execution was not just faster; it also accidentally made aggregation
resilient to grouping estimate problems (at least in cases where this
didn't create destabilizing memory pressure).

hash_mem_multiplier can provide a certain kind of continuity with the
behavior of Postgres 12 hash aggregates in cases where the planner
incorrectly estimates that all groups (plus related allocations) will
fit in work_mem/hash_mem.  This seems necessary because hash-based
aggregation is usually much slower when only a small fraction of all
groups can fit.  Even when it isn't possible to totally avoid hash
aggregates that spill, giving hash aggregation more memory will reliably
improve performance (the same cannot be said for external sort
operations, which appear to be almost unaffected by memory availability
provided it's at least possible to get a single merge pass).

The PostgreSQL 13 release notes should advise users that increasing
hash_mem_multiplier can help with performance regressions associated
with hash aggregation.  That can be taken care of by a later commit.

Author: Peter Geoghegan
Reviewed-By: Álvaro Herrera, Jeff Davis
Discussion: https://postgr.es/m/20200625203629.7m6yvut7eqblgmfo@alap3.anarazel.de
Discussion: https://postgr.es/m/CAH2-WzmD%2Bi1pG6rc1%2BCjc4V6EaFJ_qSuKCCHVnH%3DoruqD-zqow%40mail.gmail.com
Backpatch: 13-, where disk-based hash aggregation was introduced.
2020-07-29 14:14:57 -07:00
Jeff Davis 3a232a3183 HashAgg: use better cardinality estimate for recursive spilling.
Use HyperLogLog to estimate the group cardinality in a spilled
partition. This estimate is used to choose the number of partitions if
we recurse.

The previous behavior was to use the number of tuples in a spilled
partition as the estimate for the number of groups, which lead to
overpartitioning. That could cause the number of batches to be much
higher than expected (with each batch being very small), which made it
harder to interpret EXPLAIN ANALYZE results.

Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/a856635f9284bc36f7a77d02f47bbb6aaf7b59b3.camel@j-davis.com
Backpatch-through: 13
2020-07-28 23:17:23 -07:00
Peter Geoghegan cdd7bd695b Rename another "hash_mem" local variable.
Missed by my commit 564ce621.

Backpatch: 13-, where disk-based hash aggregation was introduced.
2020-07-28 17:59:14 -07:00
Peter Geoghegan b6c15e71f3 Correct obsolete UNION hash aggs comment.
Oversight in commit 1f39bce0, which added disk-based hash aggregation.

Backpatch: 13-, where disk-based hash aggregation was introduced.
2020-07-28 17:14:06 -07:00
Peter Geoghegan e362f469c5 Doc: Remove obsolete CREATE AGGREGATE note.
The planner is in fact willing to use hash aggregation when work_mem is
not set high enough for everything to fit in memory.  This has been the
case since commit 1f39bce0, which added disk-based hash aggregation.

There are a few remaining cases in which hash aggregation is avoided as
a matter of policy when the planner surmises that spilling will be
necessary.  For example, callers of choose_hashed_setop() still
conservatively avoid hash aggregation when spilling is anticipated.
That doesn't seem like a good enough reason to mention hash aggregation
in this context.

Backpatch: 13-, where disk-based hash aggregation was introduced.
2020-07-28 16:58:59 -07:00
David Rowley a57c837e5c Make EXPLAIN ANALYZE of HashAgg more similar to Hash Join
There were various unnecessary differences between Hash Agg's EXPLAIN
ANALYZE output and Hash Join's.  Here we modify the Hash Agg output so
that it's better aligned to Hash Join's.

The following changes have been made:
1. Start batches counter at 1 instead of 0.
2. Always display the "Batches" property, even when we didn't spill to
   disk.
3. Use the text "Batches" instead of "HashAgg Batches" for text format.
4. Use the text "Memory Usage" instead of "Peak Memory Usage" for text
   format.
5. Include "Batches" before "Memory Usage" in both text and non-text
   formats.

In passing also modify the "Planned Partitions" property so that we show
it regardless of if the value is 0 or not for non-text EXPLAIN formats.
This was pointed out by Justin Pryzby and probably should have been part
of 40efbf870.

Reviewed-by: Justin Pryzby, Jeff Davis
Discussion: https://postgr.es/m/CAApHDvrshRnA6C0VFnu7Fb9TVvgGo80PUMm5+2DiaS1gEkPvtw@mail.gmail.com
Backpatch-through: 13, where HashAgg batching was introduced
2020-07-29 11:43:11 +12:00
David Rowley dc6f2fb435 Doc: Improve documentation for pg_jit_available()
Per complaint from Scott Ribe. Based on wording suggestion from Tom Lane.

Discussion: https://postgr.es/m/1956E806-1468-4417-9A9D-235AE1D5FE1A@elevated-dev.com
Backpatch-through: 11, where pg_jit_available() was added
2020-07-28 22:52:43 +12:00
Fujii Masao 128fd0a65a doc: Mention the rename of wal_keep_segments GUC in release note.
Commit f5dff45962 renamed wal_keep_segments to wal_keep_size.
This commit adds the mention to this change in the release note.

Author: Fujii Masao
Reviewed-by: David Steele
Discussion: https://postgr.es/m/dc4768f2-1eff-d2fc-35ba-6b2985b7cb6c@oss.nttdata.com
2020-07-28 11:23:02 +09:00
Etsuro Fujita cebe10a5f3 Fix some issues with step generation in partition pruning.
In the case of range partitioning, get_steps_using_prefix() assumes that
the passed-in prefix list contains at least one clause for each of the
partition keys earlier than one specified in the passed-in
step_lastkeyno, but the caller (ie, gen_prune_steps_from_opexps())
didn't take it into account, which led to a server crash or incorrect
results when the list contained no clauses for such partition keys, as
reported in bug #16500 and #16501 from Kobayashi Hisanori.  Update the
caller to call that function only when the list created there contains
at least one clause for each of the earlier partition keys in the case
of range partitioning.

While at it, fix some other issues:

* The list to pass to get_steps_using_prefix() is allowed to contain
  multiple clauses for the same partition key, as described in the
  comment for that function, but that function actually assumed that the
  list contained just a single clause for each of middle partition keys,
  which led to an assertion failure when the list contained multiple
  clauses for such partition keys.  Update that function to match the
  comment.
* In the case of hash partitioning, partition keys are allowed to be
  NULL, in which case the list to pass to get_steps_using_prefix()
  contains no clauses for NULL partition keys, but that function treats
  that case as like the case of range partitioning, which led to the
  assertion failure.  Update the assertion test to take into account
  NULL partition keys in the case of hash partitioning.
* Fix a typo in a comment in get_steps_using_prefix_recurse().
* gen_partprune_steps() failed to detect self-contradiction from
  strict-qual clauses and an IS NULL clause for the same partition key
  in some cases, producing incorrect partition-pruning steps, which led
  to incorrect results of partition pruning, but didn't cause any
  user-visible problems fortunately, as the self-contradiction is
  detected later in the query planning.  Update that function to detect
  the self-contradiction.

Per bug #16500 and #16501 from Kobayashi Hisanori.  Patch by me, initial
diagnosis for the reported issue and review by Dmitry Dolgov.
Back-patch to v11, where partition pruning was introduced.

Discussion: https://postgr.es/m/16500-d1613f2a78e1e090%40postgresql.org
Discussion: https://postgr.es/m/16501-5234a9a0394f6754%40postgresql.org
2020-07-28 11:00:00 +09:00
Peter Geoghegan 5a6cc6ffa9 Remove hashagg_avoid_disk_plan GUC.
Note: This GUC was originally named enable_hashagg_disk when it appeared
in commit 1f39bce0, which added disk-based hash aggregation.  It was
subsequently renamed in commit 92c58fd9.

Author: Peter Geoghegan
Reviewed-By: Jeff Davis, Álvaro Herrera
Discussion: https://postgr.es/m/9d9d1e1252a52ea1bad84ea40dbebfd54e672a0f.camel%40j-davis.com
Backpatch: 13-, where disk-based hash aggregation was introduced.
2020-07-27 17:53:17 -07:00
Michael Paquier 0caf1fc6e8 Fix corner case with 16kB-long decompression in pgcrypto, take 2
A compressed stream may end with an empty packet.  In this case
decompression finishes before reading the empty packet and the
remaining stream packet causes a failure in reading the following
data.  This commit makes sure to consume such extra data, avoiding a
failure when decompression the data.  This corner case was reproducible
easily with a data length of 16kB, and existed since e94dd6a.  A cheap
regression test is added to cover this case based on a random,
incompressible string.

The first attempt of this patch has allowed to find an older failure
within the compression logic of pgcrypto, fixed by b9b6105.  This
involved SLES 15 with z390 where a custom flavor of libz gets used.
Bonus thanks to Mark Wong for providing access to the specific
environment.

Reported-by: Frank Gagnepain
Author: Kyotaro Horiguchi, Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
Backpatch-through: 9.5
2020-07-27 15:58:54 +09:00
Michael Paquier ed4a9dc9cf Fix handling of structure for bytea data type in ECPG
Some code paths dedicated to bytea used the structure for varchar.  This
did not lead to any actual bugs, as bytea and varchar have the same
definition, but it could become a trap if one of these definitions
changes for a new feature or a bug fix.

Issue introduced by 050710b.

Author: Shenhao Wang
Reviewed-by: Vignesh C, Michael Paquier
Discussion: https://postgr.es/m/07ac7dee1efc44f99d7f53a074420177@G08CNEXMBPEKD06.g08.fujitsu.local
Backpatch-through: 12
2020-07-27 10:29:08 +09:00
Jeff Davis 7f5f2249b2 Fix LookupTupleHashEntryHash() pipeline-stall issue.
Refactor hash lookups in nodeAgg.c to improve performance.

Author: Andres Freund and Jeff Davis
Discussion: https://postgr.es/m/20200612213715.op4ye4q7gktqvpuo%40alap3.anarazel.de
Backpatch-through: 13
2020-07-26 16:08:41 -07:00
Michael Paquier 21b0055359 Tweak behavior of pg_stat_activity.leader_pid
The initial implementation of leader_pid in pg_stat_activity added by
b025f32 took the approach to strictly print what a PGPROC entry
includes.  In short, if a backend has been involved in parallel query at
least once, leader_pid would remain set as long as the backend is alive.
For a parallel group leader, this means that the field would always be
set after it participated at least once in parallel query, and after
more discussions this could be confusing if using for example a
connection pooler.

This commit changes the data printed so as leader_pid becomes always
NULL for a parallel group leader, showing up a non-NULL value only for
the parallel workers, and actually as long as a parallel query is
running as workers are shut down once the query has completed.

This does not change the definition of any catalog, so no catalog bump
is needed.  Per discussion with Justin Pryzby, Álvaro Herrera, Julien
Rouhaud and me.

Discussion: https://postgr.es/m/20200721035145.GB17300@paquier.xyz
Backpatch-through: 13
2020-07-26 16:32:20 +09:00
Amit Kapila b15367ae39 Fix buffer usage stats for nodes above Gather Merge.
Commit 85c9d347 addressed a similar problem for Gather and Gather
Merge nodes but forgot to account for nodes above parallel nodes.  This
still works for nodes above Gather node because we shut down the workers
for Gather node as soon as there are no more tuples.  We can do a similar
thing for Gather Merge as well but it seems better to account for stats
during nodes shutdown after completing the execution.

Reported-by: Stéphane Lorek, Jehan-Guillaume de Rorthais
Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Reviewed-by: Amit Kapila
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/20200718160206.584532a2@firost
2020-07-25 10:31:19 +05:30
Tom Lane 70eca6a9a6 Replace TS_execute's TS_EXEC_CALC_NOT flag with TS_EXEC_SKIP_NOT.
It's fairly silly that ignoring NOT subexpressions is TS_execute's
default behavior.  It's wrong on its face and it encourages errors
of omission.  Moreover, the only two remaining callers that aren't
specifying CALC_NOT are in ts_headline calculations, and it's very
arguable that those are bugs: if you've specified "!foo" in your
query, why would you want to get a headline that includes "foo"?

Hence, rip that out and change the default behavior to be to calculate
NOT accurately.  As a concession to the slim chance that there is still
somebody somewhere who needs the incorrect behavior, provide a new
SKIP_NOT flag to explicitly request that.

Back-patch into v13, mainly because it seems better to change this
at the same time as the previous commit's rejiggering of TS_execute
related APIs.  Any outside callers affected by this change are
probably also affected by that one.

Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
2020-07-24 15:43:56 -04:00
Tom Lane 92fe6895d6 Fix assorted bugs by changing TS_execute's callback API to ternary logic.
Text search sometimes failed to find valid matches, for instance
'!crew:A'::tsquery might fail to locate 'crew:1B'::tsvector during
an index search.  The root of the issue is that TS_execute's callback
functions were not changed to use ternary (yes/no/maybe) reporting
when we made the search logic itself do so.  It's somewhat annoying
to break that API, but on the other hand we now see that any code
using plain boolean logic is almost certainly broken since the
addition of phrase search.  There seem to be very few outside callers
of this code anyway, so we'll just break them intentionally to get
them to adapt.

This allows removal of tsginidx.c's private re-implementation of
TS_execute, since that's now entirely duplicative.  It's also no
longer necessary to avoid use of CALC_NOT in tsgistidx.c, since
the underlying callbacks can now do something reasonable.

Back-patch into v13.  We can't change this in stable branches,
but it seems not quite too late to fix it in v13.

Tom Lane and Pavel Borisov

Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
2020-07-24 15:26:51 -04:00
Tom Lane 7dab4569d1 Fix ancient violation of zlib's API spec.
contrib/pgcrypto mishandled the case where deflate() does not consume
all of the offered input on the first try.  It reset the next_in pointer
to the start of the input instead of leaving it alone, causing the wrong
data to be fed to the next deflate() call.

This has been broken since pgcrypto was committed.  The reason for the
lack of complaints seems to be that it's fairly hard to get stock zlib
to not consume all the input, so long as the output buffer is big enough
(which it normally would be in pgcrypto's usage; AFAICT the input is
always going to be packetized into packets no larger than ZIP_OUT_BUF).
However, IBM's zlibNX implementation for AIX evidently will do it
in some cases.

I did not add a test case for this, because I couldn't find one that
would fail with stock zlib.  When we put back the test case for
bug #16476, that will cover the zlibNX situation well enough.

While here, write deflate()'s second argument as Z_NO_FLUSH per its
API spec, instead of hard-wiring the value zero.

Per buildfarm results and subsequent investigation.

Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
2020-07-23 17:20:01 -04:00
Peter Eisentraut dbd03482c6 doc: Document that ssl_ciphers does not affect TLS 1.3
TLS 1.3 uses a different way of specifying ciphers and a different
OpenSSL API.  PostgreSQL currently does not support setting those
ciphers.  For now, just document this.  In the future, support for
this might be added somehow.

Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
2020-07-23 20:38:11 +02:00
Thomas Munro 6b366190d5 Fix error message.
Remove extra space.  Back-patch to all releases, like commit 7897e3bb.

Author: Lu, Chenyang <lucy.fnst@cn.fujitsu.com>
Discussion: https://postgr.es/m/795d03c6129844d3803e7eea48f5af0d%40G08CNEXMBPEKD04.g08.fujitsu.local
2020-07-23 21:15:01 +12:00
Michael Paquier 9f5162ef43 Revert "Fix corner case with PGP decompression in pgcrypto"
This reverts commit 9e10898, after finding out that buildfarm members
running SLES 15 on z390 complain on the compression and decompression
logic of the new test: pipistrelles, barbthroat and steamerduck.

Those hosts are visibly using hardware-specific changes to improve zlib
performance, requiring more investigation.

Thanks to Tom Lane for the discussion.

Discussion: https://postgr.es/m/20200722093749.GA2564@paquier.xyz
Backpatch-through: 9.5
2020-07-23 08:29:14 +09:00
Michael Paquier 35e142202b Fix corner case with PGP decompression in pgcrypto
A compressed stream may end with an empty packet, and PGP decompression
finished before reading this empty packet in the remaining stream.  This
caused a failure in pgcrypto, handling this case as corrupted data.
This commit makes sure to consume such extra data, avoiding a failure
when decompression the entire stream.  This corner case was reproducible
with a data length of 16kB, and existed since its introduction in
e94dd6a.  A cheap regression test is added to cover this case.

Thanks to Jeff Janes for the extra investigation.

Reported-by: Frank Gagnepain
Author: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
Backpatch-through: 9.5
2020-07-22 14:52:36 +09:00
Tom Lane cc4dd2a7af neqjoinsel must now pass through collation to eqjoinsel.
Since commit 044c99bc5, eqjoinsel passes the passed-in collation
to any operators it invokes.  However, neqjoinsel failed to pass
on whatever collation it got, so that if we invoked a
collation-dependent operator via that code path, we'd get "could not
determine which collation to use for string comparison" or the like.

Per report from Justin Pryzby.  Back-patch to v12, like the previous
commit.

Discussion: https://postgr.es/m/20200721191606.GL5748@telsasoft.com
2020-07-21 19:40:44 -04:00
Alvaro Herrera ac25e7b039
Minor glossary tweaks
Add "(process)" qualifier to two terms, remove self-reference in one
term.

Author: Jürgen Purtz <juergen@purtz.de>
Discussion: https://postgr.es/m/95f90a5d-7692-701d-2c0c-0c88eb5cea7d@purtz.de
2020-07-21 13:08:16 -04:00
Tom Lane bca409e5b1 Assert that we don't insert nulls into attnotnull catalog columns.
The executor checks for this error, and so does the bootstrap catalog
loader, but we never checked for it in retail catalog manipulations.
The folly of that has now been exposed, so let's add assertions
checking it.  Checking in CatalogTupleInsert[WithInfo] and
CatalogTupleUpdate[WithInfo] should be enough to cover this.

Back-patch to v10; the aforesaid functions didn't exist before that,
and it didn't seem worth adapting the patch to the oldest branches.
But given the risk of JIT crashes, I think we certainly need this
as far back as v11.

Pre-v13, we have to explicitly exclude pg_subscription.subslotname
and pg_subscription_rel.srsublsn from the checks, since they are
mismarked.  (Even if we change our mind about applying BKI_FORCE_NULL
in the branch tips, it doesn't seem wise to have assertions that
would fire in existing databases.)

Discussion: https://postgr.es/m/298837.1595196283@sss.pgh.pa.us
2020-07-21 12:38:08 -04:00
Tom Lane e5372b48b9 Correctly mark pg_subscription_rel.srsublsn as nullable.
The code has always set this column to NULL when it's not valid,
but the catalog header's description failed to reflect that,
as did the SGML docs, as did some of the code.  To prevent future
coding errors of the same ilk, let's hide the field from C code
as though it were variable-length (which, in a sense, it is).

As with commit 72eab84a5, we can only fix this cleanly in HEAD
and v13; the problem extends further back but we'll need some
klugery in the released branches.

Discussion: https://postgr.es/m/367660.1595202498@sss.pgh.pa.us
2020-07-20 14:55:56 -04:00
Tom Lane 2f1f189cf8 Fix construction of updated-columns bitmap in logical replication.
Commit b9c130a1f failed to apply the publisher-to-subscriber column
mapping while checking which columns were updated.  Perhaps less
significantly, it didn't exclude dropped columns either.  This could
result in an incorrect updated-columns bitmap and thus wrong decisions
about whether to fire column-specific triggers on the subscriber while
applying updates.  In HEAD (since commit 9de77b545), it could also
result in accesses off the end of the colstatus array, as detected by
buildfarm member skink.  Fix the logic, and adjust 003_constraints.pl
so that the problem is exposed in unpatched code.

In HEAD, also add some assertions to check that we don't access off
the ends of these newly variable-sized arrays.

Back-patch to v10, as b9c130a1f was.

Discussion: https://postgr.es/m/CAH2-Wz=79hKQ4++c5A060RYbjTHgiYTHz=fw6mptCtgghH2gJA@mail.gmail.com
2020-07-20 13:40:16 -04:00
Fujii Masao f5dff45962 Rename wal_keep_segments to wal_keep_size.
max_slot_wal_keep_size that was added in v13 and wal_keep_segments are
the GUC parameters to specify how much WAL files to retain for
the standby servers. While max_slot_wal_keep_size accepts the number of
bytes of WAL files, wal_keep_segments accepts the number of WAL files.
This difference of setting units between those similar parameters could
be confusing to users.

To alleviate this situation, this commit renames wal_keep_segments to
wal_keep_size, and make users specify the WAL size in it instead of
the number of WAL files.

There was also the idea to rename max_slot_wal_keep_size to
max_slot_wal_keep_segments, in the discussion. But we have been moving
away from measuring in segments, for example, checkpoint_segments was
replaced by max_wal_size. So we concluded to rename wal_keep_segments
to wal_keep_size.

Back-patch to v13 where max_slot_wal_keep_size was added.

Author: Fujii Masao
Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/574b4ea3-e0f9-b175-ead2-ebea7faea855@oss.nttdata.com
2020-07-20 13:33:45 +09:00
Amit Kapila 4a1ae21750 Fix minor typo in nodeIncrementalSort.c.
Author: Vignesh C
Reviewed-by: James Coleman
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/CALDaNm0WjZqRvdeL59ZfYH0o4mLbKQ23jm-bnjXcFzgpANx55g@mail.gmail.com
2020-07-20 07:54:04 +05:30
Tom Lane 914d2383ae Correctly mark pg_subscription.subslotname as nullable.
Due to the layout of this catalog, subslotname has to be explicitly
marked BKI_FORCE_NULL, else initdb will default to the assumption
that it's non-nullable.  Since, in fact, CREATE/ALTER SUBSCRIPTION
will store null values there, the existing marking is just wrong,
and has been since this catalog was invented.

We haven't noticed because not much in the system actually depends
on attnotnull being truthful.  However, JIT'ed tuple deconstruction
does depend on that in some cases, allowing crashes or wrong answers
in queries that inspect pg_subscription.  Commit 9de77b545 quite
accidentally exposed this on the buildfarm members that force JIT
activation.

Back-patch to v13.  The problem goes further back, but we cannot
force initdb in released branches, so some klugier solution will
be needed there.  Before working on that, push this simple fix
to try to get the buildfarm back to green.

Discussion: https://postgr.es/m/4118109.1595096139@sss.pgh.pa.us
2020-07-19 12:37:23 -04:00
Michael Paquier f2b65519e1 doc: Refresh more URLs in the docs
This updates some URLs that are redirections, mostly to an equivalent
using https.  One URL referring to generalized partial indexes was
outdated.

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20200717.121308.1369606287593685396.horikyota.ntt@gmail.com
Backpatch-through: 9.5
2020-07-18 22:43:41 +09:00
Michael Paquier 580d8ac9b1 doc: Fix description of \copy for psql
The WHERE clause introduced by 31f3817 was not described.  While on it,
split the grammar of \copy FROM and TO into two distinct parts for
clarity as they support different set of options.

Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm3zWr=OmxeNqOqfT=uZTSdam_j-gkX94CL8eTNfgUtf6A@mail.gmail.com
Backpatch-through: 12
2020-07-18 10:42:46 +09:00
Tom Lane 71e8e66f78 Cope with data-offset-less archive files during out-of-order restores.
pg_dump produces custom-format archive files that lack data offsets
when it is unable to seek its output.  Up to now that's been a hazard
for pg_restore.  But if pg_restore is able to seek in the archive
file, there is no reason to throw up our hands when asked to restore
data blocks out of order.  Instead, whenever we are searching for a
data block, record the locations of the blocks we passed over (that
is, fill in the missing data-offset fields in our in-memory copy of
the TOC data).  Then, when we hit a case that requires going
backwards, we can just seek back.

Also track the furthest point that we've searched to, and seek back
to there when beginning a search for a new data block.  This avoids
possible O(N^2) time consumption, by ensuring that each data block
is examined at most twice.  (On Unix systems, that's at most twice
per parallel-restore job; but since Windows uses threads here, the
threads can share block location knowledge, reducing the amount of
duplicated work.)

We can also improve the code a bit by using fseeko() to skip over
data blocks during the search.

This is all of some use even in simple restores, but it's really
significant for parallel pg_restore.  In that case, we require
seekability of the input already, and we will very probably need
to do out-of-order restores.

Back-patch to v12, as this fixes a regression introduced by commit
548e50976.  Before that, parallel restore avoided requesting
out-of-order restores, so it would work on a data-offset-less
archive.  Now it will again.

Ideally this patch would include some test coverage, but there are
other open bugs that need to be fixed before we can extend our
coverage of parallel restore very much.  Plan to revisit that later.

David Gilman and Tom Lane; reviewed by Justin Pryzby

Discussion: https://postgr.es/m/CALBH9DDuJ+scZc4MEvw5uO-=vRyR2=QF9+Yh=3hPEnKHWfS81A@mail.gmail.com
2020-07-17 13:04:06 -04:00
Tom Lane 447cf2f8e9 Remove manual tracking of file position in pg_dump/pg_backup_custom.c.
We do not really need to track the file position by hand.  We were
already relying on ftello() whenever the archive file is seekable,
while if it's not seekable we don't need the file position info
anyway because we're not going to be able to re-write the TOC.

Moreover, that tracking was buggy since it failed to account for
the effects of fseeko().  Somewhat remarkably, that seems not to
have made for any live bugs up to now.  We could fix the oversights,
but it seems better to just get rid of the whole error-prone mess.

In itself this is merely code cleanup.  However, it's necessary
infrastructure for an upcoming bug-fix patch (because that code
*does* need valid file position after fseeko).  The bug fix
needs to go back as far as v12; hence, back-patch that far.

Discussion: https://postgr.es/m/CALBH9DDuJ+scZc4MEvw5uO-=vRyR2=QF9+Yh=3hPEnKHWfS81A@mail.gmail.com
2020-07-17 13:04:06 -04:00
Peter Geoghegan 49eb96852b Avoid CREATE INDEX unique index deduplication.
There is no advantage to attempting deduplication for a unique index
during CREATE INDEX, since there cannot possibly be any duplicates.
Doing so wastes cycles due to unnecessary copying.  Make sure that we
avoid it consistently.

We already avoided unique index deduplication in the case where there
were some spool2 tuples to merge.  That didn't account for the fact that
spool2 is removed early/unset in the common case where it has no tuples
that need to be merged (i.e. it failed to account for the "spool2 turns
out to be unnecessary" optimization in _bt_spools_heapscan()).

Oversight in commit 0d861bbb, which added nbtree deduplication

Backpatch: 13-, where nbtree deduplication was introduced.
2020-07-17 09:50:46 -07:00
Tom Lane a220e345c8 Ensure that distributed timezone abbreviation files are plain ASCII.
We had two occurrences of "Mitteleuropäische Zeit" in Europe.txt,
though the corresponding entries in Default were spelled
"Mitteleuropaeische Zeit".  Standardize on the latter spelling to
avoid questions of which encoding to use.

While here, correct a couple of other trivial inconsistencies between
the Default file and the supposedly-matching entries in the *.txt
files, as exposed by some checking with comm(1).  Also, add BDST to
the Europe.txt file; it previously was only listed in Default.
None of this has any direct functional effect.

Per complaint from Christoph Berg.  As usual for timezone data patches,
apply to all branches.

Discussion: https://postgr.es/m/20200716100743.GE3534683@msg.df7cb.de
2020-07-17 11:04:22 -04:00
Peter Eisentraut 6bab40bf60 Fix whitespace 2020-07-17 15:16:21 +02:00
Peter Eisentraut e7240ccecd Resolve gratuitous tabs in SQL file 2020-07-17 15:08:43 +02:00
Amit Kapila 35647ea9d2 Fix signal handler setup for SIGHUP in the apply launcher process.
Commit 1e53fe0e70 has unified the usage of the config-file reload flag by
using the same signal handler function for the SIGHUP signal at many places
in the code.  By mistake, it used the wrong SIGNAL in apply launcher
process for the SIGHUP signal handler function.

Author: Bharath Rupireddy
Reviewed-by: Dilip Kumar
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/CALj2ACVzHCRnS20bOiEHaLtP5PVBENZQn4khdsSJQgOv_GM-LA@mail.gmail.com
2020-07-17 08:43:06 +05:30
Michael Paquier beebbb39d9 Switch pg_test_fsync to use binary mode on Windows
pg_test_fsync has always opened files using the text mode on Windows, as
this is the default mode used if not enforced by _setmode().

This fixes a failure when running pg_test_fsync down to 12 because
O_DSYNC and the text mode are not able to work together nicely.  We
fixed the handling of O_DSYNC in 12~ for the tool by switching to the
concurrent-safe version of fopen() in src/port/ with 0ba06e0.  And
40cfe86, by enforcing the text mode for compatibility reasons if O_TEXT
or O_BINARY are not specified by the caller, broke pg_test_fsync.  For
all versions, this avoids any translation overhead, and pg_test_fsync
should test binary writes, so it is a gain in all cases.

Note that O_DSYNC is still not handled correctly in ~11, leading to
pg_test_fsync to show insanely high numbers for open_datasync() (using
this property it is easy to notice that the binary mode is much
faster).  This would require a backpatch of 0ba06e0 and 40cfe86, which
could potentially break existing applications, so this is left out.

There are no TAP tests for this tool yet, so I have checked all builds
manually using MSVC.  We could invent a new option to run a single
transaction instead of using a duration of 1s to make the tests a
maximum short, but this is left as future work.

Thanks to Bruce Momjian for the discussion.

Reported-by: Jeff Janes
Author: Michael Paquier
Discussion: https://postgr.es/m/16526-279ded30a230d275@postgresql.org
Backpatch-through: 9.5
2020-07-16 15:52:54 +09:00