Commit Graph

37114 Commits

Author SHA1 Message Date
Tom Lane ea80138545 Fix psql's \connect command some more.
Jasen Betts reported yet another unintended side effect of commit
85c54287a: reconnecting with "\c service=whatever" did not have the
expected results.  The reason is that starting from the output of
PQconndefaults() effectively allows environment variables (such
as PGPORT) to override entries in the service file, whereas the
normal priority is the other way around.

Not using PQconndefaults at all would require yet a third main code
path in do_connect's parameter setup, so I don't really want to fix
it that way.  But we can have the logic effectively ignore all the
default values for just a couple more lines of code.

This patch doesn't change the behavior for "\c -reuse-previous=on
service=whatever".  That remains significantly different from before
85c54287a, because many more parameters will be re-used, and thus
not be possible for service entries to replace.  But I think this
is (mostly?) intentional.  In any case, since libpq does not report
where it got parameter values from, it's hard to do differently.

Per bug #16936 from Jasen Betts.  As with the previous patches,
back-patch to all supported branches.  (9.5 is unfortunately now
out of support, so this won't get fixed there.)

Discussion: https://postgr.es/m/16936-3f524322a53a29f0@postgresql.org
2021-03-23 14:27:50 -04:00
Tom Lane 9d523119fd Avoid possible crash while finishing up a heap rewrite.
end_heap_rewrite was not careful to ensure that the target relation
is open at the smgr level before performing its final smgrimmedsync.
In ordinary cases this is no problem, because it would have been
opened earlier during the rewrite.  However a crash can be reproduced
by re-clustering an empty table with CLOBBER_CACHE_ALWAYS enabled.

Although that exact scenario does not crash in v13, I think that's
a chance result of unrelated planner changes, and the problem is
likely still reachable with other test cases.  The true proximate
cause of this failure is commit c6b92041d, which replaced a call to
heap_sync (which was careful about opening smgr) with a direct call
to smgrimmedsync.  Hence, back-patch to v13.

Amul Sul, per report from Neha Sharma; cosmetic changes
and test case by me.

Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com
2021-03-23 11:24:16 -04:00
Peter Eisentraut a6715af1e7 Add bit_count SQL function
This function for bit and bytea counts the set bits in the bit or byte
string.  Internally, we use the existing popcount functionality.

For the name, after some discussion, we settled on bit_count, which
also exists with this meaning in MySQL, Java, and Python.

Author: David Fetter <david@fetter.org>
Discussion: https://www.postgresql.org/message-id/flat/20201230105535.GJ13234@fetter.org
2021-03-23 10:13:58 +01:00
Michael Paquier 5aed6a1fc2 Add per-index stats information in verbose logs of autovacuum
Once a relation's autovacuum is completed, the logs include more
information about this relation state if the threshold of
log_autovacuum_min_duration (or its relation option) is reached, with
for example contents about the statistics of the VACUUM operation for
the relation, WAL and system usage.

This commit adds more information about the statistics of the relation's
indexes, with one line of logs generated for each index.  The index
stats were already calculated, but not printed in the context of
autovacuum yet.  While on it, some refactoring is done to keep track of
the index statistics directly within LVRelStats, simplifying some
routines related to parallel VACUUMs.

Author: Masahiko Sawada
Reviewed-by: Michael Paquier, Euler Taveira
Discussion: https://postgr.es/m/CAD21AoAy6SxHiTivh5yAPJSUE4S=QRPpSZUdafOSz0R+fRcM6Q@mail.gmail.com
2021-03-23 13:25:14 +09:00
Amit Kapila 4b82ed6eca Fix dangling pointer reference in stream_cleanup_files.
We can't access the entry after it is removed from dynahash.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Ps-pL++f6CJwPx2+vUqXuew=Xt-9Bi-6kCyxn+Fwi2M7w@mail.gmail.com
2021-03-23 09:43:33 +05:30
Tomas Vondra a5f002ad9a Use correct spelling of statistics kind
A couple error messages and comments used 'statistic kind', not the
correct 'statistics kind'. Fix and backpatch all the way back to 10,
where extended statistics were introduced.

Backpatch-through: 10
2021-03-23 05:01:35 +01:00
Fujii Masao 1e3e8b51bd Change the type of WalReceiverWaitStart wait event from Client to IPC.
Previously the type of this wait event was Client. But while this
wait event is being reported, walreceiver process is waiting for
the startup process to set initial data for streaming replication.
It's not waiting for any activity on a socket connected to a user
application or walsender. So this commit changes the type for
WalReceiverWaitStart wait event to IPC.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/cdacc27c-37ff-f1a4-20e2-ce19933abfcc@oss.nttdata.com
2021-03-23 10:09:42 +09:00
Fujii Masao 51893c8463 pg_waldump: Fix bug in per-record statistics.
pg_waldump --stats=record identifies a record by a combination
of the RmgrId and the four bits of the xl_info field of the record.
But XACT records use the first bit of those four bits for an optional
flag variable, and the following three bits for the opcode to
identify a record. So previously the same type of XACT record
could have different four bits (three bits are the same but the
first one bit is different), and which could cause
pg_waldump --stats=record to show two lines of per-record statistics
for the same XACT record. This is a bug.

This commit changes pg_waldump --stats=record so that it processes
only XACT record differently, i.e., filters the opcode out of xl_info
and uses a combination of the RmgrId and those three bits as
the identifier of a record, only for XACT record. For other records,
the four bits of the xl_info field are still used.

Back-patch to all supported branches.

Author: Kyotaro Horiguchi
Reviewed-by: Shinya Kato, Fujii Masao
Discussion: https://postgr.es/m/2020100913412132258847@highgo.ca
2021-03-23 09:53:08 +09:00
Bruce Momjian 95d77149c5 Add macro RelationIsPermanent() to report relation permanence
Previously, to check relation permanence, the Relation's Form_pg_class
structure member relpersistence was compared to the value
RELPERSISTENCE_PERMANENT ("p"). This commit adds the macro
RelationIsPermanent() and is used in appropirate places to simplify the
code.  This matches other RelationIs* macros.

This macro will be used in more places in future cluster file encryption
patches.

Discussion: https://postgr.es/m/20210318153134.GH20766@tamriel.snowman.net
2021-03-22 20:23:52 -04:00
Tomas Vondra 8e4b332e88 Optimize allocations in bringetbitmap
The bringetbitmap function allocates memory for various purposes, which
may be quite expensive, depending on the number of scan keys. Instead of
allocating them separately, allocate one bit chunk of memory an carve it
into smaller pieces as needed - all the pieces have the same lifespan,
and it saves quite a bit of CPU and memory overhead.

Author: Tomas Vondra <tomas.vondra@postgresql.org>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Masahiko Sawada <masahiko.sawada@enterprisedb.com>
Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://postgr.es/m/c1138ead-7668-f0e1-0638-c3be3237e812@2ndquadrant.com
2021-03-23 00:47:09 +01:00
Tomas Vondra 72ccf55cb9 Move IS [NOT] NULL handling from BRIN support functions
The handling of IS [NOT] NULL clauses is independent of an opclass, and
most of the code was exactly the same in both minmax and inclusion. So
instead move the code from support procedures to the AM.

This simplifies the code - especially the support procedures - quite a
bit, as they don't need to care about NULL values and flags at all. It
also means the IS [NOT] NULL clauses can be evaluated without invoking
the support procedure.

Author: Tomas Vondra <tomas.vondra@postgresql.org>
Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Reviewed-by: Nikita Glukhov <n.gluhov@postgrespro.ru>
Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Masahiko Sawada <masahiko.sawada@enterprisedb.com>
Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://postgr.es/m/c1138ead-7668-f0e1-0638-c3be3237e812@2ndquadrant.com
2021-03-23 00:45:42 +01:00
Tomas Vondra a1c649d889 Pass all scan keys to BRIN consistent function at once
This commit changes how we pass scan keys to BRIN consistent function.
Instead of passing them one by one, we now pass all scan keys for a
given attribute at once. That makes the consistent function a bit more
complex, as it has to loop through the keys, but it does allow more
elaborate opclasses that can use multiple keys to eliminate ranges much
more effectively.

The existing BRIN opclasses (minmax, inclusion) don't really benefit
from this change. The primary purpose is to allow future opclases to
benefit from seeing all keys at once.

This does change the BRIN API, because the signature of the consistent
function changes (a new parameter with number of scan keys). So this
breaks existing opclasses, and will require supporting two variants of
the code for different PostgreSQL versions. We've considered supporting
two variants of the consistent, but we've decided not to do that.
Firstly, there's another patch that moves handling of NULL values from
the opclass, which means the opclasses need to be updated anyway.
Secondly, we're not aware of any out-of-core BRIN opclasses, so it does
not seem worth the extra complexity.

Bump catversion, because of pg_proc changes.

Author: Tomas Vondra <tomas.vondra@postgresql.org>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Reviewed-by: Nikita Glukhov <n.gluhov@postgrespro.ru>
Discussion: https://postgr.es/m/c1138ead-7668-f0e1-0638-c3be3237e812@2ndquadrant.com
2021-03-23 00:45:03 +01:00
Tomas Vondra bfa2cee784 Move bsearch_arg to src/port
Until now the bsearch_arg function was used only in extended statistics
code, so it was defined in that code.  But we already have qsort_arg in
src/port, so let's move it next to it.
2021-03-23 00:11:22 +01:00
Tom Lane 063dd37ebc Short-circuit slice requests that are for more than the object's size.
substring(), and perhaps other callers, isn't careful to pass a
slice length that is no more than the datum's true size.  Since
toast_decompress_datum_slice's children will palloc the requested
slice length, this can waste memory.  Also, close study of the liblz4
documentation suggests that it is dependent on the caller to not ask
for more than the correct amount of decompressed data; this squares
with observed misbehavior with liblz4 1.8.3.  Avoid these problems
by switching to the normal full-decompression code path if the
slice request is >= datum's decompressed size.

Tom Lane and Dilip Kumar

Discussion: https://postgr.es/m/507597.1616370729@sss.pgh.pa.us
2021-03-22 14:01:20 -04:00
Tom Lane aeb1631ed2 Mostly-cosmetic adjustments of TOAST-related macros.
The authors of bbe0a81db hadn't quite got the idea that macros named
like SOMETHING_4B_C were only meant for internal endianness-related
details in postgres.h.  Choose more legible names for macros that are
intended to be used elsewhere.  Rearrange postgres.h a bit to clarify
the separation between those internal macros and ones intended for
wider use.

Also, avoid using the term "rawsize" for true decompressed size;
we've used "extsize" for that, because "rawsize" generally denotes
total Datum size including header.  This choice seemed particularly
unfortunate in tests that were comparing one of these meanings to
the other.

This patch includes a couple of not-purely-cosmetic changes: be
sure that the shifts aligning compression methods are unsigned
(not critical today, but will be when compression method 2 exists),
and fix broken definition of VARATT_EXTERNAL_GET_COMPRESSION (now
VARATT_EXTERNAL_GET_COMPRESS_METHOD), whose callers worked only
accidentally.

Discussion: https://postgr.es/m/574197.1616428079@sss.pgh.pa.us
2021-03-22 13:43:10 -04:00
Tom Lane 2c75f8a612 Remove useless configure probe for <lz4/lz4.h>.
This seems to have been just copied-and-pasted from some other
header checks.  But our C code is entirely unprepared to support
such a header name, so it's only wasting cycles to look for it.
If we did need to support it, some #ifdefs would be required.

(A quick trawl at codesearch.debian.net finds some packages that
reference lz4/lz4.h; but they use *only* that spelling, and
appear to be intending to reference their own copy rather than
a system-level installation of liblz4.  There's no evidence of
freestanding installations that require this spelling.)

Discussion: https://postgr.es/m/457962.1616362509@sss.pgh.pa.us
2021-03-22 11:20:44 -04:00
Robert Haas a4d5284a10 Error on invalid TOAST compression in CREATE or ALTER TABLE.
The previous coding treated an invalid compression method name as
equivalent to the default, which is certainly not right.

Justin Pryzby

Discussion: http://postgr.es/m/20210321235544.GD4203@telsasoft.com
2021-03-22 10:57:08 -04:00
Robert Haas 226e2be387 More code cleanup for configurable TOAST compression.
Remove unused macro. Fix confusion about whether a TOAST compression
method is identified by an OID or a char.

Justin Pryzby

Discussion: http://postgr.es/m/20210321235544.GD4203@telsasoft.com
2021-03-22 09:21:37 -04:00
Michael Paquier 909b449e00 Fix concurrency issues with WAL segment recycling on Windows
This commit is mostly a revert of aaa3aed, that switched the routine
doing the internal renaming of recycled WAL segments to use on Windows a
combination of CreateHardLinkA() plus unlink() instead of rename().  As
reported by several users of Postgres 13, this is causing concurrency
issues when manipulating WAL segments, mostly in the shape of the
following error:
LOG:  could not rename file "pg_wal/000000XX000000YY000000ZZ":
Permission denied

This moves back to a logic where a single rename() (well, pgrename() for
Windows) is used.  This issue has proved to be hard to hit when I tested
it, facing it only once with an archive_command that was not able to do
its work, so it is environment-sensitive.  The reporters of this issue
have been able to confirm that the situation improved once we switched
back to a single rename().  In order to check things, I have provided to
the reporters a patched build based on 13.2 with aaa3aed reverted, to
test if the error goes away, and an unpatched build of 13.2 to test if
the error still showed up (just to make sure that I did not mess up my
build process).

Extra thanks to Fujii Masao for pointing out what looked like the
culprit commit, and to all the reporters for taking the time to test
what I have sent them.

Reported-by: Andrus, Guy Burgess, Yaroslav Pashinsky, Thomas Trenz
Reviewed-by: Tom Lane, Andres Freund
Discussion: https://postgr.es/m/3861ff1e-0923-7838-e826-094cc9bef737@hot.ee
Discussion: https://postgr.es/m/16874-c3eecd319e36a2bf@postgresql.org
Discussion: https://postgr.es/m/095ccf8d-7f58-d928-427c-b17ace23cae6@burgess.co.nz
Discussion: https://postgr.es/m/16927-67c570d968c99567%40postgresql.org
Discussion: https://postgr.es/m/YFBcRbnBiPdGZvfW@paquier.xyz
Backpatch-through: 13
2021-03-22 14:02:26 +09:00
Fujii Masao 8c6eda2d1c pgbench: Improve error-handling in \sleep command.
This commit improves pgbench \sleep command so that it handles
the following three cases more properly.

(1) When only one argument was specified in \sleep command and
    it's not a number, previously pgbench reported a confusing error
    message like "unrecognized time unit, must be us, ms or s".
    This commit fixes this so that more proper error message like
    "invalid sleep time, must be an integer" is reported.

(2) When two arguments were specified in \sleep command and
    the first argument was not a number, previously pgbench treated
    that argument as the sleep time 0. No error was reported in this
    case. This commit fixes this so that an error is thrown in this
    case.

(3) When a variable was specified as the first argument in \sleep
    command and the variable stored non-digit value, previously
    pgbench treated that argument as the sleep time 0. No error
    was reported in this case. This commit fixes this so that
    an error is thrown in this case.

Author: Kota Miyake
Reviewed-by: Hayato Kuroda, Alvaro Herrera, Fujii Masao
Discussion: https://postgr.es/m/23b254daf20cec4332a2d9168505dbc9@oss.nttdata.com
2021-03-22 12:04:48 +09:00
Noah Misch e3f4aec027 Make a test endure log_error_verbosity=verbose.
Back-patch to v13, which introduced the test code in question.
2021-03-21 19:09:29 -07:00
Michael Paquier 992d353a19 Fix new TAP test for 2PC transactions and PITRs on Windows
The test added by 595b9cb forgot that on Windows it is necessary to set
up pg_hba.conf (see PostgresNode::set_replication_conf) with a specific
entry or base backups fail.  Any node that requires to support
replication just needs to pass down allows_streaming at initialization.
This updates the test to do so.  Simplify things a bit while on it.

Per buildfarm member fairywren.  Any Windows hosts running this test
would have failed, and I have reproduced the problem as well.

Backpatch-through: 10
2021-03-22 09:51:05 +09:00
Michael Paquier 11e1577a57 Simplify TAP tests of kerberos with expected log file contents
The TAP tests of kerberos rely on the logs generated by the backend to
check various connection scenarios.  In order to make sure that a given
test does not overlap with the log contents generated by a previous
test, the test suite relied on a logic with the logging collector and a
rotation of the log files to ensure the uniqueness of the log generated
with a wait phase.

Parsing the log contents for expected patterns is a problem that has
been solved in a simpler way by PostgresNode::issues_sql_like() where
the log file is truncated before checking for the contents generated,
with the backend sending its output to a log file given by pg_ctl
instead.  This commit switches the kerberos test suite to use such a
method, removing any wait phase and simplifying the whole logic,
resulting in less code.  If a failure happens in the tests, the contents
of the logs are still showed to the user at the moment of the failure
thanks to like(), so this has no impact on debugging capabilities.

I have bumped into this issue while reviewing a different patch set
aiming at extending the kerberos test suite to check for multiple
log patterns instead of one now.

Author: Michael Paquier
Reviewed-by: Stephen Frost, Bharath Rupireddy
Discussion: https://postgr.es/m/YFXcq2vBTDGQVBNC@paquier.xyz
2021-03-22 08:59:43 +09:00
Michael Paquier 595b9cba2a Fix timeline assignment in checkpoints with 2PC transactions
Any transactions found as still prepared by a checkpoint have their
state data read from the WAL records generated by PREPARE TRANSACTION
before being moved into their new location within pg_twophase/.  While
reading such records, the WAL reader uses the callback
read_local_xlog_page() to read a page, that is shared across various
parts of the system.  This callback, since 1148e22a, has introduced an
update of ThisTimeLineID when reading a record while in recovery, which
is potentially helpful in the context of cascading WAL senders.

This update of ThisTimeLineID interacts badly with the checkpointer if a
promotion happens while some 2PC data is read from its record, as, by
changing ThisTimeLineID, any follow-up WAL records would be written to
an timeline older than the promoted one.  This results in consistency
issues.  For instance, a subsequent server restart would cause a failure
in finding a valid checkpoint record, resulting in a PANIC, for
instance.

This commit changes the code reading the 2PC data to reset the timeline
once the 2PC record has been read, to prevent messing up with the static
state of the checkpointer.  It would be tempting to do the same thing
directly in read_local_xlog_page().  However, based on the discussion
that has led to 1148e22a, users may rely on the updates of
ThisTimeLineID when a WAL record page is read in recovery, so changing
this callback could break some cases that are working currently.

A TAP test reproducing the issue is added, relying on a PITR to
precisely trigger a promotion with a prepared transaction still
tracked.

Per discussion with Heikki Linnakangas, Kyotaro Horiguchi, Fujii Masao
and myself.

Author: Soumyadeep Chakraborty, Jimmy Yih, Kevin Yeap
Discussion: https://postgr.es/m/CAE-ML+_EjH_fzfq1F3RJ1=XaaNG=-Jz-i3JqkNhXiLAsM3z-Ew@mail.gmail.com
Backpatch-through: 10
2021-03-22 08:30:53 +09:00
Tom Lane ac897c4834 Fix assorted silliness in ATExecSetCompression().
It's not okay to scribble directly on a syscache entry.
Nor to continue accessing said entry after releasing it.

Also get rid of not-used local variables.

Per valgrind testing.
2021-03-21 18:43:07 -04:00
Peter Geoghegan 9dd963ae25 Recycle nbtree pages deleted during same VACUUM.
Maintain a simple array of metadata about pages that were deleted during
nbtree VACUUM's current btvacuumscan() call.  Use this metadata at the
end of btvacuumscan() to attempt to place newly deleted pages in the FSM
without further delay.  It might not yet be safe to place any of the
pages in the FSM by then (they may not be deemed recyclable), but we
have little to lose and plenty to gain by trying.  In practice there is
a very good chance that this will work out when vacuuming larger
indexes, where scanning the index naturally takes quite a while.

This commit doesn't change the page recycling invariants; it merely
improves the efficiency of page recycling within the confines of the
existing design.  Recycle safety is a part of nbtree's implementation of
what Lanin & Shasha call "the drain technique".  The design happens to
use transaction IDs (they're stored in deleted pages), but that in
itself doesn't align the cutoff for recycle safety to any of the
XID-based cutoffs used by VACUUM (e.g., OldestXmin).  All that matters
is whether or not _other_ backends might be able to observe various
inconsistencies in the tree structure (that they cannot just detect and
recover from by moving right).  Recycle safety is purely a question of
maintaining the consistency (or the apparent consistency) of a physical
data structure.

Note that running a simple serial test case involving a large range
DELETE followed by a VACUUM VERBOSE will probably show that any newly
deleted nbtree pages are not yet reusable/recyclable.  This is expected
in the absence of even one concurrent XID assignment.  It is an old
implementation restriction.  In practice it's unlikely to be the thing
that makes recycling remain unsafe, at least with larger indexes, where
recycling newly deleted pages during the same VACUUM actually matters.

An important high-level goal of this commit (as well as related recent
commits e5d8a999 and 9f3665fb) is to make expensive deferred cleanup
operations in index AMs rare in general.  If index vacuuming frequently
depends on the next VACUUM operation finishing off work that the current
operation started, then the general behavior of index vacuuming is hard
to predict.  This is relevant to ongoing work that adds a vacuumlazy.c
mechanism to skip index vacuuming in certain cases.  Anything that makes
the real world behavior of index vacuuming simpler and more linear will
also make top-down modeling in vacuumlazy.c more robust.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wzk76_P=67iUscb1UN44-gyZL-KgpsXbSxq_bdcMa7Q+wQ@mail.gmail.com
2021-03-21 15:25:39 -07:00
Tom Lane 4d399a6fbe Bring configure support for LZ4 up to snuff.
It's not okay to just shove the pkg_config results right into our
build flags, for a couple different reasons:

* This fails to maintain the separation between CPPFLAGS and CFLAGS,
as well as that between LDFLAGS and LIBS.  (The CPPFLAGS angle is,
I believe, the reason for warning messages reported when building
with MacPorts' liblz4.)

* If pkg_config emits anything other than -I/-D/-L/-l switches,
it's highly unlikely that we want to absorb those.  That'd be more
likely to break the build than do anything helpful.  (Even the -D
case is questionable; but we're doing that for libxml2, so I kept it.)

Also, it's not okay to skip doing an AC_CHECK_LIB probe, as
evidenced by recent build failure on topminnow; that should
have been caught at configure time.

Model fixes for this on configure's libxml2 support.

It appears that somebody overlooked an autoheader run, too.

Discussion: https://postgr.es/m/20210119190720.GL8560@telsasoft.com
2021-03-21 17:20:17 -04:00
Tom Lane fd1ac9a548 Make compression.sql regression test independent of default.
This test will fail in "make installcheck" if the installation's
default_toast_compression setting is not 'pglz'.  Make it robust
against that situation.

Dilip Kumar

Discussion: https://postgr.es/m/CAFiTN-t0w+Rc2U3S+y=7KWcLuOYNB5MfWeGdNa7+pg0UovVdcQ@mail.gmail.com
2021-03-21 16:26:44 -04:00
Andrew Dunstan ef82387384 Don't run recover crash_temp_files test in Windows perl
This reverts commit 677271a3a1.
"Unbreak recovery test on Windows"

The test hangs on Windows, and attempts to remedy the problem have
proved fragile at best. So we simply disable the test on Windows perl.
(Msys perl seems perfectly happy).

Discussion: https://postgr.es/m/5b748470-7335-5439-e876-6a88c951e1c5@dunslane.net
2021-03-21 15:12:03 -04:00
Alvaro Herrera 2b526ed2e1
Fix new memory leaks in libpq
My oversight in commit 9aa491abbf.

Per coverity.
2021-03-21 14:55:27 -03:00
Andrew Dunstan 677271a3a1 Unbreak recovery test on Windows
On Windows we need to send explicit quit messages to psql or the TAP tests
can hang.
2021-03-21 11:56:09 -04:00
Tom Lane 9fb9691a88 Suppress various new compiler warnings.
Compilers that don't understand that elog(ERROR) doesn't return
issued warnings here.  In the cases in libpq_pipeline.c, we were
not exactly helping things by failing to mark pg_fatal() as noreturn.

Per buildfarm.
2021-03-21 11:50:43 -04:00
Peter Eisentraut 96ae658e62 Move lwlock-release probe back where it belongs
The documentation specifically states that lwlock-release fires before
any released waiters have been awakened.  It worked that way until
ab5194e6f6, where is seems to have been
misplaced accidentally.  Move it back where it belongs.

Author: Craig Ringer <craig.ringer@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/CAGRY4nwxKUS_RvXFW-ugrZBYxPFFM5kjwKT5O+0+Stuga5b4+Q@mail.gmail.com
2021-03-21 08:02:30 +01:00
Tomas Vondra 882b2cdc08 Use valid compression method in brin_form_tuple
When compressing the BRIN summary, we can't simply use the compression
method from the indexed attribute.  The summary may use a different data
type, e.g. fixed-length attribute may have varlena summary, leading to
compression failures.  For the built-in BRIN opclasses this happens to
work, because the summary uses the same data type as the attribute.

When the data types match, we can inherit use the compression method
specified for the attribute (it's copied into the index descriptor).
Otherwise we don't have much choice and have to use the default one.

Author: Tomas Vondra
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/e0367f27-392c-321a-7411-a58e1a7e4817%40enterprisedb.com
2021-03-21 00:28:34 +01:00
Tom Lane aa25d1089a Fix up pg_dump's handling of per-attribute compression options.
The approach used in commit bbe0a81db would've been disastrous for
portability of dumps.  Instead handle non-default compression options
in separate ALTER TABLE commands.  This reduces chatter for the
common case where most columns are compressed the same way, and it
makes it possible to restore the dump to a server that lacks any
knowledge of per-attribute compression options (so long as you're
willing to ignore syntax errors from the ALTER TABLE commands).

There's a whole lot left to do to mop up after bbe0a81db, but
I'm fast-tracking this part because we need to see if it's
enough to make the buildfarm's cross-version-upgrade tests happy.

Justin Pryzby and Tom Lane

Discussion: https://postgr.es/m/20210119190720.GL8560@telsasoft.com
2021-03-20 15:01:10 -04:00
Tom Lane e835e89a0f Fix memory leak when rejecting bogus DH parameters.
While back-patching e0e569e1d, I noted that there were some other
places where we ought to be applying DH_free(); namely, where we
load some DH parameters from a file and then reject them as not
being sufficiently secure.  While it seems really unlikely that
anybody would hit these code paths in production, let alone do
so repeatedly, let's fix it for consistency.

Back-patch to v10 where this code was introduced.

Discussion: https://postgr.es/m/16160-18367e56e9a28264@postgresql.org
2021-03-20 12:47:21 -04:00
Tom Lane f0c2a5bba6 Avoid leaking memory in RestoreGUCState(), and improve comments.
RestoreGUCState applied InitializeOneGUCOption to already-live
GUC entries, causing any malloc'd subsidiary data to be forgotten.
We do want the effect of resetting the GUC to its compiled-in
default, and InitializeOneGUCOption seems like the best way to do
that, so add code to free any existing subsidiary data beforehand.

The interaction between can_skip_gucvar, SerializeGUCState, and
RestoreGUCState is way more subtle than their opaque comments
would suggest to an unwary reader.  Rewrite and enlarge the
comments to try to make it clearer what's happening.

Remove a long-obsolete assertion in read_nondefault_variables: the
behavior of set_config_option hasn't depended on IsInitProcessingMode
since f5d9698a8 installed a better way of controlling it.

Although this is fixing a clear memory leak, the leak is quite unlikely
to involve any large amount of data, and it can only happen once in the
lifetime of a worker process.  So it seems unnecessary to take any
risk of back-patching.

Discussion: https://postgr.es/m/4105247.1616174862@sss.pgh.pa.us
2021-03-19 23:03:17 -04:00
Thomas Munro 61752afb26 Provide recovery_init_sync_method=syncfs.
Since commit 2ce439f3 we have opened every file in the data directory
and called fsync() at the start of crash recovery.  This can be very
slow if there are many files, leading to field complaints of systems
taking minutes or even hours to begin crash recovery.

Provide an alternative method, for Linux only, where we call syncfs() on
every possibly different filesystem under the data directory.  This is
equivalent, but avoids faulting in potentially many inodes from
potentially slow storage.

The new mode comes with some caveats, described in the documentation, so
the default value for the new setting is "fsync", preserving the older
behavior.

Reported-by: Michael Brown <michael.brown@discourse.org>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Paul Guo <guopa@vmware.com>
Reviewed-by: Bruce Momjian <bruce@momjian.us>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: David Steele <david@pgmasters.net>
Discussion: https://postgr.es/m/11bc2bb7-ecb5-3ad0-b39f-df632734cd81%40discourse.org
Discussion: https://postgr.es/m/CAEET0ZHGnbXmi8yF3ywsDZvb3m9CbdsGZgfTXscQ6agcbzcZAw%40mail.gmail.com
2021-03-20 12:07:28 +13:00
Tomas Vondra b822ae13ea Use lfirst_int in cmp_list_len_contents_asc
The function added in be45be9c33 is comparing integer lists (IntList) by
length and contents, but there were two bugs.  Firstly, it used intVal()
to extract the value, but that's for Value nodes, not for extracting int
values from IntList.  Secondly, it called it directly on the ListCell,
without doing lfirst().  So just do lfirst_int() instead.

Interestingly enough, this did not cause any crashes on the buildfarm,
but valgrind rightfully complained about it.

Discussion: https://postgr.es/m/bf3805a8-d7d1-ae61-fece-761b7ff41ecc@postgresfriends.org
2021-03-20 00:04:25 +01:00
Robert Haas d00fbdc431 Fix use-after-ReleaseSysCache problem in ATExecAlterColumnType.
Introduced by commit bbe0a81db6.

Per buildfarm member prion.
2021-03-19 17:17:48 -04:00
Robert Haas bbe0a81db6 Allow configurable LZ4 TOAST compression.
There is now a per-column COMPRESSION option which can be set to pglz
(the default, and the only option in up until now) or lz4. Or, if you
like, you can set the new default_toast_compression GUC to lz4, and
then that will be the default for new table columns for which no value
is specified. We don't have lz4 support in the PostgreSQL code, so
to use lz4 compression, PostgreSQL must be built --with-lz4.

In general, TOAST compression means compression of individual column
values, not the whole tuple, and those values can either be compressed
inline within the tuple or compressed and then stored externally in
the TOAST table, so those properties also apply to this feature.

Prior to this commit, a TOAST pointer has two unused bits as part of
the va_extsize field, and a compessed datum has two unused bits as
part of the va_rawsize field. These bits are unused because the length
of a varlena is limited to 1GB; we now use them to indicate the
compression type that was used. This means we only have bit space for
2 more built-in compresison types, but we could work around that
problem, if necessary, by introducing a new vartag_external value for
any further types we end up wanting to add. Hopefully, it won't be
too important to offer a wide selection of algorithms here, since
each one we add not only takes more coding but also adds a build
dependency for every packager. Nevertheless, it seems worth doing
at least this much, because LZ4 gets better compression than PGLZ
with less CPU usage.

It's possible for LZ4-compressed datums to leak into composite type
values stored on disk, just as it is for PGLZ. It's also possible for
LZ4-compressed attributes to be copied into a different table via SQL
commands such as CREATE TABLE AS or INSERT .. SELECT.  It would be
expensive to force such values to be decompressed, so PostgreSQL has
never done so. For the same reasons, we also don't force recompression
of already-compressed values even if the target table prefers a
different compression method than was used for the source data.  These
architectural decisions are perhaps arguable but revisiting them is
well beyond the scope of what seemed possible to do as part of this
project.  However, it's relatively cheap to recompress as part of
VACUUM FULL or CLUSTER, so this commit adjusts those commands to do
so, if the configured compression method of the table happens not to
match what was used for some column value stored therein.

Dilip Kumar. The original patches on which this work was based were
written by Ildus Kurbangaliev, and those were patches were based on
even earlier work by Nikita Glukhov, but the design has since changed
very substantially, since allow a potentially large number of
compression methods that could be added and dropped on a running
system proved too problematic given some of the architectural issues
mentioned above; the choice of which specific compression method to
add first is now different; and a lot of the code has been heavily
refactored.  More recently, Justin Przyby helped quite a bit with
testing and reviewing and this version also includes some code
contributions from him. Other design input and review from Tomas
Vondra, Álvaro Herrera, Andres Freund, Oleg Bartunov, Alexander
Korotkov, and me.

Discussion: http://postgr.es/m/20170907194236.4cefce96%40wp.localdomain
Discussion: http://postgr.es/m/CAFiTN-uUpX3ck%3DK0mLEk-G_kUQY%3DSNOTeqdaNRR9FMdQrHKebw%40mail.gmail.com
2021-03-19 15:10:38 -04:00
Tomas Vondra e589c4890b Fix race condition in remove_temp_files_after_crash TAP test
The TAP test was written so that it was not waiting for the correct SQL
command, but for output from the preceding one.  This resulted in race
conditions, allowing the commands to run in a different order, not block
as expected and so on.  This fixes it by inverting the order of commands
where possible, so that observing the output guarantees the data was
inserted properly, and waiting for a lock to appear in pg_locks.

Discussion: https://postgr.es/m/CAH503wDKdYzyq7U-QJqGn%3DGm6XmoK%2B6_6xTJ-Yn5WSvoHLY1Ww%40mail.gmail.com
2021-03-19 18:12:39 +01:00
Tom Lane 27ab1981e7 Blindly try to fix test script's tar invocation for MSYS.
Buildfarm member fairywren doesn't like the test case I added
in commit 081876d75.  I'm guessing the reason is that I shouldn't
be using a perl2host-ified path in the tar command line.
2021-03-18 22:43:11 -04:00
Fujii Masao fd31214075 Fix comments in postmaster.c.
Commit 86c23a6eb2 changed the option to specify that postgres will
stop all other server processes by sending the signal SIGSTOP,
from -s to -T. But previously there were comments incorrectly
explaining that SIGSTOP behavior is set by -s option. This commit
fixes them.

Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20210316.165141.1400441966284654043.horikyota.ntt@gmail.com
2021-03-19 11:28:54 +09:00
Tom Lane 9bacdf9f53 Don't leak malloc'd error string in libpqrcv_check_conninfo().
We leaked the error report from PQconninfoParse, when there was
one.  It seems unlikely that real usage patterns would repeat
the failure often enough to create serious bloat, but let's
back-patch anyway to keep the code similar in all branches.

Found via valgrind testing.
Back-patch to v10 where this code was added.

Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us
2021-03-18 22:22:47 -04:00
Tom Lane 377b7a8300 Don't leak malloc'd strings when a GUC setting is rejected.
Because guc.c prefers to keep all its string values in malloc'd
not palloc'd storage, it has to be more careful than usual to
avoid leaks.  Error exits out of string GUC hook checks failed
to clear the proposed value string, and error exits out of
ProcessGUCArray() failed to clear the malloc'd results of
ParseLongOption().

Found via valgrind testing.
This problem is ancient, so back-patch to all supported branches.

Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us
2021-03-18 22:22:47 -04:00
Tom Lane d303849b05 Don't leak compiled regex(es) when an ispell cache entry is dropped.
The text search cache mechanisms assume that we can clean up
an invalidated dictionary cache entry simply by resetting the
associated long-lived memory context.  However, that does not work
for ispell affixes that make use of regular expressions, because
the regex library deals in plain old malloc.  Hence, we leaked
compiled regex(es) any time we dropped such a cache entry.  That
could quickly add up, since even a fairly trivial regex can use up
tens of kB, and a large one can eat megabytes.  Add a memory context
callback to ensure that a regex gets freed when its owning cache
entry is cleared.

Found via valgrind testing.
This problem is ancient, so back-patch to all supported branches.

Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us
2021-03-18 22:22:47 -04:00
Tom Lane 415ffdc220 Don't run RelationInitTableAccessMethod in a long-lived context.
Some code paths in this function perform syscache lookups, which
can lead to table accesses and possibly leakage of cruft into
the caller's context.  If said context is CacheMemoryContext,
we eventually will have visible bloat.  But fixing this is no
harder than moving one memory context switch step.  (The other
callers don't have a problem.)

Andres Freund and I independently found this via valgrind testing.
Back-patch to v12 where this code was added.

Discussion: https://postgr.es/m/20210317023101.anvejcfotwka6gaa@alap3.anarazel.de
Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us
2021-03-18 22:22:47 -04:00
Tom Lane 28644fac10 Don't leak rd_statlist when a relcache entry is dropped.
Although these lists are usually NIL, and even when not empty
are unlikely to be large, constant relcache update traffic could
eventually result in visible bloat of CacheMemoryContext.

Found via valgrind testing.
Back-patch to v10 where this field was added.

Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us
2021-03-18 22:22:47 -04:00
Tomas Vondra a16b2b960f Fix TAP test for remove_temp_files_after_crash
The test included in cd91de0d17 had two simple flaws.

Firstly, the number of rows was low and on some platforms (e.g. 32-bit)
the sort did not require on-disk sort, so on those machines it was not
testing the automatic removal.  The test was however failing, because
without any temporary files the base/pgsql_tmp directory was not even
created.  Fixed by increasing the rowcount to 5000, which should be high
engough on any platform.

Secondly, the test used a simple sleep to wait for the temporary file to
be created.  This is obviously problematic, because on slow machines (or
with valgrind, CLOBBER_CACHE_ALWAYS etc.) it may take a while to create
the temporary file.  But we also want the tests run reasonably fast.
Fixed by instead relying on a UNIQUE constraint, blocking the query that
created the temporary file.

Author: Euler Taveira
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CAH503wDKdYzyq7U-QJqGn%3DGm6XmoK%2B6_6xTJ-Yn5WSvoHLY1Ww%40mail.gmail.com
2021-03-19 02:17:13 +01:00
Michael Paquier 5b2266e33f Improve tab completion of IMPORT FOREIGN SCHEMA with \h in psql
Only "IMPORT" was showing as result of the completion, while IMPORT
FOREIGN SCHEMA is the only command using this keyword in first
position.  This changes the completion to show the full command name
instead of just "IMPORT".

Reviewed-by: Georgios Kokolatos, Julien Rouhaud
Discussion: https://postgr.es/m/YFL6JneBiuMWYyoh@paquier.xyz
2021-03-19 09:18:41 +09:00
Tom Lane 1d581ce712 Fix misuse of foreach_delete_current().
Our coding convention requires this macro's result to be assigned
back to the original List variable.  In this usage, since the
List could not become empty, there was no actual bug --- but
some compilers warned about it.  Oversight in be45be9c3.

Discussion: https://postgr.es/m/35077b31-2d62-1e31-0e2e-ddb52d590b73@enterprisedb.com
2021-03-18 19:24:22 -04:00
Tomas Vondra be45be9c33 Implement GROUP BY DISTINCT
With grouping sets, it's possible that some of the grouping sets are
duplicate.  This is especially common with CUBE and ROLLUP clauses. For
example GROUP BY CUBE (a,b), CUBE (b,c) is equivalent to

  GROUP BY GROUPING SETS (
    (a, b, c),
    (a, b, c),
    (a, b, c),
    (a, b),
    (a, b),
    (a, b),
    (a),
    (a),
    (a),
    (c, a),
    (c, a),
    (c, a),
    (c),
    (b, c),
    (b),
    ()
  )

Some of the grouping sets are calculated multiple times, which is mostly
unnecessary.  This commit implements a new GROUP BY DISTINCT feature, as
defined in the SQL standard, which eliminates the duplicate sets.

Author: Vik Fearing
Reviewed-by: Erik Rijkers, Georgios Kokolatos, Tomas Vondra
Discussion: https://postgr.es/m/bf3805a8-d7d1-ae61-fece-761b7ff41ecc@postgresfriends.org
2021-03-18 18:22:18 +01:00
Tomas Vondra cd91de0d17 Remove temporary files after backend crash
After a crash of a backend using temporary files, the files used to be
left behind, on the basis that it might be useful for debugging. But we
don't have any reports of anyone actually doing that, and it means the
disk usage may grow over time due to repeated backend failures (possibly
even hitting ENOSPC). So this behavior is a bit unfortunate, and fixing
it required either manual cleanup (deleting files, which is error-prone)
or restart of the instance (i.e. service disruption).

This implements automatic cleanup of temporary files, controled by a new
GUC remove_temp_files_after_crash. By default the files are removed, but
it can be disabled to restore the old behavior if needed.

Author: Euler Taveira
Reviewed-by: Tomas Vondra, Michael Paquier, Anastasia Lubennikova, Thomas Munro
Discussion: https://postgr.es/m/CAH503wDKdYzyq7U-QJqGn%3DGm6XmoK%2B6_6xTJ-Yn5WSvoHLY1Ww%40mail.gmail.com
2021-03-18 17:38:28 +01:00
Magnus Hagander da18d829c2 Fix function name in error hint
pg_read_file() is the function that's in core, pg_file_read() is in
adminpack. But when using pg_file_read() in adminpack it calls the *C*
level function pg_read_file() in core, which probably threw the original
author off. But the error hint should be about the SQL function.

Reported-By: Sergei Kornilov
Backpatch-through: 11
Discussion: https://postgr.es/m/373021616060475@mail.yandex.ru
2021-03-18 11:22:20 +01:00
Amit Kapila c8f78b6161 Add a new GUC and a reloption to enable inserts in parallel-mode.
Commit 05c8482f7f added the implementation of parallel SELECT for
"INSERT INTO ... SELECT ..." which may incur non-negligible overhead in
the additional parallel-safety checks that it performs, even when, in the
end, those checks determine that parallelism can't be used. This is
normally only ever a problem in the case of when the target table has a
large number of partitions.

A new GUC option "enable_parallel_insert" is added, to allow insert in
parallel-mode. The default is on.

In addition to the GUC option, the user may want a mechanism to allow
inserts in parallel-mode with finer granularity at table level. The new
table option "parallel_insert_enabled" allows this. The default is true.

Author: "Hou, Zhijie"
Reviewed-by: Greg Nancarrow, Amit Langote, Takayuki Tsunakawa, Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1K-cW7svLC2D7DHoGHxdAdg3P37BLgebqBOC2ZLc9a6QQ%40mail.gmail.com
Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com
2021-03-18 07:25:27 +05:30
Andres Freund 5f79580ad6 Fix memory lifetime issues of replication slot stats.
When accessing replication slot stats, introduced in 9868167500,
pgstat_read_statsfiles() reads the data into newly allocated
memory. Unfortunately the current memory context at that point is the
callers, leading to leaks and use-after-free dangers.

The fix is trivial, explicitly use pgStatLocalContext. There's some
potential for further improvements, but that's outside of the scope of
this bugfix.

No backpatch necessary, feature is only in HEAD.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210317230447.c7uc4g3vbs4wi32i@alap3.anarazel.de
2021-03-17 16:21:46 -07:00
Tom Lane 8620a7f6db Code review for server's handling of "tablespace map" files.
While looking at Robert Foggia's report, I noticed a passel of
other issues in the same area:

* The scheme for backslash-quoting newlines in pathnames is just
wrong; it will misbehave if the last ordinary character in a pathname
is a backslash.  I'm not sure why we're bothering to allow newlines
in tablespace paths, but if we're going to do it we should do it
without introducing other problems.  Hence, backslashes themselves
have to be backslashed too.

* The author hadn't read the sscanf man page very carefully, because
this code would drop any leading whitespace from the path.  (I doubt
that a tablespace path with leading whitespace could happen in
practice; but if we're bothering to allow newlines in the path, it
sure seems like leading whitespace is little less implausible.)  Using
sscanf for the task of finding the first space is overkill anyway.

* While I'm not 100% sure what the rationale for escaping both \r and
\n is, if the idea is to allow Windows newlines in the file then this
code failed, because it'd throw an error if it saw \r followed by \n.

* There's no cross-check for an incomplete final line in the map file,
which would be a likely apparent symptom of the improper-escaping
bug.

On the generation end, aside from the escaping issue we have:

* If needtblspcmapfile is true then do_pg_start_backup will pass back
escaped strings in tablespaceinfo->path values, which no caller wants
or is prepared to deal with.  I'm not sure if there's a live bug from
that, but it looks like there might be (given the dubious assumption
that anyone actually has newlines in their tablespace paths).

* It's not being very paranoid about the possibility of random stuff
in the pg_tblspc directory.  IMO we should ignore anything without an
OID-like name.

The escaping rule change doesn't seem back-patchable: it'll require
doubling of backslashes in the tablespace_map file, which is basically
a basebackup format change.  The odds of that causing trouble are
considerably more than the odds of the existing bug causing trouble.
The rest of this seems somewhat unlikely to cause problems too,
so no back-patch.
2021-03-17 16:18:46 -04:00
Tom Lane a50e4fd028 Prevent buffer overrun in read_tablespace_map().
Robert Foggia of Trustwave reported that read_tablespace_map()
fails to prevent an overrun of its on-stack input buffer.
Since the tablespace map file is presumed trustworthy, this does
not seem like an interesting security vulnerability, but still
we should fix it just in the name of robustness.

While here, document that pg_basebackup's --tablespace-mapping option
doesn't work with tar-format output, because it doesn't.  To make it
work, we'd have to modify the tablespace_map file within the tarball
sent by the server, which might be possible but I'm not volunteering.
(Less-painful solutions would require changing the basebackup protocol
so that the source server could adjust the map.  That's not very
appetizing either.)
2021-03-17 16:10:37 -04:00
Tom Lane 081876d75e Add end-to-end testing of pg_basebackup's tar-format output.
The existing test script does run pg_basebackup with the -Ft option,
but it makes no real attempt to verify the sanity of the results.
We wouldn't know if the output is incompatible with standard "tar"
programs, nor if the server fails to start from the restored output.
Notably, this means that xlog.c's read_tablespace_map() is not being
meaningfully tested, since that code is used only in the tar-format
case.  (We do have reasonable coverage of restoring from plain-format
output, though it's over in src/test/recovery not here.)

Hence, attempt to untar the output and start a server from it,
rather just hoping it's OK.

This test assumes that the local "tar" has the "-C directory"
switch.  Although that's not promised by POSIX, my research
suggests that all non-extinct tar implementations have it.
Should the buildfarm's opinion differ, we can complicate the
test a bit to avoid requiring that.

Possibly this should be back-patched, but I'm unsure about
whether it could work on Windows before d66b23b03.
2021-03-17 14:52:55 -04:00
Thomas Munro 7f7f25f15e Revert "Fix race in Parallel Hash Join batch cleanup."
This reverts commit 378802e371.
This reverts commit 3b8981b6e1.

Discussion: https://postgr.es/m/CA%2BhUKGJmcqAE3MZeDCLLXa62cWM0AJbKmp2JrJYaJ86bz36LFA%40mail.gmail.com
2021-03-18 01:10:55 +13:00
Michael Paquier 9fd2952cf4 Fix comment in indexing.c
578b229, that removed support for WITH OIDS, has changed
CatalogTupleInsert() to not return an Oid, but one comment was still
mentioning that.

Author: Vik Fearing
Discussion: https://postgr.es/m/fef01975-ed10-3601-7b9e-80ecef72d00b@postgresfriends.org
2021-03-17 18:07:00 +09:00
Peter Eisentraut e1ae40f381 Small error message improvement 2021-03-17 08:17:33 +01:00
Thomas Munro 378802e371 Update the names of Parallel Hash Join phases.
Commit 3048898e dropped -ING from some wait event names that correspond
to barrier phases.  Update the phases' names to match.

While we're here making cosmetic changes, also rename "DONE" to "FREE".
That pairs better with "ALLOCATE", and describes the activity that
actually happens in that phase (as we do for the other phases) rather
than describing a state.  The distinction is clearer after bugfix commit
3b8981b6 split the phase into two.  As for the growth barriers, rename
their "ALLOCATE" phase to "REALLOCATE", which is probably a better
description of what happens then.  Also improve the comments about
the phases a bit.

Discussion: https://postgr.es/m/CA%2BhUKG%2BMDpwF2Eo2LAvzd%3DpOh81wUTsrwU1uAwR-v6OGBB6%2B7g%40mail.gmail.com
2021-03-17 18:43:04 +13:00
Thomas Munro 3b8981b6e1 Fix race in Parallel Hash Join batch cleanup.
With very unlucky timing and parallel_leader_participation off, PHJ
could attempt to access per-batch state just as it was being freed.
There was code intended to prevent that by checking for a cleared
pointer, but it was buggy.

Fix, by introducing an extra barrier phase.  The new phase
PHJ_BUILD_RUNNING means that it's safe to access the per-batch state to
find a batch to help with, and PHJ_BUILD_DONE means that it is too late.
The last to detach will free the array of per-batch state as before, but
now it will also atomically advance the phase at the same time, so that
late attachers can avoid the hazard, without the data race.  This
mirrors the way per-batch hash tables are freed (see phases
PHJ_BATCH_PROBING and PHJ_BATCH_DONE).

Revealed by a one-off build farm failure, where BarrierAttach() failed a
sanity check assertion, because the memory had been clobbered by
dsa_free().

Back-patch to 11, where the code arrived.

Reported-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
2021-03-17 18:05:39 +13:00
Thomas Munro 3792959949 Fix transaction.sql tests in higher isolation levels.
It seems like a useful sanity check to be able to run "installcheck"
against a cluster running with default_transaction_level set to
serializable or repeatable read.  Only one thing currently fails in
those configurations, so let's fix that.

No back-patch for now, because it fails in many other places in some of
the stable branches.  We'd have to go back and fix those too if we
included this configuration in automated testing.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGJUaHeK%3DHLATxF1JOKDjKJVrBKA-zmbPAebOM0Se2FQRg%40mail.gmail.com
2021-03-17 17:26:20 +13:00
Amit Kapila 6b67d72b60 Fix race condition in drop subscription's handling of tablesync slots.
Commit ce0fdbfe97 made tablesync slots permanent and allow Drop
Subscription to drop such slots. However, it is possible that before
tablesync worker could get the acknowledgment of slot creation, drop
subscription stops it and that can lead to a dangling slot on the
publisher. Prevent cancel/die interrupts while creating a slot in the
tablesync worker.

Reported-by: Thomas Munro as per buildfarm
Author: Amit Kapila
Reviewed-by: Vignesh C, Takamichi Osumi
Discussion: https://postgr.es/m/CA+hUKGJG9dWpw1cOQ2nzWU8PHjm=PTraB+KgE5648K9nTfwvxg@mail.gmail.com
2021-03-17 08:15:12 +05:30
Thomas Munro 9e7ccd9ef6 Enable parallelism in REFRESH MATERIALIZED VIEW.
Pass CURSOR_OPT_PARALLEL_OK to pg_plan_query() so that parallel plans
are considered when running the underlying SELECT query.  This wasn't
done in commit e9baa5e9, which did this for CREATE MATERIALIZED VIEW,
because it wasn't yet known to be safe.

Since REFRESH always inserts into a freshly created table before later
merging or swapping the data into place with separate operations, we can
enable such plans here too.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Hou, Zhijie <houzj.fnst@cn.fujitsu.com>
Reviewed-by: Luc Vlaming <luc@swarm64.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CALj2ACXg-4hNKJC6nFnepRHYT4t5jJVstYvri%2BtKQHy7ydcr8A%40mail.gmail.com
2021-03-17 15:04:17 +13:00
Peter Geoghegan fbe4cb3bd4 Fix comment about promising tuples.
Oversight in commit d168b66682, which added bottom-up index deletion.
2021-03-16 13:38:52 -07:00
Tom Lane 4b12ab18c9 Avoid corner-case memory leak in SSL parameter processing.
After reading the root cert list from the ssl_ca_file, immediately
install it as client CA list of the new SSL context.  That gives the
SSL context ownership of the list, so that SSL_CTX_free will free it.
This avoids a permanent memory leak if we fail further down in
be_tls_init(), which could happen if bogus CRL data is offered.

The leak could only amount to something if the CRL parameters get
broken after server start (else we'd just quit) and then the server
is SIGHUP'd many times without fixing the CRL data.  That's rather
unlikely perhaps, but it seems worth fixing, if only because the
code is clearer this way.

While we're here, add some comments about the memory management
aspects of this logic.

Noted by Jelte Fennema and independently by Andres Freund.
Back-patch to v10; before commit de41869b6 it doesn't matter,
since we'd not re-execute this code during SIGHUP.

Discussion: https://postgr.es/m/16160-18367e56e9a28264@postgresql.org
2021-03-16 16:03:06 -04:00
Robert Haas 4078ce65a0 Fix a confusing amcheck corruption message.
Don't complain about the last TOAST chunk number being different
from what we expected if there are no TOAST chunks at all.
In such a case, saying that the final chunk number is 0 is not
really accurate, and the fact the value is missing from the
TOAST table is reported separately anyway.

Mark Dilger

Discussion: http://postgr.es/m/AA5506CE-7D2A-42E4-A51D-358635E3722D@enterprisedb.com
2021-03-16 15:42:50 -04:00
Stephen Frost c6fc50cb40 Use pre-fetching for ANALYZE
When we have posix_fadvise() available, we can improve the performance
of an ANALYZE by quite a bit by using it to inform the kernel of the
blocks that we're going to be asking for.  Similar to bitmap index
scans, the number of buffers pre-fetched is based off of the
maintenance_io_concurrency setting (for the particular tablespace or,
if not set, globally, via get_tablespace_maintenance_io_concurrency()).

Reviewed-By: Heikki Linnakangas, Tomas Vondra
Discussion: https://www.postgresql.org/message-id/VI1PR0701MB69603A433348EDCF783C6ECBF6EF0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2021-03-16 14:46:48 -04:00
Stephen Frost 94d13d474d Improve logging of auto-vacuum and auto-analyze
When logging auto-vacuum and auto-analyze activity, include the I/O
timing if track_io_timing is enabled.  Also, for auto-analyze, add the
read rate and the dirty rate, similar to how that information has
historically been logged for auto-vacuum.

Stephen Frost and Jakub Wartak

Reviewed-By: Heikki Linnakangas, Tomas Vondra
Discussion: https://www.postgresql.org/message-id/VI1PR0701MB69603A433348EDCF783C6ECBF6EF0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2021-03-16 14:46:48 -04:00
Tom Lane 1ea396362b Improve logging of bad parameter values in BIND messages.
Since commit ba79cb5dc, values of bind parameters have been logged
during errors in extended query mode.  However, we only did that after
we'd collected and converted all the parameter values, thus failing to
offer any useful localization of invalid-parameter problems.  Add a
separate callback that's used during parameter collection, and have it
print the parameter number, along with the input string if text input
format is used.

Justin Pryzby and Tom Lane

Discussion: https://postgr.es/m/20210104170939.GH9712@telsasoft.com
Discussion: https://postgr.es/m/CANfkH5k-6nNt-4cSv1vPB80nq2BZCzhFVR5O4VznYbsX0wZmow@mail.gmail.com
2021-03-16 11:16:41 -04:00
Alvaro Herrera 015061690c
(Blind) fix Perl splitting of strings at newlines
I forgot that Windows represents newlines as \r\n, so splitting a string
at /\s/ creates additional empty strings.  Let's rewrite that as /\s+/
to see if that avoids those.  (There's precedent for using that pattern
on Windows in other scripts.)

Previously: 91bdf499b3, 8ed428dc97, 650b967076.

Per buildfarm, via Tom Lane.
Discussion: https://postgr.es/m/3144460.1615860259@sss.pgh.pa.us
2021-03-16 10:36:28 -03:00
Michael Paquier 15639d5e8f Add some basic tests for progress reporting of COPY
This tests some basic features for progress reporting of COPY, relying
on an INSERT trigger that gets fired when doing COPY FROM with a file or
stdin, checking for sizes, number of tuples processed, and number of
tuples excluded by a WHERE clause.

Author: Josef Šimánek, Matthias van de Meent
Reviewed-by: Michael Paquier, Justin Pryzby, Bharath Rupireddy, Tomas
Vondra
Discussion: https://postgr.es/m/CAEze2WiOcgdH4aQA8NtZq-4dgvnJzp8PohdeKchPkhMY-jWZXA@mail.gmail.com
2021-03-16 09:55:43 +09:00
Alvaro Herrera 9aa491abbf
Add libpq pipeline mode support to pgbench
New metacommands \startpipeline and \endpipeline allow the user to run
queries in libpq pipeline mode.

Author: Daniel Vérité <daniel@manitou-mail.org>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/b4e34135-2bd9-4b8a-94ca-27d760da26d7@manitou-mail.org
2021-03-15 18:33:03 -03:00
Alvaro Herrera acb7e4eb6b
Implement pipeline mode in libpq
Pipeline mode in libpq lets an application avoid the Sync messages in
the FE/BE protocol that are implicit in the old libpq API after each
query.  The application can then insert Sync at its leisure with a new
libpq function PQpipelineSync.  This can lead to substantial reductions
in query latency.

Co-authored-by: Craig Ringer <craig.ringer@enterprisedb.com>
Co-authored-by: Matthieu Garrigues <matthieu.garrigues@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Aya Iwata <iwata.aya@jp.fujitsu.com>
Reviewed-by: Daniel Vérité <daniel@manitou-mail.org>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kirk Jamison <k.jamison@fujitsu.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: Nikhil Sontakke <nikhils@2ndquadrant.com>
Reviewed-by: Vaishnavi Prabakaran <VaishnaviP@fast.au.fujitsu.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>

Discussion: https://postgr.es/m/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com
Discussion: https://postgr.es/m/CAJkzx4T5E-2cQe3dtv2R78dYFvz+in8PY7A8MArvLhs_pg75gg@mail.gmail.com
2021-03-15 18:13:42 -03:00
Tom Lane 146cb3889c Work around issues in MinGW-64's setjmp/longjmp support.
It's hard to avoid the conclusion that there is something wrong with
setjmp/longjmp on MinGW-64, as we have seen failures come and go after
entirely-unrelated-looking changes in our own code.  Other projects
such as Ruby have given up and started using gcc's setjmp/longjmp
builtins on that platform; this patch just follows that lead.

Note that this is a pretty fundamental ABI break for functions
containining either setjmp or longjmp, so we can't really consider
a back-patch.

Per reports from Regina Obe and Heath Lord, as well as recent failures
on buildfarm member walleye, and less-recent failures on fairywren.

Juan José Santamaría Flecha

Discussion: https://postgr.es/m/000401d716a0$1ed0fc70$5c72f550$@pcorp.us
Discussion: https://postgr.es/m/CA+BEBhvHhM-Bn628pf-LsjqRh3Ang7qCSBG0Ga+7KwhGqrNUPw@mail.gmail.com
Discussion: https://postgr.es/m/f1caef93-9640-022e-9211-bbe8755a56b0@2ndQuadrant.com
2021-03-15 12:34:17 -04:00
Thomas Munro eeb60e45d8 Drop SERIALIZABLE workaround from parallel query tests.
SERIALIZABLE no longer inhibits parallelism, so we can drop some
outdated workarounds and comments from regression tests.  The change
came in release 12, commit bb16aba5, but it's not really worth
back-patching.

Also fix a typo.

Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJUaHeK%3DHLATxF1JOKDjKJVrBKA-zmbPAebOM0Se2FQRg%40mail.gmail.com
2021-03-15 23:30:22 +13:00
Fujii Masao d75288fb27 Make archiver process an auxiliary process.
This commit changes WAL archiver process so that it's treated as
an auxiliary process and can use shared memory. This is an infrastructure
patch required for upcoming shared-memory based stats collector patch
series. These patch series basically need any processes including archiver
that can report the statistics to access to shared memory. Since this patch
itself is useful to simplify the code and when users monitor the status of
archiver, it's committed separately in advance.

This commit simplifies the code for WAL archiving. For example, previously
backends need to signal to archiver via postmaster when they notify
archiver that there are some WAL files to archive. On the other hand,
this commit removes that signal to postmaster and enables backends to
notify archier directly using shared latch.

Also, as the side of this change, the information about archiver process
becomes viewable at pg_stat_activity view.

Author: Kyotaro Horiguchi
Reviewed-by: Andres Freund, Álvaro Herrera, Julien Rouhaud, Tomas Vondra, Arthur Zakirov, Fujii Masao
Discussion: https://postgr.es/m/20180629.173418.190173462.horiguchi.kyotaro@lab.ntt.co.jp
2021-03-15 13:13:14 +09:00
Peter Geoghegan 0ea71c93a0 Notice that heap page has dead items during VACUUM.
Consistently set a flag variable that tracks whether the current heap
page has a dead item during lazy vacuum's heap scan.  We missed the
common case where there is an preexisting (or even a new) LP_DEAD heap
line pointer.

Also make it clear that the variable might be affected by an existing
line pointer, say from an earlier opportunistic pruning operation.  This
distinction is important because it's the main reason why we can't just
use the nearby tups_vacuumed variable instead.

No backpatch.  In theory failing to set the page level flag variable had
no consequences.  Currently it is only used to defensively check if a
page marked all visible has dead items, which should never happen anyway
(if it does then the table must be corrupt).

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Diagnosed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAD21AoAtZb4+HJT_8RoOXvu4HM-Zd4HKS3YSMCH6+-W=bDyh-w@mail.gmail.com
2021-03-14 18:05:57 -07:00
Tom Lane 58f57490fa Doc: add note about how to run the pg_amcheck regression tests.
It's not immediately obvious what you have to do to get "make
installcheck" to work here, so document that along the same lines
as we've used elsewhere.
2021-03-13 11:10:30 -05:00
Robert Haas 945d2cb7d0 In pg_amcheck tests, don't depend on perl's Q/q pack code.
It does not work on all versions of perl across all platforms.

To avoid endian-ness issues, pick a new value for column a
that has the same upper 4 bytes as lower 4 bytes. Try to
make it something that isn't likely to occur anywhere nearby
in the page.

Discussion: http://postgr.es/m/29DA079B-0658-4E66-BDAA-0EFD7B64D9C6@enterprisedb.com
2021-03-13 10:57:01 -05:00
Tom Lane 9e294d0f34 pg_amcheck: Keep trying to fix the tests.
Fix another example of non-portable option ordering in the tests.
Oversight in 24189277f.

Mark Dilger

Discussion: https://postgr.es/m/C37D28BA-3BA3-4776-B812-17F05F3472D8@enterprisedb.com
2021-03-13 00:06:56 -05:00
Amit Kapila c5be48f092 Improve FK trigger parallel-safety check added by 05c8482f7f.
Commit 05c8482f7f added special logic related to parallel-safety of FK
triggers. This is a bit of a hack and should have instead been done by
simply setting appropriate proparallel values on those trigger functions
themselves.

Suggested-by: Tom Lane
Author: Greg Nancarrow
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/2309260.1615485644@sss.pgh.pa.us
2021-03-13 09:20:52 +05:30
Robert Haas b9164eab20 pg_amcheck: Keep trying to fix the tests.
Commit 24189277f6 managed to remove
one of the two places where we were checking for a "no such user"
error while leaving the other one right next to it. So remove that
too. In fact, remove the entire test, because the whole point of
this test was to see which message we got on a failure.
2021-03-12 21:59:56 -05:00
Robert Haas 24189277f6 pg_amcheck: Try to fix still more test failures.
Avoid use of non-portable option ordering in command_checks_all().
The use of bare command line arguments before switches doesn't work
everywhere.  Per buildfarm members drongo and hoverfly.

Avoid testing for the message "role \"%s\" does not exist", because
some buildfarm machines report a different error. fairywren complains
about "SSPI authentication failed for user \"%s\"", for example.

Mark Dilger

Discussion: http://postgr.es/m/9E76E46A-48B2-4869-BD0C-422204C1F767@enterprisedb.com
Discussion: http://postgr.es/m/F0A1FD70-A2F4-4528-8A03-8650CAEC0554%40enterprisedb.com
2021-03-12 20:11:47 -05:00
Robert Haas f371a4cdba Try to avoid apparent platform-dependency in IPC::Run
It's hard to believe, but buildfarm results from the new pg_amcheck
suggest that command_checks_all() perform shell expansion on some
machines but not others, apparently due to an underlying behavior
difference in IPC::Run. Let's see if we can work around that - and
confirm that it is the real problem - by passing '-S*' as a single
argument rather than '-S' and '*' as two separate ones.

Failures were observed on jacana and hoverfly.

Mark Dilger

Discussion: http://postgr.es/m/9E76E46A-48B2-4869-BD0C-422204C1F767@enterprisedb.com
2021-03-12 19:00:41 -05:00
Robert Haas 6611256127 Fix portability issues in pg_amcheck's 004_verify_heapam.pl.
Test #12 overwrote a 1-byte varlena header to make it look like the
initial byte of a 4-byte varlena header, but the results were
endian-dependent. Also, the byte "abc" that followed the overwritten
byte would be interpreted differently depending on endian-ness.
Overwrite 4 bytes instead, in an endian-aware manner.

Test #13 accidentally managed to depend on TOAST_MAX_CHUNK_SIZE,
which varies slightly depending on MAXIMUM_ALIGNOF. That's not
the point anyway, so make the regexp insensitive to the expected
number of chunks.

Mark Dilger

Discussion: http://postgr.es/m/A80D68F6-E38F-482D-9522-E2FB6AAFE8A1@enterprisedb.com
2021-03-12 17:34:32 -05:00
Peter Geoghegan 02b5940dbe Consolidate nbtree VACUUM metapage routines.
Simplify _bt_vacuum_needs_cleanup() functions's signature (it only needs
a single 'rel' argument now), and move it next to its sibling function
in nbtpage.c.

I believe that _bt_vacuum_needs_cleanup() was originally located in
nbtree.c due to an include dependency issue.  That's no longer an issue.

Follow-up to commit 9f3665fb.
2021-03-12 13:11:47 -08:00
Robert Haas ac44595585 Move PG_USED_FOR_ASSERTS_ONLY before initializer.
Erik Rijkers reported a compile failure, and I think this is probably
the reason.
2021-03-12 15:04:10 -05:00
Robert Haas 7a1527c02c Adjust perl style.
Per buildfarm member crake.
2021-03-12 14:55:40 -05:00
Robert Haas d60e61de4f Try to fix compiler warnings.
Per report from Peter Geoghegan.

Discussion: http://postgr.es/m/CAH2-WznpwULZ3uJ1_6WXvNMXYbOy8k8tYs3r=qSdGmZeRd6tDw@mail.gmail.com
2021-03-12 14:35:10 -05:00
Robert Haas 9706092839 Add pg_amcheck, a CLI for contrib/amcheck.
This makes it a lot easier to run the corruption checks that are
implemented by contrib/amcheck against lots of relations and get
the result in an easily understandable format. It has a wide variety
of options for choosing which relations to check and which checks
to perform, and it can run checks in parallel if you want.

Mark Dilger, reviewed by Peter Geoghegan and by me.

Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com
Discussion: http://postgr.es/m/BA592F2D-F928-46FF-9516-2B827F067F57@enterprisedb.com
2021-03-12 13:00:01 -05:00
Tom Lane 48d67fd897 Fix race condition in psql \e's detection of file modification.
psql's editing commands decide whether the user has edited the file
by checking for change of modification timestamp.  This is probably
fine for a pre-existing file, but with a temporary file that is
created within the command, it's possible for a fast typist to
save-and-exit in less than the one-second granularity of stat(2)
timestamps.  On Windows FAT filesystems the granularity is even
worse, 2 seconds, making the race a bit easier to hit.

To fix, try to set the temp file's mod time to be two seconds ago.
It's unlikely this would fail, but then again the race condition
itself is unlikely, so just ignore any error.

Also, we might as well check the file size as well as its mod time.

While this is a difficult bug to hit, it still seems worth
back-patching, to ensure that users' edits aren't lost.

Laurenz Albe, per gripe from Jacob Champion; based on fix suggestions
from Jacob and myself

Discussion: https://postgr.es/m/0ba3f2a658bac6546d9934ab6ba63a805d46a49b.camel@cybertec.at
2021-03-12 12:20:15 -05:00
Tom Lane f52c5d6749 Forbid marking an identity column as nullable.
GENERATED ALWAYS AS IDENTITY implies NOT NULL, but the code failed
to complain if you overrode that with "GENERATED ALWAYS AS IDENTITY
NULL".  One might think the old behavior was a feature, but it was
inconsistent because the outcome varied depending on the order of
the clauses, so it seems to have been just an oversight.

Per bug #16913 from Pavel Boev.  Back-patch to v10 where identity
columns were introduced.

Vik Fearing (minor tweaks by me)

Discussion: https://postgr.es/m/16913-3b5198410f67d8c6@postgresql.org
2021-03-12 11:08:42 -05:00
Thomas Munro 1b88b8908e Specialize checkpointer sort functions.
When sorting a potentially large number of dirty buffers, the
checkpointer can benefit from a faster sort routine.  One reported
improvement on a large buffer pool system was 1.4s -> 0.6s.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com
2021-03-12 23:56:02 +13:00
Amit Kapila 519e4c9ee2 Fix size overflow in calculation introduced by commits d6ad34f3 and bea449c6.
Reported-by: Thomas Munro
Author: Takayuki Tsunakawa
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/CA+hUKG+oPoFizjABt=GXZWTEHx3oev5rAe2scjW2r6F1rguo5w@mail.gmail.com
2021-03-12 15:42:08 +05:30
Amit Kapila e2cda3c20a Fix use of relcache TriggerDesc field introduced by commit 05c8482f7f.
The commit added code which used a relcache TriggerDesc field across
another cache access, which it shouldn't because the relcache doesn't
guarantee it won't get moved.

Diagnosed-by: Tom Lane
Author: Greg Nancarrow
Reviewed-by: Hou Zhijie, Amit Kapila
Discussion: https://postgr.es/m/2309260.1615485644@sss.pgh.pa.us
2021-03-12 15:14:41 +05:30
Thomas Munro 57dcc2ef33 Poll postmaster less frequently in recovery.
Since commits 9f095299 and f98b8476 we don't poll the postmaster
pipe at all during crash recovery on Linux and FreeBSD, but on other
operating systems we were still doing it for every WAL record.  Do it
less frequently on operating systems where system calls are required, at
the cost of delaying exit a bit after postmaster death.  This avoids
expensive system calls reported to slow down CPU-bound recovery by as
much as 10-30%.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
Discussion: https://postgr.es/m/7261eb39-0369-f2f4-1bb5-62f3b6083b5e@iki.fi
2021-03-12 19:45:42 +13:00
Thomas Munro de829ddf23 Add condition variable for walreceiver shutdown.
Use this new CV to wait for walreceiver shutdown without a sleep/poll
loop, while also benefiting from standard postmaster death handling.

Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
2021-03-12 19:45:42 +13:00
Thomas Munro 600f2f50b7 Add condition variable for recovery resume.
Replace a sleep loop with a CV, to get a fast reaction time when
recovery is resumed or the postmaster exits via standard infrastructure.
Unfortunately we still need to wake up every second to perform extra
polling during the recovery pause loop.

Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
2021-03-12 19:45:42 +13:00
Fujii Masao b82640df00 Send statistics collected during shutdown checkpoint to the stats collector.
When shutdown is requested, checkpointer performs checkpoint or
restartpoint, and updates the statistics, before it exits. But previously
checkpointer didn't send those statistics to the stats collector.

Shutdown checkpoint and restartpoint are treated as requested ones
instead of scheduled ones, so the number of them are counted in
pg_stat_bgwriter.checkpoints_req column.

Author: Masahiro Ikeda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com
2021-03-12 14:23:00 +09:00
Fujii Masao 33394ee6f2 Force to send remaining WAL stats to the stats collector at walwriter exit.
In walwriter's main loop, WAL stats message is only sent if enough time
has passed since last one was sent to reach PGSTAT_STAT_INTERVAL msecs.
This is necessary to avoid overloading to the stats collector. But this
can cause recent WAL stats to be unsent when walwriter exits.

To ensure that all the WAL stats are sent, this commit makes walwriter
force to send remaining WAL stats to the collector when it exits because
of shutdown request. Note that those remaining WAL stats can still be
unsent when walwriter exits with non-zero exit code (e.g., FATAL error).
This is OK because that walwriter exit leads to server crash and
subsequent recovery discards all the stats. So there is no need to send
remaining stats in that case.

Author: Masahiro Ikeda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com
2021-03-12 13:29:59 +09:00
Thomas Munro 43c6662496 Minor modernization for README.barrier.
Itanium is very uncommon and being discontinued.  ARM is everywhere.
Prefer ARM as an example of an architecture with weak memory ordering.
2021-03-12 15:36:16 +13:00
Peter Geoghegan 7bb97211a5 Save a few cycles during nbtree VACUUM.
Avoid calling RelationGetNumberOfBlocks() unnecessarily in the common
case where there are no deleted but not yet recycled pages to recycle
during a cleanup-only nbtree VACUUM operation.

Follow-up to commit e5d8a999, which (among other things) taught the
"skip full scan" nbtree VACUUM mechanism to only trigger a full index
scan when the absolute number of deleted pages in the index is
considered excessive.
2021-03-11 14:18:23 -08:00
Peter Geoghegan effdd3f3b6 Add back vacuum_cleanup_index_scale_factor parameter.
Commit 9f3665fb removed the vacuum_cleanup_index_scale_factor storage
parameter.  However, that creates dump/reload hazards when moving across
major versions.

Add back the vacuum_cleanup_index_scale_factor parameter (though not the
GUC of the same name) purely to avoid problems when using tools like
pg_upgrade.  The parameter remains disabled and undocumented.

No backpatch to Postgres 13, since vacuum_cleanup_index_scale_factor was
only disabled by REL_13_STABLE's version of master branch commit
9f3665fb in the first place -- the parameter already looks like this on
REL_13_STABLE.

Discussion: https://postgr.es/m/YEm/a3Ko3nKnBuVq@paquier.xyz
2021-03-11 12:42:46 -08:00
Robert Haas 32fd2b57d7 Be clear about whether a recovery pause has taken effect.
Previously, the code and documentation seem to have essentially
assumed than a call to pg_wal_replay_pause() would take place
immediately, but that's not the case, because we only check for a
pause in certain places. This means that a tool that uses this
function and then wants to do something else afterward that is
dependent on the pause having taken effect doesn't know how long it
needs to wait to be sure that no more WAL is going to be replayed.

To avoid that, add a new function pg_get_wal_replay_pause_state()
which returns either 'not paused', 'paused requested', or 'paused'.
After calling pg_wal_replay_pause() the status will immediate change
from 'not paused' to 'pause requested'; when the startup process
has noticed this, the status will change to 'pause'.  For backward
compatibility, pg_is_wal_replay_paused() still exists and returns
the same thing as before: true if a pause has been requested,
whether or not it has taken effect yet; and false if not.
The documentation is updated to clarify.

To improve the changes that a pause request is quickly confirmed
effective, adjust things so that WaitForWALToBecomeAvailable will
swiftly reach a call to recoveryPausesHere() when a pause request
is made.

Dilip Kumar, reviewed by Simon Riggs, Kyotaro Horiguchi, Yugo Nagata,
Masahiko Sawada, and Bharath Rupireddy.

Discussion: http://postgr.es/m/CAFiTN-vcLLWEm8Zr%3DYK83rgYrT9pbC8VJCfa1kY9vL3AUPfu6g%40mail.gmail.com
2021-03-11 15:07:03 -05:00
Tom Lane 51c54bb603 Re-simplify management of inStart in pqParseInput3's subroutines.
Commit 92785dac2 copied some logic related to advancement of inStart
from pqParseInput3 into getRowDescriptions and getAnotherTuple,
because it wanted to allow user-defined row processor callbacks to
potentially longjmp out of the library, and inStart would have to be
updated before that happened to avoid an infinite loop.  We later
decided that that API was impossibly fragile and reverted it, but
we didn't undo all of the related code changes, and this bit of
messiness survived.  Undo it now so that there's just one place in
pqParseInput3's processing where inStart is advanced; this will
simplify addition of better tracing support.

getParamDescriptions had grown similar processing somewhere along
the way (not in 92785dac2; I didn't track down just when), but it's
actually buggy because its handling of corrupt-message cases seems to
have been copied from the v2 logic where we lacked a known message
length.  The cases where we "goto not_enough_data" should not simply
return EOF, because then we won't consume the message, potentially
creating an infinite loop.  That situation now represents a
definitively corrupt message, and we should report it as such.

Although no field reports of getParamDescriptions getting stuck in
a loop have been seen, it seems appropriate to back-patch that fix.
I chose to back-patch all of this to keep the logic looking more alike
in supported branches.

Discussion: https://postgr.es/m/2217283.1615411989@sss.pgh.pa.us
2021-03-11 14:43:45 -05:00
Robert Haas f71519e545 Refactor and generalize the ParallelSlot machinery.
Create a wrapper object, ParallelSlotArray, to encapsulate the
number of slots and the slot array itself, plus some other relevant
bits of information. This reduces the number of parameters we have
to pass around all over the place.

Allow for a ParallelSlotArray to contain slots connected to
different databases within a single cluster. The current clients
of this mechanism don't need this, but it is expected to be used
by future patches.

Defer connecting to databases until we actually need the connection
for something. This is a slight behavior change for vacuumdb and
reindexdb. If you specify a number of jobs that is larger than the
number of objects, the extra connections will now not be used.
But, on the other hand, if you specify a number of jobs that is
so large that it's going to fail, the failure would previously have
happened before any operations were actually started, and now it
won't.

Mark Dilger, reviewed by me.

Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com
Discussion: http://postgr.es/m/BA592F2D-F928-46FF-9516-2B827F067F57@enterprisedb.com
2021-03-11 13:17:46 -05:00
Michael Paquier 2c0cefcd18 Set libcrypto callbacks for all connection threads in libpq
Based on an analysis of the OpenSSL code with Jacob, moving to EVP for
the cryptohash computations makes necessary the setup of the libcrypto
callbacks that were getting set only for SSL connections, but not for
connections without SSL.  Not setting the callbacks makes the use of
threads potentially unsafe for connections calling cryptohashes during
authentication, like MD5 or SCRAM, if a failure happens during a
cryptohash computation.  The logic setting the libssl and libcrypto
states is then split into two parts, both using the same locking, with
libcrypto being set up for SSL and non-SSL connections, while SSL
connections set any libssl state afterwards as needed.

Prior to this commit, only SSL connections would have set libcrypto
callbacks that are necessary to ensure a proper thread locking when
using multiple concurrent threads in libpq (ENABLE_THREAD_SAFETY).  Note
that this is only required for OpenSSL 1.0.2 and 1.0.1 (oldest version
supported on HEAD), as 1.1.0 has its own internal locking and it has
dropped support for CRYPTO_set_locking_callback().

Tests with up to 300 threads with OpenSSL 1.0.1 and 1.0.2, mixing SSL
and non-SSL connection threads did not show any performance impact after
some micro-benchmarking.  pgbench can be used here with -C and a
mostly-empty script (with one \set meta-command for example) to stress
authentication requests, and we have mixed that with some custom
programs for testing.

Reported-by: Jacob Champion
Author: Michael Paquier
Reviewed-by: Jacob Champion
Discussion: https://postgr.es/m/fd3ba610085f1ff54623478cf2f7adf5af193cbb.camel@vmware.com
2021-03-11 17:14:25 +09:00
Thomas Munro 049d9b872d Improve comment for struct BufferDesc.
Add a note that per-buffer I/O condition variables currently live
outside the BufferDesc struct.  Follow-up for commit d8725104.

Reported-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://postgr.es/m/20210311031118.hucytmrgwlktjxgq%40nol
2021-03-11 16:38:45 +13:00
Bruce Momjian 2950ff32f3 tutorial: land height is "elevation", not "altitude"
This is a follow-on patch to 92c12e46d5.  In that patch, we renamed
"altitude" to "elevation" in the docs, based on these details:

   https://mapscaping.com/blogs/geo-candy/what-is-the-difference-between-elevation-relief-and-altitude

This renames the tutorial SQL files to match the documentation.

Reported-by: max1@inbox.ru

Discussion: https://postgr.es/m/161512392887.1046.3137472627109459518@wrigleys.postgresql.org

Backpatch-through: 9.6
2021-03-10 20:25:19 -05:00
Peter Geoghegan 5f8727f5a6 VACUUM ANALYZE: Always update pg_class.reltuples.
vacuumlazy.c sometimes fails to update pg_class entries for each index
(to ensure that pg_class.reltuples is current), even though analyze.c
assumed that that must have happened during VACUUM ANALYZE.  There are
at least a couple of reasons for this.  For example, vacuumlazy.c could
fail to update pg_class when the index AM indicated that its statistics
are merely an estimate, per the contract for amvacuumcleanup() routines
established by commit e57345975c back in 2006.

Stop assuming that pg_class must have been updated with accurate
statistics within VACUUM ANALYZE -- update pg_class for indexes at the
same time as the table relation in all cases.  That way VACUUM ANALYZE
will never fail to keep pg_class.reltuples reasonably accurate.

The only downside of this approach (compared to the old approach) is
that it might inaccurately set pg_class.reltuples for indexes whose heap
relation ends up with the same inaccurate value anyway.  This doesn't
seem too bad.  We already consistently called vac_update_relstats() (to
update pg_class) for the heap/table relation twice during any VACUUM
ANALYZE -- once in vacuumlazy.c, and once in analyze.c.  We now make
sure that we call vac_update_relstats() at least once (though often
twice) for each index.

This is follow up work to commit 9f3665fb, which dealt with issues in
btvacuumcleanup().  Technically this fixes an unrelated issue, though.
btvacuumcleanup() no longer provides an accurate num_index_tuples value
following commit 9f3665fb (when there was no btbulkdelete() call during
the VACUUM operation in question), but hashvacuumcleanup() has worked in
the same way for many years now.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzknxdComjhqo4SUxVFk_Q1171GJO2ZgHZ1Y6pion6u8rA@mail.gmail.com
Backpatch: 13-, just like commit 9f3665fb.
2021-03-10 17:07:57 -08:00
Peter Geoghegan 9f3665fbfc Don't consider newly inserted tuples in nbtree VACUUM.
Remove the entire idea of "stale stats" within nbtree VACUUM (stop
caring about stats involving the number of inserted tuples).  Also
remove the vacuum_cleanup_index_scale_factor GUC/param on the master
branch (though just disable them on postgres 13).

The vacuum_cleanup_index_scale_factor/stats interface made the nbtree AM
partially responsible for deciding when pg_class.reltuples stats needed
to be updated.  This seems contrary to the spirit of the index AM API,
though -- it is not actually necessary for an index AM's bulk delete and
cleanup callbacks to provide accurate stats when it happens to be
inconvenient.  The core code owns that.  (Index AMs have the authority
to perform or not perform certain kinds of deferred cleanup based on
their own considerations, such as page deletion and recycling, but that
has little to do with pg_class.reltuples/num_index_tuples.)

This issue was fairly harmless until the introduction of the
autovacuum_vacuum_insert_threshold feature by commit b07642db, which had
an undesirable interaction with the vacuum_cleanup_index_scale_factor
mechanism: it made insert-driven autovacuums perform full index scans,
even though there is no real benefit to doing so.  This has been tied to
a regression with an append-only insert benchmark [1].

Also have remaining cases that perform a full scan of an index during a
cleanup-only nbtree VACUUM indicate that the final tuple count is only
an estimate.  This prevents vacuumlazy.c from setting the index's
pg_class.reltuples in those cases (it will now only update pg_class when
vacuumlazy.c had TIDs for nbtree to bulk delete).  This arguably fixes
an oversight in deduplication-related bugfix commit 48e12913.

[1] https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAD21AoA4WHthN5uU6+WScZ7+J_RcEjmcuH94qcoUPuB42ShXzg@mail.gmail.com
Backpatch: 13-, where autovacuum_vacuum_insert_threshold was added.
2021-03-10 16:27:01 -08:00
Bruce Momjian 845ac7f847 C comments: improve description of GiST NSN and GistBuildLSN
GiST indexes are complex, so adding more details in the code might help
someone.

Discussion: https://postgr.es/m/20210302164021.GA364@momjian.us
2021-03-10 17:03:10 -05:00
Thomas Munro d87251048a Replace buffer I/O locks with condition variables.
1.  Backends waiting for buffer I/O are now interruptible.

2.  If something goes wrong in a backend that is currently performing
I/O, waiting backends no longer wake up until that backend reaches
AbortBufferIO() and broadcasts on the CV.  Previously, any waiters would
wake up (because the I/O lock was automatically released) and then
busy-loop until AbortBufferIO() cleared BM_IO_IN_PROGRESS.

3.  LWLockMinimallyPadded is removed, as it would now be unused.

Author: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier version, 2016)
Discussion: https://postgr.es/m/CA%2BhUKGJ8nBFrjLuCTuqKN0pd2PQOwj9b_jnsiGFFMDvUxahj_A%40mail.gmail.com
Discussion: https://postgr.es/m/CA+Tgmoaj2aPti0yho7FeEf2qt-JgQPRWb0gci_o1Hfr=C56Xng@mail.gmail.com
2021-03-11 10:36:17 +13:00
Tom Lane c3ffe34863 Avoid creating duplicate cached plans for inherited FK constraints.
When a foreign key constraint is applied to a partitioned table, each
leaf partition inherits a similar FK constraint.  We were processing all
of those constraints independently, meaning that in large partitioning
trees we'd build up large collections of cached FK-checking query plans.
However, in all cases but one, the generated queries are actually
identical for all members of the inheritance tree (because, in most
cases, the query only mentions the topmost table of the other side of
the FK relationship).  So we can share a single cached plan among all
the partitions, saving memory, not to mention time to build and maintain
the cached plans.

Keisuke Kuroda and Amit Langote

Discussion: https://postgr.es/m/cab4b85d-9292-967d-adf2-be0d803c3e23@nttcom.co.jp_1
2021-03-10 14:22:31 -05:00
Peter Eisentraut bbaf315309 Add bound check before bsearch() for performance
In the current lazy vacuum implementation, some index AMs such as
btree indexes call lazy_tid_reaped() for each index tuple during
ambulkdelete to check if the index tuple points to the (collected)
garbage tuple.  In that function, we simply call bsearch(), but we
should be able to know the result without bsearch() if the index tuple
points to the heap tuple that is out of range of the collected garbage
tuples.  Therefore, add a simple bound check before resorting to
bsearch().  Testing has shown that this can give significant
performance benefits.

Author: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+fd4k76j8jKzJzcx8UqEugvayaMSnQz0iLUt_XgBp-_-bd22A@mail.gmail.com
2021-03-10 15:19:37 +01:00
Thomas Munro c427de427a Fix another portability bug in recent pgbench commit.
Commit 547f04e7 produced errors on AIX/xlc while building plpython.  The
new code appears to be incompatible with the hack installed by commit
a11cf433.  Without access to an AIX system to check, my guess is that
_POSIX_C_SOURCE may be required for <time.h> to declare the things the
header needs to see, but plpython.h undefines it.

For now, to unbreak build farm animal hoverfly, just move the new
pg_time_usec_t support into pgbench.c.  Perhaps later we could figure
out what to rearrange to put it back into a header for wider use.

Discussion: https://postgr.es/m/CA%2BhUKG%2BP%2BjcD%3Dx9%2BagyTdWtjpOT64MYiGic%2Bcbu_TD8CV%3D6A3w%40mail.gmail.com
2021-03-10 23:20:41 +13:00
Thomas Munro 68b34b2338 Try to fix portability bugs in recent pgbench commits.
1.  pg_time_usec_t needs to be printed with INT64_FORMAT, not %ld, or 32
bit systems complain, per lapwing.

2.  Some Windows compilers didn't like a thread function not marked with
__stdcall, per whelk; let's see if this fixes the problem.
2021-03-10 21:12:11 +13:00
Peter Eisentraut 1657b37d7c Small debug message tweak
This makes the wording of the delete case match the update case.
2021-03-10 08:16:38 +01:00
Michael Paquier 6c788d9f6a Move tablespace path re-creation from the makefiles to pg_regress
Moving this logic into pg_regress fixes a potential failure with
parallel tests when pg_upgrade and the main regression test suite both
trigger the makefile rule that cleaned up testtablespace/ under
src/test/regress.  Even if pg_upgrade was triggering this rule, it has
no need to do so as it uses a different tablespace path.  So if
pg_upgrade triggered the makefile rule for the tablespace setup while
the main regression test suite ran the tablespace cases, it would fail.

61be85a was a similar attempt at achieving that, but that broke cases
where the regression tests require to run under an Administrator
account, like with Appveyor.

Reported-by: Andres Freund, Kyotaro Horiguchi
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/20201209012911.uk4d6nxcnkp7ehrx@alap3.anarazel.de
2021-03-10 14:50:00 +09:00
Thomas Munro aeb57af8e6 pgbench: Synchronize client threads.
Wait until all pgbench threads are connected before benchmarking begins.
This fixes a problem where some connections could take a very long time
to be established because of lock contention from earlier connections,
making results unstable and bogus with high connection counts.

Author: Andres Freund <andres@anarazel.de>
Author: Fabien COELHO <coelho@cri.ensmp.fr>
Reviewed-by: Marina Polyakova <m.polyakova@postgrespro.ru>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20200227180100.zyvjwzcpiokfsqm2%40alap3.anarazel.de
2021-03-10 17:44:04 +13:00
Thomas Munro 44bf3d5083 Add missing pthread_barrier_t.
Supply a simple implementation of the missing pthread_barrier_t type and
functions, for macOS.

Discussion: https://postgr.es/m/20200227180100.zyvjwzcpiokfsqm2%40alap3.anarazel.de
2021-03-10 17:44:04 +13:00
Thomas Munro 547f04e734 pgbench: Improve time logic.
Instead of instr_time (struct timespec) and the INSTR_XXX macros,
introduce pg_time_usec_t and use integer arithmetic.  Don't include the
connection time in TPS unless using -C mode, but report it separately.

Author: Fabien COELHO <coelho@cri.ensmp.fr>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/20200227180100.zyvjwzcpiokfsqm2%40alap3.anarazel.de
2021-03-10 17:44:04 +13:00
Thomas Munro b1d6a8f868 pgbench: Refactor thread portability support.
Instead of maintaining an incomplete emulation of POSIX threads for
Windows, let's use an extremely minimalist macro-based abstraction for
now.  A later patch will extend this, without the need to supply more
complicated pthread emulation code.  (There may be a need for a more
serious portable thread abstraction in later projects, but this is not
it.)

Minor incidental problems fixed: it wasn't OK to use (pthread_t) 0 as a
special value, it wasn't OK to compare thread_t values with ==, and we
incorrectly assumed that pthread functions set errno.

Discussion: https://postgr.es/m/20200227180100.zyvjwzcpiokfsqm2%40alap3.anarazel.de
2021-03-10 17:44:04 +13:00
Amit Kapila e4e87a32cc Fix valgrind issue in commit 05c8482f7f.
Initialize other newly added variables in max_parallel_hazard_context via
is_parallel_safe() because we don't check the parallel-safety of target
relations in that function.

Reported-by: Tom Lane as per buildfarm
Author: Amit Kapila
Discussion: https://postgr.es/m/2060179.1615347455@sss.pgh.pa.us
2021-03-10 10:06:39 +05:30
Amit Kapila 05c8482f7f Enable parallel SELECT for "INSERT INTO ... SELECT ...".
Parallel SELECT can't be utilized for INSERT in the following cases:
- INSERT statement uses the ON CONFLICT DO UPDATE clause
- Target table has a parallel-unsafe: trigger, index expression or
  predicate, column default expression or check constraint
- Target table has a parallel-unsafe domain constraint on any column
- Target table is a partitioned table with a parallel-unsafe partition key
  expression or support function

The planner is updated to perform additional parallel-safety checks for
the cases listed above, for determining whether it is safe to run INSERT
in parallel-mode with an underlying parallel SELECT. The planner will
consider using parallel SELECT for "INSERT INTO ... SELECT ...", provided
nothing unsafe is found from the additional parallel-safety checks, or
from the existing parallel-safety checks for SELECT.

While checking parallel-safety, we need to check it for all the partitions
on the table which can be costly especially when we decide not to use a
parallel plan. So, in a separate patch, we will introduce a GUC and or a
reloption to enable/disable parallelism for Insert statements.

Prior to entering parallel-mode for the execution of INSERT with parallel
SELECT, a TransactionId is acquired and assigned to the current
transaction state. This is necessary to prevent the INSERT from attempting
to assign the TransactionId whilst in parallel-mode, which is not allowed.
This approach has a disadvantage in that if the underlying SELECT does not
return any rows, then the TransactionId is not used, however that
shouldn't happen in practice in many cases.

Author: Greg Nancarrow, Amit Langote, Amit Kapila
Reviewed-by: Amit Langote, Hou Zhijie, Takayuki Tsunakawa, Antonin Houska, Bharath Rupireddy, Dilip Kumar, Vignesh C, Zhihong Yu, Amit Kapila
Tested-by: Tang, Haiying
Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com
Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
2021-03-10 07:38:58 +05:30
Michael Paquier 0ba71107ef Revert changes for SSL compression in libpq
This partially reverts 096bbf7 and 9d2d457, undoing the libpq changes as
it could cause breakages in distributions that share one single libpq
version across multiple major versions of Postgres for extensions and
applications linking to that.

Note that the backend is unchanged here, and it still disables SSL
compression while simplifying the underlying catalogs that tracked if
compression was enabled or not for a SSL connection.

Per discussion with Tom Lane and Daniel Gustafsson.

Discussion: https://postgr.es/m/YEbq15JKJwIX+S6m@paquier.xyz
2021-03-10 09:35:42 +09:00
Peter Eisentraut 14d9b37607 libpq: Remove deprecated connection parameters authtype and tty
The authtype parameter was deprecated and made inactive in commit
d5bbe2aca5, but the environment variable was left defined and thus
tested with a getenv call even though the value is of no use.  Also,
if it would exist it would be copied but never freed as the cleanup
code had been removed.

tty was deprecated in commit cb7fb3ca95 but most of the
infrastructure around it remained in place.

Author: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/DDDF36F3-582A-4C02-8598-9B464CC42B34@yesql.se
2021-03-09 15:01:22 +01:00
Michael Paquier 096bbf7c93 Switch back sslcompression to be a normal input field in libpq
Per buildfarm member crake, any servers including a postgres_fdw server
with this option set would fail to do a pg_upgrade properly as the
option got hidden in f9264d1 by becoming a debug option, making the
restore of the FDW server fail.

This changes back the option in libpq to be visible, but still inactive
to fix this upgrade issue.

Discussion: https://postgr.es/m/YEbq15JKJwIX+S6m@paquier.xyz
2021-03-09 19:52:36 +09:00
Fujii Masao ff99918c62 Track total amounts of times spent writing and syncing WAL data to disk.
This commit adds new GUC track_wal_io_timing. When this is enabled,
the total amounts of time XLogWrite writes and issue_xlog_fsync syncs
WAL data to disk are counted in pg_stat_wal. This information would be
useful to check how much WAL write and sync affect the performance.

Enabling track_wal_io_timing will make the server query the operating
system for the current time every time WAL is written or synced,
which may cause significant overhead on some platforms. To avoid such
additional overhead in the server with track_io_timing enabled,
this commit introduces track_wal_io_timing as a separate parameter from
track_io_timing.

Note that WAL write and sync activity by walreceiver has not been tracked yet.

This commit makes the server also track the numbers of times XLogWrite
writes and issue_xlog_fsync syncs WAL data to disk, in pg_stat_wal,
regardless of the setting of track_wal_io_timing. This counters can be
used to calculate the WAL write and sync time per request, for example.

Bump PGSTAT_FILE_FORMAT_ID.

Bump catalog version.

Author: Masahiro Ikeda
Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David Johnston, Fujii Masao
Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com
2021-03-09 16:52:06 +09:00
Michael Paquier 9d2d457009 Add support for more progress reporting in COPY
The command (TO or FROM), its type (file, pipe, program or callback),
and the number of tuples excluded by a WHERE clause in COPY FROM are
added to the progress reporting already available.

The column "lines_processed" is renamed to "tuples_processed" to
disambiguate the meaning of this column in the cases of CSV and BINARY
COPY and to be more consistent with the other catalog progress views.

Bump catalog version, again.

Author: Matthias van de Meent
Reviewed-by: Michael Paquier, Justin Pryzby, Bharath Rupireddy, Josef
Šimánek, Tomas Vondra
Discussion: https://postgr.es/m/CAEze2WiOcgdH4aQA8NtZq-4dgvnJzp8PohdeKchPkhMY-jWZXA@mail.gmail.com
2021-03-09 14:21:03 +09:00
Michael Paquier f9264d1524 Remove support for SSL compression
PostgreSQL disabled compression as of e3bdb2d and the documentation
recommends against using it since.  Additionally, SSL compression has
been disabled in OpenSSL since version 1.1.0, and was disabled in many
distributions long before that.  The most recent TLS version, TLSv1.3,
disallows compression at the protocol level.

This commit removes the feature itself, removing support for the libpq
parameter sslcompression (parameter still listed for compatibility
reasons with existing connection strings, just ignored), and removes
the equivalent field in pg_stat_ssl and de facto PgBackendSSLStatus.

Note that, on top of removing the ability to activate compression by
configuration, compression is actively disabled in both frontend and
backend to avoid overrides from local configurations.

A TAP test is added for deprecated SSL parameters to check after
backwards compatibility.

Bump catalog version.

Author: Daniel Gustafsson
Reviewed-by: Peter Eisentraut, Magnus Hagander, Michael Paquier
Discussion:  https://postgr.es/m/7E384D48-11C5-441B-9EC3-F7DB1F8518F6@yesql.se
2021-03-09 11:16:47 +09:00
Tom Lane d4545dc19b Complain if a function-in-FROM returns a set when it shouldn't.
Throw a "function protocol violation" error if a function in FROM
tries to return a set though it wasn't marked proretset.  Although
such cases work at the moment, it doesn't seem like something we
want to guarantee will keep working.  Besides, there are other
negative consequences of not setting the proretset flag, such as
potentially bad plans.

No back-patch, since if there is any third-party code violating
this expectation, people wouldn't appreciate us breaking it in
a minor release.

Discussion: https://postgr.es/m/1636062.1615141782@sss.pgh.pa.us
2021-03-08 18:54:55 -05:00
Tom Lane fed10d4eec Properly mark pg_stat_get_subscription() as returning a set.
The initial catalog data for this function failed to set proretset
or provide a prorows estimate.  It accidentally worked anyway when
invoked in the FROM clause, because the executor isn't too picky
about this; but the planner didn't expect the function to return
multiple rows, which could lead to bad plans.  Also the function
would fail if invoked in the SELECT list.

We can't easily back-patch this fix, but fortunately the bug's
consequences aren't awful in most cases.  Getting this right is
mainly an exercise in future-proofing.

Discussion: https://postgr.es/m/1636062.1615141782@sss.pgh.pa.us
2021-03-08 18:47:23 -05:00
Tom Lane 5c06abb9b9 Validate the OID argument of pg_import_system_collations().
"SELECT pg_import_system_collations(0)" caused an assertion failure.
With a random nonzero argument --- or indeed with zero, in non-assert
builds --- it would happily make pg_collation entries with garbage
values of collnamespace.  These are harmless as far as I can tell
(unless maybe the OID happens to become used for a schema, later on?).
In any case this isn't a security issue, since the function is
superuser-only.  But it seems like a gotcha for unwary DBAs, so let's
add a check that the given OID belongs to some schema.

Back-patch to v10 where this function was introduced.
2021-03-08 18:21:51 -05:00
Tom Lane 6c20bdb2a2 Further tweak memory management for regex DFAs.
Coverity is still unhappy after commit 190c79884, and after looking
closer I think it might be onto something.  The callers of newdfa()
typically drop out if v->err has been set nonzero, which newdfa()
is faithfully doing if it fails.  However, what if v->err was already
nonzero before we entered newdfa()?  Then newdfa() could succeed and
the caller would promptly leak its result.

I don't think this scenario can actually happen, but the predicate
"v->err is always zero when newdfa() is called" seems difficult to be
entirely sure of; there's a good deal of code that potentially could
get that wrong.

It seems better to adjust the callers to directly check for a null
result instead of relying on ISERR() tests.  This is slightly cheaper
than the previous coding anyway.

Lacking evidence that there's any real bug, no back-patch.
2021-03-08 16:32:29 -05:00
Amit Kapila 8a812e5106 Track replication origin progress for rollbacks.
Commit 1eb6d6527a allowed to track replica origin replay progress for 2PC
but it was not complete. It misses to properly track the progress for
rollback prepared especially it missed updating the code for recovery.
Additionally, we need to allow tracking it on subscriber nodes where
wal_level might not be logical.

It is required to track decoding of 2PC which is committed in PG14
(a271a1b50e) and also nobody complained about this till now so not
backpatching it.

Author: Amit Kapila
Reviewed-by: Michael Paquier and Ajin Cherian
Discussion: https://postgr.es/m/CAA4eK1L-kHmMnSdrRW6UhRbCjR7cgh04c+6psY15qzT6ktcd+g@mail.gmail.com
2021-03-08 07:54:03 +05:30
Peter Eisentraut f9a0392e1c Add bit_xor aggregate function
This can be used as a checksum for unordered sets.  bit_and and bit_or
already exist.

Author: Alexey Bashtanov <bashtanov@imap.cc>
Reviewed-by: Ibrar Ahmed <ibrar.ahmad@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/9d4582ae-ecfc-3a13-2238-6ab5a37c1f41@imap.cc
2021-03-06 19:28:05 +01:00
Michael Paquier f1516ad7b3 pgbench: Simplify some port, host, user and dbname assignments
Using pgbench in an environment with both PGPORT and PGUSER set would
have caused the generation of a debug log with an incorrect database
name due to an oversight in 412893b.  Not specifying user, port and/or
database using the option switches, without their respective environment
variables, generated a log entry with empty strings, which was
rather useless.

This commit fixes this set of issues by simplifying the logic grabbing
the connection information, removing a set of getenv() calls that
emulated what libpq already does.  The faulty debug log now directly
uses the information from the libpq connection, and it gets generated
after the connection to the backend is completed, not before it (in the
event of a failure libpq would complain with more information about the
connection attempt so the log is not really useful before anyway).

Author: Kota Miyake
Reviewed-by: Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/026b3ae6fc339a18394d053c32a4463d@oss.nttdata.com
2021-03-06 21:26:34 +09:00
Michael Paquier 5bca69a76b Add support for PROVE_TESTS and PROVE_FLAGS in MSVC scripts
These can be set in buildenv.pl or a "set" command within a Windows
terminal.  The MSVC script vcregress.pl parses the values available in
the environment to build the resulting prove commands, and the parsing
of PROVE_TESTS is able to handle name patterns in the same way as other
platforms.

Not specifying those environment values makes vcregress.pl fall back to
the previous default, with no extra flags for the prove command, and all
the tests run within t/.

Author: Michael Paquier
Reviewed-by: Juan José Santamaría Flecha, Julien Rouhaud
Discussion: https://postgr.es/m/YD9GigwHoL6lFY2y@paquier.xyz
2021-03-05 10:12:49 +09:00
Andrew Dunstan d3676a2e9f Close psql processes gracefully in recovery tests
Under windows, psql processes need to be ended explicitly, or the TAP
tests hang. However, the recovery tests were doing this via
IPC::Run::kill_kill(), which causes other major problems on Windows.

We solve this by instead sending '\q' to psql so it quits of its own
accord, and then simply waiting for it. This means we can now run almost
all the recovery tests on all Windows platforms.

Discussion: https://postgr.es/m/20210301200715.tdjpuesfzebpffgn@alap3.anarazel.de
2021-03-04 13:23:41 -05:00
Peter Eisentraut 040af77938 pg_upgrade: Fix oversight in version checking
Mistake in f06b1c598254f8adb2b7f51d6a7685618a7fb121: We should only
check the version of the binaries in the target installation.  The
source installation can of course be of a different version.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/E1lHNKN-0005IC-V6%40gemulon.postgresql.org
2021-03-04 10:34:17 +01:00
Fujii Masao 4a4241e15b Remove redundant getenv() for PGUSER, in psql help.
Previously psql obtained the value of PGUSER twice to display
a default user in its help message.

Author: Kota Miyake
Reviewed-by: Nitin Jadhav, Fujii Masao
Discussion: https://postgr.es/m/2a3c612babdd6ed63a9d877bb575d793@oss.nttdata.com
2021-03-04 18:23:22 +09:00
Heikki Linnakangas 85d94c5753 Avoid extra newline in errors received in FE protocol version 2.
Contrary to what the comment said, the postmaster does in fact end all
its messages in a newline, since server version 7.2. Be tidy and don't
add an extra newline if the error message already has one.

Discussion: https://www.postgresql.org/message-id/CAFBsxsEdgMXc%2BtGfEy9Y41i%3D5pMMjKeH8t8vSAypR3ZnAoQnHg%40mail.gmail.com
2021-03-04 10:56:33 +02:00
Heikki Linnakangas 3174d69fb9 Remove server and libpq support for old FE/BE protocol version 2.
Protocol version 3 was introduced in PostgreSQL 7.4. There shouldn't be
many clients or servers left out there without version 3 support. But as
a courtesy, I kept just enough of the old protocol support that we can
still send the "unsupported protocol version" error in v2 format, so that
old clients can display the message properly. Likewise, libpq still
understands v2 ErrorResponse messages when establishing a connection.

The impetus to do this now is that I'm working on a patch to COPY
FROM, to always prefetch some data. We cannot do that safely with the
old protocol, because it requires parsing the input one byte at a time
to detect the end-of-copy marker.

Reviewed-by: Tom Lane, Alvaro Herrera, John Naylor
Discussion: https://www.postgresql.org/message-id/9ec25819-0a8a-d51a-17dc-4150bb3cca3b%40iki.fi
2021-03-04 10:45:55 +02:00
Tom Lane 0a687c8f10 Add trim_array() function.
This has been in the SQL spec since 2008.  It's a pretty thin
wrapper around the array slice functionality, but the spec
says we should have it, so here it is.

Vik Fearing, reviewed by Dian Fay

Discussion: https://postgr.es/m/fc92ce17-9655-8ff1-c62a-4dc4c8ccd815@postgresfriends.org
2021-03-03 16:39:57 -05:00
Tom Lane 3769e11a31 Make test_target_session_attrs more robust against connection failure.
Feed the desired command to psql via "-c" not stdin, else Perl
may complain that it can't push stdin to an already-failed psql
process, as seen in intermittent buildfarm failures.

Make some minor cosmetic improvements while at it.

Before commit ee28cacf6 we had no tests here that expected failure
to connect, so there seems no need for a back-patch.

Discussion: https://postgr.es/m/CALDaNm2mo8YED=M2ZJKGf1U3F3mw6SaQuLXWCK8rZP6sECYcrA@mail.gmail.com
2021-03-03 13:51:43 -05:00
Peter Eisentraut f06b1c5982 pg_upgrade: Check version of target cluster binaries
This expands the binary validation in pg_upgrade with a version
check per binary to ensure that the target cluster installation
only contains binaries from the target version.

In order to reduce duplication, validate_exec is exported from
port.h and the local copy in pg_upgrade is removed.

Author: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/9328.1552952117@sss.pgh.pa.us
2021-03-03 09:45:56 +01:00
Peter Eisentraut e527a99055 Some copy-editing of GUC descriptions 2021-03-03 07:14:35 +01:00
Tom Lane d422a2a94b Silence perlcritic warning in commit ee28cacf6.
Per buildfarm; this fix is from Michael Paquier (vignesh C
proposed nearly the same).

Discussion: https://postgr.es/m/YD8IZ9OKfUf9X1eF@paquier.xyz
2021-03-02 23:32:43 -05:00
Thomas Munro 8eda3eba30 Use sort_template.h for qsort_tuple() and qsort_ssup().
Replace the Perl code previously used to generate specialized sort
functions with sort_template.h.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com
2021-03-03 17:02:32 +13:00
Thomas Munro f374f4d664 Use sort_template.h for qsort() and qsort_arg().
Reduce duplication by using the new template.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com
2021-03-03 17:02:32 +13:00
Thomas Munro 0a1f1d3cac Add sort_template.h for making sort functions.
Move our qsort implementation into a header that can be used to define
specialized functions for better performance and reduced duplication.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com
2021-03-03 17:02:22 +13:00
Amit Kapila 19890a064e Add option to enable two_phase commits via pg_create_logical_replication_slot.
Commit 0aa8a01d04 extends the output plugin API to allow decoding of
prepared xacts and allowed the user to enable/disable the two-phase option
via pg_logical_slot_get_changes(). This can lead to a problem such that
the first time when it gets changes via pg_logical_slot_get_changes()
without two_phase option enabled it will not get the prepared even though
prepare is after consistent snapshot. Now next time during getting changes,
if the two_phase option is enabled it can skip prepare because by that
time start decoding point has been moved. So the user will only get commit
prepared.

Allow to enable/disable this option at the create slot time and default
will be false. It will break the existing slots which is fine in a major
release.

Author: Ajin Cherian
Reviewed-by: Amit Kapila and Vignesh C
Discussion: https://postgr.es/m/d0f60d60-133d-bf8d-bd70-47784d8fabf3@enterprisedb.com
2021-03-03 07:34:11 +05:30
Tom Lane ee28cacf61 Extend the abilities of libpq's target_session_attrs parameter.
In addition to the existing options of "any" and "read-write", we
now support "read-only", "primary", "standby", and "prefer-standby".
"read-write" retains its previous meaning of "transactions are
read-write by default", and "read-only" inverts that.  The other
three modes test specifically for hot-standby status, which is not
quite the same thing.  (Setting default_transaction_read_only on
a primary server renders it read-only to this logic, but not a
standby.)

Furthermore, if talking to a v14 or later server, no extra network
round trip is needed to detect the session's status; the GUC_REPORT
variables delivered by the server are enough.  When talking to an
older server, a SHOW or SELECT query is issued to detect session
read-only-ness or server hot-standby state, as needed.

Haribabu Kommi, Greg Nancarrow, Vignesh C, Tom Lane; reviewed at
various times by Laurenz Albe, Takayuki Tsunakawa, Peter Smith.

Discussion: https://postgr.es/m/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
2021-03-02 20:17:48 -05:00
Michael Paquier 57e6db706e Add --tablespace option to reindexdb
This option provides REINDEX (TABLESPACE) for reindexdb, applying the
tablespace value given by the caller to all the REINDEX queries
generated.

While on it, this commit adds some tests for REINDEX TABLESPACE, with
and without CONCURRENTLY, when run on toast indexes and tables.  Such
operations are not allowed, and toast relation names are not stable
enough to be part of the main regression test suite (even if using a PL
function with a TRY/CATCH logic, as CONCURRENTLY could not be tested).

Author: Michael Paquier
Reviewed-by: Mark Dilger, Daniel Gustafsson
Discussion: https://postgr.es/m/YDiaDMnzLICqeukl@paquier.xyz
2021-03-03 10:14:21 +09:00
Peter Geoghegan 5b2f2af3d9 nbtree page deletion: Add leaftopparent assertion.
Add documenting assertion.  This makes it easier to follow how we
maintain the top parent link in target subtree's half-dead/leaf level
page.
2021-03-02 14:06:07 -08:00
Peter Geoghegan 3d8d5787a3 Fix nbtree page deletion error messages.
Adjust some "can't happen" error messages that assumed that the page
deletion target page must be a half-dead page.  This assumption was
wrong in the case of an internal target page.  Simply refer to these
pages as the target page instead.

Internal pages are never marked half-dead.  There is exactly one
half-dead page for each subtree undergoing deletion.  The half-dead page
is also the target subtree's leaf-level page.  This has been the case
since commit efada2b8, which totally overhauled nbtree page deletion.
2021-03-02 13:02:24 -08:00
Tom Lane d16f8c8e41 Mark default_transaction_read_only as GUC_REPORT.
This allows clients to find out the setting at connection time without
having to expend a query round trip to do so; which is helpful when
trying to identify read/write servers.  (One must also look at
in_hot_standby, but that's already GUC_REPORT, cf bf8a662c9.)
Modifying libpq to make use of this will come soon, but I felt it
cleaner to push the server change separately.

Haribabu Kommi, Greg Nancarrow, Vignesh C; reviewed at various times
by Laurenz Albe, Takayuki Tsunakawa, Peter Smith.

Discussion: https://postgr.es/m/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
2021-03-02 13:53:54 -05:00
Alvaro Herrera 75dbfe4ca7
Use native path separators to pg_ctl in initdb
On Windows, CMD.EXE allegedly does not run a command that uses forward slashes,
so let's convert the path to use backslashes instead.

Backpatch to 10.

Author: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>
Discussion: https://postgr.es/m/CAMm1aWaNDuaPYFYMAqDeJrZmPtNvLcJRS++CcZWY8LT6KcoBZw@mail.gmail.com
2021-03-02 15:39:34 -03:00
Tom Lane 4604f83fdf Suppress unnecessary regex subre nodes in a couple more cases.
This extends the changes made in commit cebc1d34e, teaching
parseqatom() to generate fewer or cheaper subre nodes in some edge
cases.  The case of interest here is a quantified atom that is "messy"
only because it has greediness opposite to what preceded it (whereas
captures and backrefs are intrinsically messy).  In this case we don't
need an iteration node, since we don't care where the sub-matches of
the quantifier are; and we might also not need a second concatenation
node.  This seems of only marginal real-world use according to my
testing, but I wanted to get it in before wrapping up this series of
regex performance fixes.

Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
2021-03-02 12:14:14 -05:00
Tom Lane 0c3405cf11 Improve performance of regular expression back-references.
In some cases, at the time that we're doing an NFA-based precheck
of whether a backref subexpression can match at a particular place
in the string, we already know which substring the referenced
subexpression matched.  If so, we might as well forget about the NFA
and just compare the substring; this is faster and it gives an exact
rather than approximate answer.

In general, this optimization can help while we are prechecking within
the second child expression of a concat node, while the capture was
within the first child expression; then the substring was saved during
cdissect() of the first child and will be available to NFA checks done
while cdissect() recurses into the second child.  It can help quite a
lot if the tree looks like

              concat
             /      \
      capture        concat
                    /      \
     expensive stuff        backref

as we will be able to avoid recursively dissecting the "expensive
stuff" before discovering that the backref isn't satisfied with a
particular midpoint that the lower concat node is testing.  This
doesn't help if the concat tree is left-deep, as the capture node
won't get set soon enough (and it's hard to fix that without changing
the engine's match behavior).  Fortunately, right-deep concat trees
are the common case.

Patch by me, reviewed by Joel Jacobson

Discussion: https://postgr.es/m/661609.1614560029@sss.pgh.pa.us
2021-03-02 11:55:12 -05:00
Tom Lane 4aea704a5b Fix semantics of regular expression back-references.
POSIX defines the behavior of back-references thus:

    The back-reference expression '\n' shall match the same (possibly
    empty) string of characters as was matched by a subexpression
    enclosed between "\(" and "\)" preceding the '\n'.

As far as I can see, the back-reference is supposed to consider only
the data characters matched by the referenced subexpression.  However,
because our engine copies the NFA constructed from the referenced
subexpression, it effectively enforces any constraints therein, too.
As an example, '(^.)\1' ought to match 'xx', or any other string
starting with two occurrences of the same character; but in our code
it does not, and indeed can't match anything, because the '^' anchor
constraint is included in the backref's copied NFA.  If POSIX intended
that, you'd think they'd mention it.  Perl for one doesn't act that
way, so it's hard to conclude that this isn't a bug.

Fix by modifying the backref's NFA immediately after it's copied from
the reference, replacing all constraint arcs by EMPTY arcs so that the
constraints are treated as automatically satisfied.  This still allows
us to enforce matching rules that depend only on the data characters;
for example, in '(^\d+).*\1' the NFA matching step will still know
that the backref can only match strings of digits.

Perhaps surprisingly, this change does not affect the results of any
of a rather large corpus of real-world regexes.  Nonetheless, I would
not consider back-patching it, since it's a clear compatibility break.

Patch by me, reviewed by Joel Jacobson

Discussion: https://postgr.es/m/661609.1614560029@sss.pgh.pa.us
2021-03-02 11:34:53 -05:00
Michael Paquier c5530d8474 Fix duplicated test case in TAP tests of reindexdb
The same test for REINDEX (VERBOSE) was done twice, while it is clear
that the second test should use --concurrently.  Issue introduced in
5dc92b8, for what looks like a copy-paste mistake.

Reviewed-by: Mark Dilger
Discussion: https://postgr.es/m/A7AE97EA-F4B0-4CAB-8FFF-3FECD31F9D63@enterprisedb.com
Backpatch-through: 12
2021-03-02 13:18:06 +09:00
Michael Paquier fabde52fab Simplify code to switch pg_class.relrowsecurity in tablecmds.c
The same code pattern was repeated twice to enable or disable ROW LEVEL
SECURITY with an ALTER TABLE command.  This makes the code slightly
cleaner.

Author: Justin Pryzby
Reviewed-by: Zhihong Yu
Discussion: https://postgr.es/m/20210228211854.GC20769@telsasoft.com
2021-03-02 12:30:21 +09:00
Tom Lane ffd3944ab9 Improve reporting for syntax errors in multi-line JSON data.
Point to the specific line where the error was detected; the
previous code tended to include several preceding lines as well.
Avoid re-scanning the entire input to recompute which line that
was.  Simplify the logic a bit.  Add test cases.

Simon Riggs and Hamid Akhtar, reviewed by Daniel Gustafsson and myself

Discussion: https://postgr.es/m/CANbhV-EPBnXm3MF_TTWBwwqgn1a1Ghmep9VHfqmNBQ8BT0f+_g@mail.gmail.com
2021-03-01 16:44:17 -05:00
Thomas Munro bd69ddfcdb Remove obsolete comment for WaitForProcSignalBarrier().
Commit 814f1d8b removed the behavior described.

Reported-by: Amit Kapila <amit.kapila16@gmail.com>
2021-03-02 09:30:57 +13:00
Andres Freund 1e6e404471 Fix recovery test hang in 021_row_visibility.pl on windows.
The psql processes were not explicitly killed (but would eventually
exit due postgres shutting down). For some reason windows perl doesn't
like that, resulting in errors like
Warning: unable to close filehandle GEN20 properly: Bad file descriptor during global destruction.

The test was introduced in d6734a897e3, so no backpatching necessary.
2021-03-01 11:25:24 -08:00
Thomas Munro f5a5773a9d Allow condition variables to be used in interrupt code.
Adjust the condition variable sleep loop to work correctly when code
reached by its internal CHECK_FOR_INTERRUPTS() call interacts with
another condition variable.

There are no such cases currently, but a proposed patch would do this.

Discussion: https://postgr.es/m/CA+hUKGLdemy2gBm80kz20GTe6hNVwoErE8KwcJk6-U56oStjtg@mail.gmail.com
2021-03-01 17:24:47 +13:00
Thomas Munro 814f1d8bc3 Use condition variables for ProcSignalBarriers.
Instead of a poll/sleep loop, use a condition variable for precise
wake-up whenever a backend's pss_barrierGeneration advances.

Discussion: https://postgr.es/m/CA+hUKGLdemy2gBm80kz20GTe6hNVwoErE8KwcJk6-U56oStjtg@mail.gmail.com
2021-03-01 17:23:43 +13:00
Amit Kapila 8bdb1332eb Avoid repeated decoding of prepared transactions after a restart.
In commit a271a1b50e, we allowed decoding at prepare time and the prepare
was decoded again if there is a restart after decoding it. It was done
that way because we can't distinguish between the cases where we have not
decoded the prepare because it was prior to consistent snapshot or we have
decoded it earlier but restarted. To distinguish between these two cases,
we have introduced an initial_consistent_point at the slot level which is
an LSN at which we found a consistent point at the time of slot creation.
This is also the point where we have exported a snapshot for the initial
copy. So, prepare transaction prior to this point are sent along with
commit prepared.

This commit bumps SNAPBUILD_VERSION because of change in SnapBuild. It
will break existing slots which is fine in a major release.

Author: Ajin Cherian, based on idea by Andres Freund
Reviewed-by: Amit Kapila and Vignesh C
Discussion: https://postgr.es/m/d0f60d60-133d-bf8d-bd70-47784d8fabf3@enterprisedb.com
2021-03-01 09:11:18 +05:30
Thomas Munro 6230912f23 Use FeBeWaitSet for walsender.c.
This avoids the need to set up and tear down a fresh WaitEventSet every
time we need need to wait.  We have to add an explicit exit on
postmaster exit (FeBeWaitSet isn't set up to do that automatically), so
move the code to do that into a new function to avoid repetition.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com
2021-03-01 16:19:38 +13:00
Thomas Munro a042ba2ba7 Introduce symbolic names for FeBeWaitSet positions.
Previously we used 0 and 1 to refer to the socket and latch in far flung
parts of the tree, without any explanation.  Also use PGINVALID_SOCKET
rather than -1 in a couple of places that didn't already do that.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com
2021-03-01 16:10:16 +13:00
Amit Kapila b4e3dc7fd4 Update the docs and comments for decoding of prepared xacts.
Commit a271a1b50e introduced decoding at prepare time in ReorderBuffer.
This can lead to deadlock for out-of-core logical replication solutions
that uses this feature to build distributed 2PC in case such transactions
lock [user] catalog tables exclusively. They need to inform users to not
have locks on catalog tables (via explicit LOCK command) in such
transactions.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/20210222222847.tpnb6eg3yiykzpky@alap3.anarazel.de
2021-03-01 08:14:33 +05:30
Thomas Munro 6148656a0b Use EVFILT_SIGNAL for kqueue latches.
Cut down on system calls and other overheads by waiting for SIGURG
explicitly with kqueue instead of using a signal handler and self-pipe.
Affects *BSD and macOS systems.

This leaves only the poll implementation with a signal handler and the
traditional self-pipe trick.

Discussion: https://postgr.es/m/CA+hUKGJjxPDpzBE0a3hyUywBvaZuC89yx3jK9RFZgfv_KHU7gg@mail.gmail.com
2021-03-01 14:20:04 +13:00
Thomas Munro 6a2a70a020 Use signalfd(2) for epoll latches.
Cut down on system calls and other overheads by reading from a signalfd
instead of using a signal handler and self-pipe.  Affects Linux sytems,
and possibly others including illumos that implement the Linux epoll and
signalfd interfaces.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJjxPDpzBE0a3hyUywBvaZuC89yx3jK9RFZgfv_KHU7gg@mail.gmail.com
2021-03-01 14:12:02 +13:00
Thomas Munro 83709a0d5a Use SIGURG rather than SIGUSR1 for latches.
Traditionally, SIGUSR1 has been overloaded for ad-hoc signals,
procsignal.c signals and latch.c wakeups.  Move that last use over to a
new dedicated signal.  SIGURG is normally used to report out-of-band
socket data, but PostgreSQL doesn't use that facility.

The signal handler is now installed in all postmaster children by
InitializeLatchSupport().  Those wishing to disconnect from it should
call ShutdownLatchSupport().

Future patches will use this separation of signals to avoid the need for
a signal handler on some operating systems.

Discussion: https://postgr.es/m/CA+hUKGJjxPDpzBE0a3hyUywBvaZuC89yx3jK9RFZgfv_KHU7gg@mail.gmail.com
2021-03-01 12:44:12 +13:00
Thomas Munro c8f3bc2401 Optimize latches to send fewer signals.
Don't send signals to processes that aren't sleeping.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJjxPDpzBE0a3hyUywBvaZuC89yx3jK9RFZgfv_KHU7gg@mail.gmail.com
2021-03-01 12:44:12 +13:00
Thomas Munro d1b90995e8 Remove latch.c workaround for Linux < 2.6.27.
Commit 82ebbeb0 added a workaround for systems with no epoll_create1()
and EPOLL_CLOEXEC.  Linux < 2.6.27 and glibc < 2.9 are long gone.  Now
seems like a good time to drop the extra code, because otherwise we'd
have to add similar already-dead workaround code to new patches using
XXX_CLOEXEC flags that arrived in the same kernel release.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGKL_%3DaO%3Dr30N%3Ds9VoDgTqHpRSzePRbA9dkYO7snc7HsxA%40mail.gmail.com
2021-03-01 11:24:28 +13:00
Michael Paquier 943eb47880 pgbench: Remove now-dead CState->ecnt
The last use of ecnt was in 12788ae.  It was getting incremented after a
backend error without any purpose since then, so let's get rid of it.

Author: Kota Miyake
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/786c3d9fbe067763d899e78c296f9f0f@oss.nttdata.com
2021-02-28 07:50:26 +09:00
Alvaro Herrera 25936fd46c
Fix use-after-free bug with AfterTriggersTableData.storeslot
AfterTriggerSaveEvent() wrongly allocates the slot in execution-span
memory context, whereas the correct thing is to allocate it in
a transaction-span context, because that's where the enclosing
AfterTriggersTableData instance belongs into.

Backpatch to 12 (the test back to 11, where it works well with no code
changes, and it's good to have to confirm that the case was previously
well supported); this bug seems introduced by commit ff11e7f4b9.

Reported-by: Bertrand Drouvot <bdrouvot@amazon.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/39a71864-b120-5a5c-8cc5-c632b6f16761@amazon.com
2021-02-27 18:09:15 -03:00
David Rowley 977b2c0853 Add missing TidRangeScan readfunc
Mistakenly forgotten in bb437f995
2021-02-27 23:21:21 +13:00
David Rowley bb437f995d Add TID Range Scans to support efficient scanning ranges of TIDs
This adds a new executor node named TID Range Scan.  The query planner
will generate paths for TID Range scans when quals are discovered on base
relations which search for ranges on the table's ctid column.  These
ranges may be open at either end. For example, WHERE ctid >= '(10,0)';
will return all tuples on page 10 and over.

To support this, two new optional callback functions have been added to
table AM.  scan_set_tidrange is used to set the scan range to just the
given range of TIDs.  scan_getnextslot_tidrange fetches the next tuple
in the given range.

For AMs were scanning ranges of TIDs would not make sense, these functions
can be set to NULL in the TableAmRoutine.  The query planner won't
generate TID Range Scan Paths in that case.

Author: Edmund Horner, David Rowley
Reviewed-by: David Rowley, Tomas Vondra, Tom Lane, Andres Freund, Zhihong Yu
Discussion: https://postgr.es/m/CAMyN-kB-nFTkF=VA_JPwFNo08S0d-Yk0F741S2B7LDmYAi8eyA@mail.gmail.com
2021-02-27 22:59:36 +13:00
Peter Eisentraut f4adc41c4f Enhanced cycle mark values
Per SQL:202x draft, in the CYCLE clause of a recursive query, the
cycle mark values can be of type boolean and can be omitted, in which
case they default to TRUE and FALSE.

Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Discussion: https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5be2@2ndquadrant.com
2021-02-27 08:13:24 +01:00
Tom Lane 0fc1af174c Improve memory management in regex compiler.
The previous logic here created a separate pool of arcs for each
state, so that the out-arcs of each state were physically stored
within it.  Perhaps this choice was driven by trying to not include
a "from" pointer within each arc; but Spencer gave up on that idea
long ago, and it's hard to see what the value is now.  The approach
turns out to be fairly disastrous in terms of memory consumption,
though.  In the first place, NFAs built by this engine seem to have
about 4 arcs per state on average, with a majority having only one
or two out-arcs.  So pre-allocating 10 out-arcs for each state is
already cause for a factor of two or more bloat.  Worse, the NFA
optimization phase moves arcs around with abandon.  In a large NFA,
some of the states will have hundreds of out-arcs, so towards the
end of the optimization phase we have a significant number of states
whose arc pools have room for hundreds of arcs each, even though only
a few of those arcs are in use.  We have seen real-world regexes in
which this effect bloats the memory requirement by 25X or even more.

Hence, get rid of the per-state arc pools in favor of a single arc
pool for the whole NFA, with variable-sized allocation batches
instead of always asking for 10 at a time.  While we're at it,
let's batch the allocations of state structs too, to further reduce
the malloc traffic.

This incidentally allows moveouts() to be optimized in a similar
way to moveins(): when moving an arc to another state, it's now
valid to just re-link the same arc struct into a different outchain,
where before the code invariants required us to make a physically
new arc and then free the old one.

These changes reduce the regex compiler's typical space consumption
for average-size regexes by about a factor of two, and much more for
large or complicated regexes.  In a large test set of real-world
regexes, we formerly had half a dozen cases that failed with "regular
expression too complex" due to exceeding the REG_MAX_COMPILE_SPACE
limit (about 150MB); we would have had to raise that limit to
something close to 400MB to make them work with the old code.  Now,
none of those cases need more than 13MB to compile.  Furthermore,
the test set is about 10% faster overall due to less malloc traffic.

Discussion: https://postgr.es/m/168861.1614298592@sss.pgh.pa.us
2021-02-26 13:52:10 -05:00
Peter Eisentraut b3a9e9897e Extend a test case a little
This will possibly help a subsequent patch by making sure the notice
messages are distinct so that it's clear that they come out in the
right order.

Author: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://www.postgresql.org/message-id/alpine.DEB.2.21.1904240654120.3407%40lancre
2021-02-26 09:11:15 +01:00
Thomas Munro 8556267b2b Revert "pg_collation_actual_version() -> pg_collation_current_version()."
This reverts commit 9cf184cc05.  Name
change less well received than anticipated.

Discussion: https://postgr.es/m/afcfb97e-88a1-a540-db95-6c573b93bc2b%40eisentraut.org
2021-02-26 15:29:27 +13:00
Tom Lane 80ca8464fe Fix list-manipulation bug in WITH RECURSIVE processing.
makeDependencyGraphWalker and checkWellFormedRecursionWalker
thought they could hold onto a pointer to a list's first
cons cell while the list was modified by recursive calls.
That was okay when the cons cell was actually separately
palloc'd ... but since commit 1cff1b95a, it's quite unsafe,
leading to core dumps or incorrect complaints of faulty
WITH nesting.

In the field this'd require at least a seven-deep WITH nest
to cause an issue, but enabling DEBUG_LIST_MEMORY_USAGE
allows the bug to be seen with lesser nesting depths.

Per bug #16801 from Alexander Lakhin.  Back-patch to v13.

Michael Paquier and Tom Lane

Discussion: https://postgr.es/m/16801-393c7922143eaa4d@postgresql.org
2021-02-25 20:47:32 -05:00
Peter Geoghegan 2376361839 VACUUM VERBOSE: Count "newly deleted" index pages.
Teach VACUUM VERBOSE to report on pages deleted by the _current_ VACUUM
operation -- these are newly deleted pages.  VACUUM VERBOSE continues to
report on the total number of deleted pages in the entire index (no
change there).  The former is a subset of the latter.

The distinction between each category of deleted index page only arises
with index AMs where page deletion is supported and is decoupled from
page recycling for performance reasons.

This is follow-up work to commit e5d8a999, which made nbtree store
64-bit XIDs (not 32-bit XIDs) in pages at the point at which they're
deleted.  Note that the btm_last_cleanup_num_delpages metapage field
added by that commit usually gets set to pages_newly_deleted.  The
exceptions (the scenarios in which they're not equal) all seem to be
tricky cases for the implementation (of page deletion and recycling) in
general.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznpdHvujGUwYZ8sihX%3Dd5u-tRYhi-F4wnV2uN2zHpMUXw%40mail.gmail.com
2021-02-25 14:32:18 -08:00
Tom Lane 301ed8812e Doc: remove src/backend/regex/re_syntax.n.
We aren't publishing this file as documentation, and it's been
much more haphazardly maintained than the real docs in func.sgml,
so let's just drop it.  I think the only reason I included it in
commit 7bcc6d98f was that the Berkeley-era sources had had a man
page in this directory.

Discussion: https://postgr.es/m/4099447.1614186542@sss.pgh.pa.us
2021-02-25 13:33:27 -05:00
Tom Lane 7dc13a0f08 Change regex \D and \W shorthands to always match newlines.
Newline is certainly not a digit, nor a word character, so it is
sensible that it should match these complemented character classes.
Previously, \D and \W acted that way by default, but in
newline-sensitive mode ('n' or 'p' flag) they did not match newlines.

This behavior was previously forced because explicit complemented
character classes don't match newlines in newline-sensitive mode;
but as of the previous commit that implementation constraint no
longer exists.  It seems useful to change this because the primary
real-world use for newline-sensitive mode seems to be to match the
default behavior of other regex engines such as Perl and Javascript
... and their default behavior is that these match newlines.

The old behavior can be kept by writing an explicit complemented
character class, i.e. [^[:digit:]] or [^[:word:]].  (This means
that \D and \W are not exactly equivalent to those strings, but
they weren't anyway.)

Discussion: https://postgr.es/m/3220564.1613859619@sss.pgh.pa.us
2021-02-25 13:29:06 -05:00
Tom Lane 2a0af7fe46 Allow complemented character class escapes within regex brackets.
The complement-class escapes \D, \S, \W are now allowed within
bracket expressions.  There is no semantic difficulty with doing
that, but the rather hokey macro-expansion-based implementation
previously used here couldn't cope.

Also, invent "word" as an allowed character class name, thus "\w"
is now equivalent to "[[:word:]]" outside brackets, or "[:word:]"
within brackets.  POSIX allows such implementation-specific
extensions, and the same name is used in e.g. bash.

One surprising compatibility issue this raises is that constructs
such as "[\w-_]" are now disallowed, as our documentation has always
said they should be: character classes can't be endpoints of a range.
Previously, because \w was just a macro for "[:alnum:]_", such a
construct was read as "[[:alnum:]_-_]", so it was accepted so long as
the character after "-" was numerically greater than or equal to "_".

Some implementation cleanup along the way:

* Remove the lexnest() hack, and in consequence clean up wordchrs()
to not interact with the lexer.

* Fix colorcomplement() to not be O(N^2) in the number of colors
involved.

* Get rid of useless-as-far-as-I-can-see calls of element()
on single-character character element names in brackpart().
element() always maps these to the character itself, and things
would be quite broken if it didn't --- should "[a]" match something
different than "a" does?  Besides, the shortcut path in brackpart()
wasn't doing this anyway, making it even more inconsistent.

Discussion: https://postgr.es/m/2845172.1613674385@sss.pgh.pa.us
Discussion: https://postgr.es/m/3220564.1613859619@sss.pgh.pa.us
2021-02-25 13:00:40 -05:00
Fujii Masao 6b40d9bdbd Improve tab-completion for TRUNCATE.
Author: Kota Miyake
Reviewed-by: Muhammad Usama
Discussion: https://postgr.es/m/f5d30053d00dcafda3280c9e267ecb0f@oss.nttdata.com
2021-02-25 18:20:57 +09:00
Peter Geoghegan e5d8a99903 Use full 64-bit XIDs in deleted nbtree pages.
Otherwise we risk "leaking" deleted pages by making them non-recyclable
indefinitely.  Commit 6655a729 did the same thing for deleted pages in
GiST indexes.  That work was used as a starting point here.

Stop storing an XID indicating the oldest bpto.xact across all deleted
though unrecycled pages in nbtree metapages.  There is no longer any
reason to care about that condition/the oldest XID.  It only ever made
sense when wraparound was something _bt_vacuum_needs_cleanup() had to
consider.

The btm_oldest_btpo_xact metapage field has been repurposed and renamed.
It is now btm_last_cleanup_num_delpages, which is used to remember how
many non-recycled deleted pages remain from the last VACUUM (in practice
its value is usually the precise number of pages that were _newly
deleted_ during the specific VACUUM operation that last set the field).

The general idea behind storing btm_last_cleanup_num_delpages is to use
it to give _some_ consideration to non-recycled deleted pages inside
_bt_vacuum_needs_cleanup() -- though never too much.  We only really
need to avoid leaving a truly excessive number of deleted pages in an
unrecycled state forever.  We only do this to cover certain narrow cases
where no other factor makes VACUUM do a full scan, and yet the index
continues to grow (and so actually misses out on recycling existing
deleted pages).

These metapage changes result in a clear user-visible benefit: We no
longer trigger full index scans during VACUUM operations solely due to
the presence of only 1 or 2 known deleted (though unrecycled) blocks
from a very large index.  All that matters now is keeping the costs and
benefits in balance over time.

Fix an issue that has been around since commit 857f9c36, which added the
"skip full scan of index" mechanism (i.e. the _bt_vacuum_needs_cleanup()
logic).  The accuracy of btm_last_cleanup_num_heap_tuples accidentally
hinged upon _when_ the source value gets stored.  We now always store
btm_last_cleanup_num_heap_tuples in btvacuumcleanup().  This fixes the
issue because IndexVacuumInfo.num_heap_tuples (the source field) is
expected to accurately indicate the state of the table _after_ the
VACUUM completes inside btvacuumcleanup().

A backpatchable fix cannot easily be extracted from this commit.  A
targeted fix for the issue will follow in a later commit, though that
won't happen today.

I (pgeoghegan) have chosen to remove any mention of deleted pages in the
documentation of the vacuum_cleanup_index_scale_factor GUC/param, since
the presence of deleted (though unrecycled) pages is no longer of much
concern to users.  The vacuum_cleanup_index_scale_factor description in
the docs now seems rather unclear in any case, and it should probably be
rewritten in the near future.  Perhaps some passing mention of page
deletion will be added back at the same time.

Bump XLOG_PAGE_MAGIC due to nbtree WAL records using full XIDs now.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAH2-WznpdHvujGUwYZ8sihX=d5u-tRYhi-F4wnV2uN2zHpMUXw@mail.gmail.com
2021-02-24 18:41:34 -08:00
Amit Kapila 8a4f9522d0 Fix relcache reference leak introduced by ce0fdbfe97.
Author: Sawada Masahiko
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAD21AoA7ZEfsOXQ9HQqMv3QYGsEm2H5Wk5ic5S=mvzDf-3a3SA@mail.gmail.com
2021-02-25 07:48:24 +05:30
Michael Paquier bcf2667bf6 Fix some typos, grammar and style in docs and comments
The portions fixing the documentation are backpatched where needed.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210210235557.GQ20012@telsasoft.com
backpatch-through: 9.6
2021-02-24 16:13:17 +09:00