This adds tests to cover more code paths to ignore backslash commands in
false branches when using \if|\elif|\else, and improves the coverage of
\elif.
Author: Fabien Coelho
Discussion: https://postgr.es/m/alpine.DEB.2.21.1908281618520.28828@lancre
This is a second try at what commit 57431a911 tried to do, namely,
launch the syslogger before we open postmaster sockets so that our
messages about the sockets end up in the syslogger files. That
commit fell foul of a bunch of subtle issues caused by trying to
launch a postmaster child process before creating shared memory.
Rather than messing with that interaction, let's postpone opening
the sockets till after we launch the syslogger.
This would not have been terribly safe before commit 7de19fbc0,
because we relied on socket opening to detect whether any competing
postmasters were using the same port number. But now that we choose
IPC keys without regard to the port number, there's no interaction
to worry about.
Also delay creation of the external PID file (if requested) till after
the sockets are open, since external code could plausibly be relying
on that ordering of events. And postpone most of the work of
RemovePgTempFiles() so that that potentially-slow processing still
happens after we make the external PID file. We have to be a bit
careful about that last though: as noted in the discussion subsequent to
bug #15804, EXEC_BACKEND builds still have to clear the parameter-file
temp dir before launching the syslogger.
Patch by me; thanks to Michael Paquier for review/testing.
Discussion: https://postgr.es/m/15804-3721117bf40fb654@postgresql.org
Procedures are supported since v11 and \dfp can be used since this
version, but it was not mentioned as a supported option in the
description of describeFunctions() which handles \df in psql.
Extracted from a larger patch.
Author: Fabien Coelho
Discussion: https://postgr.es/m/alpine.DEB.2.21.1908281618520.28828@lancre
Depending on the system used, t/*.pl may not be expanded into a list of
tests which can be consumed by prove when attempting to run TAP tests on
a given path. Fix that by using glob() directly in the script, to make
sure that a complete list of tests is provided. This has not proved to
be an issue with MSVC as the list was properly expanded, but it is on
Linux with perl's system().
This is extracted from a larger patch.
Author: Tom Lane
Discussion: https://postgr.es/m/6628.1567958876@sss.pgh.pa.us
Backpatch-through: 9.4
When building statistics, we need to decide how many rows to sample and
how accurate the resulting statistics should be. Until now, it was not
possible to explicitly define statistics target for extended statistics
objects, the value was always computed from the per-attribute targets
with a fallback to the system-wide default statistics target.
That's a bit inconvenient, as it ties together the statistics target set
for per-column and extended statistics. In some cases it may be useful
to require larger sample / higher accuracy for extended statics (or the
other way around), but with this approach that's not possible.
So this commit introduces a new command, allowing to specify statistics
target for individual extended statistics objects, overriding the value
derived from per-attribute targets (and the system default).
ALTER STATISTICS stat_name SET STATISTICS target_value;
When determining statistics target for an extended statistics object we
first look at this explicitly set value. When this value is -1, we fall
back to the old formula, looking at the per-attribute targets first and
then the system default. This means the behavior is backwards compatible
with older PostgreSQL releases.
Author: Tomas Vondra
Discussion: https://postgr.es/m/20190618213357.vli3i23vpkset2xd@development
Reviewed-by: Kirk Jamison, Dean Rasheed
Up to now, async.c scanned its whole array of per-backend state
whenever it needed to find listening backends. That's expensive
if MaxBackends is large, so extend the data structure with list
links that thread the active entries together.
A downside of this change is that asyncQueueUnregister (unregister
a listening backend at backend exit) now requires exclusive not shared
lock, and it can take awhile if there are many other listening
backends. We could improve the latter issue by using a doubly- not
singly-linked list, but it's probably not worth the storage space;
typical usage patterns for LISTEN/NOTIFY have fairly long-lived
listeners.
In return for that, Exec_ListenPreCommit (initially register a
listening backend), SignalBackends, and asyncQueueAdvanceTail
get significantly faster when MaxBackends is much larger than
the number of listening backends. If most of the potential
backend slots are listening, we don't win, but that's a case
where the actual interprocess-signal overhead is going to swamp
these considerations anyway.
Martijn van Oosterhout, hacked a bit more by me
Discussion: https://postgr.es/m/CADWG95vtRBFDdrx1JdT1_9nhOFw48KaeTev6F_LtDQAFVpSPhA@mail.gmail.com
There was some duplicate code to run SHOW transaction_read_only to
determine whether the server is read-write or read-only. Reduce it by
adding another state to the state machine.
Author: Hari Babu Kommi
Reviewed-by: Takayuki Tsunakawa, Álvaro Herrera
Discussion: https://postgr.es/m/CAJrrPGe_qgdbbN+yBgEVpd+YLHXXjTruzk6RmTMhqrFig+32ag@mail.gmail.com
Assert that _bt_binsrch() binary searches with scantid set in insertion
scankey cannot be performed on leaf pages. Leaf-level binary searches
where scantid is set must use _bt_binsrch_insert() instead.
_bt_binsrch_insert() is likely to have additional responsibilities in
the future, such as searching within GIN-style posting lists using
scantid. It seems like a good idea to tighten things up now.
Don't just assume that the next port is free; it might not be, or
if we're really unlucky it might even be out of the TCP range.
Do it honestly with two get_free_port() calls instead.
This is surely a pretty low-probability problem, but I think it
explains a buildfarm failure seen today, so let's fix it.
Back-patch to v11 where this script was added.
Discussion: https://postgr.es/m/25124.1568052346@sss.pgh.pa.us
Modern versions of msys2 have changed the treatment of "cmd /c" so that
the runtime will try to convert the switch to a native file path. This
patch adds a setting to inhibit that behaviour.
Discussion: https://postgr.es/m/3227042f-cfcc-745a-57dd-fb8c471f8ddf@2ndQuadrant.com
Backpatch to all live branches.
In ad0bda5d24 I changed the EvalPlanQual machinery to store
substitution tuples in slot, instead of using plain HeapTuples. The
main motivation for that was that using HeapTuples will be inefficient
for future tableams. But it turns out that that conversion was buggy
for non-locking rowmarks - the wrong tuple descriptor was used to
create the slot.
As a secondary issue 5db6df0c0 changed ExecLockRows() to begin EPQ
earlier, to allow to fetch the locked rows directly into the EPQ
slots, instead of having to copy tuples around. Unfortunately, as Tom
complained, that forces some expensive initialization to happen
earlier.
As a third issue, the test coverage for EPQ was clearly insufficient.
Fixing the first issue is unfortunately not trivial: Non-locked row
marks were fetched at the start of EPQ, and we don't have the type
information for the rowmarks available at that point. While we could
change that, it's not easy. It might be worthwhile to change that at
some point, but to fix this bug, it seems better to delay fetching
non-locking rowmarks when they're actually needed, rather than
eagerly. They're referenced at most once, and in cases where EPQ
fails, might never be referenced. Fetching them when needed also
increases locality a bit.
To be able to fetch rowmarks during execution, rather than
initialization, we need to be able to access the active EPQState, as
that contains necessary data. To do so move EPQ related data from
EState to EPQState, and, only for EStates creates as part of EPQ,
reference the associated EPQState from EState.
To fix the second issue, change EPQ initialization to allow use of
EvalPlanQualSlot() to be used before EvalPlanQualBegin() (but
obviously still requiring EvalPlanQualInit() to have been done).
As these changes made struct EState harder to understand, e.g. by
adding multiple EStates, significantly reorder the members, and add a
lot more comments.
Also add a few more EPQ tests, including one that fails for the first
issue above. More is needed.
Reported-By: yi huang
Author: Andres Freund
Reviewed-By: Tom Lane
Discussion:
https://postgr.es/m/CAHU7rYZo_C4ULsAx_LAj8az9zqgrD8WDd4hTegDTMM1LMqrBsg@mail.gmail.comhttps://postgr.es/m/24530.1562686693@sss.pgh.pa.us
Backpatch: 12-, where the EPQ changes were introduced
f2e40380 introduces support of non-key attributes in GiST indexes. Then if
get_index_column_opclass() is asked by gistproperty() to get an opclass of
non-key column, it returns garbage past oidvector value. This commit fixes
that by making get_index_column_opclass() return InvalidOid in this case.
Discussion: https://postgr.es/m/20190902231948.GA5343%40alvherre.pgsql
Author: Nikita Glukhov, Alexander Korotkov
Backpatch-through: 12
Some of these are quite old, but that doesn't make them not bugs.
We'd rather report a failure via elog than SIGSEGV.
While at it, uniformly spell the error check as !RelationIsValid(rel)
rather than a bare rel == NULL test. The machine code is the same
but it seems better to be consistent.
Coverity complained about this today, not sure why, because the
mistake is in fact old.
In order to implement NULL LAST semantic GiST previously assumed distance to
the NULL value to be Inf. However, our distance functions can return Inf and
NaN for non-null values. In such cases, NULL LAST semantic appears to be
broken. This commit fixes that by introducing separate array of null flags for
distances.
Backpatch to all supported versions.
Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov
Backpatch-through: 9.4
Previously plain float comparison was used in GiST pairing heap. Such
comparison doesn't provide proper ordering for value sets containing Inf and Nan
values. This commit fixes that by usage of float8_cmp_internal(). Note, there
is remaining problem with NULL distances, which are represented as Inf in
pairing heap. It would be fixes in subsequent commit.
Backpatch to all supported versions.
Reported-by: Andrey Borodin
Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Heikki Linnakangas
Backpatch-through: 9.4
When using COMMIT AND CHAIN or ROLLBACK AND CHAIN not in an explicit
transaction block, the previous implementation would leave a
transaction block active in the ROLLBACK case but not the COMMIT case.
To fix for now, error out when using these commands not in an explicit
transaction block. This restriction could be lifted if a sensible
definition and implementation is found.
Bug: #15977
Author: fn ln <emuser20140816@gmail.com>
Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
Commit 6f6b99d13 stuck an INFO message into the fast path for
checking partition constraints, for no very good reason except
that it made it easy for the regression tests to verify that
that path was taken. Assorted later patches did likewise,
increasing the unsuppressable-chatter level from ALTER TABLE
even more. This isn't good for the user experience, so let's
drop these messages down to DEBUG1 where they belong. So as
not to have a loss of test coverage, create a TAP test that
runs the relevant queries with client_min_messages = DEBUG1
and greps for the expected messages.
This testing method is a bit brute-force --- in particular,
it duplicates the execution of a fair amount of the core
create_table and alter_table tests. We experimented with
other solutions, but running any significant amount of
standard testing with client_min_messages = DEBUG1 seems
to have a lot of output-stability pitfalls, cf commits
bbb96c370 and 5655565c0. Possibly at some point we'll look
into whether we can reduce the amount of test duplication.
Backpatch into v12, because some of these messages are new
in v12 and we don't really want to ship it that way.
Sergei Kornilov
Discussion: https://postgr.es/m/81911511895540@web58j.yandex.ru
Discussion: https://postgr.es/m/4859321552643736@myt5-02b80404fd9e.qloud-c.yandex.net
As a result of some long-ago quick hacks, the SIMILAR TO operator
and the corresponding flavor of substring() interpreted "ESCAPE NULL"
as selecting the default escape character '\'. This is both
surprising and not per spec: the standard is clear that these
functions should return NULL for NULL input.
Additionally, because of inconsistency of the strictness markings
of 3-argument substring() and similar_escape(), the planner could not
inline the SQL definition of substring(), resulting in a substantial
performance penalty compared to the underlying POSIX substring()
function.
The simplest fix for this would be to change the strictness marking
of similar_escape(), but if we do that we risk breaking existing views
that depend on that function. Hence, leave similar_escape() as-is
as a compatibility function, and instead invent a new function
similar_to_escape() that comes in two strict variants.
There are a couple of other behaviors in this area that are also
not per spec, but they are documented and seem generally at least
as sane as the spec's definition, so leave them alone. But improve
the documentation to describe them fully.
Patch by me; thanks to Álvaro Herrera and Andrew Gierth for review
and discussion.
Discussion: https://postgr.es/m/14047.1557708214@sss.pgh.pa.us
The test for SysV support currently involves looking for the perl
modules IPC::SharedMem and IPC::SysV. However, the perl on msys2 has
these modules but the tests fail. Therefore, force skipping the tests on
Windows platforms unconditionally.
Discussion: https://postgr.es/m/176e86ba-1a46-9d8c-5ae4-9865a463b411@2ndQuadrant.com
This moves much of the non-heap-specific logic from toast_delete and
toast_insert_or_update into a helper functions accessible via a new
header, toast_helper.h. Using the functions in this module, a table
AM can implement creation and deletion of TOAST table rows with
much less code duplication than was possible heretofore. Some
table AMs won't want to use the TOAST logic at all, but for those
that do this will make that easier.
Patch by me, reviewed and tested by Prabhat Sabu, Thomas Munro,
Andres Freund, and Álvaro Herrera.
Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com
The old code didn't differentiate between a read error and a
concurrent truncation. fread reports both of these by returning 0;
you have to use feof() or ferror() to distinguish between them,
which this code did not do.
It might be a better idea to use read() rather than fread() here,
so that we can display a less-generic error message, but I'm not
sure that would qualify as a back-patchable bug fix, so just do
this much for now.
Jeevan Chalke, reviewed by Jeevan Ladhe and by me.
Discussion: http://postgr.es/m/CA+TgmobG4ywMzL5oQq2a8YKp8x2p3p1LOMMcGqpS7aekT9+ETA@mail.gmail.com
Previously even if postmaster died and WaitLatch() woke up with that event
while pg_promote() was waiting for the standby promotion to finish,
pg_promote() did nothing special and kept waiting until timeout occurred.
This could cause a busy loop.
This patch make pg_promote() return false immediately when postmaster
dies, to avoid such a busy loop.
Back-patch to v12 where pg_promote() was added.
Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAHGQGwEs9ROgSp+QF+YdDU+xP8W=CY1k-_Ov-d_Z3JY+to3eXA@mail.gmail.com
This approach provides a much tighter binding between a data directory
and the associated SysV shared memory block (and SysV or named-POSIX
semaphores, if we're using those). Key collisions are still possible,
but only between data directories stored on different filesystems,
so the situation should be negligible in practice. More importantly,
restarting the postmaster with a different port number no longer
risks failing to identify a relevant shared memory block, even when
postmaster.pid has been removed. A standalone backend is likewise
much more certain to detect conflicting leftover backends.
(In the longer term, we might now think about deprecating the port as
a cluster-wide value, so that one postmaster could support sockets
with varying port numbers. But that's for another day.)
The hazards fixed here apply only on Unix systems; our Windows code
paths already use identifiers derived from the data directory path
name rather than the port.
src/test/recovery/t/017_shm.pl, which intends to test key-collision
cases, has been substantially rewritten since it can no longer use
two postmasters with identical port numbers to trigger the case.
Instead, use Perl's IPC::SharedMem module to create a conflicting
shmem segment directly. The test script will be skipped if that
module is not available. (This means that some older buildfarm
members won't run it, but I don't think that that results in any
meaningful coverage loss.)
Patch by me; thanks to Noah Misch and Peter Eisentraut for discussion
and review.
Discussion: https://postgr.es/m/16908.1557521200@sss.pgh.pa.us
detoast.c/h contain functions required to detoast a datum, partially
or completely, plus a few other utility functions for examining the
size of toasted datums.
toast_internals.c/h contain functions that are used internally to the
TOAST subsystem but which (mostly) do not need to be accessed from
outside.
heaptoast.c/h contains code that is intrinsically specific to the
heap AM, either because it operates on HeapTuples or is based on the
layout of a heap page.
detoast.c and toast_internals.c are placed in
src/backend/access/common rather than src/backend/access/heap. At
present, both files still have dependencies on the heap, but that will
be improved in a future commit.
Patch by me, reviewed and tested by Prabhat Sabu, Thomas Munro,
Andres Freund, and Álvaro Herrera.
Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com
Use the explicit_bzero() function in places where it is important that
security information such as passwords is cleared from memory. There
might be other places where it could be useful; this is just an
initial collection.
For platforms that don't have explicit_bzero(), provide various
fallback implementations. (explicit_bzero() itself isn't standard,
but as Linux/glibc, FreeBSD, and OpenBSD have it, it's the most common
spelling, so it makes sense to make that the invocation point.)
Discussion: https://www.postgresql.org/message-id/flat/42d26bde-5d5b-c90d-87ae-6cab875f73be%402ndquadrant.com
The logic ending progress reporting for a backend entry introduced by
b6fb647 causes callers of pgstat_progress_end_command() to do some extra
work when track_activities is enabled as the process fields are reset in
the backend entry even if no command were started for reporting.
This resets the fields only if a command is registered for progress
reporting, and only if track_activities is enabled.
Author: Masahiho Sawada
Discussion: https://postgr.es/m/CAD21AoCry_vJ0E-m5oxJXGL3pnos-xYGCzF95rK5Bbi3Uf-rpA@mail.gmail.com
Backpatch-through: 9.6
Since the addition of fsync requests in bc34223 to make base backup data
consistent on disk once pg_basebackup finishes, each tablespace tar file
is individually flushed once completed, with an additional flush of the
parent directory when the base backup finishes. While holding a
connection to the server, a fsync request taking a long time may cause a
failure of the base backup, which is annoying for any integration. A
recent example of breakage can involve tcp_user_timeout, but
wal_sender_timeout can cause similar problems.
While reviewing the code, there was a second issue causing too many
fsync requests to be done for the same WAL data. As recursive fsyncs
are done at the end of the backup for both the plain and tar formats
from the base target directory where everything is written, it is fine
to disable fsyncs when fetching or streaming WAL.
Reported-by: Ryohei Takahashi
Author: Michael Paquier
Reviewed-by: Ryohei Takahashi
Discussion: https://postgr.es/m/OSBPR01MB4550DAE2F8C9502894A45AAB82BE0@OSBPR01MB4550.jpnprd01.prod.outlook.com
Backpatch-through: 10
This function is only used by xlogreader.c itself, so there's no need to
export it. It was introduced by commit 3b02ea4f07 with the apparent
intention that it could be used externally, but I couldn't find any
external code calling it.
I (Álvaro) couldn't resist the urge to sort nearby function prototypes
properly while at it.
Author: Antonin Houska
Discussion: https://postgr.es/m/14984.1554998742@spoje.net
The message was included as a parameter when this function was added in
dcb2bda9b7, but I don't think it has ever served any useful purpose.
Let's stop spreading it pointlessly.
Reviewed by Amit Langote and Peter Eisentraut.
Discussion: https://postgr.es/m/20190806224728.GA17233@alvherre.pgsql
Clarify in the help output and documentation that -n, -t etc. take a
"pattern" rather than a "schema" or "table" etc. This was especially
confusing now that the new pg_dumpall --exclude-database option was
documented with "pattern" and the others not, even though they all
behave the same.
Discussion: https://www.postgresql.org/message-id/flat/b85f3fa1-b350-38d1-1893-4f7911bd7310%402ndquadrant.com
Document that the tablespace sizes are in units of kilobytes. Make
the pg_basebackup source code a bit clearer about this, too.
Reviewed-by: Magnus Hagander <magnus@hagander.net>
The leak happens in str_tolower, str_toupper and str_initcap, which are
used in several places including their equivalent SQL-level functions,
and can only be triggered when using an ICU-provided collation when
converting the input string.
b615920 fixed a similar leak. Backpatch down 10 where ICU collations
have been introduced.
Author: Konstantin Knizhnik
Discussion: https://postgr.es/m/94c0ad0a-cbc2-e4a3-7829-2bdeaf9146db@postgrespro.ru
Backpatch-through: 10
In what seems like a fit of misplaced optimization,
ExtractReplicaIdentity() accessed the relation's replica-identity
index without taking any lock on it. Usually, the surrounding query
already holds some lock so this is safe enough ... but in the case
of a previously-planned delete, there might be no existing lock.
Given a suitable test case, this is exposed in v12 and HEAD by an
assertion added by commit b04aeb0a0.
The whole thing's rather poorly thought out anyway; rather than
looking directly at the index, we should use the index-attributes
bitmap that's held by the parent table's relcache entry, as the
caller functions do. This is more consistent and likely a bit
faster, since it avoids a cache lookup. Hence, change to doing it
that way.
While at it, rather than blithely assuming that the identity
columns are non-null (with catastrophic results if that's wrong),
add assertion checks that they aren't null. Possibly those should
be actual test-and-elog, but I'll leave it like this for now.
In principle, this is a bug that's been there since this code was
introduced (in 9.4). In practice, the risk seems quite low, since
we do have a lock on the index's parent table, so concurrent
changes to the index's catalog entries seem unlikely. Given the
precedent that commit 9c703c169 wasn't back-patched, I won't risk
back-patching this further than v12.
Per report from Hadi Moshayedi.
Discussion: https://postgr.es/m/CAK=1=Wrek44Ese1V7LjKiQS-Nd-5LgLi_5_CskGbpggKEf3tKQ@mail.gmail.com
After an unexpected connection loss and successful reconnection,
psql neglected to resynchronize its internal state about the server,
such as server version. Ordinarily we'd be reconnecting to the same
server and so this isn't really necessary, but there are scenarios
where we do need to update --- one example is where we have a list
of possible connection targets and they're not all alike.
Define "resynchronize" as including connection_warnings(), so that
this case acts the same as \connect. This seems useful; for example,
if the server version did change, the user might wish to know that.
An attuned user might also notice that the new connection isn't
SSL-encrypted, for example, though this approach isn't especially
in-your-face about such changes. Although this part is a behavioral
change, it only affects interactive sessions, so it should not break
any applications.
Also, in do_connect, make sure that we desynchronize correctly when
abandoning an old connection in non-interactive mode.
These problems evidently are the result of people patching only one
of the two places where psql deals with connection changes, so insert
some cross-referencing comments in hopes of forestalling future bugs
of the same ilk.
Lastly, in Windows builds, issue codepage mismatch warnings only at
startup, not during reconnections. psql's codepage can't change
during a reconnect, so complaining about it again seems like useless
noise.
Peter Billen and Tom Lane. Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAMTXbE8e6U=EBQfNSe01Ej17CBStGiudMAGSOPaw-ALxM-5jXg@mail.gmail.com
Similarly to the signed versions added in 4d6ad31, this adds a set of
inline functions for overflow checks with unsigned integers, including
uint16, uint32 and uint64. This relies on compiler built-in overflow
checks by default if available. The behavior of unsigned integers is
well-defined so the fallback implementations checks are simple for
additions and subtractions. Multiplications avoid division-based checks
which are expensive if possible, still this can happen for uint64 if
128-bit integers are not available.
While on it, the code in common/int.h is reorganized to avoid too many
duplicated comments. The new macros will be used in a follow-up patch.
All thanks to Andres Freund for the input provided.
Author: Fabien Coelho, Michael Paquier
Discussion: https://postgr.es/m/20190830073423.GB2354@paquier.xyz
The comment describing the string format was a lie. Make it agree with
reality, add/improve some other comments, fix coding style for loops with
empty bodies. Also add an Assert that we counted parameters correctly,
because the spread-out logic for that looks pretty fragile.
No actual bugs fixed here, so no need to back-patch.
Discussion: https://postgr.es/m/848B1649C8A6274AA527C4472CA11EDD5FC70CBE@G01JPEXMBYT02
The comment did not match what the code actually did for integers with
the 43rd bit set. You get an integer like that, if you have a posting
list with two adjacent TIDs that are more than 2^31 blocks apart.
According to the comment, we would store that in 6 bytes, with no
continuation bit on the 6th byte, but in reality, the code encodes it
using 7 bytes, with a continuation bit on the 6th byte as normal.
The decoding routine also handled these 7-byte integers correctly, except
for an overflow check that assumed that one integer needs at most 6 bytes.
Fix the overflow check, and fix the comment to match what the code
actually does. Also fix the comment that claimed that there are 17 unused
bits in the 64-bit representation of an item pointer. In reality, there
are 64-32-11=21.
Fitting any item pointer into max 6 bytes was an important property when
this was written, because in the old pre-9.4 format, item pointers were
stored as plain arrays, with 6 bytes for every item pointer. The maximum
of 6 bytes per integer in the new format guaranteed that we could convert
any page from the old format to the new format after upgrade, so that the
new format was never larger than the old format. But we hardly need to
worry about that anymore, and running into that problem during upgrade,
where an item pointer is expanded from 6 to 7 bytes such that the data
doesn't fit on a page anymore, is implausible in practice anyway.
Backpatch to all supported versions.
This also includes a little test module to test these large distances
between item pointers, without requiring a 16 TB table. It is not
backpatched, I'm including it more for the benefit of future development
of new posting list formats.
Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi
Reviewed-by: Masahiko Sawada, Alexander Korotkov
RelationAllowsEarlyPruning() performed a catalog scan, but is used
in two contexts where that was a bad idea:
1. In heap_page_prune_opt(), which runs very frequently in some large
scans. This caused major performance problems in a field report
that was easy to reproduce.
2. In TestForOldSnapshot(), which runs while we hold a buffer content
lock. It's not clear if this was guaranteed to be free of buffer
deadlock risk.
The check was introduced in commit 2cc41acd8 and defended against a
real problem: 9.6's hash indexes have no page LSN and so we can't
allow early pruning (ie the snapshot-too-old feature). We can remove
the check from all later releases though: hash indexes are now logged,
and there is no way to create UNLOGGED indexes on regular logged
tables.
If a future release allows such a combination, it might need to put
a similar check in place, but it'll need some more thought.
Back-patch to 10.
Author: Thomas Munro
Reviewed-by: Tom Lane, who spotted the second problem
Discussion: https://postgr.es/m/CA%2BhUKGKT8oTkp5jw_U4p0S-7UG9zsvtw_M47Y285bER6a2gD%2Bg%40mail.gmail.com
Discussion: https://postgr.es/m/CAA4eK1%2BWy%2BN4eE5zPm765h68LrkWc3Biu_8rzzi%2BOYX4j%2BiHRw%40mail.gmail.com
check_float4_val() checks after underflow and overflow of values
converted from float8 to float4, but there has never been any regression
tests for that. This brings the coverage of float.h to 100%.
Author: Movead Li
Discussion: https://postgr.es/m/20190822174636998766188@highgo.ca
In this case, the transfer uses a libpq connection, which is subject to
the timeout parameters set at system level, and this can make the rewind
operation suddenly canceled which is not good for automation. One
workaround to such issues would be to use PGOPTIONS to enforce the
wanted timeout parameters, but that's annoying, and for example pg_dump,
which can run potentially long-running queries disables all types of
timeouts.
lock_timeout and statement_timeout are the ones which can cause problems
now. Note that pg_rewind does not use transactions, so disabling
idle_in_transaction_session_timeout is optional, but it feels safer to
do so for the future.
This is back-patched down to 9.5. idle_in_transaction_session_timeout
is only present since 9.6.
Author: Alexander Kukushkin
Discussion: https://postgr.es/m/CAFh8B=krcVXksxiwVQh1SoY+ziJ-JC=6FcuoBL3yce_40Es5_g@mail.gmail.com
Backpatch-through: 9.5
Commit a4327296d taught pg_regress proper to do this, but
missed the opportunity to do likewise in the isolationtester
and ecpg variants of pg_regress. Seems like this might be
helpful for tracking down issues exposed by those tests.
Turns out that returning "unrecognized signal" is confusing.
Make it explicit that the platform lacks any support for signal names.
(At least of the machines in the buildfarm, only HPUX lacks it.)
Back-patch to v12 where we invented this function.
Discussion: https://postgr.es/m/3067.1566870481@sss.pgh.pa.us
Commit efada2b8e9, which made the nbtree page deletion algorithm more
robust, removed the concept of a half-dead internal page. Remove a
comment about half dead parent pages that was overlooked.
An empty file name or subdirectory name leads join_path_components() to
just produce the parent directory name, which leads to weird failures or
recursive inclusions. Let's throw a specific error for that. It takes
only slightly more code to detect all-blank names, so do so.
Also, detect direct recursion, ie a file calling itself. As coded
this will also detect recursion via "include_dir '.'", which is
perhaps more likely than explicitly including the file itself.
Detecting indirect recursion would require API changes for guc-file.l
functions, which seems not worth it since extensions might call them.
The nesting depth limit will catch such cases eventually, just not
with such an on-point error message.
In passing, adjust the example usages in postgresql.conf.sample
to perhaps eliminate the problem at the source: there's no reason
for the examples to suggest that an empty value is valid.
Per a trouble report from Brent Bates. Back-patch to 9.5; the
issue is old, but the code in 9.4 is enough different that the
patch doesn't apply easily, and it doesn't seem worth the trouble
to fix there.
Ian Barwick and Tom Lane
Discussion: https://postgr.es/m/8c8bcbca-3bd9-dc6e-8986-04a5abdef142@2ndquadrant.com
FD_SETSIZE needs to be declared before winsock2.h, or it is possible to
run into buffer overflow issues when using --jobs. This is similar to
pgbench's solution done in a23c641.
This has been introduced by 71d84ef, and older versions have been using
the default value of FD_SETSIZE, defined at 64.
Per buildfarm member jacana, but this impacts all Windows animals
running the TAP tests. I have reproduced the failure locally to check
the patch.
Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20190826054000.GE7005@paquier.xyz
Backpatch-through: 9.5
If a test case tried to set an invalid value of synchronous_standby_names,
the test script didn't detect that, which seems like a bad idea.
Noticed while testing a proposed patch that broke some of these
test cases.
A report from Alvaro Herrera shows that if we're in PM_STARTUP
state, and we spawn a dead_end child to reject some incoming
connection request, and that child dies with an unexpected exit
code, the postmaster does not respond well. We correctly send
SIGQUIT to the startup process, but then:
* if the startup process exits with nonzero exit code, as expected,
we thought that that indicated a crash and aborted startup.
* if the startup process exits with zero exit code, which is possible
due to the inherent race condition, we'd advance to PM_RUN state
which is fine --- but the code forgot that AbortStartTime would be
nonzero in this situation. We'd either die on the Asserts saying
that it was zero, or perhaps misbehave later on. (A quick look
suggests that the only misbehavior might be busy-waiting due to
DetermineSleepTime doing the wrong thing.)
To fix the first point, adjust the state-machine logic to recognize
that a nonzero exit code is expected after sending SIGQUIT, and have
it transition to a state where we can restart the startup process.
To fix the second point, change the Asserts to clear the variable
rather than just claiming it should be clear already.
Perhaps we could improve this further by not treating a crash of
a dead_end child as a reason for panic'ing the database. However,
since those child processes are connected to shared memory, that
seems a bit risky. There are few good reasons for a dead_end child
to report failure anyway (the cause of this in Alvaro's report is
quite unclear). On balance, therefore, a minimal fix seems best.
This is an oversight in commit 45811be94. While that was back-patched,
I'm hesitant to back-patch this change. The lack of reasons for a
dead_end child to fail suggests that the case should be very rare in
the field, which squares with the lack of reports; so it seems like
this might not be worth the risk of introducing new issues. In any
case we can let it bake awhile in HEAD before considering a back-patch.
Discussion: https://postgr.es/m/20190615160950.GA31378@alvherre.pgsql
Previously 'uname -r' on Msys2 reported a kernele release starting with
2. The latest version starts with 3. In commit 1638623f we specifically
looked for one starting with 2. This is now changed to look for any
digit between 2 and 9.
backpatch to release 10.
On msys2, 'uname -s' reports a string starting MSYS instead on MINGW
as happens on msys1. Treat these both the same way. This reverts
608a710195 in favor of a more general solution.
Backpatch to all live branches.
When trying to use a high number of jobs, vacuumdb (and more recently
reindexdb) has only checked for a maximum number of jobs used, causing
confusing failures when running out of file descriptors when the jobs
open connections to Postgres. This commit changes the error handling so
as we do not check anymore for a maximum number of allowed jobs when
parsing the option value with FD_SETSIZE, but check instead if a file
descriptor is within the supported range when opening the connections
for the jobs so as this is detected at the earliest time possible.
Also, improve the error message to give a hint about the number of jobs
recommended, using a wording given by the reviewers of the patch.
Reported-by: Andres Freund
Author: Michael Paquier
Reviewed-by: Andres Freund, Álvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/20190818001858.ho3ev4z57fqhs7a5@alap3.anarazel.de
Backpatch-through: 9.5
POSIX permits getopt() to advance optind beyond argc when the last
argv entry is an option that requires an argument and hasn't got one.
It seems that no major platforms actually do that, but musl does,
so that something like "psql -f" would crash with that libc.
Add a check that optind is in range before trying to look at the
possibly-bogus option.
Report and fix by Quentin Rameau. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/20190825100617.GA6087@fifth.space
We were setting extra_float_digits = 0 to avoid platform-dependent
output in this test, but that's still able to expose platform-specific
roundoff behavior in some new test cases added by commit a3d284485,
as reported by Peter Eisentraut. Reduce it to -1 to hide that.
(Over in geometry.sql, we're using -3, which is an ancient decision
dating to 337f73b1b. I wonder whether that's overkill now. But
there's probably little value in trying to change it.)
Back-patch to v12 where a3d284485 came in; there's no evidence that
we have any platform-dependent issues here before that.
Discussion: https://postgr.es/m/15551268-e224-aa46-084a-124b64095ee3@2ndquadrant.com
Bleeding-edge LLVM has stopped supplying replacements for various
C++14 library features, for people on older C++ versions. Since we're
not ready to require C++14 yet, just use plain old new instead of
make_unique. As revealed by buildfarm animal seawasp.
Back-patch to 11.
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CA%2BhUKGJWG7unNqmkxg7nC5o3o-0p2XP6co4r%3D9epqYMm8UY4Mw%40mail.gmail.com
The Postgres approach to coupling locks during an ascent of the tree is
slightly different to the approach taken by Lehman and Yao. Add a new
paragraph to the "Differences to the Lehman & Yao algorithm" section of
the nbtree README that explains the similarities and differences.
This is useful for developers to find out if an isolation spec is
over-engineered or if it needs more work by warning at the end of a
test run if a step is not used, generating a failure with extra diffs.
While on it, clean up all the specs which include steps not used in any
permutations to simplify them.
Author: Michael Paquier
Reviewed-by: Asim Praveen, Melanie Plageman
Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz
The original purpose of the dry-run mode is to be able to print all the
possible permutations from a spec file, but it has become less useful
since isolation tests has improved regarding deadlock detection as one
step not wanted by the author could block indefinitely now (originally
the step blocked would have been detected rather quickly). Per
discussion, let's remove it.
Author: Michael Paquier
Reviewed-by: Asim Praveen, Melanie Plageman
Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz
Adjust the struct comment that describes how page splits use their
descent stack to cascade up the tree from the leaf level.
In passing, fix up some unrelated nbtree comments that had typos or were
obsolete.
In early development patches, "replication origins" were called "identifiers";
almost everything was renamed, but these references to the old terminology
went unnoticed.
Reported-by: Craig Ringer
Using pg_pltemplate as test data was probably not very forward-looking,
considering we've had many discussions around removing that catalog
altogether. Use a nearby temp table instead, to make these two test
scripts more self-contained. This is a better test case anyway, since
it exercises the scenario where the entries in the anyarray column
actually vary in type intra-query.
FD_SETSIZE is included in sys/select.h per POSIX, and this header
inclusion has been moved to scripts_parallel.c as of 5f38403 without
moving the variable, causing a compilation failure on recent versions of
OpenBSD (6.6 was the version used in the report).
In order to take care of the failure, move FD_SETSIZE directly to
scripts_parallel.c with a wrapper controlling the maximum number of
parallel slots supported, based on a suggestion by Andres Freund.
While on it, reduce the maximum number to be less than FD_SETSIZE,
leaving some room for stdin, stdout and such as they consume some file
descriptors.
The buildfarm did not complain about that, as it happens to only be
an issue on recent versions of OpenBSD and there is no coverage in this
area. 51c3e9f fixed a similar set of issues.
Bug: #15964
Reported-by: Sean Farrell
Discussion: https://postgr.es/m/15964-c1753bdfed722e04@postgresql.org
If the record argument is NULL and has no declared type more concrete
than RECORD, we can't extract useful information about the desired
rowtype from it. In this case, see if we're in FROM with an AS clause,
and if so extract the needed rowtype info from AS.
It worked like this before v11, but commit 37a795a60 removed the
behavior, reasoning that it was undocumented, inefficient, and utterly
not self-consistent. If you want to take type info from an AS clause,
you should be using the json_to_record() family of functions not the
json_populate_record() family. Also, it was already the case that
the "populate" functions would fail for a null-valued RECORD input
(with an unfriendly "record type has not been registered" error)
when there wasn't an AS clause at hand, and it wasn't obvious that
that behavior wasn't OK when there was one. However, it emerges
that some people were depending on this to work, and indeed the
rather off-point error message you got if you left off AS encouraged
slapping on AS without switching to the json_to_record() family.
Hence, put back the fallback behavior of looking for AS. While at it,
improve the run-time error you get when there's no place to obtain type
info; we can do a lot better than "record type has not been registered".
(We can't, unfortunately, easily improve the parse-time error message
that leads people down this path in the first place.)
While at it, I refactored the code a bit to avoid duplicating the
same logic in several different places.
Per bug #15940 from Jaroslav Sivy. Back-patch to v11 where the
current coding came in. (The pre-v11 deficiencies in this area
aren't regressions, so we'll leave those branches alone.)
Patch by me, based on preliminary analysis by Dmitry Dolgov.
Discussion: https://postgr.es/m/15940-2ab76dc58ffb85b6@postgresql.org
We already had "cpluspluscheck", which served the dual purposes of
verifying that headers compile standalone and that they compile as C++.
However, C++ compilers don't have the exact same set of error conditions
as C compilers, so this doesn't really prove that a header will compile
standalone as C.
Hence, add a second script that's largely similar but runs the C
compiler not C++.
Also add a bit more documentation than the none-at-all we had before.
Discussion: https://postgr.es/m/14803.1566175851@sss.pgh.pa.us
IANA tzcode release 2019b adds an option that tells zic not to emit
the old 32-bit section of the timezone files, and to skip some other
space-wasting hacks needed for compatibility with old timezone client
libraries. Since we only expect our own code to use the timezone data
we install, and our code is up-to-date with 2019b, there's no apparent
reason not to generate the smallest possible files.
Unfortunately, while the individual zone files do get significantly
smaller in many cases, they were not that big to begin with; which
means that no real space savings ensues on filesystems that don't
optimize small files. (For instance, on ext4 with 4K block size,
"du" says the installed timezone tree is the same size as before.)
Still, it seems worth making the change, if only because this is
presumably the wave of the future. At the very least, we'll save
some cycles while reading a zone file.
But given the marginal value and the fact that this is a new code
path, it doesn't seem worth the risk of back-patching this change
into stable branches. Hence, unlike most of our timezone-related
changes, apply to HEAD only.
Discussion: https://postgr.es/m/24998.1563403327@sss.pgh.pa.us
Prefix inet_net_ntop and sibling routines with "pg_" to ensure that
they aren't mistaken for C-library functions. This fixes warnings
from cpluspluscheck on some platforms, and should help reduce reader
confusion everywhere, since our functions aren't exactly interchangeable
with the library versions (they may have different ideas about address
family codes).
This shouldn't be fixing any actual bugs, unless somebody's linker
is misbehaving, so no need to back-patch.
Discussion: https://postgr.es/m/20518.1559494394@sss.pgh.pa.us
Remove use of "register" keyword in hashfn.c. It's obsolescent
according to recent C++ compilers, and no modern C compiler pays
much attention to it either.
Also fix one cosmetic warning about signed vs unsigned comparison.
Discussion: https://postgr.es/m/20518.1559494394@sss.pgh.pa.us
This has to have <time.h>, or the references to "struct tm" don't
mean what they should.
We have some other recently-introduced issues of the same ilk,
but this one seems old. No backpatch though, as it's only a
latent problem for most purposes.
If a table inherits from multiple unrelated parents, we must disallow
changing the type of a column inherited from multiple such parents, else
it would be out of step with the other parents. However, it's possible
for the column to ultimately be inherited from just one common ancestor,
in which case a change starting from that ancestor should still be
allowed. (I would not be excited about preserving that option, were
it not that we have regression test cases exercising it already ...)
It's slightly annoying that this patch looks different from the logic
with the same end goal in renameatt(), and more annoying that it
requires an extra syscache lookup to make the test. However, the
recursion logic is quite different in the two functions, and a
back-patched bug fix is no place to be trying to unify them.
Per report from Manuel Rigger. Back-patch to 9.5. The bug exists in
9.4 too (and doubtless much further back); but the way the recursion
is done in 9.4 is a good bit different, so that substantial refactoring
would be needed to fix it in 9.4. I'm disinclined to do that, or risk
introducing new bugs, for a bug that has escaped notice for this long.
Discussion: https://postgr.es/m/CA+u7OA4qogDv9rz1HAb-ADxttXYPqQdUdPY_yd4kCzywNxRQXA@mail.gmail.com
This test failed fairly reproducibly on some CLOBBER_CACHE_ALWAYS
buildfarm animals. The cause seems to be that if a parallel worker
is slow enough to reach its lock wait, it may not be released by
the first deadlock check run, and then later deadlock checks might
decide to unblock the d2 session instead of the d1 session, leaving
us in an undetected deadlock state (since the isolationtester client
is waiting for d1 to complete first).
Fix by introducing an additional lock wait at the end of the d2a1
step, ensuring that the deadlock checker will recognize that d1
has to be unblocked before d2a1 completes.
Also reduce max_parallel_workers_per_gather to 3 in this test. With the
default max_worker_processes value, we were only getting one parallel
worker for the d2a1 step, which is not the case I hoped to test. We
should get 3 for d1a2 and 2 for d2a1, as the code stands; and maybe 3
for d2a1 if somebody figures out why the last parallel worker slot isn't
free already.
Discussion: https://postgr.es/m/22195.1566077308@sss.pgh.pa.us
If an assertion expression contained a macro, the failed assertion
message would print the expanded macro, which is usually unhelpful and
confusing. Restructure the Assert macros to not expand any macros
when constructing the failure message.
This also fixes that the existing output for Assert et al. shows
the *inverted* condition, which is also confusing and not how
assertions usually work.
Discussion: https://www.postgresql.org/message-id/flat/6c68efe3-117a-dcc1-73d4-18ba1ec532e2%402ndquadrant.com
For most uses of acl.h the details of how "Acl" internally looks like
are irrelevant. It might make sense to move a lot of the
implementation details into a separate header at a later point.
The main motivation of this change is to avoid including fmgr.h (via
array.h, which needs it for exposed structs) in a lot of files that
otherwise don't need it. A subsequent commit will remove the fmgr.h
include from a lot of files.
Directly include utils/array.h and utils/expandeddatum.h from the
files that need them, but previously included them indirectly, via
acl.h.
Author: Andres Freund
Discussion: https://postgr.es/m/20190803193733.g3l3x3o42uv4qj7l@alap3.anarazel.de
This is a variant of the problem fixed in commit 25b692568, which
unfortunately we failed to detect at the time. If an update trigger
returns the "old" tuple, as it's entitled to do, then a subsequent
iteration of the loop in ExecBRUpdateTriggers would have "oldtuple"
equal to "trigtuple" and would fail to notice that it shouldn't
free that.
In addition to fixing the code, extend the test case added by
25b692568 so that it covers multiple-trigger-iterations cases.
This problem does not manifest in v12/HEAD, as a result of the
relevant code having been largely rewritten for slotification.
However, include the test case into v12/HEAD anyway, since this
is clearly an area that someone could break again in future.
Per report from Piotr Gabriel Kosinski. Back-patch into all
supported branches, since the bug seems quite old.
Diagnosis and code fix by Thomas Munro, test case by me.
Discussion: https://postgr.es/m/CAFMLSdP0rd7LqC3j-H6Fh51FYSt5A10DDh-3=W4PPc4LLUQ8YQ@mail.gmail.com