5a1e1d8302 was a minimal bug fix for dc7420c2c9. To avoid future bugs of
that kind, deduplicate the choice of a relation's horizon into a new helper,
GlobalVisHorizonKindForRel().
As the code in question was only introduced in dc7420c2c9 it seems worth
backpatching this change as well, otherwise 14 will look different from all
other branches.
A different approach to this was suggested by Matthias van de Meent.
Author: Andres Freund
Discussion: https://postgr.es/m/20210621122919.2qhu3pfugxxp3cji@alap3.anarazel.de
Backpatch: 14, like 5a1e1d8302
We have an implementation restriction that PREPARE TRANSACTION can't
handle cases where both session-lifespan and transaction-lifespan locks
are held on the same lockable object. (That's because we'd otherwise
need to acquire a new PROCLOCK entry during post-prepare cleanup, which
is an operation that might fail. The situation can only arise with odd
usages of advisory locks, so removing the restriction is probably not
worth the amount of effort it would take.) AtPrepare_Locks attempted
to enforce this, but its logic was many bricks shy of a load, because
it only detected cases where the session and transaction locks had the
same lockmode. Locks of different modes on the same object would lead
to the rather unhelpful message "PANIC: we seem to have dropped a bit
somewhere".
To fix, build a transient hashtable with one entry per locktag,
not one per locktag + mode, and use that to detect conflicts.
Per bug #17122 from Alexander Pyhalov. This bug is ancient,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/17122-04f3c32098a62233@postgresql.org
We previously took a hard-line attitude that callers should never print
a null string pointer, and doing so is worthy of an assertion failure
or crash. However, we've long since flushed out any easy-to-find bugs
of that nature. What remains is a lot of code that perhaps could fail
that way in hard-to-reach corner cases. For example, in something as
simple as
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("constraint \"%s\" for table \"%s\" does not exist",
conname, get_rel_name(relid))));
one must wonder whether it's completely guaranteed that get_rel_name
cannot return NULL in this context. If such a situation did occur,
the existing policy converts what might be a pretty minor bug into
a server crash condition. This is not good for robustness.
Hence, let's follow the lead of glibc and print "(null)" instead
of failing. We should, of course, still consider it a bug if that
behavior is reachable in ordinary use; but crashing seems less
desirable than not crashing.
This fix works across-the-board in v12 and up, where we always use
src/port/snprintf.c. Before that, on most platforms we're at the mercy
of the local libc, but it appears that Solaris 10 is the only supported
platform where we'd still get a crash. Most other platforms such as
*BSD, macOS, and Solaris 11 have adopted glibc's behavior at some
point. (AIX and HPUX just print "" not "(null)", but that's close
enough.) I've not checked what Windows' native printf would do, but
it doesn't matter because we've long used snprintf.c on that platform.
In v12 and up, also const-ify related code so that we're not casting
away const on the constant string. This is just neatnik-ism, since
next to no compilers will warn about that.
Discussion: https://postgr.es/m/17098-b960f3616c861f83@postgresql.org
This testing was useful when it was written, nigh twenty years ago,
but it seems fairly pointless for any platform built in the last
dozen or more years. (Compare also the comments at 8a2121185.)
Also we now have reports that the test program itself fails under
ThreadSanitizer. Rather than invest effort in fixing it, let's
just drop it, and assume that the few people who still care
already know they need to use --disable-thread-safety.
Back-patch into v14, for consistency with 8a2121185.
Discussion: https://postgr.es/m/CADhDkKzPSiNvA3Hyq+wSR_icuPmazG0cFe=YnC3U-CFcYLc8Xw@mail.gmail.com
Code inlined by LLVM can crash or fail with "Relocation type not
implemented yet!" if it tries to access thread local variables. Don't
inline such code.
Back-patch to 11, where LLVM arrived. Bug #16696.
Author: Dmitry Marakasov <amdmi3@amdmi3.ru>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/16696-29d944a33801fbfe@postgresql.org
In postgresql.conf, memory and file size GUCs can be specified with "B"
(bytes) as of b06d8e58b. Likewise, time GUCs can be specified with "us"
(microseconds) as of caf626b2c. Update postgres.conf.sample to reflect
that fact.
Pavel Luzanov
Backpatch to v12, which is the earliest version that allows both of
these units. A separate commit will document the "B" case for v11.
Discussion: https://www.postgresql.org/message-id/flat/f10d16fc-8fa0-1b3c-7371-cb3a35a13b7a%40postgrespro.ru
If an error was raised during our initial attempt to check whether
a successfully-compiled expression is "simple", subsequent calls of
exec_stmt_execsql would suppose that stmt->mod_stmt was already computed
when it had not been. This could lead to assertion failures in debug
builds; in production builds the effect would typically be to act as
if INTO STRICT had been specified even when it had not been. Of course
that only matters if the subsequent attempt to execute the expression
succeeds, so that the problem can only be reached by fixing a failure
in some referenced, inline-able SQL function and then retrying the
calling plpgsql function in the same session.
(There might be even-more-obscure ways to change the expression's
behavior without changing the plpgsql function, but that one seems
like the only one people would be likely to hit in practice.)
The most foolproof way to fix this would be to arrange for
exec_prepare_plan to not set expr->plan until we've finished the
subsidiary simple-expression check. But it seems hard to do that
without creating reference-count leak issues. So settle for documenting
the hazard in a comment and fixing exec_stmt_execsql to test separately
for whether it's computed stmt->mod_stmt. (That adds a test-and-branch
per execution, but hopefully that's negligible in context.) In v11 and
up, also fix exec_stmt_call which had a variant of the same issue.
Per bug #17113 from Alexander Lakhin. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/17113-077605ce00e0e7ec@postgresql.org
The logic handling the opening of new WAL segments was fuzzy when using
--compress if a partial, non-compressed, segment with the same base name
existed in the repository storing those files. In this case, using
--compress would cause the code to first check for the existence and the
size of a non-compressed segment, followed by the opening of a new
compressed, partial, segment. The code was accidentally working
correctly on most platforms as the buildfarm has proved, except
bowerbird where gzflush() could fail in this code path. It is wrong
anyway to take the code path used pre-padding when creating a new
partial, non-compressed, segment, so let's fix it.
Note that this issue exists when users mix successive runs of
pg_receivewal with or without compression, as discovered with the tests
introduced by ffc9dda.
While on it, this refactors the code so as code paths that need to know
about the ".gz" suffix are down from four to one in walmethods.c, easing
a bit the introduction of new compression methods. This addresses a
second issue where log messages generated for an unexpected failure
would not show the compressed segment name involved, which was
confusing, printing instead the name of the non-compressed equivalent.
Reported-by: Georgios Kokolatos
Discussion: https://postgr.es/m/YPDLz2x3o1aX2wRh@paquier.xyz
Backpatch-through: 10
Commit 3499df0d added a comment that incorrectly suggested that
--force-index-cleanup did not appear in the same major version as the
similar --no-index-cleanup option. In fact, both options are new to
PostgreSQL 14.
Backpatch: 14-, where both options were introduced.
We don't allow to create replication slot_name as an empty string ('') via
SQL API pg_create_logical_replication_slot() but it is allowed to be set
via Alter Subscription command. This will lead to apply worker repeatedly
keep trying to stream data via slot_name '' and the user is not allowed to
create the slot with that name.
Author: Japin Li
Reviewed-By: Ranier Vilela, Amit Kapila
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/MEYP282MB1669CBD98E721C77CA696499B61A9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
It has been spotted that multiranges lack of ability to decompose them into
individual ranges. Subscription and proper expanded object representation
require substantial work, and it's too late for v14. This commit
provides the implementation of unnest(multirange), which is quite trivial.
unnest(multirange) is defined as a polymorphic procedure.
Catversion is bumped.
Reported-by: Jonathan S. Katz
Discussion: https://postgr.es/m/flat/60258efe-bd7e-4886-82e1-196e0cac5433%40postgresql.org
Author: Alexander Korotkov
Reviewed-by: Justin Pryzby, Jonathan S. Katz, Zhihong Yu, Tom Lane
Reviewed-by: Alvaro Herrera
The new test code added in ead9e51e82 is racy -- it hinges on
shared-memory state, which changes before the WARNING message is logged.
Put it the other way around.
Backpatch to 13.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/202107161809.zclasccpfcg3@alvherre.pgsql
We had documentation of default_transaction_isolation et al,
but for some reason not of transaction_isolation et al.
AFAICS this is just an ancient oversight, so repair.
Per bug #17077 from Yanliang Lei.
Discussion: https://postgr.es/m/17077-ade8e166a01e1374@postgresql.org
pg_dump failed to preserve the 'enabled' flag (which can be not only
disabled, but also REPLICA or ALWAYS) for partitions which had it
changed from their respective parents. Attempt to handle that by
including a definition for such triggers in the dump, but replace the
standard CREATE TRIGGER line with an ALTER TRIGGER line.
Backpatch to 11, where these triggers can exist. In branches 11 and 12,
pick up a few test lines from commit b9b408c487 to verify that
pg_upgrade is okay with these arrangements.
Co-authored-by: Justin Pryzby <pryzby@telsasoft.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200930223450.GA14848@telsasoft.com
When triggers are cloned from partitioned tables to their partitions,
the 'tgenabled' flag (origin/replica/always/disable) was not propagated.
Make it so that the flag on the trigger on partition is initially set to
the same value as on the partitioned table.
Add a test case to verify the behavior.
Backpatch to 11, where this appeared in commit 86f575948c.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200930223450.GA14848@telsasoft.com
When some slots are invalidated due to the max_slot_wal_keep_size limit,
the old segment horizon should move forward to stay within the limit.
However, in commit c655077639 we forgot to call KeepLogSeg again to
recompute the horizon after invalidating replication slots. In cases
where other slots remained, the limits would be recomputed eventually
for other reasons, but if all slots were invalidated, the limits would
not move at all afterwards. Repair.
Backpatch to 13 where the feature was introduced.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reported-by: Marcin Krupowicz <mk@071.ovh>
Discussion: https://postgr.es/m/17103-004130e8f27782c9@postgresql.org
Ensure to properly mark up function parameters in text with <parameter>,
avoid using <acronym> for terms which aren't acronyms and properly place
the ", and" in a value list. The acronym removal is a follow-up to commit
fb72a7b8c3 which removed it for minmax-multi. In passing, also fix an
incorrectly cased word.
Author: Ekaterina Kiryanova <e.kiryanova@postgrespro.ru>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/c050ecbc-80b2-b360-3c1d-9fe6a6a11bb5@postgrespro.ru
Backpatch-through: v14
Autoconf's AC_CHECK_DECLS() always defines HAVE_DECL_whatever
as 1 or 0, but some of the entries in msvc/Solution.pm showed
such symbols as "undef" instead of 0. Fix that for consistency.
There's no live bug in current usages AFAICS, but it's not hard
to imagine one creeping in if more-complex #if tests get added.
Back-patch to v13, which is as far back as Solution.pm contains
this data. The inconsistency still exists in the manually-filled
pg_config_ext.h.win32 files of older branches; but as long as the
problem is only latent, it doesn't seem worth the trouble to
clean things up there.
Discussion: https://postgr.es/m/3185430.1626133592@sss.pgh.pa.us
This commit fixes the description of a couple of multirange operators and
oprjoin for another multirange operator. The change of oprjoin is more
cosmetic since both old and new functions return the same constant.
These cosmetic changes don't worth catalog incompatibility between 14beta2
and 14beta3. So, catversion isn't bumped.
Discussion: https://postgr.es/m/CAPpHfdv9OZEuZDqOQoUKpXhq%3Dmc-qa4gKCPmcgG5Vvesu7%3Ds1w%40mail.gmail.com
Backpatch-throgh: 14
Some commands of ALTER TABLE could fail with the following error:
ERROR: "tab" is of the wrong type
This error is unexpected, as all the code paths leading to
ATWrongRelkindError() should use a supported set of relkinds to generate
correct error messages. This commit closes the gap with such mistakes,
by adding all the missing relkind combinations. Tests are added to
check all the problems found. Note that some combinations are not used,
but these are left around as it could have an impact on applications
relying on this code.
2ed532e has done a much larger refactoring on HEAD to make such error
messages easier to manage in the long-term, so nothing is needed there.
Author: Kyotaro Horiguchi
Reviewed-by: Peter Eisentraut, Ahsan Hadi, Michael Paquier
Discussion: https://postgr.es/m/20210216.181415.368926598204753659.horikyota.ntt@gmail.com
Backpatch-through: 11
"Result Cache" was never a great name for this node, but nobody managed
to come up with another name that anyone liked enough. That was until
David Johnston mentioned "Node Memoization", which Tom Lane revised to
just "Memoize". People seem to like "Memoize", so let's do the rename.
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/20210708165145.GG1176@momjian.us
Backpatch-through: 14, where Result Cache was introduced
41469253e went to the trouble of removing a theoretical bug from
free_sort_tuple by checking if the tuple was NULL before freeing it. Let's
make this a little more robust by also setting the tuple to NULL so that
should we be called again we won't end up doing a pfree on the already
pfree'd tuple. Per advice from Tom Lane.
Discussion: https://postgr.es/m/3188192.1626136953@sss.pgh.pa.us
Backpatch-through: 9.6, same as 41469253e
This fixes a theoretical bug in tuplesort.c which, if a bounded sort was
used in combination with a byval Datum sort (tuplesort_begin_datum), when
switching the sort to a bounded heap in make_bounded_heap(), we'd call
free_sort_tuple(). The problem was that when sorting Datums of a byval
type, the tuple is NULL and free_sort_tuple() would free the memory for it
regardless of that. This would result in a crash.
Here we fix that simply by adding a check to see if the tuple is NULL
before trying to disassociate and free any memory belonging to it.
The reason this bug is only theoretical is that nowhere in the current
code base do we do tuplesort_set_bound() when performing a Datum sort.
However, let's backpatch a fix for this as if any extension uses the code
in this way then it's likely to cause problems.
Author: Ronan Dunklau
Discussion: https://postgr.es/m/CAApHDvpdoqNC5FjDb3KUTSMs5dg6f+XxH4Bg_dVcLi8UYAG3EQ@mail.gmail.com
Backpatch-through: 9.6, oldest supported version
Apple's mechanism for dealing with functions that are available
in only some OS versions confuses AC_CHECK_FUNCS, and therefore
AC_REPLACE_FUNCS. We can use AC_CHECK_DECLS instead, so long as
we enable -Werror=unguarded-availability-new. This allows people
compiling for macOS to control whether or not preadv/pwritev are
used by setting MACOSX_DEPLOYMENT_TARGET, rather than supplying
a back-rev SDK. (Of course, the latter still works, too.)
James Hilliard
Discussion: https://postgr.es/m/20210122193230.25295-1-james.hilliard1@gmail.com
This should have been removed in commit 7e30c186da, which split the loop
into two. Only the first loop uses the 'from' variable; updating it in
the second loop is bogus. It was never read after the first loop, so this
was harmless and surely optimized away by the compiler, but let's be tidy.
Backpatch to all supported versions.
Author: Ranier Vilela
Discussion: https://www.postgresql.org/message-id/CAEudQAoWq%2BAL3BnELHu7gms2GN07k-np6yLbukGaxJ1vY-zeiQ%40mail.gmail.com
This reverts commit 54fb8c7, as per the issues reported by fairywren
when it comes to MinGW because of the lack of microsoft_native_stat()
there. Using just stat() for MSVC is not sufficient to take care of the
concurrency problems with files pending on deletion. It may be possible
to paint some __MINGW64__ in the code to switch to a different
implementation of stat() in this build context, but I am not sure either
if relying on the implementation of stat() in MinGW to take care of the
problems we are trying to fix is enough or not. So this needs more
study.
Discussion: https://postgr.es/m/YOvOlfRrIO0yGtgw@paquier.xyz
Backpatch-through: 14
The code introduced by bed9075 to enhance the stat() implementation on
Windows for file sizes larger than 4GB fails to properly detect files
pending for deletion with its method based on NtQueryInformationFile()
or GetFileInformationByHandleEx(), as proved by Alexander Lakhin in a
custom TAP test of his own.
The method used in the implementation of open() to sleep and loop when
when failing on ERROR_ACCESS_DENIED (EACCES) is showing much more
stability, so switch to this method. This could still lead to issues if
the permission problem stays around for much longer than the timeout of
1 second used, but that should (hopefully) never happen in
performance-critical paths. Still, there could be a point in increasing
the timeouts for the sake of machines that handle heavy loads.
Note that WIN32's open() now uses microsoft_native_stat() as it should
be similar to stat() when working around issues with concurrent file
deletions.
I have spent some time testing this patch with pgbench in combination
of the SQL functions from genfile.c, as well as running the TAP test
provided on the thread with MSVC builds, and this looks much more
stable than the previous method.
Author: Alexander Lakhin
Reviewed-by: Tom Lane, Michael Paquier, Justin Pryzby
Discussion: https://postgr.es/m/c3427edf-d7c0-ff57-90f6-b5de3bb62709@gmail.com
Backpatch-through: 14
Although we were careful to lock the object being added or dropped,
we failed to get any sort of lock on the extension itself. This
allowed the ALTER to proceed in parallel with a DROP EXTENSION,
which is problematic for a couple of reasons. If both commands
succeeded we'd be left with a dangling link in pg_depend, which
would cause problems later. Also, if the ALTER failed for some
reason, it might try to print the extension's name, and that could
result in a crash or (in older branches) a silly error message
complaining about extension "(null)".
Per bug #17098 from Alexander Lakhin. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/17098-b960f3616c861f83@postgresql.org
Commit 547f04e changed pgbench to use microsecond accounting, but
introduced a couple of logging and aggregation bugs:
1. We print Unix epoch timestamps so that you can correlate them with
other logs, but these were inadvertently changed to use a
system-dependent reference epoch. Compute Unix timestamps, and begin
aggregation intervals on the boundaries of whole Unix epoch seconds, as
before.
2. The user-supplied aggregation interval needed to be scaled.
Back-patch to 14.
Author: Fabien COELHO <coelho@cri.ensmp.fr>
Author: Yugo NAGATA <nagata@sraoss.co.jp>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reported-by: YoungHwan Joo <rulyox@gmail.com>
Reported-by: Gregory Smith <gregsmithpgsql@gmail.com>
Discussion: https://postgr.es/m/CAF7igB1r6wRfSCEAB-iZBKxkowWY6%2BdFF2jObSdd9%2BiVK%2BvHJg%40mail.gmail.com
Discussion: https://postgr.es/m/CAHLJuCW_8Vpcr0%3Dt6O_gozrg3wXXWXZXDioYJd3NhvKriqgpfQ%40mail.gmail.com
If an error occurred in the wrong place, it was possible to leave an
unintialized entry in the hash table, leading to a crash. Fixed.
Also, be more careful about the order of operations so that an
allocation error doesn't leak memory in CacheMemoryContext or
unnecessarily advance NextRecordTypmod.
Backpatch through version 11. Earlier versions (prior to 35ea75632a)
do not exhibit the problem, because an uninitialized hash entry
contains a valid empty list.
Author: Sait Talha Nisanci <Sait.Nisanci@microsoft.com>
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/HE1PR8303MB009069D476225B9A9E194B8891779@HE1PR8303MB0090.EURPRD83.prod.outlook.com
Backpatch-through: 11
Sigh ... I was expecting AC_CHECK_LIB to do something it didn't,
namely update LIBS. This led to not finding ldap_initialize.
Fix by moving the probe for ldap_initialize. In some sense this
is more correct anyway, since (at least for now) we care about
whether ldap_initialize exists in libldap not libldap_r.
Per buildfarm member elver and local testing.
Discussion: https://postgr.es/m/17083-a19190d9591946a7@postgresql.org
This fixes an overflow error when using the numeric * operator if the
result has more than 16383 digits after the decimal point by rounding
the result. Overflow errors should only occur if the result has too
many digits *before* the decimal point.
Discussion: https://postgr.es/m/CAEZATCUmeFWCrq2dNzZpRj5+6LfN85jYiDoqm+ucSXhb9U2TbA@mail.gmail.com
I incorrectly diagnosed the reason why hoverfly is unhappy.
Looking closer, it appears that it fails to link libldap
unless libssl is also present; so the problem was my
idea of clearing LIBS before making the check. Revert
to essentially the original coding, except that instead
of failing when libldap_r isn't there, use libldap.
Per buildfarm member hoverfly.
Discussion: https://postgr.es/m/17083-a19190d9591946a7@postgresql.org
When sending queries in pipeline mode, we were careless about leaving
the connection in the right state so that PQgetResult would behave
correctly; trying to read further results after sending a query after
having read a result with an error would sometimes hang. Fix by
ensuring internal libpq state is changed properly. All the state
changes were being done by the callers of pqAppendCmdQueueEntry(); it
would have become too repetitious to have this logic in each of them, so
instead put it all in that function and relieve callers of the
responsibility.
Add a test to verify this case. Without the code fix, this new test
hangs sometimes.
Also, document that PQisBusy() would return false when no queries are
pending result. This is not intuitively obvious, and NULL would be
obtained by calling PQgetResult() at that point, which is confusing.
Wording by Boris Kolpackov.
In passing, fix bogus use of "false" to mean "0", per Ranier Vilela.
Backpatch to 14.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Boris Kolpackov <boris@codesynthesis.com>
Discussion: https://postgr.es/m/boris.20210624103805@codesynthesis.com
In commit d0a02bdb8, I'd supposed that uniformly probing for
ldap_bind would make the intent clearer. However, that seems
not to work on AIX, for obscure reasons (maybe it's a macro
there?). Revert to the former behavior of probing
ldap_simple_bind for thread-safe cases and ldap_bind otherwise.
Per buildfarm member hoverfly.
Discussion: https://postgr.es/m/17083-a19190d9591946a7@postgresql.org