As demonstrated by David Johnston, the Memoize cache hit ratio calculation
wasn't quite correct.
This change only affects the estimated hit ratio when the estimated number
of entries to cache is estimated not to fit inside the cache. For
example, if we expect 2000 distinct cache key values and only expect to be
able to cache 1000 of those at once due to memory constraints, with an
estimate of 10000 calls, if we could store all entries then the hit ratio
should be 80% to account for the first 2000 of the 10000 calls to be a
cache miss due to the value not being cached yet. If we can only store
1000 entries for each of the 2000 distinct possible values at once then
the 80% should be reduced by half to make the final estimate of 40%.
Previously, the calculation would have produced an estimated hit ratio of
30%, which wasn't correct.
Apply to master only so as not to destabilize plans in the back branches.
Reported-by: David G. Johnston
Discussion: https://postgr.es/m/CAKFQuwZEmcNk3YQo2Xj4EDUOdY6qakad31rOD1Vc4q1_s68-Ew@mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvrV44LwiF4W_qf_RpbGYWSgp1kF=cZr+kTRRaALUfmXqw@mail.gmail.com
These are set after a \! command or a backtick substitution.
SHELL_ERROR is just "true" for error (nonzero exit status) or "false"
for success, while SHELL_EXIT_CODE records the actual exit status
following standard shell/system(3) conventions.
Corey Huinker, reviewed by Maxim Orlov and myself
Discussion: https://postgr.es/m/CADkLM=cWao2x2f+UDw15W1JkVFr_bsxfstw=NGea7r9m4j-7rQ@mail.gmail.com
The check for the number of roles in the target cluster for an upgrade
selects the existing roles and performs a COUNT(*) over the result. A
value of one is the expected query result value indicating that only
the install user is present in the new cluster. The result was converted
with the function for converting a string containing an Oid into a numeric,
which avoids potential overflow but makes the code less readable since
it's not actually an Oid at all.
Discussion: https://postgr.es/m/41AB5F1F-4389-4B25-9668-5C430375836C@yesql.se
This makes it easier to specify values taken directly from WAL file
names.
The option parsing is arranged in the style of option_parse_int() (but
we need to parse unsigned int), to allow future refactoring in the
same manner.
Reviewed-by: Sébastien Lardière <sebastien@lardiere.net>
Discussion: https://www.postgresql.org/message-id/flat/8fef346e-2541-76c3-d768-6536ae052993@lardiere.net
We fail to apply updates and deletes when the REPLICA IDENTITY FULL is
used for the table having dropped columns. We didn't use to ignore dropped
columns while doing tuple comparison among the tuples from the publisher
and subscriber during apply of updates and deletes.
Author: Onder Kalaci, Shi yu
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
With unlucky timing and parallel_leader_participation=off (not the
default), PHJ could attempt to access per-batch shared state just as it
was being freed. There was code intended to prevent that by checking
for a cleared pointer, but it was racy. Fix, by introducing an extra
barrier phase. The new phase PHJ_BUILD_RUNNING means that it's safe to
access the per-batch state to find a batch to help with, and
PHJ_BUILD_DONE means that it is too late. The last to detach will free
the array of per-batch state as before, but now it will also atomically
advance the phase, so that late attachers can avoid the hazard. This
mirrors the way per-batch hash tables are freed (see phases
PHJ_BATCH_PROBING and PHJ_BATCH_DONE).
An earlier attempt to fix this (commit 3b8981b6, later reverted) missed
one special case. When the inner side is empty (the "empty inner
optimization), the build barrier would only make it to
PHJ_BUILD_HASHING_INNER phase before workers attempted to detach from
the hashtable. In that case, fast-forward the build barrier to
PHJ_BUILD_RUNNING before proceeding, so that our later assertions hold
and we can still negotiate who is cleaning up.
Revealed by build farm failures, where BarrierAttach() failed a sanity
check assertion, because the memory had been clobbered by dsa_free().
In non-assert builds, the result could be a segmentation fault.
Back-patch to all supported releases.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: Michael Paquier <michael@paquier.xyz>
Reported-by: David Geier <geidav.pg@gmail.com>
Tested-by: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
Counting writes only for io_context = 'normal' is unreliable, as backends
using a buffer access strategy could flush all of the dirty buffers out from
under the other backends and checkpointer. Change the test to count writes in
any context. This achieves roughly the same coverage anyway.
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://www.postgresql.org/message-id/ZAnWU8WbXEDjrfUE%40telsasoft.com
@extschema:name@ extends the existing @extschema@ feature so that
we can also insert the schema name of some required extension,
thus making cross-extension references robust even if they are in
different schemas.
However, this has the same hazard as @extschema@: if the schema
name is embedded literally in an installed object, rather than being
looked up once during extension script execution, then it's no longer
safe to relocate the other extension to another schema. To deal with
that without restricting things unnecessarily, add a "no_relocate"
option to extension control files. This allows an extension to
specify that it cannot handle relocation of some of its required
extensions, even if in themselves those extensions are relocatable.
We detect "no_relocate" requests of dependent extensions during
ALTER EXTENSION SET SCHEMA.
Regina Obe, reviewed by Sandro Santilli and myself
Discussion: https://postgr.es/m/003001d8f4ae$402282c0$c0678840$@pcorp.us
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.
A new type TU_UpdateIndexes provides a signal to the executor to
determine which indexes to update - no indexes, all indexes, or only the
summarizing indexes.
This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.
This was originally committed as 5753d4ee32, but then got reverted by
e3fcca0d0d because of correctness issues.
Original patch by Josef Simanek, various fixes and improvements by Tomas
Vondra and me.
Authors: Matthias van de Meent, Josef Simanek, Tomas Vondra
Reviewed-by: Tomas Vondra, Alvaro Herrera
Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
When calculating distance in brin_minmax_multi_distance_inet(), the
netmask was applied incorrectly. This results in (seemingly) incorrect
ordering of values, triggering an assert.
For builds without asserts this is mostly harmless - we may merge other
ranges, possibly resulting in slightly less efficient index. But it's
still correct and the greedy algorithm doesn't guarantee optimality
anyway.
Backpatch to 14, where minmax-multi indexes were introduced.
Reported by Dmitry Dolgov, investigation and fix by me.
Reported-by: Dmitry Dolgov
Backpatch-through: 14
Discussion: https://postgr.es/m/17774-c6f3e36dd4471e67@postgresql.org
The Memoize executor node stores the cache key values along with the
tuple(s) which were found in the outer node which match each key value,
however, when the planner tried to estimate how many entries could be
stored in the cache, it didn't take into account that the cache key must
also be stored. In many cases, this won't make a large difference as the
key is likely small in comparison to the tuple(s) being stored, however,
it's not impossible to craft cases where the key could take more memory
than the tuple(s) stored for it.
Here we adjust the planner so it takes into account the estimated amount
of memory to store the cache key. Effectively, this change will reduce
the estimated cache hit ratio when it's thought that not all items will
fit in the cache, thus Memoize will become more expensive in such cases.
The executor already takes into account the memory consumed by the cache
key, so here we only need to adjust the planner.
Discussion: https://postgr.es/m/CAApHDvqGErGuyBfQvBQrTCHDbzLTqoiW=_G9sOzeFxWEc_7auA@mail.gmail.com
When probing the Memoize cache to check if the current cache key values
exist in the cache, we perform an evaluation of the expressions making up
the cache key before probing the hash table for those values. This
operation could leak memory as it is possible that the cache key is an
expression which requires allocation of memory, as was the case in bug
17844.
Here we fix this by correctly switching to the per tuple context before
evaluating the cache expressions so that the memory is freed next time the
per tuple context is reset.
Bug: 17844
Reported-by: Alexey Ermakov
Discussion: https://postgr.es/m/17844-d2f6f9e75a622bed@postgresql.org
Backpatch-through: 14, where Memoize was introduced
nodeRead() will have created a Node struct that's only allocated big
enough for the specific node type, so copying sizeof(union ValUnion)
can be copying too much. This provokes valgrind complaints, and with
very bad luck could perhaps result in SIGSEGV.
While at it, tidy up _equalA_Const to avoid duplicate checks of isnull.
Per report from Alexander Lakhin. This code is new as of a6bc33019,
so no need to back-patch.
Discussion: https://postgr.es/m/4995256b-cc65-170e-0b22-60ad2cd535f1@gmail.com
Add versions of timestamptz + interval, timestamptz - interval, and
generate_series(timestamptz, ...) in which a timezone can be specified
explicitly instead of defaulting to the TimeZone GUC setting.
The new functions for the first two are named date_add and
date_subtract. This might seem too generic, but we could use
overloading to add additional variants if that seems useful.
Along the way, improve the docs' pretty inadequate explanation
of how timestamptz +- interval works.
Przemysław Sztoch and Gurjeet Singh; cosmetic changes and most of
the docs work by me
Discussion: https://postgr.es/m/01a84551-48dd-1359-bf7e-f6b0203a6bd0@sztoch.pl
We already had five copies of essentially the same logic, and an
upcoming patch introduces yet another use-case. That's past my
threshold of pain, so introduce a common subroutine. There's not
that much net code savings, but the chance of typos should go down.
Inspired by a patch from Przemysław Sztoch, but different in detail.
Discussion: https://postgr.es/m/01a84551-48dd-1359-bf7e-f6b0203a6bd0@sztoch.pl
Check whether the datctype is C to determine whether t_isspace() and
related functions use isspace() or iswspace().
Previously, t_isspace() checked whether the database default collation
was C; which is incorrect when the default collation uses the ICU
provider.
Discussion: https://postgr.es/m/79e4354d9eccfdb00483146a6b9f6295202e7890.camel@j-davis.com
Reviewed-by: Peter Eisentraut
Backpatch-through: 15
Instead of trying to optimize this by skipping creation of the
links for tables we don't plan to dump, just create them all in
bulk with a single scan over the pg_inherits data. The previous
approach was more or less O(N^2) in the number of pg_inherits
entries, not to mention being way too complicated.
Also, don't create useless TableAttachInfo objects.
It's silly to create a TableAttachInfo object that we're not
going to dump, when we know perfectly well at creation time
that it won't be dumped.
Patch by me; thanks to Julien Rouhaud for review.
Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
Hash partitioning on an enum is problematic because the hash codes are
derived from the OIDs assigned to the enum values, which will almost
certainly be different after a dump-and-reload than they were before.
This means that some rows probably end up in different partitions than
before, causing restore to fail because of partition constraint
violations. (pg_upgrade dodges this problem by using hacks to force
the enum values to keep the same OIDs, but that's not possible nor
desirable for pg_dump.)
Users can work around that by specifying --load-via-partition-root,
but since that's a dump-time not restore-time decision, one might
find out the need for it far too late. Instead, teach pg_dump to
apply that option automatically when dealing with a partitioned
table that has hash-on-enum partitioning.
Also deal with a pre-existing issue for --load-via-partition-root
mode: in a parallel restore, we try to TRUNCATE target tables just
before loading them, in order to enable some backend optimizations.
This is bad when using --load-via-partition-root because (a) we're
likely to suffer deadlocks from restore jobs trying to restore rows
into other partitions than they came from, and (b) if we miss getting
a deadlock we might still lose data due to a TRUNCATE removing rows
from some already-completed restore job.
The fix for this is conceptually simple: just don't TRUNCATE if we're
dealing with a --load-via-partition-root case. The tricky bit is for
pg_restore to identify those cases. In dumps using COPY commands we
can inspect each COPY command to see if it targets the nominal target
table or some ancestor. However, in dumps using INSERT commands it's
pretty impractical to examine the INSERTs in advance. To provide a
solution for that going forward, modify pg_dump to mark TABLE DATA
items that are using --load-via-partition-root with a comment.
(This change also responds to a complaint from Robert Haas that
the dump output for --load-via-partition-root is pretty confusing.)
pg_restore checks for the special comment as well as checking the
COPY command if present. This will fail to identify the combination
of --load-via-partition-root and --inserts in pre-existing dump files,
but that should be a pretty rare case in the field. If it does
happen you will probably get a deadlock failure that you can work
around by not using parallel restore, which is the same as before
this bug fix.
Having done this, there seems no remaining reason for the alarmism
in the pg_dump man page about combining --load-via-partition-root
with parallel restore, so remove that warning.
Patch by me; thanks to Julien Rouhaud for review. Back-patch to
v11 where hash partitioning was introduced.
Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
Mainly move some detail from errmsg to errdetail, remove explicit
mention of superuser where appropriate, since that is implied in most
permission checks, and make messages more uniform.
Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230316234701.GA903298@nathanxps13
Since commit 7627b91cd5, libpq has used FD_CLOEXEC so that sockets
wouldn't be leaked to subprograms. With enough bad luck, a
multi-threaded program might fork in between the socket() and fcntl()
calls. We can close that tiny gap by using SOCK_CLOEXEC instead of a
separate call. While here, we might as well do the same for
SOCK_NONBLOCK, to save another syscall.
These flags are expected to appear in the next revision of the POSIX
standard, specifically to address this problem. Our Unixoid targets
except macOS and AIX have had them for a long time, and macOS would
hopefully use guarded availability to roll them out, so it seems enough
to use a simple ifdef test for availability until we hear otherwise.
Windows doesn't have them, but has non-inheritable sockets by default.
Discussion: https://postgr.es/m/CA%2BhUKGKb6FsAdQWcRL35KJsftv%2B9zXqQbzwkfRf1i0J2e57%2BhQ%40mail.gmail.com
Unfortunately it turns out that the logfile-only option added in b9f8d1cbad
is only available in openldap starting in 2.6.
Luckily the option to control the log level (loglevel/-s) have been around for
much longer. As it turns out loglevel/-s only control what goes into syslog,
not what ends up in the file specified with 'logfile' and stderr.
While we currently are specifying 'logfile', nothing ends up in it, as the
option only controls debug messages, and we didn't set a debug level. The
debug level can only be configured on the commandline and also prevents
forking. That'd require larger changes, so this commit doesn't tackle that
issue.
Specify the syslog level when starting slapd using -s, as that allows to
prevent all syslog messages if one uses '0' instead of 'none', while loglevel
doesn't prevent the first message.
Discussion: https://postgr.es/m/20230311233708.3yjdbjkly2q4gq2j@awork3.anarazel.de
Backpatch: 11-
Currently, there are quite a few places in reorderbuffer.c that tries to
access top-transaction for a subtransaction. This makes the code to access
top-transaction consistent and easier to follow.
Author: Peter Smith
Reviewed-by: Vignesh C, Sawada Masahiko
Discussion: https://postgr.es/m/CAHut+PuCznOyTqBQwjRUu-ibG-=KHyCv-0FTcWQtZUdR88umfg@mail.gmail.com
The logic added in 9d9c02ccd to determine when a qual can be used as a
WindowClause run condition failed to correctly check for subqueries in the
qual. This was being done correctly for normal subquery qual pushdowns,
it's just that 9d9c02ccd failed to follow the lead on that.
This also fixes various other cases where transforming the qual into a
WindowClause run condition in the subquery should have been disallowed.
Bug: #17826
Reported-by: Anban Company
Discussion: https://postgr.es/m/17826-7d8750952f19a5f5@postgresql.org
Backpatch-through: 15, where 9d9c02ccd was introduced.
Until now the tests using slapd spammed syslog for every connection /
query. Use logfile-only to prevent syslog activity. Unfortunately that only
takes effect after logging the first message, but that's still much better
than the prior situation.
Discussion: https://postgr.es/m/20230311233708.3yjdbjkly2q4gq2j@awork3.anarazel.de
Backpatch: 11-
Support for SCM credential authentication has been removed in the
backend in 9.1, and libpq has kept some code to handle it for
compatibility.
Commit be4585b, that did the cleanup of the backend code, has done
so because the code was not really portable originally. And, as there
are likely little chances that this is used these days, this removes the
remaining code from libpq. An error will now be raised by libpq if
attempting to connect to a server that returns AUTH_REQ_SCM_CREDS,
instead.
References to SCM credential authentication are removed from the
protocol documentation. This removes some meson and configure checks.
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZBLH8a4otfqgd6Kn@paquier.xyz
Further to commit 6a9229da, checking for NULL is now redundant. An "out
of memory" error would have been thrown already by palloc() and treated
as FATAL, so we can delete a few more lines.
Back-patch to all releases, like those other commits.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/4040668.1679013388%40sss.pgh.pa.us
create_append_path() would only apply get_baserel_parampathinfo
when the path is for a partitioned table, but it's also potentially
useful for paths for UNION ALL appendrels. Specifically, that
supports building a Memoize path atop this one.
While we're in the vicinity, delete some dead code in
create_merge_append_plan(): there's no need for it to support
parameterized MergeAppend paths, and it doesn't look like that
is going to change anytime soon. It'll be easy enough to undo
this when/if it becomes useful.
Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_ABSu4PWG2rE1q10tJugEXHWgru3U8dAgkoFvgrb6aEA@mail.gmail.com
gcc 12+ has complaints like the following:
../../../../../pgsql/src/backend/utils/adt/network.c: In function 'inetnot':
../../../../../pgsql/src/backend/utils/adt/network.c:1893:34: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
1893 | pdst[nb] = ~pip[nb];
| ~~~~~~~~~^~~~~~~~~~
../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16
27 | unsigned char ipaddr[16]; /* up to 128 bits of address */
| ^~~~~~
../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16
This is due to a compiler bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104986
It has been a year since the bug has been reported without getting fixed. As
the warnings are verbose and use of gcc 12 is becoming more common, it seems
worth working around the bug. Particularly because a simple reformulation of
the loop condition fixes the issue and isn't any less readable.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/144536.1648326206@sss.pgh.pa.us
Backpatch: 11-
A comment was left behind claiming that we needed to use malloc() rather
than palloc() because the corresponding free would run in another
thread, but that's not true anymore. Remove that comment. And, with
the reason being gone, we might as well actually use palloc().
Back-patch to supported releases, like d41a178b.
Discussion: https://postgr.es/m/CA%2BhUKG%2BpdM9v3Jv4tc2BFx2jh_daY3uzUyAGBhtDkotEQDNPYw%40mail.gmail.com
DecodeDateTime and DecodeTimeOnly had support for date input in the
style "Y2023M03D16", which the comments claimed to be an "ISO" format.
However, so far as I can find there is no such format in ISO 8601;
they write units before numbers in intervals, but not in datetimes.
Furthermore, the lesser-known ISO 8601-2 spec actually defines an
incompatible format "2023Y03M16D". None of our documentation mentions
such a format either. So let's just drop it.
That leaves us with only two cases for a prefix unit specifier in
datetimes: Julian dates written as Jnnnn, and the "T" separator
defined by ISO 8601. Add checks to catch misuse of these specifiers,
that is consecutive specifiers or a dangling specifier at the end of
the string. We do not however disallow a specifier that is separated
from the field that it disambiguates (by noise words or unrelated
fields). That being the case, remove some overly-aggressive error
checks from the ISOTIME cases.
Joseph Koshakow, editorialized a bit by me; thanks also to
Peter Eisentraut for some standards-reading.
Discussion: https://postgr.es/m/CAAvxfHf2Q1gKLiHGnuPOiyf0ASvKUM4BnMfsXuwgtYEb_Gx0Zw@mail.gmail.com
.../src/common/file_utils.c: In function ‘pg_pwrite_zeros’:
.../src/common/file_utils.c:543:9: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
543 | const static PGAlignedBlock zbuffer = {{0}}; /* worth BLCKSZ */
The user receiving the message might not understand where the
server's "current directory" is. "Data directory" seems clearer.
(This would not be good for frontend code, but both of these
messages are only issued in the backend.)
Kyotaro Horiguchi
Discussion: https://postgr.es/m/20230316.111646.1564684434328830712.horikyota.ntt@gmail.com
In the .pc (pkg-config) files generated by the make and meson builds,
the Requires.private entries use different delimiters. The make build
uses spaces, the meson build uses commas. The pkg-config documentation
says that it should be comma-separated, but apparently about half the
.pc in the wild use just spaces. The pkg-config source code
acknowledges that both commas and spaces work.
This changes the make build to use commas, for consistency.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/1fb52d61-0964-2d8e-87d9-e8be830e2b24%40enterprisedb.com
This structure included only PgStat_FunctionCounts, and removing it
facilitates some upcoming refactoring for pgstatfuncs.c to use more
macros rather that mostly-duplicated functions.
Author: Bertrand Drouvot
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/11d531fe-52fc-c6ea-7e8e-62f1b6ec626e@gmail.com
A failure in parsing the interval value defined in the \watch command
was silently switched to 1s of interval between two queries, which can
be confusing. This commit improves the error handling, and a couple of
tests are added to check after:
- An incorrect value.
- An out-of-range value.
- A negative value.
A value of zero is able to work now, meaning that there is no interval
of time between two queries in a \watch loop. No backpatch is done, as
it could break existing applications.
Author: Andrey Borodin
Reviewed-by: Kyotaro Horiguchi, Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/CAAhFRxiZ2-n_L1ErMm9AZjgmUK=qS6VHb+0SaMn8sqqbhF7How@mail.gmail.com
This adds the ability to pretty-print XML documents ... according to
libxml's somewhat idiosyncratic notions of what's pretty, anyway.
One notable divergence from a strict reading of the spec is that
libxml is willing to collapse empty nodes "<node></node>" to just
"<node/>", whereas SQL and the underlying XML spec say that this
option should only result in whitespace tweaks. Nonetheless,
it seems close enough to justify using the SQL-standard syntax.
Jim Jones, reviewed by Peter Smith and myself
Discussion: https://postgr.es/m/2f5df461-dad8-6d7d-4568-08e10608a69b@uni-muenster.de
The hook can be installed by a shared_preload library.
A similar mechanism could be used for radius paswords, for example, and
the type name auth_password_hook_typ has been shosen with that in mind.
John Naylor and Andrew Dunstan
Discussion: https://postgr.es/m/469b06ed-69de-ba59-c13a-91d2372e52a9@dunslane.net
preprocess_targetlist thought PHVs couldn't appear here.
It was mistaken, as per report from Önder Kalacı.
Surveying other pull_var_clause calls, I noted no similar errors,
but I did notice that qual_is_pushdown_safe's assertion about
!contain_window_function was pointless, because the following
pull_var_clause call would complain about them anyway. In HEAD
only, remove the redundant Assert and improve the commentary.
Discussion: https://postgr.es/m/CACawEhUuum-gC_2S3sXLTcsk7bUSPSHOD+g1ZpfKaDK-KKPPWA@mail.gmail.com
The previous coding based on select() had commentary about historical
portability concerns. Use POSIX nanosleep() instead.
This has independently been suggested a couple of times before, but
never managed to stick. Since recent and proposed work removes other
uses of select(), and associated code and comments relating to its
non-portable interaction with signals, it seems like a good time to tidy
up this case, too.
Also modernize the explanation of why WaitLatch() is a better way to
wait.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Suggested-by: Paul Guo <paulguo@gmail.com>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAAKRu_b-q0hXCBUCAATh0Z4Zi6UkiC0k2DFgoD3nC-r3SkR3tg%40mail.gmail.com
Discussion: https://postgr.es/m/CABQrizfxpBLZT5mZeE0js5oCh1tqEWvcGF3vMRCv5P-RwUY5dQ@mail.gmail.com
Discussion: https://postgr.es/m/4902.1552349020@sss.pgh.pa.us
There are still some systems that use traditional tick-based sleep
timing, but many including Linux, FreeBSD and macOS started using high
resolution timer hardware more directly a decade or two ago. Update our
comment about that. Also highlight that Windows is like the older
Unixen in that respect.
Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BogAon8_V223Ldv6taPR2uKH3X_UJ_A7LJAf3-VRARPA%40mail.gmail.com
The current implementation of _pgfstat64() is ineffective in detecting a
terminal handle or an anonymous named pipe. This commit improves our
port of fstat() to detect more efficiently such cases by relying on
GetFileType(), and returning more correct data when the type found is
either a FILE_TYPE_PIPE (_S_IFIFO) or a FILE_TYPE_CHAR (_S_IFCHR).
This is part of a more global fix to address failures when feeding the
output generated by pg_dump to pg_restore through a pipe, for example,
but not all of it. We are also going to need to do something about
fseek() and ftello() which are not reliable on WIN32 for the same cases
where fstat() was incorrect. Fixing fstat() is independent of the rest,
though, which is why both fixes are handled separately, and this is the
first part of it.
Reported-by: Daniel Watzinger
Author: Daniel Watzinger, Juan José Santamaría Flecha
Discussion: https://postgr.es/m/b1448cd7-871e-20e3-8398-895e2d1d3bf9@gmail.com
Backpatch-through: 14
Using REPLICA IDENTITY FULL on the publisher can lead to a full table scan
per tuple change on the subscription when REPLICA IDENTITY or PK index is
not available. This makes REPLICA IDENTITY FULL impractical to use apart
from some small number of use cases.
This patch allows using indexes other than PRIMARY KEY or REPLICA
IDENTITY on the subscriber during apply of update/delete. The index that
can be used must be a btree index, not a partial index, and it must have
at least one column reference (i.e. cannot consist of only expressions).
We can uplift these restrictions in the future. There is no smart
mechanism to pick the index. If there is more than one index that
satisfies these requirements, we just pick the first one. We discussed
using some of the optimizer's low-level APIs for this but ruled it out
as that can be a maintenance burden in the long run.
This patch improves the performance in the vast majority of cases and the
improvement is proportional to the amount of data in the table. However,
there could be some regression in a small number of cases where the indexes
have a lot of duplicate and dead rows. It was discussed that those are
mostly impractical cases but we can provide a table or subscription level
option to disable this feature if required.
Author: Onder Kalaci, Amit Kapila
Reviewed-by: Peter Smith, Shi yu, Hou Zhijie, Vignesh C, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/CACawEhVLqmAAyPXdHEPv1ssU2c=dqOniiGz7G73HfyS7+nGV4w@mail.gmail.com
Commit 4753ef37 changed vacuum_delay_point() to use the WaitLatch() API,
to fix the problem that vacuum could keep running for a very long time
after the postmaster died.
Unfortunately, that broke commit caf626b2's support for fractional
vacuum_cost_delay, which shipped in PostgreSQL 12. WaitLatch() works in
whole milliseconds.
For now, revert the change from commit 4753ef37, but add an explicit
check for postmaster death. That's an extra system call on systems
other than Linux and FreeBSD, but that overhead doesn't matter much
considering that we willingly went to sleep and woke up again. (In
later work, we might add higher resolution timeouts to the latch API so
that we could do this with our standard programming pattern, but that
wouldn't be back-patched.)
Back-patch to 14, where commit 4753ef37 arrived.
Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b-q0hXCBUCAATh0Z4Zi6UkiC0k2DFgoD3nC-r3SkR3tg%40mail.gmail.com
Our waitpid() emulation didn't prevent a PID from being recycled by the
OS before the call to waitpid(). The postmaster could finish up
tracking more than one child process with the same PID, and confuse
them.
Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(),
so that resources are released synchronously. The process and PID
continue to exist until we close the process handle, which only happens
once we're ready to adjust our book-keeping of running children.
This seems to explain a couple of failures on CI. It had never been
reported before, despite the code being as old as the Windows port.
Perhaps Windows started recycling PIDs more rapidly, or perhaps timing
changes due to commit 7389aad6 made it more likely to break.
Thanks to Alexander Lakhin for analysis and Andres Freund for tracking
down the root cause.
Back-patch to all supported branches.
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de
The band-aid applied in commit f0bedf3e4 turns out to still need
some work: it made sure we didn't set Np->last_relevant too small
(to the left of the decimal point), but it didn't prevent setting
it too large (off the end of the partially-converted string).
This could result in fetching data beyond the end of the allocated
space, which with very bad luck could cause a SIGSEGV, though
I don't see any hazard of interesting memory disclosure.
Per bug #17839 from Thiago Nunes. The bug's pretty ancient,
so back-patch to all supported versions.
Discussion: https://postgr.es/m/17839-aada50db24d7b0da@postgresql.org
This patch adds new pg_dump switches
--table-and-children=pattern
--exclude-table-and-children=pattern
--exclude-table-data-and-children=pattern
which act the same as the existing --table, --exclude-table, and
--exclude-table-data switches, except that any partitions or
inheritance child tables of the table(s) matching the pattern
are also included or excluded.
Gilles Darold, reviewed by Stéphane Tachoires
Discussion: https://postgr.es/m/5aa393b5-5f67-8447-b83e-544516990ee2@migops.com
Scanning the expression for compatible Vars isn't really necessary,
because the subsequent match against StatisticExtInfo entries will
eliminate expressions containing other Vars just fine. Moreover,
this code hadn't stopped to think about what to do with
PlaceHolderVars or Aggrefs in the clause; and at least for the PHV
case, that demonstrably leads to failures. Rather than work out
whether it's reasonable to ignore those, let's just remove the
whole stanza.
Per report from Richard Guo. Back-patch to v14 where this code
was added.
Discussion: https://postgr.es/m/CAMbWs48Mmvm-acGevXuwpB=g5JMqVSL6i9z5UaJyLGJqa-XPAA@mail.gmail.com
Expose the standard error functions as SQL-callable functions. These
are expected to be useful to people working with normal distributions,
and we use them here to test the distribution from random_normal().
Since these functions are defined in the POSIX and C99 standards, they
should in theory be available on all supported platforms. If that
turns out not to be the case, more work will be needed.
On all platforms tested so far, using extra_float_digits = -1 in the
regression tests is sufficient to allow for variations between
implementations. However, past experience has shown that there are
almost certainly going to be additional unexpected portability issues,
so these tests may well need further adjustments, based on the
buildfarm results.
Dean Rasheed, reviewed by Nathan Bossart and Thomas Munro.
Discussion: https://postgr.es/m/CAEZATCXv5fi7+Vu-POiyai+ucF95+YMcCMafxV+eZuN1B-=MkQ@mail.gmail.com
The new connection parameter require_auth allows a libpq client to
define a list of comma-separated acceptable authentication types for use
with the server. There is no negotiation: if the server does not
present one of the allowed authentication requests, the connection
attempt done by the client fails.
The following keywords can be defined in the list:
- password, for AUTH_REQ_PASSWORD.
- md5, for AUTH_REQ_MD5.
- gss, for AUTH_REQ_GSS[_CONT].
- sspi, for AUTH_REQ_SSPI and AUTH_REQ_GSS_CONT.
- scram-sha-256, for AUTH_REQ_SASL[_CONT|_FIN].
- creds, for AUTH_REQ_SCM_CREDS (perhaps this should be removed entirely
now).
- none, to control unauthenticated connections.
All the methods that can be defined in the list can be negated, like
"!password", in which case the server must NOT use the listed
authentication type. The special method "none" allows/disallows the use
of unauthenticated connections (but it does not govern transport-level
authentication via TLS or GSSAPI).
Internally, the patch logic is tied to check_expected_areq(), that was
used for channel_binding, ensuring that an incoming request is
compatible with conn->require_auth. It also introduces a new flag,
conn->client_finished_auth, which is set by various authentication
routines when the client side of the handshake is finished. This
signals to check_expected_areq() that an AUTH_REQ_OK from the server is
expected, and allows the client to complain if the server bypasses
authentication entirely, with for example the reception of a too-early
AUTH_REQ_OK message.
Regression tests are added in authentication TAP tests for all the
keywords supported (except "creds", because it is around only for
compatibility reasons). A new TAP script has been added for SSPI, as
there was no script dedicated to it yet. It relies on SSPI being the
default authentication method on Windows, as set by pg_regress.
Author: Jacob Champion
Reviewed-by: Peter Eisentraut, David G. Johnston, Michael Paquier
Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
The majority of error exit cases in json_lex_string() failed to
set lex->token_terminator, causing problems for the error context
reporting code: it would see token_terminator less than token_start
and do something more or less nuts. In v14 and up the end result
could be as bad as a crash in report_json_context(). Older
versions accidentally avoided that fate; but all versions produce
error context lines that are far less useful than intended,
because they'd stop at the end of the prior token instead of
continuing to where the actually-bad input is.
To fix, invent some macros that make it less notationally painful
to do the right thing. Also add documentation about what the
function is actually required to do; and in >= v14, add an assertion
in report_json_context about token_terminator being sufficiently
far advanced.
Per report from Nikolay Shaplov. Back-patch to all supported
versions.
Discussion: https://postgr.es/m/7332649.x5DLKWyVIX@thinkpad-pgpro
check_agg_arguments_walker() supposed that it needn't descend into
the arguments of a lower-level aggregate function, but this is
just wrong in the presence of multiple levels of sub-select. The
oversight would lead to executor failures on queries that should
be rejected. (Prior to v11, they actually were rejected, thanks
to a "redundant" execution-time check.)
Per bug #17835 from Anban Company. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17835-4f29f3098b2d0ba4@postgresql.org
This allows for a string which if an input field matches causes the
column's default value to be inserted. The advantage of this is that
the default can be inserted in some rows and not others, for which
non-default data is available.
The file_fdw extension is also modified to take allow use of this
option.
Israel Barth Rubio
Discussion: https://postgr.es/m/CAO_rXXAcqesk6DsvioOZ5zmeEmpUN5ktZf-9=9yu+DTr0Xr8Uw@mail.gmail.com
This ensures that the row count in the command tag for a MERGE is
correctly computed in the case where UPDATEs or DELETEs are skipped
due to a BEFORE ROW trigger returning NULL (the INSERT case was
already handled correctly by ExecMergeNotMatched() calling
ExecInsert()).
Back-patch to v15, where MERGE was introduced.
Discussion: https://postgr.es/m/CAEZATCU8XEmR0JWKDtyb7iZ%3DqCffxS9uyJt0iOZ4TV4RT%2Bow1w%40mail.gmail.com
If MERGE attempts an UPDATE or DELETE on a table with BEFORE ROW
triggers, or a cross-partition UPDATE (with or without triggers), and
a concurrent UPDATE or DELETE happens, the merge code would fail.
In some cases this would lead to a crash, while in others it would
cause the wrong merge action to be executed, or no action at all. The
immediate cause of the crash was the trigger code calling
ExecGetUpdateNewTuple() as part of the EPQ mechanism, which fails
because during a merge ri_projectNew is NULL, since merge has its own
per-action projection information, which ExecGetUpdateNewTuple() knows
nothing about.
Fix by arranging for the trigger code to exit early, returning the
TM_Result and TM_FailureData information, if a concurrent modification
is detected, allowing the merge code to do the necessary EPQ handling
in its own way. Similarly, prevent the cross-partition update code
from doing any EPQ processing for a merge, allowing the merge code to
work out what it needs to do.
This leads to a number of simplifications in nodeModifyTable.c. Most
notably, the ModifyTableContext->GetUpdateNewTuple() callback is no
longer needed, and mergeGetUpdateNewTuple() can be deleted, since
there is no longer any requirement for get-update-new-tuple during a
merge. Similarly, ModifyTableContext->cpUpdateRetrySlot is no longer
needed. Thus ExecGetUpdateNewTuple() and the retry_slot handling of
ExecCrossPartitionUpdate() can be restored to how they were in v14,
before the merge code was added, and ExecMergeMatched() no longer
needs any special-case handling for cross-partition updates.
While at it, tidy up ExecUpdateEpilogue() a bit, making it handle
recheckIndexes locally, rather than passing it in as a parameter,
ensuring that it is freed properly. This dates back to when it was
split off from ExecUpdate() to support merge.
Per bug #17809 from Alexander Lakhin, and follow-up investigation of
bug #17792, also from Alexander Lakhin.
Back-patch to v15, where MERGE was introduced, taking care to preserve
backwards-compatibility of the trigger API in v15 for any extensions
that might use it.
Discussion:
https://postgr.es/m/17809-9e6650bef133f0fe%40postgresql.orghttps://postgr.es/m/17792-0f89452029662c36%40postgresql.org
Most of these calls were to generate some random data. These can be
replaced by appropriately adapted sha256() calls. To keep the diff
smaller, we wrap this into a helper function that produces the same
output format and length as the md5() call.
This will eventually allow these tests to pass in OpenSSL FIPS mode
(which does not allow MD5 use).
Similar work for other test suites will follow later.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/dbbd927f-ef1f-c9a1-4ec6-c759778ac852@enterprisedb.com
The recently added standard collation UNICODE (0d21d4b9bc) doesn't
give consistent results on some build farm members with old ICU
versions. Apparently, the ICU locale specification 'und' (language
tag style) misbehaves on some older ICU versions. Replacing it with
'' (ICU locale ID style) fixes it at least on some OS versions. Let's
see what the build farm says.
The error cases for TLS and GSS encryption were inconsistent. After TLS
fails, the connection is marked as dead and follow-up calls of
PQconnectPoll() would return immediately, but GSS encryption was not
doing that, so the connection would still have been allowed to enter the
GSS handling code. This was handled incorrectly when gssencmode was set
to "require". "prefer" was working correctly, and this could not happen
under "disable" as GSS encryption would not be attempted.
This commit makes the error handling of GSS encryption on par with TLS
portion, fixing the case of gssencmode=require.
Reported-by: Jacob Champion
Author: Michael Paquier
Reviewed-by: Jacob Champion, Stephen Frost
Discussion: https://postgr.es/m/23787477-5fe1-a161-6d2a-e459f74c4713@timescale.com
Backpatch-through: 12
The 'ssl' option is of type 'combo', but we add a choice 'auto' that
simulates the behavior of a feature option. This way, openssl is used
automatically by default if present, but we retain the ability to
potentially select another ssl library.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ad65ffd1-a9a7-fda1-59c6-f7dc763c3051%40enterprisedb.com
adjust_appendrel_attrs can't transfer nullingrel labeling to a non-Var
translation expression (mainly because it's too late to wrap such an
expression in a PlaceHolderVar). I'd supposed in commit 2489d76c4
that that restriction was unreachable because we'd not attempt to push
problematic clauses down to an appendrel child relation. I forgot that
set_append_rel_size blindly converts all the parent rel's joininfo
clauses to child clauses, and that list could well contain clauses
from above a nulling outer join.
We might eventually have to devise a direct fix for this implementation
restriction, but for now it seems enough to filter out troublesome
clauses while constructing the child's joininfo list. Such clauses
are certainly not useful while constructing paths for the child rel;
they'll have to be applied later when we join the completed appendrel
to something else. So we don't need them here, and omitting them from
the list should save a few cycles while processing the child rel.
Per bug #17832 from Marko Tiikkaja.
Discussion: https://postgr.es/m/17832-d0a8106cdf1b722e@postgresql.org
This was an omission in the original creation of the module.
Also slightly adjust some wording to avoid a double "is".
Backpatch the non-meson piece of this to release 12, where the module
was introduced.
Discussion: https://postgr.es/m/be869e1c-8e3f-4cde-8609-212c899cccf9@dunslane.net
Freezing the relation N times and fetching the tuples one-by-one isn't that
cheap. On my machine this reduces test times by a bit less than one second, on
windows CI it's a few seconds.
Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/20230309001558.b7shzvio645ebdta@awork3.anarazel.de
64bit xids can't represent xids before epoch 0 (see also be504a3e97). When
FullTransactionIdFromXidAndCtx() was passed such an xid, it'd create a 64bit
xid far into the future. Noticed while adding assertions in the course of
investigating be504a3e97, as amcheck's test create such xids.
To fix the issue, just return FirstNormalFullTransactionId in this case. A
freshly initdb'd cluster already has a newer horizon. The most minimal version
of this would make the messages for some detected corruptions differently
inaccurate. To make those cases accurate, switch
FullTransactionIdFromXidAndCtx() to use the 32bit modulo difference between
xid and nextxid to compute the 64bit xid, yielding sensible "in the future" /
"in the past" answers.
Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/20230108002923.cyoser3ttmt63bfn@awork3.anarazel.de
Backpatch: 14-, where heapam verification was introduced
The COPY documentation is quite clear that "COPY relation TO" copies
rows from only the named table, not any inheritance children it may
have. However, if you enabled row-level security on the table then
this stopped being true, because the code forgot to apply the ONLY
modifier in the "SELECT ... FROM relation" query that it constructs
in order to allow RLS predicates to be attached. Fix that.
Report and patch by Antonin Houska (comment adjustments and test case
by me). Back-patch to all supported branches.
Discussion: https://postgr.es/m/3472.1675251957@antos
Previously, the default encoding was derived from the locale when
using libc; while the default was always UTF-8 when using ICU. That
would throw an error when the locale was not compatible with UTF-8.
This commit causes initdb to derive the default encoding from the
locale for both providers. If --no-locale is specified (or if the
locale is C or POSIX), the default encoding will be UTF-8 for ICU
(because ICU does not support SQL_ASCII) and SQL_ASCII for libc.
Per buildfarm failure on system "hoverfly" related to commit
27b62377b4.
Discussion: https://postgr.es/m/d191d5841347301a8f1238f609471ddd957fc47e.camel%40j-davis.com
Datetime input formerly accepted combinations such as
'1995-08-06 infinity', but this seems like a clear error.
Reject any combination of regular y/m/d/h/m/s fields with
these special tokens.
Joseph Koshakow, reviewed by Keisuke Kuroda and myself
Discussion: https://postgr.es/m/CAAvxfHdm8wwXwG_FFRaJ1nTHiMWb7YXS2YKCzCt8Q0a2ZoMcHg@mail.gmail.com
Previously, pg_upgrade checked that the old and new clusters were
compatible, including the locale and encoding. But the new cluster was
just created, and only template0 from the new cluster will be
preserved (template1 and postgres are both recreated during the
upgrade process).
Because template0 is not sensitive to locale or encoding, just update
the pg_database entry to be the same as template0 from the original
cluster.
This commit makes it easier to change the default initdb locale or
encoding settings without causing needless incompatibilities.
Discussion: https://postgr.es/m/d62b2874-729b-d26a-2d0a-0d64f509eca4@enterprisedb.com
Reviewed-by: Peter Eisentraut
In our Kerberos test suite, there isn't much need to worry about the
normal canonicalization that Kerberos provides by looking up the reverse
DNS for the IP address connected to, and in some cases it can actively
cause problems (eg: a captive portal wifi where the normally not
resolvable localhost address used ends up being resolved anyway, and
not to the domain we are using for testing, causing the entire
regression test to fail with errors about not being able to get a TGT
for the remote realm for cross-realm trust).
Therefore, disable it by adding rdns = false into the krb5.conf that's
generated for the test.
Reviewed-By: Heikki Linnakangas
Discussion: https://postgr.es/m/Y/QD2zDkDYQA1GQt@tamriel.snowman.net
Change comments for pg_cryptohash_init(), pg_cryptohash_update(),
pg_cryptohash_final() in cryptohash.c to match cryptohash_openssl.c.
In particular, the claim that these functions were "designed" to never
fail was incorrect, since by design callers need to be prepared to
handle failures, for compatibility with the cryptohash_openssl.c
versions.
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/301F4EDD-27B9-460F-B462-B9DB2BDE4ACF@yesql.se
A patch sent by Jacob Champion has been touching this area of the code,
and the set of changes done in a9e9a9f has made a run of pgindent on
these files a bit annoying to handle. So let's clean up a bit the area,
first, to ease the work on follow-up patches.
Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
Commit bdaabb9b started skipping doomed transactions when building the
list of possible conflicts for SERIALIZABLE READ ONLY. That makes
sense, because doomed transactions won't commit, but a couple of subtle
things broke:
1. If all uncommitted r/w transactions are doomed, a READ ONLY
transaction would arbitrarily not benefit from the safe snapshot
optimization. It would not be taken immediately, and yet no other
transaction would set SXACT_FLAG_RO_SAFE later.
2. In the same circumstances but with DEFERRABLE, GetSafeSnapshot()
would correctly exit its wait loop without sleeping and then take the
optimization in non-assert builds, but assert builds would fail a sanity
check that SXACT_FLAG_RO_SAFE had been set by another transaction.
This is similar to the case for PredXact->WritableSxactCount == 0. We
should opt out immediately if our possibleUnsafeConflicts list is empty
after filtering.
The code to maintain the serializable global xmin is moved down below
the new opt out site, because otherwise we'd have to reverse its effects
before returning.
Back-patch to all supported releases. Bug #17368.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17116-d6ca217acc180e30%40postgresql.org
Discussion: https://postgr.es/m/20110707212159.GF76634%40csail.mit.edu
396d348b0 omitted adding with_icu to the pg_dump tests under
meson. Conversely, e6927270c exported ZSTD for pg_basebackup's tests, despite
pg_basebackup's ZSTD support not having any tests.
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20230226225239.GL1653@telsasoft.com
The changes in b6a0d469ca prevented installation of the test files during a
normal install. However, the buildfarm intentionally tries to trun the tests
against a "real" installation. The new install-test-files target provides that
ability.
Because we want to install into a normal directory, I removed the necessary
munging of the target paths from meson.build and moved it into
install-test-files. I also added DESTDIR support, so that installing can
redirect the directory if desired. That's used for the tmp_install/
installation now.
I didn't like the number of arguments necessary for install_test_files, so I
changed it to use
--install target list of files
which makes it easier to use for further directories, if/when we need them.
Discussion: https://postgr.es/m/20230308012940.edexipb3vqylcu6r@awork3.anarazel.de
This exposes the ICU facility to add custom collation rules to a
standard collation.
New options are added to CREATE COLLATION, CREATE DATABASE, createdb,
and initdb to set the rules.
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Discussion: https://www.postgresql.org/message-id/flat/821c71a4-6ef0-d366-9acf-bb8e367f739f@enterprisedb.com
Reformat some of the comments in MergeAttributes(). A lot of code has
been added here over time, and the comments could use a bit of editing
to make the code flow read better.
One file per line seems best. We already did this in some cases.
This adopts the same format everywhere (except in some cases where the
list reasonably fits on one line).
There was apparently an attempt here to list all the object types that
ACL_USAGE applies to, but it wasn't complete. So instead of trying to
keep up, put in a more timeless comment.
When vacuum_defer_cleanup_age is bigger than the current xid, including the
epoch, the subtraction of vacuum_defer_cleanup_age would lead to a wrapped
around xid. While that normally is not a problem, the subsequent conversion to
a 64bit xid results in a 64bit-xid very far into the future. As that xid is
used as a horizon to detect whether rows versions are old enough to be
removed, that allows removal of rows that are still visible (i.e. corruption).
If vacuum_defer_cleanup_age was never changed from the default, there is no
chance of this bug occurring.
This bug was introduced in dc7420c2c9. A lesser version of it exists in
12-13, introduced by fb5344c969, affecting only GiST.
The 12-13 version of the issue can, in rare cases, lead to pages in a gist
index getting recycled too early, potentially causing index entries to be
found multiple times.
The fix is fairly simple - don't allow vacuum_defer_cleanup_age to retreat
further than FirstNormalTransactionId.
Patches to make similar bugs easier to find, by adding asserts to the 64bit
xid infrastructure, have been proposed, but are not suitable for backpatching.
Currently there are no tests for vacuum_defer_cleanup_age. A patch introducing
infrastructure to make writing a test easier has been posted to the list.
Reported-by: Michail Nikolaev <michail.nikolaev@gmail.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230108002923.cyoser3ttmt63bfn@awork3.anarazel.de
Backpatch: 12-, but impact/fix is smaller for 12-13
Previously, all the nodes of CallStmt were included in the jumbling,
causing a duplicate in the computation as the transformed state of the
CALL query was included as well as the parsed state (transformed
FuncCall with all the input arguments and potential output arguments).
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Y+MRdEq9W9XVa2AB@paquier.xyz
We already didn't use touch (some earlier version of the meson build did ),
and cp is only used for updating unicode files. The latter already depends on
the optional availability of 'wget', so doing the same for 'cp' makes sense.
Eventually we probably want a portable command for updating source code as
part of a target, but for now...
Reported-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/70e96c34-64ee-e549-8c4a-f91a7a668804@dunslane.net
IntoClause.viewQuery is a copy of the parsed-but-not-rewritten SELECT
clause copied to IntoClause when transforming CreateTableAsStmt for a
materialized view. Including a second copy of the SELECT Query into the
query jumbling was leading to an incorrect numbering of the Const node
locations, as these would be counted twice instead of once.
This becomes visible once the query normalization is applied to CREATE
MATERIALIZED VIEW in pg_stat_statements in the shape of a query string
using only odd numbers for the normalized constants, (regression tests
added in pg_stat_statements as of de2aca2 would show the difference).
Including the original Query from CreateTableAsStmt is enough for the
query jumbling.
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Y+MRdEq9W9XVa2AB@paquier.xyz
4211fbd has been handling PROCESS_MAIN in vacuum_rel() with an "if/else
if" structure to avoid an extra level of indentation, but this has been
found as being rather parse to read. This commit updates the code so as
we check for PROCESS_MAIN in a single place and then handle its
subpaths, FULL or non-FULL vacuums. Some comments are added to make
that clearer for the reader.
Reported-by: Melanie Plageman
Author: Nathan Bossart
Reviewed-by: Michael Paquier, Melanie Plageman
Discussion: https://postgr.es/m/20230306194009.5cn6sp3wjotd36nu@liskov
If a view is defined atop another view, and then CREATE OR REPLACE
VIEW is used to add columns to the lower view, then when the upper
view's referencing RTE is expanded by ApplyRetrieveRule we will have
a subquery RTE with fewer eref->colnames than output columns. This
confuses various code that assumes those lists are always in sync,
as they are in plain parser output.
We have seen such problems before (cf commit d5b760ecb), and now
I think the time has come to do what was speculated about in that
commit: let's make ApplyRetrieveRule synthesize some column names to
preserve the invariant that holds in parser output. Otherwise we'll
be chasing this class of bugs indefinitely. Moreover, it appears from
testing that this actually gives us better results in the test case
d5b760ecb added, and likely in other corner cases that we lack
coverage for.
In HEAD, I replaced d5b760ecb's hack to make expandRTE exit early with
an elog(ERROR) call, since the case is now presumably unreachable.
But it seems like changing that in back branches would bring more risk
than benefit, so there I just updated the comment.
Per bug #17811 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17811-d31686b78f0dffc9@postgresql.org
This routine is able to retrieve the OID of the schema used with an
extension (pg_extension.extnamespace), or InvalidOid if this information
is not available. plpgsql_check embeds a copy of this code when
performing checks on functions, as one out-of-core example.
Author: Pavel Stehule
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/CAFj8pRD+9x55hjDoi285jCcjPc8uuY_D+FLn5RpXggdz+4O2sQ@mail.gmail.com
If UPDATE is forced to retry after an EvalPlanQual check, it neglected
to repeat GENERATED-column computations, even though those might well
have changed since we're dealing with a different tuple than before.
Fixing this is mostly a matter of looping back a bit further when
we retry. In v15 and HEAD that's most easily done by altering the API
of ExecUpdateAct so that it includes computing GENERATED expressions.
Also, if an UPDATE in a partitioned table turns into a cross-partition
INSERT operation, we failed to recompute GENERATED columns. That's a
bug since 8bf6ec3ba allowed partitions to have different generation
expressions; although it seems to have no ill effects before that.
Fixing this is messier because we can now have situations where the same
query needs both the UPDATE-aligned set of GENERATED columns and the
INSERT-aligned set, and it's unclear which set will be generated first
(else we could hack things by forcing the INSERT-aligned set to be
generated, which is indeed how fe9e658f4 made it work for MERGE).
The best fix seems to be to build and store separate sets of expressions
for the INSERT and UPDATE cases. That would create ABI issues in the
back branches, but so far it seems we can leave this alone in the back
branches.
Per bug #17823 from Hisahiro Kauchi. The first part of this affects all
branches back to v12 where GENERATED columns were added.
Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
While testing a fix for bug #17823, I discovered that EvalPlanQualStart
failed to copy es_rteperminfos from the parent EState, resulting in
failure if anything in EPQ execution wanted to consult that information.
This led me to conclude that commit a61b1f748 had been too haphazard
about where to fill es_rteperminfos, and that we need to be sure that
that happens exactly where es_range_table gets filled. So I changed the
signature of ExecInitRangeTable to help ensure that this new requirement
doesn't get missed. (Indeed, pgoutput.c was also failing to fill it.
Maybe we don't ever need it there, but I wouldn't bet on that.)
No test case yet; one will arrive with the fix for #17823.
But that needs to be back-patched, while this fix is HEAD-only.
Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
The functions that follow are concerned with various things, of
which the tar format is only one, so this comment doesn't really
seem helpful. The file isn't really divided into sections in the
way that this comment seems to contemplate -- or at least, not
any more.
Patch by me, reviewed by Michael Paquier.
Discussion: http://postgr.es/m/CA+TgmoZ_fFAoU6mrHt9QBs+dcYhN6yXenGTTMRebZNhtwPwHyg@mail.gmail.com
We read blocks of data from files that we're backing up in chunks,
some multiple of BLCKSZ for each read. If checksum verification fails,
we then try rereading just the one block for which validation failed.
If that block happened to be the first block of the chunk, and if
the file was concurrently truncated to remove that block, then we'd
reach a call to bbsink_archive_contents() with a buffer length of 0.
That causes an assertion failure.
As far as I can see, there are no particularly bad consequences if
this happens in a non-assert build, and it's pretty unlikely to happen
in the first place because it requires a series of somewhat unlikely
things to happen in very quick succession. However, assertion failures
are bad, so rearrange the code to avoid that possibility.
Patch by me, reviewed by Michael Paquier.
Discussion: http://postgr.es/m/CA+TgmoZ_fFAoU6mrHt9QBs+dcYhN6yXenGTTMRebZNhtwPwHyg@mail.gmail.com
Add description of which one is the default between two complementary
options of --bypassrls and --replication in the help text and docs. In
correspondence let the command always include the tokens corresponding
to every options of that kind in the SQL command sent to server. Tests
are updated accordingly.
Also fix the checks of some trivalue vars which were using literal zero
for checking default value instead of the enum label TRI_DEFAULT. While
not a bug, since TRI_DEFAULT is defined as zero, fixing improves read-
ability improved readability (and avoid bugs if the enum is changed).
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220810.151243.1073197628358749087.horikyota.ntt@gmail.com
Disabling this option is useful to run VACUUM (with or without FULL) on
only the toast table of a relation, bypassing the main relation. This
option is enabled by default.
Running directly VACUUM on a toast table was already possible without
this feature, by using the non-deterministic name of a toast relation
(as of pg_toast.pg_toast_N, where N would be the OID of the parent
relation) in the VACUUM command, and it required a scan of pg_class to
know the name of the toast table. So this feature is basically a
shortcut to be able to run VACUUM or VACUUM FULL on a toast relation,
using only the name of the parent relation.
A new switch called --no-process-main is added to vacuumdb, to work as
an equivalent of PROCESS_MAIN.
Regression tests are added to cover VACUUM and VACUUM FULL, looking at
pg_stat_all_tables.vacuum_count to see how many vacuums have run on
each table, main or toast.
Author: Nathan Bossart
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/20221230000028.GA435655@nathanxps13
All the regression tests of VACUUM (PROCESS_TOAST) were only checking if
the commands were able to run, without checking if VACUUM was really
running on what it should. This expands this set of tests so as we now
look at pg_stat_all_tables.vacuum_count to see how many vacuums have
been run on a given table and its toast relation.
Extracted from a larger patch by the same author, as this is useful on
its own.
Special thanks to Álvaro Herrera for the idea of using
pg_stat_all_tables to check the state of the toast relation.
Author: Nathan Bossart
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/20221230000028.GA435655@nathanxps13
The following changes are made to pg_write_zeros(), the API able to
write series of zeros using vectored I/O:
- Add of an "offset" parameter, to write the size from this position
(the 'p' of "pwrite" seems to mean position, though POSIX does not
outline ythat directly), hence the name of the routine is incorrect if
it is not able to handle offsets.
- Avoid memset() of "zbuffer" on every call.
- Avoid initialization of the whole IOV array if not needed.
- Group the trailing write() call with the main write() call,
simplifying the function logic.
Author: Andres Freund
Reviewed-by: Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/20230215005525.mrrlmqrxzjzhaipl@awork3.anarazel.de
1. Make sure that we don't decrement SxactGlobalXminCount twice when
the SXACT_FLAG_RO_SAFE optimization is reached in a parallel query.
This could trigger a sanity check failure in assert builds. Non-assert
builds recompute the count in SetNewSxactGlobalXmin(), so the problem
was hidden, explaining the lack of field reports. Add a new isolation
test to exercise that case.
2. Remove an assertion that the DOOMED flag can't be set on a partially
released SERIALIZABLEXACT. Instead, ignore the flag (our transaction
was already determined to be read-only safe, and DOOMED is in fact set
during partial release, and there was already an assertion that it
wasn't set sooner). Improve an existing isolation test so that it
reaches that case (previously it wasn't quite testing what it was
supposed to be testing; see discussion).
Back-patch to 12. Bug #17116. Defects in commit 47a338cf.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17116-d6ca217acc180e30%40postgresql.org
Add support for non-decimal integer literals and underscores in
numeric literals to SQL JSON path language. This follows the rules of
ECMAScript, as referred to by the SQL standard.
Internally, all the numeric literal parsing of jsonpath goes through
numeric_in, which already supports all this, so this patch is just a
bit of lexer work and some tests and documentation.
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b11b25bb-6ec1-d42f-cedd-311eae59e1fb@enterprisedb.com
Beginning in v15, if you apply ALTER TABLE ENABLE/DISABLE TRIGGER to
a partitioned table, it also affects the partitions' cloned versions
of the affected trigger(s). The initial implementation of this
located the clones by name, but that fails on foreign-key triggers
which have names incorporating their own OIDs. We can fix that, and
also make the behavior more bulletproof in the face of user-initiated
trigger renames, by identifying the cloned triggers by tgparentid.
Following the lead of earlier commits in this area, I took care not
to break ABI in the v15 branch, even though I rather doubt there
are any external callers of EnableDisableTrigger.
While here, update the documentation, which was not touched when
the semantics were changed.
Per bug #17817 from Alan Hodgson. Back-patch to v15; older versions
do not have this behavior.
Discussion: https://postgr.es/m/17817-31dfb7c2100d9f3d@postgresql.org
We allow our header files to depend on the appropriate one of
postgres.h, postgres_fe.h, or c.h having already been included.
However, there are a few headers such as libpq-fe.h that are
meant to be used by client applications and therefore must
compile without any assumptions about previous inclusions.
These test scripts failed to consider that, which seems quite
hazardous since we might not immediately notice such a problem
otherwise. Hence, adjust these scripts to test relevant libpq
and ecpg headers with no prior inclusion.
While at it, we can also make an effort to actually use the
relevant one of postgres.h, postgres_fe.h, or c.h. I added
some rules that guess which one to use based on the first-level
src subdirectory, e.g. use postgres_fe.h under src/bin/.
These rules are hardly water-tight but they seem to work today,
and we can always refine them in the future.
These changes don't reveal any live problems today, which is good,
but they should make these scripts more able to catch future bugs.
Discussion: https://postgr.es/m/2488193.1677863247@sss.pgh.pa.us
The comments claim that certain pieces of data are part of the main
WAL record data when in reality they are part of the data for
block 0. Repair.
Bertrand Drouvot, reviewed by Amit Kapila. Originally reported by me.
Discussion: http://postgr.es/m/80db7836-4415-d54a-64c3-66b88b1430e7@gmail.com
Previously, meson installed modules under src/test/modules/ as part of
a normal installation, even though these files are only meant for use
by tests. This is because there is no way to set up up the build
system to install extra things only when told.
This patch fixes that with a workaround: We don't install these
modules as part of meson install, but we create a new "test" that runs
before the real tests whose action it is to install these files. The
installation is done by manual copies using a small helper script.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/2a039e8e-f31f-31e8-afe7-bab3130ad2de%40enterprisedb.com
Coverage of the query jumbling code has always relied on the queries
included in the regression tests of pg_stat_statements. This has its
limitations, as a lot of query patterns have never really stressed the
query jumbling code. The situation got a bit worse since the query
jumbling has been added in the backend core code (5fd9dfa), hence new
nodes that should be included in the jumbling could easily be missed,
resulting in failures in pg_stat_statements or any modules that require
query ID computations. Forcing a load of pg_stat_statements in
027_stream_regress.pl ensures that nodes are never missed in the
computations, without having to rely on a buildfarm member for this
check.
Before this commit, the line coverage of queryjumblefuncs.funcs.c was
around 48.5%, now up to 94.6% just by running 027_stream_regress.pl.
A basic check is added to show that pg_stat_statements reports are
generated after the main regression test suite is finished.
Discussion: https://postgr.es/m/Y+nD9LN70w+8eaG9@paquier.xyz
Our previous habit of showing the full function body is really
pretty unfriendly for tabular viewing of functions, and now that
we have \sf and \ef commands there seems no good reason why \df+
has to do it. It still seems to make sense to show prosrc for
internal and C-language functions, since in those cases prosrc
is just the C function name; but then let's rename the column to
"Internal name" which is a more accurate descriptor.
Isaac Morland
Discussion: https://postgr.es/m/CAMsGm5eqKc6J1=Lwn=ZONG=6ZDYWRQ4cgZQLqMuZGB1aVt_JBg@mail.gmail.com
Open long-lived data and WAL file descriptors with O_CLOEXEC. This flag
was introduced by SUSv4 (POSIX.1-2008), and by now all of our target
Unix systems have it. Our open() implementation for Windows already had
that behavior, so provide a dummy O_CLOEXEC flag on that platform.
For now, callers of open() and the "thin" wrappers in fd.c that deal in
raw descriptors need to pass in O_CLOEXEC explicitly if desired. This
commit does that for WAL files, and automatically for everything
accessed via VFDs including SMgrRelation and BufFile. (With more
discussion we might decide to turn it on automatically for the thin
open()-wrappers too to avoid risk of missing places that need it, but
these are typically used for short-lived descriptors where we don't
expect to fork/exec, and it's remotely possible that extensions could be
using these APIs and passing descriptors to subprograms deliberately, so
that hasn't been done here.)
Do the same for sockets and the postmaster pipe with FD_CLOEXEC. (Later
commits might use modern interfaces to remove these extra fcntl() calls
and more where possible, but we'll need them as a fallback for a couple
of systems, so do it that way in this initial commit.)
With this change, subprograms executed for archiving, copying etc will
no longer have access to the server's descriptors, other than the ones
that we decide to pass down.
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGKb6FsAdQWcRL35KJsftv%2B9zXqQbzwkfRf1i0J2e57%2BhQ%40mail.gmail.com
When I designed the Bitmapset module, I set things up so that an empty
Bitmapset could be represented either by a NULL pointer, or by an
allocated object all of whose bits are zero. I've recently come to
the conclusion that that was a bad idea and we should instead have a
convention like the longstanding invariant for Lists, whereby an empty
list is represented by NIL and nothing else.
To do this, we need to fix bms_intersect, bms_difference, and a couple
of other functions to check for having produced an empty result; but
then we can replace bms_is_empty(a) by a simple "a == NULL" test.
This is very likely a (marginal) win performance-wise, because we
call bms_is_empty many more times than those other functions put
together. However, the real reason to do it is that we have various
places that have hand-implemented a rule about "this Bitmapset
variable must be exactly NULL if empty", so that they can use
checks-for-null in place of bms_is_empty calls in particularly hot
code paths. That is a really fragile, mistake-prone way to do things,
and I'm surprised that we've seldom been bitten by it. It's not well
documented at all which variables have this property, so you can't
readily tell which code might be violating those conventions. By
making the convention universal, we can eliminate a subtle source of
bugs.
Patch by me; thanks to Nathan Bossart and Richard Guo for review.
Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
nodeAppend.c used non-nullness of appendstate->as_valid_subplans as
a state flag to indicate whether it'd done ExecFindMatchingSubPlans
(or some sufficient approximation to that). This was pretty
questionable even in the beginning, since it wouldn't really work
right if there are no valid subplans. It got more questionable
after commit 27e1f1456 added logic that could reduce as_valid_subplans
to an empty set: at that point we were depending on unspecified
behavior of bms_del_members, namely that it'd not return an empty
set as NULL. It's about to start doing that, which breaks this
logic entirely. Hence, add a separate boolean flag to signal
whether as_valid_subplans has been computed.
Also fix a previously-cosmetic bug in nodeAgg.c, wherein it ignored
the return value of bms_del_member instead of updating its pointer.
Patch by me; thanks to Nathan Bossart and Richard Guo for review.
Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
This function has been semi-deprecated ever since we invented
bms_next_member(). Its habit of scribbling on the input bitmapset
isn't great, plus for sufficiently large bitmapsets it would take
O(N^2) time to complete a loop. Now we have the additional problem
that reducing the input to empty while leaving it still accessible
would violate a planned invariant. So let's just get rid of it,
after updating the few extant callers to use bms_next_member().
Patch by me; thanks to Nathan Bossart and Richard Guo for review.
Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
Some deprecated options were not marked as such in usage output. This
does so across the installed binaries in an attempt to provide consistent
markup for this.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/062C6A8A-A4E8-4F52-9E31-45F0C9E9915E@yesql.se
Commit 0a20ff54f split out the GUC variables from guc.c into a new file
guc_tables.c. This updates comments referencing guc.c regarding variables
which are now in guc_tables.c.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/6B50C70C-8C1F-4F9A-A7C0-EEAFCC032406@yesql.se
When you have some invalid WAL, you often get a message like "wanted
24, got 0". This is a bit incorrect, since it really wanted *at
least* 24, not exactly 24. This updates the messages to that effect,
and also adds that detail to one message where it was available but
not printed.
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Jeevan Ladhe <jeevanladhe.os@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/726d782b-5e45-0c3e-d775-6686afe9aa83%40enterprisedb.com
This is usually harmless, but if you were very unlucky it could
provoke a segfault due to the "to" string being right up against
the end of memory. Found via valgrind testing (so we might've
found it earlier, except that our regression tests lacked any
exercise of translate()'s deletion feature).
Fix by switching the order of the test-for-end-of-string and
advance-pointer steps. While here, compute "to_ptr + tolen"
just once. (Smarter compilers might figure that out for
themselves, but let's just make sure.)
Report and fix by Daniil Anisimov, in bug #17816.
Discussion: https://postgr.es/m/17816-70f3d2764e88a108@postgresql.org
As it stands, flagInhAttrs() can make changes in table properties that
change decisions made at other tables during other iterations of its
loop. This is a pretty bad idea, since we visit the tables in OID
order which is not necessarily related to inheritance relationships.
So far as I can tell, the consequences are just cosmetic: we might
dump DEFAULT or GENERATED expressions that we don't really need to
because they match properties of the parent. Nonetheless, it's buggy,
and somebody might someday add functionality here that fails less
benignly when the traversal order varies.
One issue is that when we decide we needn't dump a particular
GENERATED expression, we physically unlink the struct for it,
so that it will now look like the table has no such expression,
causing the wrong choice to be made at any child visited later.
We can improve that by instead clearing the dobj.dump flag,
and taking care to check that flag when it comes time to dump
the expression or not.
The other problem is that if we decide we need to fake up a DEFAULT
NULL clause to override a default that would otherwise get inherited,
we modify the data structure in the reverse fashion, creating an
attrdefs entry where there hadn't been one. It's harder to avoid
doing that, but since the backend won't report a plain "DEFAULT NULL"
property we can modify the code to recognize ones we just added.
Add some commentary to perhaps forestall future mistakes of the
same ilk.
Since the effects of this seem only cosmetic, no back-patch.
Discussion: https://postgr.es/m/1506298.1676323579@sss.pgh.pa.us
Per buildfarm members snakefly, parula and prion, that reflect the
results coming from the latest versions of libxml2.
Oversight in b8da37b in the shape of an incorrect copy-paste. The CI
was green, but it does not stress this expected output.
pg_input_error_info() is now a SQL function able to return a row with
more than just the error message generated for incorrect data type
inputs when these are able to handle soft failures, returning more
contents of ErrorData, as of:
- The error message (same as before).
- The error detail, if set.
- The error hint, if set.
- SQL error code.
All the regression tests that relied on pg_input_error_message() are
updated to reflect the effects of the rename.
Per discussion with Tom Lane and Andrew Dunstan.
Author: Nathan Bossart
Discussion: https://postgr.es/m/139a68e1-bd1f-a9a7-b5fe-0be9845c6311@dunslane.net
Some clang versions whine about comparing an enum variable to
a value outside the range of the enum, on the grounds that the
result must be constant. In the cases we fix here, the loops
will terminate only if the enum variable can in fact hold a
value one beyond its declared range. While that's very likely
to always be true for these enum types, it still seems like a
poor coding practice to assume it; so use "int" loop variables
instead to silence the warnings. (This matches what we've done
in other places, for example loops over the range of ForkNumber.)
While at it, let's drop the XXX_FIRST macros for these enums and just
write zeroes for the loop start values. The apparent flexibility
seems rather illusory given that iterating up to one-less-than-
the-number-of-values is only correct for a zero-based range.
Melanie Plageman
Discussion: https://postgr.es/m/20520.1677435600@sss.pgh.pa.us
This will create two bytes of padding space in xl_hash_vacuum_one_page which
can be used for future patches. This makes the datatype of
xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which is
advisable as both are used for the same purpose.
Author: Bertrand Drouvot
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/b0e20c40-cb7a-fc1c-c607-2a78dac5021e@gmail.com
It's been this way for a very long time, but it appears to have been
masking an issue that only manifests with different settings. Therefore,
run the tests in the installation's default encoding/locale.
Backpatch to all live branches.
We already tried to fix this in commits 3f7323cbb et al (and follow-on
fixes), but now it emerges that there are still unfixed cases;
moreover, these cases affect all branches not only pre-v14. I thought
we had eliminated all cases of making multiple clones of an UPDATE's
target list when we nuked inheritance_planner. But it turns out we
still do that in some partitioned-UPDATE cases, notably including
INSERT ... ON CONFLICT UPDATE, because ExecInitPartitionInfo thinks
it's okay to clone and modify the parent's targetlist.
This fix is based on a suggestion from Andres Freund: let's stop
abusing the ParamExecData.execPlan mechanism, which was only ever
meant to handle initplans, and instead solve the execution timing
problem by having the expression compiler move MULTIEXPR_SUBLINK steps
to the front of their expression step lists. This is feasible because
(a) all branches still in support compile the entire targetlist of
an UPDATE into a single ExprState, and (b) we know that all
MULTIEXPR_SUBLINKs do need to be evaluated --- none could be buried
inside a CASE, for example. There is a minor semantics change
concerning the order of execution of the MULTIEXPR's subquery versus
other parts of the parent targetlist, but that seems like something
we can get away with. By doing that, we no longer need to worry
about whether different clones of a MULTIEXPR_SUBLINK share output
Params; their usage of that data structure won't overlap.
Per bug #17800 from Alexander Lakhin. Back-patch to all supported
branches. In v13 and earlier, we can revert 3f7323cbb and follow-on
fixes; however, I chose to keep the SubPlan.subLinkId field added
in ccbb54c72. We don't need that anymore in the core code, but it's
cheap enough to fill, and removing a plan node field in a minor
release seems like it'd be asking for trouble.
Andres Freund and Tom Lane
Discussion: https://postgr.es/m/17800-ff90866b3906c964@postgresql.org
If a rule action contains a subquery that refers to columns from OLD
or NEW, then those are really lateral references, and the planner will
complain if it sees such things in a subquery that isn't marked as
lateral. However, at rule-definition time, the user isn't required to
mark the subquery with LATERAL, and so it can fail when the rule is
used.
Fix this by marking such subqueries as lateral in the rewriter, at the
point where they're used.
Dean Rasheed and Tom Lane, per report from Alexander Lakhin.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/5e09da43-aaba-7ea7-0a51-a2eb981b058b%40gmail.com
A unique index which is created with non-distinct NULLS cannot be
used for backing a primary key constraint. Make sure to disallow
such table alterations and teach pg_dump to drop the non-distinct
NULLS clause on indexes where this has been set.
Bug: 17720
Reported-by: Reiner Peterke <zedaardv@drizzle.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/17720-dab8ee0fa85d316d@postgresql.org
Multiple cycles of starting up and shutting down the plugin within a
single session would eventually lead to "out of relcache_callback_list
slots", because pgoutput_startup blindly re-registered its cache
callbacks each time. Fix it to register them only once, as all other
users of cache callbacks already take care to do.
This has been broken all along, so back-patch to all supported branches.
Shi Yu
Discussion: https://postgr.es/m/OSZPR01MB631004A78D743D68921FFAD3FDA79@OSZPR01MB6310.jpnprd01.prod.outlook.com
Expand pg_dump's compression streaming and file APIs to support the lz4
algorithm. The newly added compress_lz4.{c,h} files cover all the
functionality of the aforementioned APIs. Minor changes were necessary
in various pg_backup_* files, where code for the 'lz4' file suffix has
been added, as well as pg_dump's compression option parsing.
Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Shi Yu, Tomas Vondra
Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
Offers a generally better separation of responsibilities for collation
code. Also, a step towards multi-lib ICU, which should be based on a
clean separation of the routines required for collation providers.
Callers with NUL-terminated strings should call pg_strcoll() or
pg_strxfrm(); callers with strings and their length should call the
variants pg_strncoll() or pg_strnxfrm().
Reviewed-by: Peter Eisentraut, Peter Geoghegan
Discussion: https://postgr.es/m/a581136455c940d7bd0ff482d3a2bd51af25a94f.camel%40j-davis.com
Switch pg_dump to use the Compression API, implemented by bf9aa490db.
The CompressFileHandle replaces the cfp* family of functions with a
struct of callbacks for accessing (compressed) files. This allows adding
new compression methods simply by introducing a new struct instance with
appropriate implementation of the callbacks.
Archives compressed using custom compression methods store an identifier
of the compression algorithm in their header instead of the compression
level. The header version is bumped.
Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Tomas Vondra
Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
It's possible, in admittedly-rather-contrived cases, for an eclass
to generate a derived "join" qual that constrains the post-outer-join
value(s) of some RHS variable(s) without mentioning the LHS at all.
While the mechanisms were set up to work for this, we fell foul of
the "get_common_eclass_indexes" filter installed by commit 3373c7155:
it could decide that such an eclass wasn't relevant to the join, so
that the required qual clause wouldn't get emitted there or anywhere
else.
To fix, apply get_common_eclass_indexes only at inner joins, where
its rule is still valid. At an outer join, fall back to examining all
eclasses that mention either input (or the OJ relid, though it should
be impossible for an eclass to mention that without mentioning either
input). Perhaps we can improve on that later, but the cost/benefit of
adding more complexity to skip some irrelevant eclasses is dubious.
To allow cheaply distinguishing outer from inner joins, pass the
ojrelid to generate_join_implied_equalities as a separate argument.
This also allows cleaning up some sloppiness that had crept into
the definition of its join_relids argument, and it allows accurate
calculation of nominal_join_relids for a child outer join. (The
latter oversight seems not to have been a live bug, but it certainly
could have caused problems in future.)
Also fix what might be a live bug in check_index_predicates: it was
being sloppy about what it passed to generate_join_implied_equalities.
Per report from Richard Guo.
Discussion: https://postgr.es/m/CAMbWs4-DsTBfOvXuw64GdFss2=M5cwtEhY=0DCS7t2gT7P6hSA@mail.gmail.com
Commit bf9aa490db introduced a compression API in compress_io.{c,h} to
make reuse easier, and allow adding more compression algorithms.
However, pg_backup_archiver.c was not switched to this API and continued
to call the compression directly.
This commit teaches pg_backup_archiver.c about the compression API, so
that it can benefit from bf9aa490db (simpler code, easier addition of
new compression methods).
Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Tomas Vondra
Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
If the source server was just promoted, and it hasn't written the
checkpoint record yet, pg_rewind considered the server to be still on
the old timeline. Because of that, it would claim incorrectly that no
rewind is required. Fix that by looking at minRecoveryPointTLI in the
control file in addition to the ThisTimeLineID on the checkpoint.
This has been a known issue since forever, and we had worked around it
in the regression tests by issuing a checkpoint after each promotion,
before running pg_rewind. But that was always quite hacky, so better
to fix this properly. This doesn't add any new tests for this, but
removes the previously-added workarounds from the existing tests, so
that they should occasionally hit this codepath again.
This is arguably a bug fix, but don't backpatch because we haven't
really treated it as a bug so far. Also, the patch didn't apply
cleanly to v13 and below. I'm sure sure it could be made to work on
v13, but doesn't seem worth the risk and effort.
Reviewed-by: Kyotaro Horiguchi, Ibrar Ahmed, Aleksander Alekseev
Discussion: https://www.postgresql.org/message-id/9f568c97-87fe-a716-bd39-65299b8a60f4%40iki.fi
Given an updatable view with a DO ALSO INSERT ... SELECT rule, a
multi-row INSERT ... VALUES query on the view fails if the VALUES list
contains any DEFAULTs that are not replaced by view defaults. This
manifests as an "unrecognized node type" error, or an Assert failure,
in an assert-enabled build.
The reason is that when RewriteQuery() attempts to replace the
remaining DEFAULT items with NULLs in any product queries, using
rewriteValuesRTEToNulls(), it assumes that the VALUES RTE is located
at the same rangetable index in each product query. However, if the
product query is an INSERT ... SELECT, then the VALUES RTE is actually
in the SELECT part of that query (at the same index), rather than the
top-level product query itself.
Fix, by descending to the SELECT in such cases. Note that we can't
simply use getInsertSelectQuery() for this, since that expects to be
given a raw rule action with OLD and NEW placeholder entries, so we
duplicate its logic instead.
While at it, beef up the checks in getInsertSelectQuery() by checking
that the jointree->fromlist node is indeed a RangeTblRef, and that the
RTE it points to has rtekind == RTE_SUBQUERY.
Per bug #17803, from Alexander Lakhin. Back-patch to all supported
branches.
Dean Rasheed, reviewed by Tom Lane.
Discussion: https://postgr.es/m/17803-53c63ed4ecb4eac6%40postgresql.org
This was previously only documented in a comment. Given the size of the
struct, it's not hard to miss that comment. As evidenced by the commits
leading up to fe3caa1439, 67b26703b4.
It's possible, but not likely, that we might have to weaken these assertions
on a less commonly used architecture.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/295606.1677101684@sss.pgh.pa.us
A couple of code paths in CONNECTION_AWAITING_RESPONSE will eagerly read
bytes off a connection that should be closed. Don't let a misbehaving
server chew up client resources here; a v2 error can't be infinitely
long, and a v3 error should be bounded by its original message length.
For the existing error_return cases, I added some additional error
messages for symmetry with the new ones, and cleaned up some message
rot.
Author: Jacob Champion
Discussion: https://www.postgresql.org/message-id/8e729daf-7d71-6965-9687-8bc0630599b3%40timescale.com
initsplan.c figured that it could push Var-free qual clauses to
the top of the current JoinDomain, which is okay in the abstract.
But if the current domain is inside some outer join, and we later
commute an inside-the-domain outer join with one outside it,
we end up placing the pushed-up qual clause incorrectly.
In distribute_qual_to_rels, avoid this by using the syntactic scope
of the qual clause; with the exception that if we're in the top-level
join domain we can still use the full query relid set, ensuring the
resulting gating Result node goes to the top of the plan. (This is
approximately as smart as the pre-v16 code was. Perhaps we can do
better later, but it's not clear that such cases are worth a lot of
sweat.)
In process_implied_equality, we don't have a clear notion of syntactic
scope, but we do have the results of SpecialJoinInfo construction.
Thumb through those and remove any lower outer joins that might get
commuted to above the join domain. Again, we can make an exception
for the top-level join domain. It'd be possible to work harder here
(for example, by keeping outer joins that aren't shown as potentially
commutable), but I'm going to stop here for the moment. This issue
has convinced me that the current representation of join domains
probably needs further refinement, so I'm disinclined to write
inessential dependent logic just yet.
In passing, tighten the qualscope passed to process_implied_equality
by generate_base_implied_equalities_no_const; there's no need for
it to be larger than the rel we are currently considering.
Tom Lane and Richard Guo, per report from Tender Wang.
Discussion: https://postgr.es/m/CAHewXNk9eJ35ru5xATWioTV4+xZPHptjy9etdcNPjUfY9RQ+uQ@mail.gmail.com
Whe decoding a transactional logical message, logicalmsg_decode called
SnapBuildGetOrBuildSnapshot. But we may not have a consistent snapshot
yet at that point. We don't actually need the snapshot in this case
(during replay we'll have the snapshot from the transaction), so in
practice this is harmless. But in assert-enabled build this crashes.
Fixed by requesting the snapshot only in non-transactional case, where
we are guaranteed to have SNAPBUILD_CONSISTENT.
Backpatch to 11. The issue exists since 9.6.
Backpatch-through: 11
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/84d60912-6eab-9b84-5de3-41765a5449e8@enterprisedb.com
In ExecInitPartitionInfo(), the Assert when building the WITH CHECK
OPTION list for the new partition assumed that the command would be an
INSERT or UPDATE, but it can also be a MERGE. This can be triggered by
a MERGE into a partitioned table with RLS checks to enforce.
Fix, and back-patch to v15, where MERGE was introduced.
Discussion: https://postgr.es/m/CAEZATCWWFtQmW67F3XTyMU5Am10Oxa_b8oe0x%2BNu5Mo%2BCdRErg%40mail.gmail.com
This ensures that the row count in the command tag for a MERGE is
correctly computed. Previously, if MERGE updated a partitioned table,
the row count would be incorrect if any row was moved to a different
partition, since such updates were counted twice.
Back-patch to v15, where MERGE was introduced.
Discussion: https://postgr.es/m/CAEZATCWRMG7XX2QEsVL1LswmNo2d_YG8tKTLkpD3=Lp644S7rg@mail.gmail.com
SQL:2023 defines an ANY_VALUE aggregate whose purpose is to emit an
implementation-dependent (i.e. non-deterministic) value from the
aggregated rows.
Author: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5cff866c-10a8-d2df-32cb-e9072e6b04a2@postgresfriends.org
WAL_LOG does a scan of the template's pg_class to determine the set of
relations that need to be copied from a template database to the new
one. However, as coded in 9c08aea, this copy strategy would load the
pages of pg_class without considering it as a permanent relation,
causing the loaded pages to never be flushed when they should. Any
modification of the template's pg_class, mostly through DDLs, would then
be missed, causing corruptions.
STRATEGY = WAL_LOG is the default over FILE_COPY since it has been
introduced, so any changes done to pg_class on a database template would
be gone. Updates of database templates should be a rare thing, so the
impact of this bug should be hopefully limited. The pre-14 default
strategy FILE_COPY is safe, and can be used as a workaround.
Ryo Matsumura has found and analyzed the issue, and Nathan has written a
test able to reproduce the failure (with few tweaks from me).
Backpatch down to 15, where STRATEGY = WAL_LOG has been introduced.
Author: Nathan Bossart, Ryo Matsumura
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/TYCPR01MB6868677E499C9AD5123084B5E8A39@TYCPR01MB6868.jpnprd01.prod.outlook.com
Backpatch-through: 15
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
Commits 04cad8f7 and 0c088568 supported old macOS systems that didn't
define O_CLOEXEC or O_DSYNC yet, but those arrived in macOS releases
10.7 and 10.6 (respectively), which themselves reached EOL around a
decade ago. We've already made use of other POSIX features that early
macOS vintages can't compile (for example commits 623cc673, d2e15083).
A later commit will use O_CLOEXEC on POSIX systems so it would be
strange to pretend here that it's optional, and we might as well give
O_DSYNC the same treatment since the reference is also guarded by a test
for a macOS-specific macro, and we know that current Macs have it.
Discussion: https://postgr.es/m/CA%2BhUKGKb6FsAdQWcRL35KJsftv%2B9zXqQbzwkfRf1i0J2e57%2BhQ%40mail.gmail.com
Failing to do so results in an error when a pgbench script tries to
start a serializable transaction inside a pipeline, because by the time
BEGIN ISOLATION LEVEL SERIALIZABLE is executed, we're already in a
transaction that has acquired a snapshot, so the server rightfully
complains.
We can work around that by preparing all commands in the pipeline before
actually starting the pipeline. This changes the existing code in two
aspects: first, we now prepare each command individually at the point
where that command is about to be executed; previously, we would prepare
all commands in a script as soon as the first command of that script
would be executed. It's hard to see that this would make much of a
difference (particularly since it only affects the first time to execute
each script in a client), but I didn't actually try to measure it.
Secondly, we no longer use PQsendPrepare() in pipeline mode, but only
PQprepare. There's no specific reason for this change other than no
longer needing to do differently in pipeline mode. (Previously we had
no choice, because in pipeline mode PQprepare could not be used.)
Backpatch to 14, where pgbench got support for pipeline mode.
Reported-by: Yugo NAGATA <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/20210716153013.fc53b1c780b06fccc07a7f0d@sraoss.co.jp
Historically we've accepted interval input like 'P.1e10D'. This
is probably an accident of having used strtod() to do the parsing,
rather than something anyone intended, but it's been that way for
a long time. Commit e39f99046 broke this by trying to parse the
integer and fractional parts separately, without accounting for
the possibility of an exponent. In principle that coding allowed
for precise conversions of field values wider than 15 decimal
digits, but that does not seem like a goal worth sweating bullets
for. So, rather than trying to manage an exponent on top of the
existing complexity, let's just revert to the previous coding that
used strtod() by itself. We can still improve on the old code to
the extent of allowing the value to range up to 1.0e15 rather than
only INT_MAX. (Allowing more than that risks creating problems
due to precision loss: the converted fractional part might have
absolute value more than 1. Perhaps that could be dealt with in
some way, but it really does not seem worth additional effort.)
Per bug #17795 from Alexander Lakhin. Back-patch to v15 where
the faulty code came in.
Discussion: https://postgr.es/m/17795-748d6db3ed95d313@postgresql.org
This was not something that required consideration before MERGE
was invented; but MERGE builds a join tree that left-joins to the
result relation, meaning that remove_useless_joins will consider
removing it. That should generally be stopped by the query's use
of output variables from the result relation. However, if the
result relation is inherited (e.g. a partitioned table) then
we don't add any row identity variables to the query until
expand_inherited_rtentry, which happens after join removal.
This was exposed as of commit 3c569049b, which made it possible
to deduce that a partitioned table could contain at most one row
matching a join key, enabling removal of the not-yet-expanded
result relation. Ooops.
To fix, let's just teach join_is_removable that the query result
rel is never removable. It's a cheap enough test in any case,
and it'll save some cycles that we'd otherwise expend in proving
that it's not removable, even in the cases we got right.
Back-patch to v15 where MERGE was added. Although I think the
case cannot be reached in v15, this seems like cheap insurance.
Per investigation of a report from Alexander Lakhin.
Discussion: https://postgr.es/m/36bee393-b351-16ac-93b2-d46d83637e45@gmail.com
For no clearly good reason, make_modifytable assumed that it
could not reach its get-the-FDW-info-the-hard-way path in MERGE.
It's currently possible to demonstrate that assertion failing,
which seems to be due to an upstream planner bug; but there's no
good reason to do it like this at all. Let's apply the principle
of separation of concerns and make the MERGE check separately,
after getting or not getting the fdwroutine pointer.
Per report from Alexander Lakhin. No test case, since I think
the potential test condition will go away soon.
Discussion: https://postgr.es/m/36bee393-b351-16ac-93b2-d46d83637e45@gmail.com
The RelOptInfo->userid field (the user ID to check permissions as) of an
"otherrel" relation was being copied from its parent relation, which is
correct in most cases but wrong when the parent is a subquery. In that
case, using the value from the RTEPermissionInfo of the child itself is
the appropriate thing to do.
Coming up with a test case where user-visible behavior changes proves
hard enough, so we don't add one here.
Bug introduced by a61b1f7482, discovered by Amit while reviewing
nearby code.
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqE0WY_AhLnGtTsY7eYebG212XWbM-D8gr2A_ToOHyCywQ@mail.gmail.com
In generate_orderedappend_paths(), when match_partition_order_desc was
true, we would lcons() items to various lists in a loop over each live
partition. When the number of live partitions was large, the lcons()
could show up in profiles due to it having to perform memmove() to make
way for the new list item.
Here we adjust things so that we just perform the loop over the live
partitions backwards when match_partition_order_desc is true. This allows
us to simplify the logic in the loop. Now, as far as the guts of the loop
knows, there's no difference between match_partition_order and
match_partition_order_desc. We can just set match_partition_order to true
so that we build the correct list of paths for the asc and desc case. Per
idea from Andres Freund.
Discussion: https://postgr.es/m/20230217002351.nyt4y5tdzg6hugdt@awork3.anarazel.de
To allow testing for general support for fast bitscan intrinsics,
add symbols HAVE_BITSCAN_REVERSE and HAVE_BITSCAN_FORWARD.
Also do related cleanup in AllocSetFreeIndex(): Previously, we
tested for HAVE__BUILTIN_CLZ and copied the relevant internals of
pg_leftmost_one_pos32(), with a special fallback that does less
work than the general fallback for that function. Now that we have
a more general test, we just call pg_leftmost_one_pos32() directly
for platforms with intrinsic support. On gcc at least, there is no
difference in the binary for non-assert builds.
Discussion: https://www.postgresql.org/message-id/CAFBsxsEPc%2BFnX_0vmmQ5DHv60sk4rL_RZJ%2BMD6ei%3D76L0kFMvA%40mail.gmail.com
The setting of the process title could be seen on profiles of very
fast-to-execute queries. In many locations where we call
set_ps_display() we pass along a string constant, the length of which is
known during compilation. Here we effectively rename set_ps_display() to
set_ps_display_with_len() and then add a static inline function named
set_ps_display() which calls strlen() on the given string. This allows
the compiler to optimize away the strlen() call when dealing with
call sites passing a string constant. We can then also use memcpy()
instead of strlcpy() to copy the string into the destination buffer.
That's significantly faster than strlcpy's byte-at-a-time way of
copying.
Here we also take measures to improve some code which was adjusting the
process title to add a " waiting" suffix to it. Call sites which require
this can now just call set_ps_display_suffix() to add or adjust the suffix
and call set_ps_display_remove_suffix() to remove it again.
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CAApHDvocBvvk-0gWNA2Gohe+sv9fMcv+fK_G+siBKJrgDG4O7g@mail.gmail.com
When evaluating clauses on multiple scan keys of a multi-column BRIN
index, we can stop processing as soon as we find a scan key eliminating
the range, and the range should not be added to tbe bitmap.
That's how it worked before 14, but since a681e3c107 the code treated
the range as matching if it matched at least the last scan key.
Backpatch to 14, where this code was introduced.
Backpatch-through: 14
Discussion: https://postgr.es/m/ebc18613-125e-60df-7520-fcbe0f9274fc%40enterprisedb.com
ruleutils.c blindly printed the user-given alias (or nothing if there
hadn't been one) for the target table of INSERT/UPDATE/DELETE queries.
That works a large percentage of the time, but not always: for queries
appearing in WITH, it's possible that we chose a different alias to
avoid conflict with outer-scope names. Since the chosen alias would
be used in any Var references to the target table, this'd lead to an
inconsistent printout with consequences such as dump/restore failures.
The correct logic for printing (or not) a relation alias was embedded
in get_from_clause_item. Factor it out to a separate function so that
we don't need a jointree node to use it. (Only a limited part of that
function can be reached from these new call sites, but this seems like
the cleanest non-duplicative factorization.)
In passing, I got rid of a redundant "\d+ rules_src" step in rules.sql.
Initial report from Jonathan Katz; thanks to Vignesh C for analysis.
This has been broken for a long time, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/e947fa21-24b2-f922-375a-d4f763ef3e4b@postgresql.org
Discussion: https://postgr.es/m/CALDaNm1MMntjmT_NJGp-Z=xbF02qHGAyuSHfYHias3TqQbPF2w@mail.gmail.com
A new callback named startup_cb, called shortly after a module is
loaded, is added. This makes possible the initialization of any
additional state data required by a module. This initial state data can
be saved in a ArchiveModuleState, that is now passed down to all the
callbacks that can be defined in a module. With this design, it is
possible to have a per-module state, aimed at opening the door to the
support of more than one archive module.
The initialization of the callbacks is changed so as
_PG_archive_module_init() does not anymore give in input a
ArchiveModuleCallbacks that a module has to fill in with callback
definitions. Instead, a module now needs to return a const
ArchiveModuleCallbacks.
All the structure and callback definitions of archive modules are moved
into their own header, named archive_module.h, from pgarch.h.
Command-based archiving follows the same line, with a new set of files
named shell_archive.{c,h}.
There are a few more items that are under discussion to improve the
design of archive modules, like the fact that basic_archive calls
sigsetjmp() by itself to define its own error handling flow. These will
be adjusted later, the changes done here cover already a good portion
of what has been discussed.
Any modules created for v15 will need to be adjusted to this new
design.
Author: Nathan Bossart
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/20230130194810.6fztfgbn32e7qarj@awork3.anarazel.de
In commit 8bf6ec3ba, I mistakenly supposed that MergeAttributes'
loop over saved_schema was reprocessing column definitions that
had already been checked earlier: there is a variant syntax for
creating a child partition in which that's not true. So we need
to duplicate the full check appearing further up.
(Actually, I believe that the "if (restdef->identity)" part is
not reachable, because we reject identity on partitions earlier.
But it seems wise to keep the check, in case that's ever relaxed,
and to keep this code in sync with the other instance.)
Per report from Alexander Lakhin.
Discussion: https://postgr.es/m/4a8200ca-8378-653e-38ed-b2e1f1611aa6@gmail.com
d9d7fe68d3 made use of an existing wait event when sending data from the
apply worker, but we should have invented a new wait event since this is a
new place to wait.
This patch corrects the mistake by using a new wait event
"LogicalApplySendData".
Author: Hou Zhijie
Reviewed-by: Peter Smith
Discussion: https://postgr.es/m/CA+TgmobWzbr9H3yN3dLVckviEZKemPwd+XyCFKEgyZQZhgP66Q@mail.gmail.com
Support for regexps in database and role entries for pg_hba.conf has
been added in 8fea8683, and efb6f4a has extended support of pg-user in
pg_ident.conf, still both of them have missed a short description about
the new patterns supported in their respective sample files.
This commit closes the gap, by providing a short description of all the
new features supported for each entry type.
Reported-by: Pavel Luzanov
Reviewed-by: Jelte Fennema, Pavel Luzanov
Discussion: https://postgr.es/m/e495112d-8741-e651-64a2-ecb5728f1a56@postgrespro.ru
On MERGE / WHEN MATCHED DELETE it's not possible to get cross-partition
updates, so we don't initialize cpUpdateRetrySlot; however, the code was
not careful to ignore the value in that case. Make it do so.
Backpatch to 15.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/17792-0f89452029662c36@postgresql.org
force_parallel_mode is meant to be used to allow us to exercise the
parallel query infrastructure to ensure that it's working as we expect.
It seems some users think this GUC is for forcing the query planner into
picking a parallel plan regardless of the costs. A quick look at the
documentation would have made them realize that they were wrong, but the
GUC is likely too conveniently named which, evidently, seems to often
result in users expecting that it forces the planner into usefully
parallelizing queries.
Here we rename the GUC to something which casual users are less likely to
mistakenly think is what they need to make their query run more quickly.
For now, the old name can still be used. We'll revisit if the old name
mapping can be removed once the buildfarm configs are all updated.
Reviewed-by: John Naylor
Discussion: https://postgr.es/m/CAApHDvrsOi92_uA7PEaHZMH-S4Xv+MGhQWA+GrP8b1kjpS1HjQ@mail.gmail.com
OpenSSL 1.1.1 and newer versions have added support for RSA-PSS
certificates, which requires the use of a specific routine in OpenSSL to
determine which hash function to use when compiling it when using
channel binding in SCRAM-SHA-256. X509_get_signature_nid(), that is the
original routine the channel binding code has relied on, is not able to
determine which hash algorithm to use for such certificates. However,
X509_get_signature_info(), new to OpenSSL 1.1.1, is able to do it. This
commit switches the channel binding logic to rely on
X509_get_signature_info() over X509_get_signature_nid(), which would be
the choice when building with 1.1.1 or newer.
The error could have been triggered on the client or the server, hence
libpq and the backend need to have their related code paths patched.
Note that attempting to load an RSA-PSS certificate with OpenSSL 1.1.0
or older leads to a failure due to an unsupported algorithm.
The discovery of relying on X509_get_signature_info() comes from Jacob,
the tests have been written by Heikki (with few tweaks from me), while I
have bundled the whole together while adding the bits needed for MSVC
and meson.
This issue exists since channel binding exists, so backpatch all the way
down. Some tests are added in 15~, triggered if compiling with OpenSSL
1.1.1 or newer, where the certificate and key files can easily be
generated for RSA-PSS.
Reported-by: Gunnar "Nick" Bluth
Author: Jacob Champion, Heikki Linnakangas
Discussion: https://postgr.es/m/17760-b6c61e752ec07060@postgresql.org
Backpatch-through: 11
Previously make_etags always ran make_ctags -e when make_etags was
executed. However, because non-Exuberant ctags on Mac does not
support -e option (and also on other platforms including old Linux),
ctags failed. To avoid the failure change make_ctags so that if
non-Exuberant ctags is used and ctags -e option is requested, run
etags command instead. If etags command does not exist, make_ctags
will fail.
Also refactor make_ctags and tweak make_etags to emit proper usage
message.
Author: Fujii Masao
Reviewed-by: Tatsuo Ishii
Discussion: https://www.postgresql.org/message-id/369c13b9-8b0f-d6f9-58fc-61258ec8f713%40oss.nttdata.com
In commit b78f6264e I opined that it was "too risky" to delete a
relation's RelOptInfo from the planner's data structures when we have
realized that we don't need to join to it; so instead we just marked
it as a dead relation. In hindsight that judgment seems flawed: any
subsequent access to such a dead relation is arguably a bug in
itself, so leaving the RelOptInfo present just helps to mask bugs.
Let's delete it instead, allowing removal of the whole notion of a
"dead relation". So far as the regression tests can find, this
requires no other code changes, except for one Assert in equivclass.c
that was very dubiously not complaining about access to a dead rel.
Discussion: https://postgr.es/m/229905.1676062220@sss.pgh.pa.us
Late in the development of commit 2489d76c4, I (tgl) incorrectly
concluded that the new function have_unsafe_outer_join_ref couldn't
ever reach its inner loop. That should be the case if the inner
rel's parameterization is based on just one Var, but it could be
based on Vars from several relations, and then not only is the
inner loop reachable but it's wrongly coded.
Despite those errors, it still appears that the whole thing is
redundant given previous join_is_legal checks, so let's arrange
to only run it in assert-enabled builds.
Diagnosis and patch by Richard Guo, per fuzz testing by Justin Pryzby.
Discussion: https://postgr.es/m/20230212235823.GW1653@telsasoft.com
Now that we have the sources for pg_bsd_indent in our code base these
are redundant.
It is now required to provide a list of files or directories to pgindent,
either by using --commit or on the command line. The equivalent of
previously running pgindent with no parameters is now `pgindent .`
Some extra checks are also added. duplicate files in the file list are
skipped, and there is a warning if no files are specified.
If the --commit option is used, the script now chdir's to the source
root, as git always reports files relative to that. (Fixes a gripe from
Justin Pryzby)
Reviewed by Tom Lane
Discussion: https://postgr.es/m/842819.1676219054@sss.pgh.pa.us
In commit ad89a5d115, we added an unhelpful 'ON' that doesn't match
the input syntax. This was discovered while adding code to support for
DDL in logical replication.
No backpatch because of the change of behavior, however improbable it
may be that somebody is depending on this.
Author: Zheng Li <zhengli10@gmail.com>
Discussion: https://postgr.es/m/CAAD30UKg8rXeGM8Oy_MAmxKBL_K5DiHXdeNF=hUefcu1C_6VfQ@mail.gmail.com
The logic in this area was recently changed in 7da51590e, however, in that
commit, I neglected to consider that the conditions in which we should
pfree the old Datum needed to be updated after that change. This could
result in trying to pfree a NULL value, as was demonstrated by Alexander
Lakhin.
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/4103db46-d888-6d1d-e88d-87c21ed99472@gmail.com
Here we fix a faulty "if" condition which failed to correctly handle two
or more consecutive NULL transition values when checking if the new value
is DISTINCT from the old value for presorted aggregates. Given a suitably
non-strict aggregate transition function, a byref aggregate could cause a
crash due to calling the type's equality function and passing along a
(Datum) 0 value to test for equality, the equality function would then try
to dereference that 0 Datum and segfault. For byval types, there'd have
been no crash and the equality function would have seen that the two 0
Datums matched, which (only by chance) meant the calling code would have
worked correctly.
Here we ensure that we only call the equality function when neither of
the input values are NULL.
This code is all new as of 1349d2790, so no backpatch needed.
Reported-by: Fujii Masao
Discussion: https://postgr.es/m/860c6d6f-a3c5-3ae9-9da2-827177bede06@oss.nttdata.com
When an aggregate function is used as a WindowFunc and a tuple transitions
out of the window frame, we ordinarily try to make use of the aggregate
function's inverse transition function to "unaggregate" the exiting tuple.
This optimization is disabled for various cases, including when the
aggregate contains a volatile function. In such a case we'd be unable to
ensure that the transition value was calculated to the same value during
transitions and inverse transitions. Unfortunately, we did this check by
calling contain_volatile_functions() which does not recursively search
SubPlans for volatile functions. If the aggregate function's arguments or
its FILTER clause contained a subplan with volatile functions then we'd
fail to notice this.
Here we fix this by just disabling the optimization when the WindowFunc
contains any subplans. Volatile functions are not the only reason that a
subplan may have nonrepeatable results.
Bug: #17777
Reported-by: Anban Company
Discussion: https://postgr.es/m/17777-860b739b6efde977%40postgresql.org
Reviewed-by: Tom Lane
Backpatch-through: 11
This commit removes most of the Plan and Path nodes, which should never
be included in the query jumbling because we ignore these in Query
nodes. This is facilitated by making no_query_jumble an inherited
attribute, like no_copy, no_equal and no_read when the supertype of a
node is found as marked with that.
RawStmt is not used in parsed queries, so it can be removed from the
query jumbling. A couple of nodes defined in pathnodes.h, plannodes.h
and primnodes.h with NodeTag as supertype need to be marked
individually.
Forcing the execution of the query jumbling code with compute_query_id =
auto while pg_stat_statements is loaded brings the code coverage of
queryjumblefuncs.funcs.c to 95.6%.
The core code does not yet include a way to enforce the execution in
query jumbling except in pg_stat_statements, so the numbers I am
mentioning above will not reflect on the default coverage report with
just what is done in this commit.
Reported-by: Tom Lane
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/3344827.1675809127@sss.pgh.pa.us
Commit e39f99046 moved some code up closer to the start of
DecodeInterval(), without noticing that it had been implicitly
relying on previous checks to reject the case of empty input.
Given empty input, we'd now dereference a pointer that hadn't been
set, possibly leading to a core dump. (But if we fail to provoke
a SIGSEGV, nothing bad happens, and the expected syntax error is
thrown a bit later.)
Per bug #17788 from Alexander Lakhin. Back-patch to v15 where
the fault was introduced.
Discussion: https://postgr.es/m/17788-dabac9f98f7eafd5@postgresql.org
Update the Makefile and build directions for in-tree build,
and add Meson build infrastructure. Also convert the ad-hoc
test target into a TAP test.
Currently, the Make build system will not build pg_bsd_indent
by default, while the Meson system will. Both will test it
during "make check-world" or "ninja test". Neither will install
it automatically. (We might change some of these decisions later.)
Also fix a few portability nits noted during early testing.
Also, exclude pg_bsd_indent from pgindent's purview; at least for
now, we'll leave it formatted similarly to the FreeBSD original.
Tom Lane and Andres Freund
Discussion: https://postgr.es/m/3935719.1675967430@sss.pgh.pa.us
Discussion: https://postgr.es/m/20200812223409.6di3y2qsnvynao7a@alap3.anarazel.de
An upcoming test needs to use a tablespace as part of its test. Historically,
we wanted tablespace creation be done in a dedicated file, so it's easy to
disable when testing replication. But that is not necessary anymore, due to
allow_in_place_tablespaces.
Create regress_tblspace tablespace in test_setup. Move the tablespace test to
the end of the parallel schedule, so other tests can use it.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
Builds on 28e626bde0 and f30d62c2fc. See the former for motivation.
Rows of the view show IO operations for a particular backend type, IO target
object, IO context combination (e.g. a client backend's operations on
permanent relations in shared buffers) and each column in the view is the
total number of IO Operations done (e.g. writes). So a cell in the view would
be, for example, the number of blocks of relation data written from shared
buffers by client backends since the last stats reset.
In anticipation of tracking WAL IO and non-block-oriented IO (such as
temporary file IO), the "op_bytes" column specifies the unit of the "reads",
"writes", and "extends" columns for a given row.
Rows for combinations of IO operation, backend type, target object and context
that never occur, are ommitted entirely. For example, checkpointer will never
operate on temporary relations.
Similarly, if an IO operation never occurs for such a combination, the IO
operation's cell will be null, to distinguish from 0 observed IO
operations. For example, bgwriter should not perform reads.
Note that some of the cells in the view are redundant with fields in
pg_stat_bgwriter (e.g. buffers_backend). For now, these have been kept for
backwards compatibility.
Bumps catversion.
Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Samay Sharma <smilingsamay@gmail.com>
Reviewed-by: Maciek Sakrejda <m.sakrejda@gmail.com>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
analyzejoins.c took care to clean out removed relids from the
clause_relids and required_relids of RestrictInfos associated with
the doomed rel ... but it paid no attention to the fact that if such a
RestrictInfo contains an OR clause, there will be sub-RestrictInfos
containing similar fields.
I'm more than a bit surprised that this oversight hasn't caused
visible problems before. In any case, it's certainly broken now,
so add logic to clean out the sub-RestrictInfos recursively.
We might need to back-patch this someday.
Per bug #17786 from Robins Tharakan.
Discussion: https://postgr.es/m/17786-f1ea7fbdab97daec@postgresql.org
One of the add_nulling_relids calls in deconstruct_distribute_oj_quals
added an OJ relid to too few Vars, while the other added it to too
many. We should consider the syntactic structure not
min_left/righthand while deciding which Vars to decorate, and when
considering pushing up a lower outer join pursuant to transforming the
second form of OJ identity 3 to the first form, we only want to
decorate Vars coming from its LHS.
In a related bug, I realized that make_outerjoininfo was failing to
check a very basic property that's needed to apply OJ identity 3:
the syntactically-upper outer join clause can't refer to the lower
join's LHS. This didn't break the join order restriction logic,
but it led to setting bogus commute_xxx bits, possibly resulting
in bogus nullingrel markings in modified quals.
Richard Guo and Tom Lane
Discussion: https://postgr.es/m/CAMbWs497CmBruMx1SOjepWEz+T5NWa4scqbdE9v7ZzSXqH_gQw@mail.gmail.com
Discussion: https://postgr.es/m/CAEP4nAx9C5gXNBfEA0JBfz7B+5f1Bawt-RWQWyhev-wdps8BZA@mail.gmail.com
Commit 28e626bde0 introduced the infrastructure for tracking more detailed IO
statistics. This commit adds the actual collection of the new IO statistics
for relations and temporary relations. See aforementioned commit for goals and
high-level design.
The changes in this commit are fairly straight-forward. The bulk of the change
is to passing sufficient information to the callsites of pgstat_count_io_op().
A somewhat unsightly detail is that it currently is hard to find a better
place to count fsyncs than in md.c, whereas the other pgstat_count_io_op()
calls are in bufmgr.c/localbuf.c. As the number of fsyncs is tied to md.c
implementation details, it's not obvious there is a better answer.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
This commit adds the infrastructure for more detailed IO statistics. The calls
to actually count IOs, a system view to access the new statistics,
documentation and tests will be added in subsequent commits, to make review
easier.
While we already had some IO statistics, e.g. in pg_stat_bgwriter and
pg_stat_database, they did not provide sufficient detail to understand what
the main sources of IO are, or whether configuration changes could avoid
IO. E.g., pg_stat_bgwriter.buffers_backend does contain the number of buffers
written out by a backend, but as that includes extending relations (always
done by backends) and writes triggered by the use of buffer access strategies,
it cannot easily be used to tune background writer or checkpointer. Similarly,
pg_stat_database.blks_read cannot easily be used to tune shared_buffers /
compute a cache hit ratio, as the use of buffer access strategies will often
prevent a large fraction of the read blocks to end up in shared_buffers.
The new IO statistics count IO operations (evict, extend, fsync, read, reuse,
and write), and are aggregated for each combination of backend type (backend,
autovacuum worker, bgwriter, etc), target object of the IO (relations, temp
relations) and context of the IO (normal, vacuum, bulkread, bulkwrite).
What is tracked in this series of patches, is sufficient to perform the
aforementioned analyses. Further details, e.g. tracking the number of buffer
hits, would make that even easier, but was left out for now, to keep the scope
of the already large patchset manageable.
Bumps PGSTAT_FILE_FORMAT_ID.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
While removing the use of SHM_QUEUE from predicate.c, in 9600371764, I made
two mistakes in GetSafeSnapshotBlockingPids():
- Removed the check for output_size
- Previously, when the first loop didn't find a matching proc, sxact would be
NULL. But with naive use of dlist_foreach() it ends up as the value of the
last iteration.
The second issue is the cause of occasional failures in the deadlock-hard and
deadlock-soft isolation tests that we have been observing on CI. The issue was
very hard to reproduce, as it requires the transactions.sql regression test to
run at the same time as the deadlock-{hard,soft} isolation test.
I did not find other similar mistakes in 9600371764.
Discussion: https://postgr.es/m/20230208221145.bwzhancellclrgia@awork3.anarazel.de
A new --commit option will add all the files in a commit to the file
list. The option can be specified more than once.
Also, if a directory is given on the command line, all the files in that
directory tree will be added to the file list.
Per suggestions from Robert Haas
Reviewed by Jelte Fennema
Discussion: https://postgr.es/m/CA+TgmoY59Ksso81RNLArNxj0a7xaqV_F_u7gSMHbgdc2kG5Vpw@mail.gmail.com
The code I added in fee7b77b9 could misbehave if commute_above_r
contains multiple relids. While adding too many relids here is
probably harmless (pre-fee7b77b9, we did it all the time), it's
not very expensive to be accurate: we just have to intersect
commute_above_r with the join's relids.
Discussion: https://postgr.es/m/17781-c0405c8b3cd5e072@postgresql.org
No GUCs that use NO_SHOW_ALL are reported in pg_show_all_settings(),
hence trying to check combinations of flags related to it is pointless.
These queries have been introduced by d10e41d, so backpatch down to 15
to keep all the branches consistent. Equivalent checks based on
NO_SHOW_ALL could be added in check_GUC_init() when a GUC is initially
loaded, but this can be done only on HEAD.
Author: Nitin Jadhav
Discussion: https://postgr.es/m/CAMm1aWaYe0muu3ABo7iSAgK+OWDS9yNe8GGRYnCyeEpScYKa+g@mail.gmail.com
Backpatch-through: 15
The DDLs like Refresh Materialized views that generate lots of temporary
data due to rewrite rules may not be processed by output plugins (for
example pgoutput). So, we won't send keep-alive messages for a long time
while processing such commands and that can lead the subscriber side to
timeout. We have previously fixed a similar case for large transactions in
commit f95d53eded where the output plugin filters all or most of the
changes but missed to handle the DDLs.
We decided not to backpatch this as this adds a new callback in the
existing exposed structure and moreover, users can increase the
wal_sender_timeout and wal_receiver_timeout to avoid this problem.
Author: Wang wei, Hou Zhijie
Reviewed-by: Peter Smith, Ashutosh Bapat, Shi yu, Amit Kapila
Discussion: https://postgr.es/m/OS3PR01MB6275478E5D29E4A563302D3D9E2B9@OS3PR01MB6275.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com
The logic for when to add the current outer join's own relid
to the nullingrels sets of output Vars and PHVs was overly
complicated and underly correct. Not sure why I didn't think
of this before, but since what we want is marking per the
syntactic structure, we can just consult our records about
the syntactic structure, ie syn_righthand/syn_lefthand.
Also, tighten the rule about when to add the commute_above_r
bits, in hopes of eliminating some squishy reasoning. I do not
know of a reason to think that that's broken as-is, but this way
seems better.
Per bug #17781 from Robins Tharakan.
Discussion: https://postgr.es/m/17781-c0405c8b3cd5e072@postgresql.org
Commit c3382a3c3, which moved the implementation of PG_TEST_EXTRA
from src/test/Makefile into individual test scripts, broke the
directions given in the subdirectory README files about how to run
these tests by hand. Update. Also mention wal_consistency_checking
in recovery/README --- that omission isn't the fault of c3382a3c3,
but it's still an omission.
Currently, we reuse WAIT_EVENT_LOGICAL_PARALLEL_APPLY_STATE_CHANGE in the
apply worker while sending data to the parallel apply worker via a shared
memory queue. This is not appropriate as one won't be able to distinguish
whether the worker is waiting for sending data or for the state change.
To patch instead uses the wait event WAIT_EVENT_MQ_SEND which has been
already used in blocking mode while sending data via a shared memory
queue.
Author: Hou Zhijie
Reviewed-by: Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB57161C680B22E4C591628EE994DA9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Here we further simplify the code in heapgettup() and
heapgettup_pagemode() to make better use of the helper functions added in
the previous recent refactors in this area.
In passing, remove an unneeded cast added in 8ca6d49f6.
Author: Melanie Plageman
Reviewed-by: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ@mail.gmail.com
Like the implementation for node copy, write and read, this node
requires a custom implementation so as the query jumbling is able to
consider the correct value assigned to it, depending on its type (int,
float, bool, string, bitstring).
Based on a dump of pg_stat_statements from the regression database, this
would confuse the query jumbling of the following queries:
- SET.
- COPY TO with SELECT queries.
- START TRANSACTION with different isolation levels.
- ALTER TABLE with default expressions.
- CREATE TABLE with partition bounds.
Note that there may be a long-term argument in tracking the location of
such nodes so as query strings holding such nodes could be normalized,
but this is left as a separate discussion.
Oversight in 3db72eb.
Discussion: https://postgr.es/m/Y9+HuYslMAP6yyPb@paquier.xyz