All backends should have a BackendType to enable statistics reporting
per BackendType.
Add a new BackendType for standalone backends, B_STANDALONE_BACKEND (and
alphabetize the BackendTypes). Both the bootstrap backend and single
user mode backends will have BackendType B_STANDALONE_BACKEND.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/CAAKRu_aaq33UnG4TXq3S-OSXGWj1QGf0sU%2BECH4tNwGFNERkZA%40mail.gmail.com
Somewhere during the development of the patch acquiring a lock during read
access to variable-numbered stats got lost. The missing lock acquisition won't
cause corruption, but can lead to reading torn values when accessing
stats. Add the missing lock acquisitions.
Reported-by: Greg Stark <stark@mit.edu>
Reviewed-by: "Drouvot, Bertrand" <bdrouvot@amazon.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CAM-w4HMYkM_DkYhWtUGV+qE_rrBxKOzOF0+5faozxO3vXrc9wA@mail.gmail.com
Backpatch: 15-
Previously, membership of role A in role B could be recorded in the
catalog tables only once. This meant that a new grant of role A to
role B would overwrite the previous grant. For other object types, a
new grant of permission on an object - in this case role A - exists
along side the existing grant provided that the grantor is different.
Either grant can be revoked independently of the other, and
permissions remain so long as at least one grant remains. Make role
grants work similarly.
Previously, when granting membership in a role, the superuser could
specify any role whatsoever as the grantor, but for other object types,
the grantor of record must be either the owner of the object, or a
role that currently has privileges to perform a similar GRANT.
Implement the same scheme for role grants, treating the bootstrap
superuser as the role owner since roles do not have owners. This means
that attempting to revoke a grant, or admin option on a grant, can now
fail if there are dependent privileges, and that CASCADE can be used
to revoke these. It also means that you can't grant ADMIN OPTION on
a role back to a user who granted it directly or indirectly to you,
similar to how you can't give WITH GRANT OPTION on a privilege back
to a role which granted it directly or indirectly to you.
Previously, only the superuser could specify GRANTED BY with a user
other than the current user. Relax that rule to allow the grantor
to be any role whose privileges the current user posseses. This
doesn't improve compatibility with what we do for other object types,
where support for GRANTED BY is entirely vestigial, but it makes this
feature more usable and seems to make sense to change at the same time
we're changing related behaviors.
Along the way, fix "ALTER GROUP group_name ADD USER user_name" to
require the same privileges as "GRANT group_name TO user_name".
Previously, CREATEROLE privileges were sufficient for either, but
only the former form was permissible with ADMIN OPTION on the role.
Now, either CREATEROLE or ADMIN OPTION on the role suffices for
either spelling.
Patch by me, reviewed by Stephen Frost.
Discussion: http://postgr.es/m/CA+TgmoaFr-RZeQ+WoQ5nKPv97oT9+aDgK_a5+qWHSgbDsMp1Vg@mail.gmail.com
Remove four probes for members of sockaddr_storage. Keep only the probe
for sockaddr's sa_len, which is enough for our two remaining places that
know about _len fields:
1. ifaddr.c needs to know if sockaddr has sa_len to understand the
result of ioctl(SIOCGIFCONF). Only AIX is still using the relevant code
today, but it seems like a good idea to keep it compilable on Linux.
2. ip.c was testing for presence of ss_len to decide whether to fill in
sun_len in our getaddrinfo_unix() function. It's just as good to test
for sa_len. If you have one, you have them all.
(The code in #2 isn't actually needed at all on several OSes I checked
since modern versions ignore sa_len on input to system calls. Proving
that's the case for all relevant OSes is left for another day, but
wouldn't get rid of that last probe anyway if we still want it for #1.)
Discussion: https://postgr.es/m/CA%2BhUKGJJjF2AqdU_Aug5n2MAc1gr%3DGykNjVBZq%2Bd6Jrcp3Dyvg%40mail.gmail.com
As such the current usage of & won't produce incorrect results but it
would be better to use && to short-circuit the evaluation of second
condition when the same is not required.
Author: Ranier Vilela
Reviewed-by: Tom Lane, Bharath Rupireddy
Backpatch-through: 15, where it was introduced
Discussion: https://postgr.es/m/CAEudQApL8QcoYwQuutkWKY_h7gBY8F0Xs34YKfc7-G0i83K_pw@mail.gmail.com
The ecpg tests have their input directory in the build directory, as the tests
need to be built. Until now that required copying the expected/ directory to
the build directory in VPATH builds. To avoid needing to implement the same
for the meson build, add support for specifying the location of the expected
directory.
Now that that's not needed anymore, remove the copying of ecpg's expected
directory to the build directory in VPATH builds.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/20220718202327.pspcqz5mwbi2yb7w@awork3.anarazel.de
Compiling with -Wshadow=compatible-local yields quite a few warnings about
local variables being shadowed by compatible local variables in an inner
scope. Of course, this is perfectly valid in C, but we have had bugs in
the past as a result of developers failing to notice this. af7d270dd is a
recent example.
Here we do a cleanup of warnings we receive from -Wshadow=compatible-local
for code which is new to PostgreSQL 15. We've yet to have the discussion
about if we actually ever want to run that as a standard compilation flag.
We'll need to at least get the number of warnings down to something easier
to manage before we can realistically consider if we want this or not.
This commit is the first step towards reducing the warnings.
The changes being made here are all fairly trivial. Because of that, and
the fact that v15 is still in beta, this is being back-patched into 15.
It seems more risky not to do this as the risk of future bugs is increased
by the additional conflicts that this commit could cause for any future
bug fixes touching the same areas as this commit.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220817145434.GC26426%40telsasoft.com
Backpatch-through: 15
Consistently avoid trusting a sample of only one page at the point that
VACUUM determines a new reltuples for the target table (though only when
the table is larger than a single page). This is follow-up work to
commit 74388a1a, which added a heuristic to prevent reltuples from
becoming distorted by successive VACUUM operations that each scan only a
single heap page (which was itself more or less a bugfix for an issue in
commit 44fa8488, which simplified VACUUM's handling of scanned pages).
The original bugfix commit did not account for certain remaining cases
that where not affected by its "2% of total relpages" heuristic. This
happened with relations that are small enough that just one of its pages
exceeded the 2% threshold, yet still big enough for VACUUM to deem
skipping most of its pages via the visibility map worthwhile. reltuples
could still become distorted over time with such a table, at least in
scenarios where the VACUUM command is run repeatedly and without the
table itself ever changing.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wzk7d4m3oEbEWkWQKd+gz-eD_peBvdXVk1a_KBygXadFeg@mail.gmail.com
Backpatch: 15-, where the rules for scanned pages changed.
Initialize shared memory allocated for index stats to avoid a hard
crash. This was possible when parallel VACUUM became confused about the
current phase of index processing.
Oversight in commit 8e1fae1938, which refactored parallel VACUUM.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-By: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20220818133406.GL26426@telsasoft.com
Backpatch: 15-, the first version with the refactoring commit.
Previously, "GRANT foo TO bar" or "GRANT foo TO bar GRANTED BY baz"
would record the OID of the grantor in pg_auth_members.grantor, but
that role could later be dropped without modifying or removing the
pg_auth_members record. That's not great, because we typically try
to avoid dangling references in catalog data.
Now, a role grant depends on the grantor, and the grantor can't be
dropped without removing the grant or changing the grantor. "DROP
OWNED BY" will remove the grant, just as it does for other kinds of
privileges. "REASSIGN OWNED BY" will not, again just like what we do
in other cases involving privileges.
pg_auth_members now has an OID column, because that is needed in order
for dependencies to work. It also now has an index on the grantor
column, because otherwise dropping a role would require a sequential
scan of the entire table to see whether the role's OID is in use as
a grantor. That probably wouldn't be too large a problem in practice,
but it seems better to have an index just in case.
A follow-on patch is planned with the goal of more thoroughly
rationalizing the behavior of role grants. This patch is just trying
to do enough to make sure that the data we store in the catalogs is at
some basic level valid.
Patch by me, reviewed by Stephen Frost
Discussion: http://postgr.es/m/CA+TgmoaFr-RZeQ+WoQ5nKPv97oT9+aDgK_a5+qWHSgbDsMp1Vg@mail.gmail.com
The present implementations of adjust_appendrel_attrs_multilevel and
its sibling adjust_child_relids_multilevel are very messy, because
they work by reconstructing the relids of the child's immediate
parent and then seeing if that's bms_equal to the relids of the
target parent. Aside from being quite inefficient, this will not
work with planned future changes to make joinrels' relid sets
contain outer-join relids in addition to baserels.
The whole thing can be solved at a stroke by adding explicit parent
and top_parent links to child RelOptInfos, and making these functions
work with RelOptInfo pointers instead of relids. Doing that is
simpler for most callers, too.
In my original version of this patch, I got rid of
RelOptInfo.top_parent_relids on the grounds that it was now redundant.
However, that adds a lot of code churn in places that otherwise would
not need changing, and arguably the extra indirection needed to fetch
top_parent->relids in those places costs something. So this version
leaves that field in place.
Discussion: https://postgr.es/m/553080.1657481916@sss.pgh.pa.us
As written, if you use XLogBeginRead() to position an xlogreader at
the beginning of a WAL page and then try to read WAL, this assertion
will fail. However, the header comment for XLogBeginRead() claims
that positioning an xlogreader at the beginning of a page is valid,
and the code here is perfectly able to cope with it. It's only the
assertion that causes trouble. So relax it.
This is formally a bug in all supported branches, but as it doesn't
seem to have any consequences for current uses of the xlogreader
facility, no back-patch, at least for now.
Dilip Kumar and Robert Haas
Discussion: http://postgr.es/m/CA+TgmoaJSs2_7WHW2GzFYe9+zfPtxBKvT3GW47+x=ptUE=cULw@mail.gmail.com
When creating a partitioned index, DefineIndex tries to identify
any existing indexes on the partitions that match the partitioned
index, so that it can absorb those as child indexes instead of
building new ones. Part of the matching is to compare IndexInfo
structs --- but that wasn't done quite right. We're comparing
the IndexInfo built within DefineIndex itself to one made from
existing catalog contents by BuildIndexInfo. Notably, while
BuildIndexInfo will run index expressions and predicates through
expression preprocessing, that has not happened to DefineIndex's
struct. The result is failure to match and subsequent creation
of duplicate indexes.
The easiest and most bulletproof fix is to build a new IndexInfo
using BuildIndexInfo, thereby guaranteeing that the processing done
is identical.
While here, let's also extract the opfamily and collation data
from the new partitioned index, removing ad-hoc logic that
duplicated knowledge about how those are constructed.
Per report from Christophe Pettus. Back-patch to v11 where
we invented partitioned indexes.
Richard Guo and Tom Lane
Discussion: https://postgr.es/m/8864BFAA-81FD-4BF9-8E06-7DEB8D4164ED@thebuild.com
configure extracts TCL_SHLIB_LD_LIBS from tclConfig.sh, and puts the
value into Makefile.global, but then we never use it anywhere. It
looks like I removed the only usage in cd75f94da, but didn't notice
that it was the only usage. Might as well mop this up while we're
trying to get rid of unnecessary configure steps.
Discussion: https://postgr.es/m/2442359.1660835043@sss.pgh.pa.us
Commit 5579388d was confused about why gai_strerror() didn't work, and
used gai_strerrorA(). It turns out that we had explicitly undefined
Windows' own macro for that somewhere else. Get rid of all that, and
use the system headers' definition of gai_sterror() directly as
intended.
Discussion: https://postgr.es/m/CA+hUKGKErNfhmvb_H0UprEmp4LPzGN06yR2_0tYikjzB-2ECMw@mail.gmail.com
On BSD-family systems, header <sys/sockio.h> defines socket ioctl
numbers like SIOCGIFCONF. Only AIX is using those now, but it defines
them in <net/if.h> anyway.
Supposing some PostgreSQL hacker wants to test that AIX-only code path
on a more common development system by pretending not to have
getifaddrs(). It's enough to include <sys/ioctl.h>, at least on macOS,
FreeBSD and Linux, and we're already doing that.
We carried a special implementation of pg_foreach_ifaddr() using
Solaris's ioctl(SIOCGLIFCONF), but Solaris 11 and illumos adopted
getifaddrs() more than a decade ago, and we prefer to use that. Solaris
10 is EOL'd. Remove the dead code.
Adjust comment about which OSes have getifaddrs(), which also
incorrectly listed AIX. AIX is in fact the only Unix in the build farm
that *doesn't* have it today, so the implementation based on
ioctl(SIOCGIFCONF) (note, no 'L') is still live. All the others have
had it for at least one but mostly two decades.
The last-stop fallback at the bottom of the file is dead code in
practice, but it's hard to justify removing it because the better
options are all non-standard.
Discussion: https://postgr.es/m/CA+hUKGKErNfhmvb_H0UprEmp4LPzGN06yR2_0tYikjzB-2ECMw@mail.gmail.com
The table column that stores this is of type oid, but is actually limited
to uint16 and has a different path for creating new values. Some of
the documentation already referred to it as an ID, so let's standardize
on that.
While at it, most format strings already use %u, so for consintency
change the remaining stragglers using %d.
Per suggestions from Tom Lane and Justin Pryzby
Discussion: https://www.postgresql.org/message-id/3437166.1659620465%40sss.pgh.pa.us
Backpatch to v15
1349d2790 changed things to make the planner request that the
query_pathkeys contain pathkeys for any ORDER BY / DISTINCT aggregates.
Some code added prior to that commit in db0d67db2 made it so the order
that the pathkeys appear in the group_pathkeys could be changed so that
the GROUP BY could be executed in a more optimal order which minimized
sort comparisons. 1349d2790 had to make sure that the pathkeys for any
ORDER BY / DISTINCT aggregates remained at the end of the groupby_pathkeys
and wasn't reordered, so some code was added to
add_paths_to_grouping_rel() to first strip off any pathkeys belonging to
ORDER BY / DISTINCT aggregates before passing to the function to optimize
the order of the group_pathkeys.
It seems I dropped the ball in 1349d2790 and mistakenly used the untouched
PlannerInfo.group_pathkeys to pass to get_useful_group_keys_orderings()
instead of the version that had the aggregate pathkeys removed. It was
only the code path that was handling creating paths for
partially_grouped_rel which made this mistake. In practice, we'll never
have any extra pathkeys to strip off when processing
partially_grouped_rel as that's only used when considering partial
paths, which we never do when there are ORDER BY / DISTINCT aggregates.
So this is just a hypothetical bug, not a live bug. We already have the
correct pathkeys determined, so it's of no extra cost to pass the
correct variable.
Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20220817015755.GB26426@telsasoft.com
Make build_joinrel_tlist() responsible for adding PHVs that were
already computed in one or the other input relation, and therefore
change add_placeholders_to_joinrel() to only add PHVs that will be
newly computed in this joinrel's output. This makes the handling
of PHVs in build_joinrel_tlist() more like its handling of plain
Vars, which seems like a good thing on intelligibility grounds
and will simplify planned future changes. There is a purely
cosmetic side-effect that the order of entries in the joinrel's
tlist may change; but since it becomes more like the order of
entries in the input tlists, that's not bad.
The reason it wasn't done like this originally was the potential
cost of looking up PlaceHolderInfo entries to consult ph_needed.
Now that that's O(1) it shouldn't hurt.
Discussion: https://postgr.es/m/1405792.1660677844@sss.pgh.pa.us
Up to now, callers of find_placeholder_info() were required to pass
a flag indicating if it's OK to make a new PlaceHolderInfo. That'd
be fine if the callers had free choice, but they do not. Once we
begin deconstruct_jointree() it's no longer OK to make more PHIs;
while callers before that always want to create a PHI if it's not
there already. So there's no freedom of action, only the opportunity
to cause bugs by creating PHIs too late. Let's get rid of that in
favor of adding a state flag PlannerInfo.placeholdersFrozen, which
we can set at the point where it's no longer OK to make more PHIs.
This patch also simplifies a couple of call sites that were using
complicated logic to avoid calling find_placeholder_info() as much
as possible. Now that that lookup is O(1) thanks to the previous
commit, the extra bitmap manipulations are probably a net negative.
Discussion: https://postgr.es/m/1405792.1660677844@sss.pgh.pa.us
Up to now we've just searched the placeholder_list when we want to
find the PlaceHolderInfo with a given ID. While there's no evidence
of that being a problem in the field, an upcoming patch will add
find_placeholder_info() calls in build_joinrel_tlist(), which seems
likely to make it more of an issue: a joinrel emitting lots of
PlaceHolderVars would incur O(N^2) cost, and we might be building
a lot of joinrels in complex queries. Hence, add an array that
can be indexed directly by phid to make the lookups constant-time.
Discussion: https://postgr.es/m/1405792.1660677844@sss.pgh.pa.us
The standard way to check for list emptiness is to compare the
List pointer to NIL; our list code goes out of its way to ensure
that that is the only representation of an empty list. (An
acceptable alternative is a plain boolean test for non-null
pointer, but explicit mention of NIL is usually preferable.)
Various places didn't get that memo and expressed the condition
with list_length(), which might not be so bad except that there
were such a variety of ways to check it exactly: equal to zero,
less than or equal to zero, less than one, yadda yadda. In the
name of code readability, let's standardize all those spellings
as "list == NIL" or "list != NIL". (There's probably some
microscopic efficiency gain too, though few of these look to be
at all performance-critical.)
A very small number of cases were left as-is because they seemed
more consistent with other adjacent list_length tests that way.
Peter Smith, with bikeshedding from a number of us
Discussion: https://postgr.es/m/CAHut+PtQYe+ENX5KrONMfugf0q6NHg4hR5dAhqEXEc2eefFeig@mail.gmail.com
This event can happen when using SET ACCESS METHOD, as the data files of
the materialized need a full refresh but this command tag was not
updated to reflect that. The documentation is updated to track this
behavior.
Author: Onder Kalaci
Discussion: https://postgr.es/m/CACawEhXwHN3X34FiwoYG8vXR-oyUdrp7qcfRWSzS+NPahS5gSw@mail.gmail.com
Backpatch-through: 15
The assert, introduced by 9f1cf97bb5, is intended to check if the prefix
is terminated by a \0 byte, but it has two flaws. Firstly, prefix_size
includes the \0 byte, so prefix[prefix_size] points to the byte after
the null byte. Secondly, the check ensures the byte is not equal \0,
while it should be checking the opposite.
Backpatch-through: 14
Discussion: https://postgr.es/m/b99b6101-2f14-3796-3dfa-4a6cd7d4326d@enterprisedb.com
The current publisher code checks if UPDATE or DELETE can be executed with
the replica identity of the table even if it's a partitioned table. We can
skip checking the replica identity for partitioned tables because the
operations are actually performed on the leaf partitions (not the
partitioned table).
Reported-by: Brad Nicholson
Author: Hou Zhijie
Reviewed-by: Peter Smith, Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/CAMMnM%3D8i5DohH%3DYKzV0_wYuYSYvuOJoL9F5nzXTc%2ByzsG1f6rg%40mail.gmail.com
There's a convention that externally-visible libpq functions should
check for a NULL PGconn pointer, and fail gracefully instead of
crashing. PQflush() and PQisnonblocking() didn't get that memo
though. Also add a similar check to PQdefaultSSLKeyPassHook_OpenSSL;
while it's not clear that ordinary usage could reach that with a
null conn pointer, it's cheap enough to check, so let's be consistent.
Daniele Varrazzo and Tom Lane
Discussion: https://postgr.es/m/CA+mi_8Zm_mVVyW1iNFgyMd9Oh0Nv8-F+7Y3-BqwMgTMHuo_h2Q@mail.gmail.com
Since WRITE_NODE_FIELD() output always starts with a space, we don't
need to go out of our way to print another space right before it.
This change is only for visual appearance; the tokenizer on the
reading side would read it the same way (but there is no read support
for A_Expr at this time anyway).
When enlarging the work buffers of a VarStringSortSupport object,
varstrfastcmp_locale was careful to keep them in the ssup_cxt
memory context; but varstr_abbrev_convert just used palloc().
The latter creates a hazard that the buffers could be freed out
from under the VarStringSortSupport object, resulting in stomping
on whatever gets allocated in that memory later.
In practice, because we only use this code for ICU collations
(cf. 3df9c374e), the problem is confined to use of ICU collations.
I believe it may have been unreachable before the introduction
of incremental sort, too, as traditional sorting usually just
uses one context for the duration of the sort.
We could fix this by making the broken stanzas in varstr_abbrev_convert
match the non-broken ones in varstrfastcmp_locale. However, it seems
like a better idea to dodge the issue altogether by replacing the
pfree-and-allocate-anew coding with repalloc, which automatically
preserves the chunk's memory context. This fix does add a few cycles
because repalloc will copy the chunk's content, which the existing
coding assumes is useless. However, we don't expect that these buffer
enlargement operations are performance-critical. Besides that, it's
far from obvious that copying the buffer contents isn't required, since
these stanzas make no effort to mark the buffers invalid by resetting
last_returned, cache_blob, etc. That seems to be safe upon examination,
but it's fragile and could easily get broken in future, which wouldn't
get revealed in testing with short-to-moderate-size strings.
Per bug #17584 from James Inform. Whether or not the issue is
reachable in the older branches, this code has been broken on its
own terms from its introduction, so patch all the way back.
Discussion: https://postgr.es/m/17584-95c79b4a7d771f44@postgresql.org
SUSv3, all targeted Unixes and modern Windows have getaddrinfo() and
related interfaces. Drop the replacement implementation, and adjust
some headers slightly to make sure that the APIs are visible everywhere
using standard POSIX headers and names.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKG%2BL_3brvh%3D8e0BW_VfX9h7MtwgN%3DnFHP5o7X2oZucY9dg%40mail.gmail.com
It's possible to reach this case when work_mem is very small and tupsize
is (relatively) very large. In that case ExecChooseHashTableSize would
get an assertion failure, or with asserts off it'd compute nbuckets = 0,
which'd likely cause misbehavior later (I've not checked). To fix,
clamp the number of buckets to be at least 1.
This is due to faulty conversion of old my_log2() coding in 28d936031.
Back-patch to v13, as that was.
Zhang Mingli
Discussion: https://postgr.es/m/beb64ca0-91e2-44ac-bf4a-7ea36275ec02@Spark
Since HAVE_UNIX_SOCKETS is now defined unconditionally, remove the macro
and drop a small amount of dead code.
The last known systems not to have them (as far as I know at least) were
QNX, which we de-supported years ago, and Windows, which now has them.
If a new OS ever shows up with the POSIX sockets API but without working
AF_UNIX, it'll presumably still be able to compile the code, and fail at
runtime with an unsupported address family error. We might want to
consider adding a HINT that you should turn off the option to use it if
your network stack doesn't support it at that point, but it doesn't seem
worth making the relevant code conditional at compile time.
Also adjust a couple of places in the docs and comments that referred to
builds without Unix-domain sockets, since there aren't any. Windows
still gets a special mention in those places, though, because we don't
try to use them by default there yet.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BL_3brvh%3D8e0BW_VfX9h7MtwgN%3DnFHP5o7X2oZucY9dg%40mail.gmail.com
Most parts of the parser can expect that the stack overflow check
in transformExprRecurse() will trigger before things get desperate.
However, transformFromClauseItem() can recurse directly to self
without having analyzed any expressions, so it's possible to drive
it to a stack-overrun crash. Add a check to prevent that.
Per bug #17583 from Egor Chindyaskin. Back-patch to all supported
branches.
Richard Guo
Discussion: https://postgr.es/m/17583-33be55b9f981f75c@postgresql.org
Assume that we can use LWARX hint flags and the LWSYNC instruction
on any PPC machine. The check on the assembler's behavior was only
needed for Apple's old assembler, which is no longer of interest
now that we've de-supported all PPC-era versions of macOS (thanks
to them not having clock_gettime()). Also, given an up-to-date
assembler these instructions work even on Apple's old hardware.
It seems quite unlikely that anyone would be interested in running
current Postgres on PPC hardware that's so old as to not have
these instructions.
Hence, rip out associated configure test and manual configuration
options, and just use the modernized instructions all the time.
Also, update atomics/arch-ppc.h to use these instructions as well.
(It was already using LWSYNC unconditionally in another place,
providing further proof that nobody is using PG on hardware old
enough to have a problem with that.)
Discussion: https://postgr.es/m/166622.1660323391@sss.pgh.pa.us
<sys/resource.h> is in SUSv2 and is on all targeted Unix systems. We
have a replacement for getrusage() on Windows, so let's just move its
declarations into src/include/port/win32/sys/resource.h so that we can
use a standard-looking #include. Also remove an obsolete reference to
CLK_TCK. Also rename src/port/getrusage.c to win32getrusage.c,
following the convention for Windows-only fallback code.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKG%2BL_3brvh%3D8e0BW_VfX9h7MtwgN%3DnFHP5o7X2oZucY9dg%40mail.gmail.com
<sys/un.h> is in SUSv3 and every targeted Unix has it. Some Windows
tool chains may still lack the approximately equivalent header
<afunix.h>, so we already defined struct sockaddr_un ourselves on that
OS for now. To harmonize things a bit, move our definition into a new
header src/include/port/win32/sys/un.h.
HAVE_UNIX_SOCKETS is now defined unconditionally. We migh remove that
in a separate commit, pending discussion.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BL_3brvh%3D8e0BW_VfX9h7MtwgN%3DnFHP5o7X2oZucY9dg%40mail.gmail.com
<sys/uio.h> is in SUSv2, and all targeted Unix system have it, so we
might as well drop the probe (in fact we never really needed this one).
It's where struct iovec is defined, and as a common extension, it's also
where non-standard preadv() and pwritev() are declared on systems that
have them.
We should also be able to assume that IOV_MAX is defined on Unix.
To spell out what our pg_iovec.h header does for the OSes in the build
farm as of today:
Windows: our own struct and functions
Solaris, Cygwin: <sys/uio.h>'s struct, our own functions
Every other Unix: <sys/uio.h>'s struct and functions
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKG%2BL_3brvh%3D8e0BW_VfX9h7MtwgN%3DnFHP5o7X2oZucY9dg%40mail.gmail.com
As of 897795240c, check constraints can
be declared invalid. But that patch didn't update _outConstraint() to
also show the relevant struct fields (which were only applicable to
foreign keys before that). This currently only affects debugging
output, so no impact in practice.
96ef3b8ff1 accidentally copied a not
applicable comment from the float8_pass_by_value code to the
data_checksums code. Remove that.
87d3b35a1c changed pg_upgrade to
checking the checksum version rather than just the Boolean presence of
checksums, but didn't change the field type in its ControlData struct
from bool. So this would not work correctly if there ever is a
checksum version larger than 1.
If an error occurs before we close the fake relcache entry, the the
fake relcache entry will be destroyed by the SmgrRelation will
survive until end of transaction. Its smgr_owner pointer ends up
pointing to already-freed memory.
The original reason for using a fake relcache entry here was to try
to avoid reusing an SMgrRelation across a relevant invalidation. To
avoid that problem, just call smgropen() again each time we need a
reference to it. Hopefully someday we will come up with a more
elegant approach, but accessing uninitialized memory is bad so let's
do this for now.
Dilip Kumar, reviewed by Andres Freund and Tom Lane. Report by
Justin Pryzby.
Discussion: http://postgr.es/m/20220802175043.GA13682@telsasoft.com
Discussion: http://postgr.es/m/CAFiTN-vSFeE6_W9z698XNtFROOA_nSqUXWqLcG0emob_kJ+dEQ@mail.gmail.com
The grammar added for MERGE inadvertently made it accepted syntax in
places that were not prepared to deal with it -- namely COPY and inside
CTEs, but invoking these things with MERGE currently causes assertion
failures or weird misbehavior in non-assertion builds. Protect those
places by checking for it explicitly until somebody decides to implement
it.
Reported-by: Alexey Borzov <borz_off@cs.msu.su>
Discussion: https://postgr.es/m/17579-82482cd7b267b862@postgresql.org
The set of fields printed by _outConstraint() in the CONSTR_IDENTITY
case didn't match the set of fields actually used in that case. (The
code was probably uncarefully copied from the CONSTR_DEFAULT case.)
Fix that by using the right set of fields. Since there is no read
support for this node type, this is really just for debugging output
right now, so it doesn't affect anything important.
We now require that compilers support this, and it makes the code easier
to trace, so change it. I'm fixated on this particular struct because
I've had to navigate around it a number of times, but there are others
elsewhere that could use the same treatment.
Discussion: https://postgr.es/m/20220810140300.ixhbmm4svo5yypv6@alvherre.pgsql
Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION
records to know if the transaction has modified the catalog, and that
information is not serialized to snapshot. Therefore, after the restart,
if the logical decoding decodes only the commit record of the transaction
that has actually modified a catalog, we will miss adding its XID to the
snapshot. Thus, we will end up looking at catalogs with the wrong
snapshot.
To fix this problem, this change adds the list of transaction IDs and
sub-transaction IDs, that have modified catalogs and are running during
snapshot serialization, to the serialized snapshot. After restart or
otherwise, when we restore from such a serialized snapshot, the
corresponding list is restored in memory. Now, when decoding a COMMIT
record, we check both the list and the ReorderBuffer to see if the
transaction has modified catalogs.
Since this adds additional information to the serialized snapshot, we
cannot backpatch it. For back branches, we took another approach.
We remember the last-running-xacts list of the decoded RUNNING_XACTS
record after restoring the previously serialized snapshot. Then, we mark
the transaction as containing catalog changes if it's in the list of
initial running transactions and its commit record has
XACT_XINFO_HAS_INVALS. This doesn't require any file format changes but
the transaction will end up being added to the snapshot even if it has
only relcache invalidations. But that won't be a problem since we use
snapshot built during decoding only to read system catalogs.
This commit bumps SNAPBUILD_VERSION because of a change in SnapBuild.
Reported-by: Mike Oh
Author: Masahiko Sawada
Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi
Backpatch-through: 10
Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com
As reported by Yura Sokolov, scanning the snapshot->xip array has
noticeable impact on scalability when there are a large number of
concurrent writers. Use the optimized (on x86-64) routine from b6ef16756
to speed up searches through the [sub]xip arrays. One benchmark showed
a 5% increase in transaction throughput with 128 concurrent writers,
and a 50% increase in a pathological case of 1024 writers. While a hash
table would have scaled even better, it was ultimately rejected because
of concerns around code complexity and memory allocation. Credit to Andres
Freund for the idea to optimize linear search using SIMD instructions.
Nathan Bossart
Reviewed by: Andres Freund, John Naylor, Bharath Rupireddy, Masahiko Sawada
Discussion: https://postgr.es/m/20220713170950.GA3116318%40nathanxps13
fmgr_sql must make expanded-datum arguments read-only, because
it's possible that the function body will pass the argument to
more than one callee function. If one of those functions takes
the datum's R/W property as license to scribble on it, then later
callees will see an unexpected value, leading to wrong answers.
From a performance standpoint, it'd be nice to skip this in the
common case that the argument value is passed to only one callee.
However, detecting that seems fairly hard, and certainly not
something that I care to attempt in a back-patched bug fix.
Per report from Adam Mackler. This has been broken since we
invented expanded datums, so back-patch to all supported branches.
Discussion: https://postgr.es/m/WScDU5qfoZ7PB2gXwNqwGGgDPmWzz08VdydcPFLhOwUKZcdWbblbo-0Lku-qhuEiZoXJ82jpiQU4hOjOcrevYEDeoAvz6nR0IU4IHhXnaCA=@mackler.email
Discussion: https://postgr.es/m/187436.1660143060@sss.pgh.pa.us
Use SSE2 intrinsics to speed up the search, where available. Otherwise,
use a simple 'for' loop. The motivation to add this now is to speed up
XidInMVCCSnapshot(), which is the reason only unsigned 32-bit integer
arrays are optimized. Other types are left for future work, as is the
extension of this technique to non-x86 platforms.
Nathan Bossart
Reviewed by: Andres Freund, Bharath Rupireddy, Masahiko Sawada
Discussion: https://postgr.es/m/20220713170950.GA3116318%40nathanxps13
This commit addresses a few things around GUCs:
- The TCP-related parameters (the four tcp_keepalives_* and
client_connection_check_interval are listed in postgresql.conf.sample in
a subsection called "TCP settings" of "CONNECTIONS AND AUTHENTICATION",
but they did not have their own group name in guc.c.
- enable_group_by_reordering, stats_fetch_consistency and
recovery_prefetch had an inconsistent description, missing a dot at the
end.
- In postgresql.conf.sample, "Process title" should not have a section
of its own, but it should be a subsection of "REPORTING AND LOGGING".
This impacts the contents of pg_settings, which could be seen as a
compatibility break, so no backpatch is done. This is similar to the
cleanup done in a55a984.
Author: Shinya Kato
Discussion: https://postgr.es/m/5e0c9c608624eafbba910c344282cb14@oss.nttdata.com
Commit 623cc673 removed gettimeofday(), and commits 24c3ce8f and
495ed0ef removed support for very old Windows releases with low accuracy
timers, but references to those things were left behind in comments.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/295419.1659918447%40sss.pgh.pa.us
Commit 964d01ae9 was a few bricks shy of a load here: the script
checked whether gen_node_support.pl itself had been updated since it
was last run, but not whether any of its input files had been updated.
Fix that. While here, scrape the list of input files from the
Makefiles rather than having a duplicate copy, as we do for most
other lists of source files.
In passing, improve gen_node_support.pl's error report for an
incorrect file list.
Per gripe from Amit Kapila.
Discussion: https://postgr.es/m/CAA4eK1KQk4vP-3mTAz26h-PRUZaGu8Fc=q-ZKSajsAthH0A15w@mail.gmail.com
Per buildfarm, the output order of \dx+ isn't consistent across
locales. Apply NO_LOCALE to force C locale. There might be a
more localized way, but I'm not seeing it offhand, and anyway
there is nothing in this test module that particularly cares
about locales.
Security: CVE-2022-2625
Previously, if an extension script did CREATE OR REPLACE and there was
an existing object not belonging to the extension, it would overwrite
the object and adopt it into the extension. This is problematic, first
because the overwrite is probably unintentional, and second because we
didn't change the object's ownership. Thus a hostile user could create
an object in advance of an expected CREATE EXTENSION command, and would
then have ownership rights on an extension object, which could be
modified for trojan-horse-type attacks.
Hence, forbid CREATE OR REPLACE of an existing object unless it already
belongs to the extension. (Note that we've always forbidden replacing
an object that belongs to some other extension; only the behavior for
previously-free-standing objects changes here.)
For the same reason, also fail CREATE IF NOT EXISTS when there is
an existing object that doesn't belong to the extension.
Our thanks to Sven Klemm for reporting this problem.
Security: CVE-2022-2625
Previously we fell back to __FUNCTION__ and then NULL. As __func__ is in C99
that shouldn't be necessary anymore.
Solution.pm defined HAVE_FUNCNAME__FUNCTION instead of
HAVE_FUNCNAME__FUNC (originating in 4164e6636e), as at some point in the past
MSVC only supported __FUNCTION__. Our minimum version supports __func__.
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220807012914.ydz73yte6j3coulo@awork3.anarazel.de
strtof() is in C99 and all targeted systems have it. We can remove the
configure probe and some dead code, but we still need replacement code
for a couple of systems that have known buggy implementations selected
via platform template.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/152683.1659830125%40sss.pgh.pa.us
Previously we bothered to forward-declare struct timezone, following man
pages on typical systems, but POSIX actually says the argument (which we
ignore anyway) is void *. Drop a line.
While here, add an assertion that nobody actually uses the tzp argument.
Previously we did extra work to select between Windows APIs needed on
older releases, but now we can just use the higher resolution function
directly.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGKwRpvGfcfq2qNVAQS2Wg1B9eA9QRhAmVSyJt1zsCN2sQ%40mail.gmail.com
Buildfarm member jacana (MinGW) has been complaining that
get_iso_localename is defined but not used. This is evidently
fallout from the recent removal of VS2013 support in pg_locale.c.
Rearrange the #ifs so that get_iso_localename and its subroutine
search_locale_enum won't get built on MinGW.
I also noticed that a comment in get_iso_localename cross-
referenced a comment in IsoLocaleName that isn't there anymore.
Put back what I think is the referenced material.
RelationCopyStorageUsingBuffer thought it could skip copying
empty pages, but of course that does not work at all, because
subsequent blocks will be out of place.
Also fix it to acquire share lock on the source buffer. It *might*
be safe to not do that, but it's not very certain, and I don't think
this code deserves any benefit of the doubt.
Dilip Kumar, per complaint from me
Discussion: https://postgr.es/m/3679800.1659654066@sss.pgh.pa.us
There's no known supported system needing 1 argument gettimeofday()
support. The test for it was added a long time ago (92c6bf9775). Remove.
Until now we tested whether a gettimeofday() fallback is needed when
targetting windows. Which lead to the odd result that HAVE_GETTIMEOFDAY only
being defined when targetting MinGW (which has gettimeofday() since at least
2007). As the fallback is specific to msvc, remove the configure code and
rename src/port/gettimeofday.c to src/port/win32gettimeofday.c.
While at it, also remove the definition of struct timezone, a forward
declaration of the struct is sufficient.
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220806000311.ywx65iuchvj4qn2k@awork3.anarazel.de
Commit 59be1c942a already tried to make
src/test/recovery/t/033_replay_tsp_drops more reliable, but it wasn't
enough. Try to improve on that by making this use of a replication slot
to be more like others. Also, don't drop the slot.
Make a few other stylistic changes while at it. It's still quite slow,
which is another thing that we need to fix in this script.
Backpatch to all supported branches.
Discussion: https://postgr.es/m/349302.1659191875@sss.pgh.pa.us
Now that lstat() reports junction points with S_IFLNK/S_ISLINK(), and
unlink() can unlink them, there is no need for conditional code for
Windows in a few places. That was expressed by testing for WIN32 or
S_ISLNK, which we can now constant-fold.
The coding around pgwin32_is_junction() was a bit suspect anyway, as we
never checked for errors, and we also know that errors can be spuriously
reported because of transient sharing violations on this OS. The
lstat()-based code has handling for that.
This also reverts 4fc6b6ee on master only. That was done because
lstat() didn't previously work for symlinks (junction points), but now
it does.
Tested-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/CA%2BhUKGLfOOeyZpm5ByVcAt7x5Pn-%3DxGRNCvgiUPVVzjFLtnY0w%40mail.gmail.com
Junction points will be reported with S_ISLNK(x.st_mode), simulating
POSIX lstat(). stat() will follow pseudo-symlinks, like in POSIX (but
only one level before giving up, unlike in POSIX).
This completes a TODO left by commit bed90759fc.
Tested-by: Andrew Dunstan <andrew@dunslane.net> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGLfOOeyZpm5ByVcAt7x5Pn-%3DxGRNCvgiUPVVzjFLtnY0w%40mail.gmail.com
strtoll was backfilled with either __strtoll or strtoq on systems without
strtoll. The last such system on the buildfarm was an ancient HP-UX animal. We
don't support HP-UX anymore, so remove.
On other systems strtoll was present, but did not have a declaration. The last
known instance on the buildfarm was running an ancient OSX and shut down in
2019.
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220804013546.h65najrzig764jar@awork3.anarazel.de
nbtree deduplication passes add tuples from the original/target page to
a temp page, merging as necessary. The temp page is copied back to the
target permanent page in the critical section. This is similar to the
approach taken by nbtree page splits.
Adjust comments that referred to updating the original page in-place as
tuples were merged. These were left over from earlier versions of the
deduplication patch that didn't yet use a temp page.
On closer inspection, mcv.c isn't as broken for ScalarArrayOpExpr
as I thought. The Var-on-right issue is real enough, but actually
it does cope fine with a NULL array constant --- I was misled by
an XXX comment suggesting it didn't. Undo that part of the code
change, and replace the XXX comment with something less misleading.
Since v14, the extended stats machinery will try to estimate for
otherwise-unsupported boolean expressions if they match an expression
available from an extended stats object. mcv.c did not get the memo
about this, and would spit up with "unknown clause type". Fortunately
the case is easy to handle, since we can expect the expression yields
boolean.
While here, replace some not-terribly-on-point assertions with
simpler runtime tests for lookup failure. That seems appropriate
so that we get an elog not a crash if we somehow get to the new
it-should-be-a-bool-expression code with a subexpression that
doesn't match any stats column.
Per report from Danny Shemesh. Thanks to Justin Pryzby for
preliminary investigation.
Discussion: https://postgr.es/m/CAFZC=QqD6=27wQPOW1pbRa98KPyuyn+7cL_Ay_Ck-roZV84vHg@mail.gmail.com
statext_is_compatible_clause_internal() checked that the arguments
of a ScalarArrayOpExpr are one Var and one Const, but it would allow
cases where the Const was on the left. Subsequent uses of the clause
are not expecting that and would suffer assertion failures or core
dumps. mcv.c also had not bothered to cope with the case of a NULL
array constant, which seems really unacceptably sloppy of somebody.
(Although our tools failed us there too, since AFAIK neither Coverity
nor any compiler warned of the obvious use-of-uninitialized-variable
condition.) It seems best to handle that by having
statext_is_compatible_clause_internal() reject it.
Noted while fixing bug #17570. Back-patch to v13 where the
extended stats code grew some awareness of ScalarArrayOpExpr.
Commit a4d75c86b improved the extended-stats logic to allow extended
stats to be collected on expressions not just bare Vars. To apply
such stats, we first verify that the user has permissions to read all
columns used in the stats. (If not, the query will likely fail at
runtime, but the planner ought not do so.) That had to get extended
to check permissions of columns appearing within such expressions,
but the code for that was completely wrong: it applied pull_varattnos
to the wrong pointer, leading to "unrecognized node type" failures.
Furthermore, although you couldn't get to this because of that bug,
it failed to account for the attnum offset applied by pull_varattnos.
This escaped recognition so far because the code in question is not
reached when the user has whole-table SELECT privilege (which is the
common case), and because only subexpressions not specially handled
by statext_is_compatible_clause_internal() are at risk.
I think a large part of the reason for this bug is under-documentation
of what statext_is_compatible_clause() is doing and what its arguments
are, so do some work on the comments to try to improve that.
Per bug #17570 from Alexander Kozhemyakin. Patch by Richard Guo;
comments and other cosmetic improvements by me. (Thanks also to
Japin Li for diagnosis.) Back-patch to v14 where the bug came in.
Discussion: https://postgr.es/m/17570-f2f2e0f4bccf0965@postgresql.org
Having additional triggers in a test table made the ORDER BY clauses in
old queries underspecified. Add another column there for stability.
Per sporadic buildfarm pink.
Using ATSimpleRecursion() in ATPrepCmd() to do so as bbb927b4db did is
not correct, because ATPrepCmd() can't distinguish between triggers that
may be cloned and those that may not, so would wrongly try to recurse
for the latter category of triggers.
So this commit restores the code in EnableDisableTrigger() that
86f575948c had added to do the recursion, which would do it only for
triggers that may be cloned, that is, row-level triggers. This also
changes tablecmds.c such that ATExecCmd() is able to pass the value of
ONLY flag down to EnableDisableTrigger() using its new 'recurse'
parameter.
This also fixes what seems like an oversight of 86f575948c that the
recursion to partition triggers would only occur if EnableDisableTrigger()
had actually changed the trigger. It is more apt to recurse to inspect
partition triggers even if the parent's trigger didn't need to be
changed: only then can we be certain that all descendants share the same
state afterwards.
Backpatch all the way back to 11, like bbb927b4db. Care is taken not
to break ABI compatibility (and that no catversion bump is needed.)
Co-authored-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com
clock_gettime() is in SUSv2 and all targeted Unix systems have it.
Remove a chunk of fallback code for old Unix is no longer reachable on
modern systems, and untested as of the retirement of build farm animal
prairiedog.
There is no need to retain a HAVE_CLOCK_GETTIME macro here, because it
is already used in a context with Unix and Windows code paths.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
Commit dd299df818, which made heap TID a tiebreaker nbtree index
column, introduced new rules on page space management to make suffix
truncation safe for v4+ indexes. New pivot tuples (generated by suffix
truncation during leaf page splits) sometimes require dedicated extra
space at the end of a new leaf page high key/pivot to store a heap TID
using a special representation (a representation only used in pivot
tuples).
The definition of "1/3 of a page" was reduced by a single MAXALIGN()
quantum for v4 indexes to make sure that the final enlarged pivot tuple
always fit, even with a split point whose firstright tuple happened to
already be at the "1/3 of a page" limit (limit for non-pivot tuples).
Internal pages (which only contain pivot tuples) stuck with the original
"1/3 of a page" definition. This scheme made it impossible for any page
split to fail to free enough space for its newitem, which is never okay.
The macro that determines whether non-pivot tuples exceed their "1/3 of
a leaf page" restriction was structured as if space was needed for all
three tuples during a leaf page split (the new pivot plus two very large
adjoining non-pivots that are separated by the split). This was subtly
wrong, in that it accidentally relied on implementation details that
could (at least in theory) change in the future.
To fix, make the macro subtract a single MAXALIGN() quantum, once. The
macro evaluates to exactly the same value as before in practice. But it
no longer depends on the current layout of nbtree's special area struct.
No backpatch, since this isn't a live bug.
Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Robert Haas <robertmhaas@gmail.com>
Diagnosed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+Tgmoa7UBxivM7f6Ocx_qbq4=ky3uXc+WZNOBcVX+kvJvWOEA@mail.gmail.com
preadv() and pwritev() are not standardized by POSIX, but appeared in
NetBSD in 1999 and were adopted by at least OpenBSD, FreeBSD,
DragonFlyBSD, Linux, AIX, illumos and macOS. We don't use them much
yet, but an active proposal uses them heavily.
In 15, we had two replacement implementations for other OSes: one based
on lseek() + -v function if available for true vector I/O, and the other
based on a loop over p- function.
The former would be an obstacle to hypothetical future multi-threaded
code sharing file descriptors, while the latter would not, since commit
cf112c12. Furthermore, the number of targeted systems that could
benefit from the former's potential upside has dwindled to just one
niche OS, since macOS added the functions and we de-supported HP-UX.
That doesn't seem like a good trade-off.
Therefore, drop the lseek()-based variant, and also the pg_ prefix now
that the file position portability hazard is gone.
At the time of writing, the only systems in our build farm that lack
native preadv/pwritev and thus use fallback code are:
* Solaris (but not illumos)
* macOS before release 11.0
* Windows
With this commit, the above systems will now use the *same* fallback
code, the version that loops over pread()/pwrite(). Windows already
used that (though a later proposal may include true vector I/O for
Windows), so this decision really only affects Solaris, until it gets
around to adding these system calls.
Also remove some useless includes while here.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
This commit adjusts two log messages:
- When a field in pg_ident.conf is not populated, report the line of the
configuration file in an error context message instead of the main
entry.
- When parsing pg_ident.conf and finding an invalid regexp, add some
information about the line of the configuration file involved within an
error context message.
Author: Julien Rouhaud
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud
This is particularly useful when log_min_messages is set to FATAL, so as
one can know which file was not getting loaded whether hba_file or
ident_file are set to some non-default values. If using the default
values of these GUC parameters, the same reports are generated.
This commit changes the load (startup) and reload (SIGHUP) messages.
Author: Julien Rouhaud
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud
pread() and pwrite() are in SUSv2, and all targeted Unix systems have
them.
Previously, we defined pg_pread and pg_pwrite to emulate these function
with lseek() on old Unixen. The names with a pg_ prefix were a reminder
of a portability hazard: they might change the current file position.
That hazard is gone, so we can drop the prefixes.
Since the remaining replacement code is Windows-only, move it into
src/port/win32p{read,write}.c, and move the declarations into
src/include/port/win32_port.h.
No need for vestigial HAVE_PREAD, HAVE_PWRITE macros as they were only
used for declarations in port.h which have now moved into win32_port.h.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Greg Stark <stark@mit.edu>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
setenv() and unsetenv() are in SUSv3 and targeted Unix systems have
them. We still need special code for these on Windows, but that doesn't
require a configure probe.
This marks the first time we require a SUSv3 (POSIX.1-2001) facility
(rather than SUSv2). The replacement code removed here was not needed
on any targeted system or any known non-EOL'd Unix system, and was
therefore dead and untested.
No need for vestigial HAVE_SETENV and HAVE_UNSETENV macros, because we
provide a replacement for Windows, and we didn't previously test the
macros.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Greg Stark <stark@mit.edu>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
poll() and <poll.h> are in SUSv2 and all targeted Unix systems have
them.
Retain HAVE_POLL and HAVE_POLL_H macros for readability. There's an
error in latch.c that is now unreachable (since we always have one of
WIN32 or HAVE_POLL defined), but that falls out of a decision to keep
using defined(HAVE_POLL) instead of !defined(WIN32) to guard the poll()
code.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
link() is in SUSv2 and all targeted Unix systems have it. We have
replacement code for Windows that doesn't require a configure probe.
Since only Windows needs it, rename src/port/link.c to win32link.c like
other similar things.
There is no need for a vestigial HAVE_LINK macro, because we expect all
Unix and, with our replacement function, Windows systems to have it, so
we didn't have any tests around link() usage.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
symlink() and readlink() are in SUSv2 and all targeted Unix systems have
them. We have partial emulation on Windows. Code that raised runtime
errors on systems without it has been dead for years, so we can remove
that and also references to such systems in the documentation.
Define HAVE_READLINK and HAVE_SYMLINK macros on Unix. Our Windows
replacement functions based on junction points can't be used for
relative paths or for non-directories, so the macros can be used to
check for full symlink support. The places that deal with tablespaces
can just use symlink functions without checking the macros. (If they
did check the macros, they'd need to provide an #else branch with a
runtime or compile time error, and it'd be dead code.)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
setsid() is in SUSv2 and all targeted Unix systems have it. Retain a
HAVE_SETSID macro, defined on Unix only. That's easier to understand
than !defined(WIN32), for the optional code it governs.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
shm_open() is in SUSv2 and all targeted Unix systems have it.
We retain a HAVE_SHM_OPEN macro, because it's clearer to readers than
something like !defined(WIN32).
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
getrlimit() is in SUSv2 and all targeted systems have it.
Windows doesn't have it. We could just use #ifndef WIN32, but for a
little more explanation about why we're making things conditional, let's
retain the HAVE_GETRLIMIT macro. It's defined in port.h for Unix systems.
On systems that have it, it's not necessary to test for RLIMIT_CORE,
RLIMIT_STACK or RLIMIT_NOFILE macros, since SUSv2 requires those and all
targeted systems have them. Also remove references to a pre-historic
alternative spelling of RLIMIT_NOFILE, and coding that seemed to believe
that Cygwin didn't have it.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
dlopen() is in SUSv2 and all targeted Unix systems have it. We still
need replacement functions for Windows, but we don't need a configure
probe for that.
Since it's no longer needed by other operating systems, rename dlopen.c
to win32dlopen.c and move the declarations into win32_port.h.
Likewise, the macros RTLD_NOW and RTLD_GLOBAL now only need to be
defined on Windows, since all targeted Unix systems have 'em.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
The test is proving to be unreliable in the buildfarm, and we neither
agree on how best to fix it nor have time to do so before the upcoming
release. So for now, put things back to the way they were before commit
d498e052b4.
Discussion: http://postgr.es/m/3628089.1659640252@sss.pgh.pa.us
Adjusting this function was overlooked in commit 94aa7cc5f. The only
visible symptom (so far) is that INSERT ... ON CONFLICT could go into
an endless loop when inserting a null that has a conflict.
Richard Guo and Tom Lane, per bug #17558 from Andrew Kesper
Discussion: https://postgr.es/m/17558-3f6599ffcf52fd4a@postgresql.org
Ordinarily the functions called in this loop ought to have plenty
of CFIs themselves; but we've now seen a case where no such CFI is
reached, making the loop uninterruptible. Even though that's from
a recently-introduced bug, it seems prudent to install a CFI at
the loop level in all branches.
Per discussion of bug #17558 from Andrew Kesper (an actual fix for
that bug will follow).
Discussion: https://postgr.es/m/17558-3f6599ffcf52fd4a@postgresql.org
Remove the test case added by commit fac1b470a, which never actually
worked to expose the problem it claimed to test. Replace it with
a case that does expose the problem, and also covers the SRF-not-
at-the-top deficiency repaired in 1aa8dad41.
Richard Guo, with some editorialization by me
Discussion: https://postgr.es/m/17564-c7472c2f90ef2da3@postgresql.org
The use of "we" when referring to the active backend might be
misunderstood, so rephrase to make it clearer who is performing
the actions discussed in the comment.
Author: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Erikjan Rijkers <er@xs4all.nl>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAEG8a3LRSMqkvjiURiJoSi4aGWORpiXUmUfQQK5PaD6WfPzu3w@mail.gmail.com
SSE2 vector instructions are part of the spec for the 64-bit x86
architecture. Until now we have relied on the compiler to autovectorize
in some limited situations, but some useful coding idioms can only be
expressed explicitly via compiler intrinsics. To this end, add a header
that defines USE_SSE2 where available. While x86-only for now, we can
add other architectures in the future. This will also be the intended
place for helper functions that use vector operations.
Reviewed by Nathan Bossart and Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/CAFBsxsE2G_H_5Wbw%2BNOPm70-BK4xxKf86-mRzY%3DL2sLoQqM%2B-Q%40mail.gmail.com
Commit fac1b470a thought we could check for set-returning functions
by testing only the top-level node in an expression tree. This is
wrong in itself, and to make matters worse it encouraged others
to make the same mistake, by exporting tlist.c's special-purpose
IS_SRF_CALL() as a widely-visible macro. I can't find any evidence
that anyone's taken the bait, but it was only a matter of time.
Use expression_returns_set() instead, and stuff the IS_SRF_CALL()
genie back in its bottle, this time with a warning label. I also
added a couple of cross-reference comments.
After a fair amount of fooling around, I've despaired of making
a robust test case that exposes the bug reliably, so no test case
here. (Note that the test case added by fac1b470a is itself
broken, in that it doesn't notice if you remove the code change.
The repro given by the bug submitter currently doesn't fail either
in v15 or HEAD, though I suspect that may indicate an unrelated bug.)
Per bug #17564 from Martijn van Oosterhout. Back-patch to v13,
as the faulty patch was.
Discussion: https://postgr.es/m/17564-c7472c2f90ef2da3@postgresql.org
The sto_using_cursor and sto_using_select tests were coded to exercise
every permutation of their test steps, but AFAICS there is no value in
exercising more than one. This matters because each permutation costs
about six seconds, thanks to the "pg_sleep(6)". Perhaps we could
reduce that, but the useless permutations seem worth getting rid of
in any case. (Note that sto_using_hash_index got it right already.)
While here, clean up some other sloppiness such as an unused table.
This doesn't make too much difference in interactive testing, since the
wasted time is typically masked by parallelization with other tests.
However, the buildfarm runs this as a serial step, which means we can
expect to shave ~40 seconds from every buildfarm run. That makes it
worth back-patching.
Discussion: https://postgr.es/m/2515192.1659454702@sss.pgh.pa.us
The TAP tests for logical replication in src/test/subscription are using
the following code in many places to make sure that the subscription is
synchronized with the publisher:
$node_publisher->wait_for_catchup('tap_sub');
$node_subscriber->poll_query_until('postgres',
qq[SELECT count(1) = 0
FROM pg_subscription_rel
WHERE srsubstate NOT IN ('r', 's')]);
The new function wait_for_subscription_sync() can be used to replace the
above code. This eliminates duplicated code and makes it easier to write
future tests.
Author: Masahiko Sawada
Reviewed by: Amit Kapila, Shi yu
Discussion: https://postgr.es/m/CAD21AoC-fvAkaKHa4t1urupwL8xbAcWRePeETvshvy80f6WV1A@mail.gmail.com
Previously, a byte with the high bit set was just transmitted
as-is by charin() and charout(). This is problematic if the
database encoding is multibyte, because the result of charout()
won't be validly encoded, which breaks various stuff that
expects all text strings to be validly encoded. We've
previously decided to enforce encoding validity rather than try
to individually harden each place that might have a problem with
such strings, so it's time to do something about "char".
To fix, represent high-bit-set characters as \ooo (backslash
and three octal digits), following the ancient "escape" format
for bytea. charin() will continue to accept the old way as well,
though that is only reachable in single-byte encodings.
Add some test cases just so there is coverage for this code.
We'll otherwise leave this question undocumented as it was before,
because we don't really want to encourage end-user use of "char".
For the moment, back-patch into v15 so that this change appears
in 15beta3. If there's not great pushback we should consider
absorbing this change into the older branches.
Discussion: https://postgr.es/m/2318797.1638558730@sss.pgh.pa.us
ORDER BY / DISTINCT aggreagtes have, since implemented in Postgres, been
executed by always performing a sort in nodeAgg.c to sort the tuples in
the current group into the correct order before calling the transition
function on the sorted tuples. This was not great as often there might be
an index that could have provided pre-sorted input and allowed the
transition functions to be called as the rows come in, rather than having
to store them in a tuplestore in order to sort them once all the tuples
for the group have arrived.
Here we change the planner so it requests a path with a sort order which
supports the most amount of ORDER BY / DISTINCT aggregate functions and
add new code to the executor to allow it to support the processing of
ORDER BY / DISTINCT aggregates where the tuples are already sorted in the
correct order.
Since there can be many ORDER BY / DISTINCT aggregates in any given query
level, it's very possible that we can't find an order that suits all of
these aggregates. The sort order that the planner chooses is simply the
one that suits the most aggregate functions. We take the most strictly
sorted variation of each order and see how many aggregate functions can
use that, then we try again with the order of the remaining aggregates to
see if another order would suit more aggregate functions. For example:
SELECT agg(a ORDER BY a),agg2(a ORDER BY a,b) ...
would request the sort order to be {a, b} because {a} is a subset of the
sort order of {a,b}, but;
SELECT agg(a ORDER BY a),agg2(a ORDER BY c) ...
would just pick a plan ordered by {a} (we give precedence to aggregates
which are earlier in the targetlist).
SELECT agg(a ORDER BY a),agg2(a ORDER BY b),agg3(a ORDER BY b) ...
would choose to order by {b} since two aggregates suit that vs just one
that requires input ordered by {a}.
Author: David Rowley
Reviewed-by: Ronan Dunklau, James Coleman, Ranier Vilela, Richard Guo, Tom Lane
Discussion: https://postgr.es/m/CAApHDvpHzfo92%3DR4W0%2BxVua3BUYCKMckWAmo-2t_KiXN-wYH%3Dw%40mail.gmail.com
The select_outer_pathkeys_for_merge function made an attempt to build the
merge join pathkeys in the same order as query_pathkeys. This was done as
it may have led to no sort being required for an ORDER BY or GROUP BY
clause in the upper planner. However, this restriction seems overly
strict as it required that we match the query_pathkeys entirely or we
don't bother putting the merge join pathkeys in that order.
Here we relax this rule so that we use a prefix of the query_pathkeys
providing that prefix matches all of the join quals. This may provide the
upper planner with partially sorted input which will allow the use of
incremental sorts instead of full sorts.
Author: David Rowley
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/CAApHDvrtZu0PHVfDPFM4Yx3jNR2Wuwosv+T2zqa7LrhhBr2rRg@mail.gmail.com
Here we add code which detects when ExecFindPartition() continually finds
the same partition and add a caching layer to improve partition lookup
performance for such cases.
Both RANGE and LIST partitioned tables traditionally require a binary
search for the set of Datums that a partition needs to be found for. This
binary search is commonly visible in profiles when bulk loading into a
partitioned table. Here we aim to reduce the overhead of bulk-loading
into partitioned tables for cases where many consecutive tuples belong to
the same partition and make the performance of this operation closer to
what it is with a traditional non-partitioned table.
When we find the same partition 16 times in a row, the next search will
result in us simply just checking if the current set of values belongs to
the last found partition. For LIST partitioning we record the index into
the PartitionBoundInfo's datum array. This allows us to check if the
current Datum is the same as the Datum that was last looked up. This
means if any given LIST partition supports storing multiple different
Datum values, then the caching only works when we find the same value as
we did the last time. For RANGE partitioning we simply check if the given
Datums are in the same range as the previously found partition.
We store the details of the cached partition in PartitionDesc (i.e.
relcache) so that the cached values are maintained over multiple
statements.
No caching is done for HASH partitions. The majority of the cost in HASH
partition lookups are in the hashing function(s), which would also have to
be executed if we were to try to do caching for HASH partitioned tables.
Since most of the cost is already incurred, we just don't bother. We also
don't do any caching for LIST partitions when we continually find the
values being looked up belong to the DEFAULT partition. We've no
corresponding index in the PartitionBoundInfo's datum array for this case.
We also don't cache when we find the given values match to a LIST
partitioned table's NULL partition. This is so cheap that there's no
point in doing any caching for this. We also don't cache for a RANGE
partitioned table's DEFAULT partition.
There have been a number of different patches submitted to improve
partition lookups. Hou, Zhijie submitted a patch to detect when the value
belonging to the partition key column(s) were constant and added code to
cache the partition in that case. Amit Langote then implemented an idea
suggested by me to remember the last found partition and start to check if
the current values work for that partition. The final patch here was
written by me and was done by taking many of the ideas I liked from the
patches in the thread and redesigning other aspects.
Discussion: https://postgr.es/m/OS0PR01MB571649B27E912EA6CC4EEF03942D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Author: Amit Langote, Hou Zhijie, David Rowley
Reviewed-by: Amit Langote, Hou Zhijie
I thought commit fd96d14d9 had plugged all the holes of this sort,
but no, function RTEs could produce oversize tuples too, either
via long coldeflists or just from multiple functions in one RTE.
(I'm pretty sure the other variants of base RTEs aren't a problem,
because they ultimately refer to either a table or a sub-SELECT,
whose widths are enforced elsewhere. But we explicitly allow join
RTEs to be overwidth, as long as you don't try to form their
tuple result.)
Per further discussion of bug #17561. As before, patch all branches.
Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
errno was not reported correctly after attempting to clone a file,
leading to incorrect error reports. While scanning through the code, I
have not noticed any similar mistakes.
Error introduced in 3a769d8.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220731134135.GY15006@telsasoft.com
Backpatch-through: 12
This commit adds some test coverage for ee79647 (prevent BASE_BACKUP
from running in the middle of another base backup) and b24b2be
(BASE_BACKUP cancellation followed by pg_backup_start), caused by the
interactions of replication and SQL commands in a logical replication
connection in a WAL sender.
The second test uses a design close to what has been introduced in
0475a97f, where BASE_BACKUP is throttled to give enough room for a
cancellation, though this time we rely on psql with multiple -c
switches to keep a connection around for the second query.
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/Ys/NCI4Eo9300GnQ@paquier.xyz
In the short time this function has existed, it's already proven to be
a nontrivial maintenance burden, since it has to be updated whenever a
node tag is added or removed. Although in principle we could now
automate that, I see little justification for having such functionality
here at all. The function is only being applied to utility statements,
for which we already have infrastructure for obtaining string names.
Moreover, that infrastructure produces already-familiar-to-users names,
unlike nodetag_to_string().
So, remove this function and use the existing infrastructure instead.
That saves over a thousand lines of largely-unreachable code.
Back-patch to v15 where this code came in. Although it seems unlikely
that v15's nodetag list will change anymore, we might as well keep the
two branches looking and acting alike; otherwise back-patching any
test-results changes in this area will be painful.
Discussion: https://postgr.es/m/843818.1659218928@sss.pgh.pa.us
These two new options can be used to either process all tables in
specific schemas or to skip processing all tables in specific
schemas. This change also refactors the handling of invalid
combinations of command-line options to a new helper function.
Author: Gilles Darold
Reviewed-by: Justin Pryzby, Nathan Bossart and Michael Paquier.
Discussion: https://postgr.es/m/929fbf3c-24b8-d454-811f-1d5898ab3e91%40migops.com
The code tried to access ARR_DIMS(v)[0] and ARR_LBOUND(v)[0]
whether or not those values exist. This made the range check
on the "n" argument unstable --- it might or might not fail, and
if it did it would report garbage for the allowed upper limit.
These bogus accesses would probably annoy Valgrind, and if you
were very unlucky even lead to SIGSEGV.
Report and fix by Martin Kalcher. Back-patch to v14 where this
function was added.
Discussion: https://postgr.es/m/baaeb413-b8a8-4656-5757-ef347e5ec11f@aboutsource.net
These flavors of ALTER TABLE were already shaped to report the
ObjectAddress of the partition attached or detached, but this data was
not added to what is collected for event triggers. The tests of
test_ddl_deparse are updated to show the modification in the data
reported.
Author: Hou Zhijie
Reviewed-by: Álvaro Herrera, Amit Kapila, Hayato Kuroda, Michael Paquier
Discussion: https://postgr.es/m/OS0PR01MB571626984BD099DADF53F38394899@OS0PR01MB5716.jpnprd01.prod.outlook.com
This module is expanded to track the description of the objects changed
in the subcommands of ALTER TABLE by reworking the function
get_altertable_subcmdtypes() (now named get_altertable_subcmdinfo) used
in the event trigger of the test. It now returns a set of rows made of
(subcommand type, object description) instead of a text array with only
the information about the subcommand type.
The tests have been lacking a lot of the subcommands added to
AlterTableType over the years. All the missing subcommands are added,
and the code is now structured so as the addition of a new subcommand
is detected by removing the default clause used in the switch for the
subcommand types.
The coverage of the module is increased from roughly 30% to 50%. More
could be done but this is already a nice improvement.
Author: Michael Paquier, Hou Zhijie
Reviewed-by: Álvaro Herrera, Amit Kapila, Hayato Kuroda
Discussion: https://postgr.es/m/OS0PR01MB571626984BD099DADF53F38394899@OS0PR01MB5716.jpnprd01.prod.outlook.com
Two callers of generate_useful_gather_paths were testing the wrong
thing when deciding whether to call that function: they checked for
being at the top of the current join subproblem, rather than being at
the actual top join. This'd result in failing to construct parallel
paths for a sub-join for which they might be useful.
While set_rel_pathlist() isn't actively broken, it seems best to
make its identical-in-intention test for this be like the other two.
This has been wrong all along, but given the lack of field complaints
I'm hesitant to back-patch into stable branches; we usually prefer
to avoid non-bug-fix changes in plan choices in minor releases.
It seems not too late for v15 though.
Richard Guo, reviewed by Antonin Houska and Tom Lane
Discussion: https://postgr.es/m/CAMbWs4-mH8Zf87-w+3P2J=nJB+5OyicO28ia9q_9o=Lamf_VHg@mail.gmail.com
It's allowed for an installation to remove postgresql.auto.conf,
so don't rely on that being present. Instead probe whether we can
read postmaster.pid. (If you've removed that, you broke the data
directory's multiple-postmaster interlock, not to mention pg_ctl.)
Per gripe from Michael Paquier.
Discussion: https://postgr.es/m/YuSZTsoBMObyY+vT@paquier.xyz
Instead of using command_ok() to run psql, use safe_psql(). wrasse
isn't happy, and it be because of failure to pass -X to the psql
invocation, which safe_psql() will do automatically.
Since safe_psql() returns standard output instead of writing it to
a file, this requires some changes to the incantation for running
'diff'.
Test against the 'regression' database rather than 'postgres' so
we test more than just one table. That also means we need to record
the horizons later, after the test does "VACUUM FULL pg_largeobject".
Add an ORDER BY clause to the horizon query for stability.
Patch by me, reviewed by Tom Lane.
Discussion: http://postgr.es/m/CA+TgmoaGBbpzgu3=du1f9zDUbkfycO0y=_uWrLFy=KKEqXWeLQ@mail.gmail.com
The new test is from commit 9e4f914b5e.
With this setting messages have SQL error numbers included, so that
needs to be provided for in the pattern looked for.
We must issue the TRUNCATE command first and update relfrozenxid
and relminmxid afterward; otherwise, TRUNCATE overwrites the
previously-set values.
Add a test case like I should have done the first time.
Per buildfarm report from TestUpgradeXversion.pm, by way of Tom
Lane.
There wasn't an especially nice way to read all of a file while
passing missing_ok = true. Add an additional overloaded variant
to support that use-case.
While here, refactor the C code to avoid a rats-nest of PG_NARGS
checks, instead handling the argument collection in the outer
wrapper functions. It's a bit longer this way, but far more
straightforward.
(Upon looking at the code coverage report for genfile.c, I was
impelled to also add a test case for pg_stat_file() -- tgl)
Kyotaro Horiguchi
Discussion: https://postgr.es/m/20220607.160520.1984541900138970018.horikyota.ntt@gmail.com
A RowExpr with more than MaxTupleAttributeNumber columns would fail at
execution anyway, since we cannot form a tuple datum with more than that
many columns. While heap_form_tuple() has a check for too many columns,
it emerges that there are some intermediate bits of code that don't
check and can be driven to failure with sufficiently many columns.
Checking this at parse time seems like the most appropriate place to
install a defense, since we already check SELECT list length there.
While at it, make the SELECT-list-length error use the same errcode
(TOO_MANY_COLUMNS) as heap_form_tuple does, rather than the generic
PROGRAM_LIMIT_EXCEEDED.
Per bug #17561 from Egor Chindyaskin. The given test case crashes
in all supported branches (and probably a lot further back),
so patch all.
Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
The earlier commit used pg_class.relfilenode where it should have
used pg_class.oid. This could lead to emitting an UPDATE statement
into the dump that would update nothing (or the wrong thing) when
executed in the new cluster, resulting in relfrozenxid and
relminmxid being improperly carried forward for pg_largeobject.
Noticed by Dilip Kumar.
Discussion: http://postgr.es/m/CAFiTN-ty1Gzs6stk2vt9BJiq0m0hzf=aPnh3a-4Z3Tk5GzoENw@mail.gmail.com
On FreeBSD, the new test fails due to a WAL file being removed before
the standby has had the chance to copy it. Fix by adding a replication
slot to prevent the removal until after the standby has connected.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reported-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAEze2Wj5nau_qpjbwihvmXLfkAWOZ5TKdbnqOc6nKSiRJEoPyQ@mail.gmail.com
Commit 9a974cbcba arranged to preserve
the relfilenode of user tables across pg_upgrade, but failed to notice
that pg_upgrade treats pg_largeobject as a user table and thus it needs
the same treatment. Otherwise, large objects will appear to vanish
after a pg_upgrade.
Commit d498e052b4 fixed this problem
by teaching pg_dump to UPDATE pg_class.relfilenode for pg_largeobject
and its index. However, because an UPDATE on the catalog rows doesn't
change anything on disk, this can leave stray files behind in the new
cluster. They will normally be empty, but it's a little bit untidy.
Hence, this commit arranges to do the same thing using DDL. Specifically,
it makes TRUNCATE work for the pg_largeobject catalog when in
binary-upgrade mode, and it then uses that command in binary-upgrade
dumps as a way of setting pg_class.relfilenode for pg_largeobject and
its index. That way, the old files are removed from the new cluster.
Discussion: http://postgr.es/m/CA+TgmoYYMXGUJO5GZk1-MByJGu_bB8CbOL6GJQC8=Bzt6x6vDg@mail.gmail.com
In the initial data sort, if the bucket numbers are the same then
next sort on the hash value. Because index pages are kept in
hash value order, this gains a little speed by allowing the
eventual tuple insertions to be done sequentially, avoiding repeated
data movement within PageAddItem. This seems to be good for overall
speedup of 5%-9%, depending on the incoming data.
Simon Riggs, reviewed by Amit Kapila
Discussion: https://postgr.es/m/CANbhV-FG-1ZNMBuwhUF7AxxJz3u5137dYL-o6hchK1V_dMw86g@mail.gmail.com
Commit b0a55e4329 missed a few places
where we are referring to the number used as a part of the relation
filename as an "OID". We now want to call that a "RelFileNumber".
Some of these places actually made it sound like the OID in question
is pg_class.oid rather than pg_class.relfilenode, which is especially
good to clean up.
Dilip Kumar with some editing by me.
Crash recovery on standby may encounter missing directories
when replaying database-creation WAL records. Prior to this
patch, the standby would fail to recover in such a case;
however, the directories could be legitimately missing.
Consider the following sequence of commands:
CREATE DATABASE
DROP DATABASE
DROP TABLESPACE
If, after replaying the last WAL record and removing the
tablespace directory, the standby crashes and has to replay the
create database record again, crash recovery must be able to continue.
A fix for this problem was already attempted in 49d9cfc68b, but it
was reverted because of design issues. This new version is based
on Robert Haas' proposal: any missing tablespaces are created
during recovery before reaching consistency. Tablespaces
are created as real directories, and should be deleted
by later replay. CheckRecoveryConsistency ensures
they have disappeared.
The problems detected by this new code are reported as PANIC,
except when allow_in_place_tablespaces is set to ON, in which
case they are WARNING. Apart from making tests possible, this
gives users an escape hatch in case things don't go as planned.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Author: Paul Guo <paulguo@gmail.com>
Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions)
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions)
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Diagnosed-by: Paul Guo <paulguo@gmail.com>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
On Windows with MSVC, get_dirent_type() was recently made to return
DT_LNK for junction points by commit 9d3444dc, which fixed some
defective dirent.c code.
On Windows with Cygwin, get_dirent_type() already worked for symlinks,
as it does on POSIX systems, because Cygwin has its own fake symlinks
that behave like POSIX (on closer inspection, Cygwin's dirent has the
BSD d_type extension but it's probably always DT_UNKNOWN, so we fall
back to lstat(), which understands Cygwin symlinks with S_ISLNK()).
On Windows with MinGW/MSYS, we need extra code, because the MinGW
runtime has its own readdir() without d_type, and the lstat()-based
fallback has no knowledge of our convention for treating junctions as
symlinks.
Back-patch to 14, where get_dirent_type() landed.
Reported-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/b9ddf605-6b36-f90d-7c30-7b3e95c46276%40dunslane.net
The catalog contents haven't changed, but it's good to make clear
that initdb is required. Changing RELMAPPER_FILEMAGIC would be more
appropriate, but that doesn't actually produce a useful diagnostic,
so cheat by doing this instead.
Discussion: http://postgr.es/m/20220727171939.6ixixqcjt5riil2o@alvherre.pgsql
Commit d8cd0c6c95 introduced a file
rename that could fail on Windows, probably due to other backends
having an open file handle to the old file of the same name.
Re-arrange the locking slightly to prevent that, by making sure the
open() and close() run while we hold the lock.
Thomas Munro. I added an explanatory comment.
Discussion: https://postgr.es/m/CA%2BhUKGLZtCTgp4NTWV-wGbR2Nyag71%3DEfYTKjDKnk%2BfkhuFMHw%40mail.gmail.com
GetSubscriptionRelations() and GetSubscriptionNotReadyRelations() share
mostly the same code, which scans pg_subscription_rel and fetches all
the relations of a given subscription. The only difference is that the
second routine looks for all the relations not in a ready state. This
commit refactors the code to use a single routine, shaving a bit of
code.
Author: Vignesh C
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Michael Paquier, Peter
Smith
Discussion: https://postgr.es/m/CALDaNm0eW-9g4G_EzHebnFT5zZoasWCS_EzZQ5BgnLZny9S=pg@mail.gmail.com
This commit puts the implementation of Tuple sort variants into the separate
file tuplesortvariants.c. That gives better separation of the code and
serves well as the demonstration that Tuple sort variant can be defined outside
of tuplesort.c.
Discussion: https://postgr.es/m/CAPpHfdvjix0Ahx-H3Jp1M2R%2B_74P-zKnGGygx4OWr%3DbUQ8BNdw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Pavel Borisov, Maxim Orlov, Matthias van de Meent
Reviewed-by: Andres Freund, John Naylor
The new TuplesortPublic data structure contains the definition of
sort-variant-specific interface methods and the part of Tuple sort operation
state required by their implementations. This will let define Tuple sort
variants without knowledge of Tuplesortstate, that is without knowledge
of generic sort implementation guts.
Discussion: https://postgr.es/m/CAPpHfdvjix0Ahx-H3Jp1M2R%2B_74P-zKnGGygx4OWr%3DbUQ8BNdw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Pavel Borisov, Maxim Orlov, Matthias van de Meent
Reviewed-by: Andres Freund, John Naylor
This commit puts some generic work away from sort-variant-specific function.
In particular, tuplesort_put*() now doesn't need to decrease available memory
and switch to sort context before calling puttuple_common(). writetup()
doesn't need to free SortTuple.tuple and increase available memory.
Discussion: https://postgr.es/m/CAPpHfdvjix0Ahx-H3Jp1M2R%2B_74P-zKnGGygx4OWr%3DbUQ8BNdw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Pavel Borisov, Maxim Orlov, Matthias van de Meent
Reviewed-by: Andres Freund, John Naylor
Abbreviation code is very similar along tuplesort_put*() functions. This
commit unifies that code and puts it into puttuple_common(). tuplesort_put*()
functions differs in the abbreviation condition, so it has been added as an
argument to the puttuple_common() function.
Discussion: https://postgr.es/m/CAPpHfdvjix0Ahx-H3Jp1M2R%2B_74P-zKnGGygx4OWr%3DbUQ8BNdw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Pavel Borisov, Maxim Orlov, Matthias van de Meent
Reviewed-by: Andres Freund, John Naylor
This commit is the preparation to move abbreviation logic into
puttuple_common(). The new removeabbrev function turns datum1 representation
of SortTuple's from the abbreviated key to the first column value. Therefore,
it encapsulates the differential part of abbreviation handling code in
tuplesort_put*() functions, making these functions similar.
Discussion: https://postgr.es/m/CAPpHfdvjix0Ahx-H3Jp1M2R%2B_74P-zKnGGygx4OWr%3DbUQ8BNdw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Pavel Borisov, Maxim Orlov, Matthias van de Meent
Reviewed-by: Andres Freund, John Naylor
It's currently unclear how do we split functionality between
Tuplesortstate.copytup() function and tuplesort_put*() functions.
For instance, copytup_index() and copytup_datum() return error while
tuplesort_putindextuplevalues() and tuplesort_putdatum() do their work.
This commit removes Tuplesortstate.copytup() altogether, putting the
corresponding code into tuplesort_put*().
Discussion: https://postgr.es/m/CAPpHfdvjix0Ahx-H3Jp1M2R%2B_74P-zKnGGygx4OWr%3DbUQ8BNdw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Pavel Borisov, Maxim Orlov, Matthias van de Meent
Reviewed-by: Andres Freund, John Naylor
XLogRecordBlockHeader, the header holding the information for the data
related to a block, tracks the length of the data appended to the WAL
record with data_length (uint16). This limitation in size was not
enforced by the public routine in charge of registering the data
assembled later to form the WAL record inserted, XLogRegisterBufData().
Incorrectly used, it could lead to the generation of records with some
of its data overflowed. This commit adds some safeguards to prevent
that for the block data, complaining immediately if attempting to add to
a record block information with a size larger than UINT16_MAX, which is
the limit implied by the internal logic.
Note that this also adjusts XLogRegisterData() and XLogRegisterBufData()
so as the length of the WAL record data given by the caller is unsigned,
matching with what gets stored in XLogRecData->len.
Extracted from a larger patch by the same author. The original patch
includes more protections when assembling a record in full that will be
looked at separately later.
Author: Matthias van de Meent
Reviewed-by: Andres Freund, Heikki Linnakangas, Michael Paquier, David
Zhang
Discussion: https://postgr.es/m/CAEze2WgGiw+LZt+vHf8tWqB_6VxeLsMeoAuod0N=ij1q17n5pw@mail.gmail.com
As before, we start by prepending one underscore (truncating the
base name if necessary). But if there is a conflict, then instead of
prepending more and more underscores, append an underscore and some
digits, in much the same way that ChooseRelationName does. While
the previous logic could be driven to fail by creating a lot of
types with long names differing only near the end, this version seems
certain enough to eventually succeed that we can remove the failure
code path that was there before.
While at it, undo 6df7a9698's decision to split this code out of
makeArrayTypeName. That wasn't actually accomplishing anything,
because no other function was using it --- and it would have been
wrong to do so. The convention that a prefix "_" means an array,
not something else, is too ancient to mess with.
Andrey Lepikhov and Dmitry Koval, reviewed by Masahiko Sawada and myself
Discussion: https://postgr.es/m/b84cd82c-cc67-198a-8b1c-60f44e1259ad@postgrespro.ru
The BoolGetDatum() call ended up in the wrong place. It should be
applied when we, err, want to convert a bool to a datum.
Thanks to Tom Lane for noticing this.
Discussion: http://postgr.es/m/2511599.1658861964@sss.pgh.pa.us
A bootstrap user who is not a superuser will still own many
important system objects, such as the pg_catalog schema, that
will likely allow that user to regain superuser status. Therefore,
allowing the superuser property to be removed from the superuser
creates a false perception of security where none exists.
Although removing superuser from the bootstrap user is also a bad idea
and should be considered unsupported in all released versions, no
back-patch, as this is a behavior change.
Discussion: http://postgr.es/m/CA+TgmoZirCwArJms_fgvLBFrC6b=HdxmG7iAhv+kt_=NBA7tEw@mail.gmail.com
We have a few commands that "can't run in a transaction block",
meaning that if they complete their processing but then we fail
to COMMIT, we'll be left with inconsistent on-disk state.
However, the existing defenses for this are only watertight for
simple query protocol. In extended protocol, we didn't commit
until receiving a Sync message. Since the client is allowed to
issue another command instead of Sync, we're in trouble if that
command fails or is an explicit ROLLBACK. In any case, sitting
in an inconsistent state while waiting for a client message
that might not come seems pretty risky.
This case wasn't reachable via libpq before we introduced pipeline
mode, but it's always been an intended aspect of extended query
protocol, and likely there are other clients that could reach it
before.
To fix, set a flag in PreventInTransactionBlock that tells
exec_execute_message to force an immediate commit. This seems
to be the approach that does least damage to existing working
cases while still preventing the undesirable outcomes.
While here, add some documentation to protocol.sgml that explicitly
says how to use pipelining. That's latent in the existing docs if
you know what to look for, but it's better to spell it out; and it
provides a place to document this new behavior.
Per bug #17434 from Yugo Nagata. It's been wrong for ages,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org
Presently, archive status files are durably renamed from .ready to
.done to indicate that a file has been archived. Persisting this
rename to disk accounts for a significant amount of the overhead
associated with archiving. While durably renaming the file
prevents re-archiving in most cases, archive commands and libraries
must already gracefully handle attempts to re-archive the last
archived file after a crash (e.g., a crash immediately after
archive_command exits but before the server renames the status
file).
This change reduces the amount of overhead associated with
archiving by using rename() instead of durable_rename() to rename
the archive status files. As a consequence, the server is more
likely to attempt to re-archive files after a crash, but as noted
above, archive commands and modules are already expected to handle
this. It is also possible that the server will attempt to re-
archive files that have been removed or recycled, but the archiver
already handles this, too.
Author: Nathan Bossart
Reviewed-by: Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/20220222011948.GA3850532@nathanxps13
Since a2c8499, HbaFileName (default pg_hba.conf) was getting used
instead of IdentFileName (default pg_ident.conf) as the parent file to
use as reference when parsing the contents of pg_ident.conf, with
pg_ident.conf correctly opened, when feeding this information to
pg_ident_file_mappings. This had two consequences:
- On an I/O error when reading pg_ident.conf, the user would get an
ERROR message referring to pg_hba.conf and not pg_ident.conf.
- When reading an external file with a relative path using '@' in
pg_ident.conf, the directory used to look at the file to load would be
the base directory of pg_hba.conf rather than the one of pg_ident.conf,
leading to errors in pg_ident_file_mappings inconsistent with what gets
loaded at startup when pg_ident.conf and pg_hba.conf are located in
different directories.
This error only impacted the SQL view pg_ident_file_mappings that uses a
logic new to v15 to fill the view with the parsed information, not the
code paths loading these authentication files at startup.
Author: Julien Rouhaud
Discussion: https://postgr.es/m/20220726050402.vsr6fmz7rsgpmdz3@jrouhaud
Backpatch-through: 15
This addresses a couple of bugs in the REINDEX grammar, introduced by
83011ce:
- A name was never specified for DATABASE/SYSTEM, even if the query
included one. This caused such REINDEX queries to always work with any
object name, but we should complain if the object name specified does
not match the name of the database we are connected to. A test is added
for this case in the main regression test suite, provided by Álvaro.
- REINDEX SYSTEM CONCURRENTLY [name] was getting rejected in the
parser. Concurrent rebuilds are not supported for catalogs but the
error provided at execution time is more helpful for the user, and
allowing this flavor results in a simplification of the parsing logic.
- REINDEX DATABASE CONCURRENTLY was rebuilding the index in a
non-concurrent way, as the option was not being appended correctly in
the list of DefElems in ReindexStmt (REINDEX (CONCURRENTLY) DATABASE was
working fine. A test is added in the TAP tests of reindexdb for this
case, where we already have a REINDEX DATABASE CONCURRENTLY query
running on a small-ish instance. This relies on the work done in
2cbc3c1 for SYSTEM, but here we check if the OIDs of the index relations
match or not after the concurrent rebuild. Note that in order to get
this part to work, I had to tweak the tests so as the index OID and
names are saved separately. This change not affect the reliability or
of the coverage of the existing tests.
While on it, I have implemented a tweak in the grammar to reduce the
parsing by one branch, simplifying things even more.
Author: Michael Paquier, Álvaro Herrera
Discussion: https://postgr.es/m/YttqI6O64wDxGn0K@paquier.xyz
The setting controls tha maximum length of the header line in expanded
format output. Possible settings are full, column, page, or an integer.
the default is full, the current behaviour, and in this case the header
line is the length of the widest line of output. column causes the
header to be truncated to the width of the first column, page causes it
to be truncated to the width of the terminal page, and an integer causes
it to be truncated to that value. If the full value is less than the
page or integer value no truncation occurs. If given without an argument
this option prints its current setting.
Platon Pronko, somewhat modified by me.
Discussion: https://postgr.es/m/f03d38a3-db96-a56e-d1bc-dbbc80bbde4d@gmail.com
Previously we did this after InitPostgres, at a somewhat randomly chosen
place within PostgresMain. However, since commit a0ffa885e doing this
outside a transaction can cause a crash, if we need to check permissions
while replacing a placeholder GUC. (Besides which, a preloaded library
could itself want to do database access within _PG_init.)
To avoid needing an additional transaction start/end in every session,
move the process_session_preload_libraries call to within InitPostgres's
transaction. That requires teaching the code not to call it when
InitPostgres is called from somewhere other than PostgresMain, since
we don't want session_preload_libraries to affect background workers.
The most future-proof solution here seems to be to add an additional
flag parameter to InitPostgres; fortunately, we're not yet very worried
about API stability for v15.
Doing this also exposed the fact that we're currently honoring
session_preload_libraries in walsenders, even those not connected to
any database. This seems, at minimum, a POLA violation: walsenders
are not interactive sessions. Let's stop doing that.
(All these comments also apply to local_preload_libraries, of course.)
Per report from Gurjeet Singh (thanks also to Nathan Bossart and Kyotaro
Horiguchi for review). Backpatch to v15 where a0ffa885e came in.
Discussion: https://postgr.es/m/CABwTF4VEpwTHhRQ+q5MiC5ucngN-whN-PdcKeufX7eLSoAfbZA@mail.gmail.com
It incorrectly used GetBufferDescriptor instead of
GetLocalBufferDescriptor, causing it to not find the correct buffer in
most cases, and performing an out-of-bounds memory read in the corner
case that temp_buffers > shared_buffers.
It also bumped the usage-count on the buffer, even if it was
previously pinned. That won't lead to crashes or incorrect results,
but it's different from what the shared-buffer case does, and
different from the usual code in LocalBufferAlloc. Fix that too, and
make the code ordering match LocalBufferAlloc() more closely, so that
it's easier to verify that it's doing the same thing.
Currently, ReadRecentBuffer() is only used with non-temp relations, in
WAL redo, so the broken code is currently dead code. However, it could
be used by extensions.
Backpatch-through: 14
Discussion: https://www.postgresql.org/message-id/2d74b46f-27c9-fb31-7f99-327a87184cc0%40iki.fi
Reviewed-by: Thomas Munro, Zhang Mingli, Richard Guo
This commit removes two arguments "report" and "whichChkpt"
in ReadCheckpointRecord().
"report" is obviously useless because it's always true, i.e., there are
two callers of the function and they always specify true as "report".
Commit 1d919de5eb removed the only call with "report" = false.
"whichChkpt" indicated where the specified checkpoint location
came from, pg_control or backup_label. This information was used
to report different error messages depending on where the invalid
checkpoint record came from, when it was found.
But ReadCheckpointRecord() doesn't need to do that because
its callers already do that and users can still identify where
the invalid checkpoint record came from, by reading such log messages.
Also when "whichChkpt" was 0, the word "primary checkpoint" was used
in the log message and could confuse users because the concept of
primary and secondary checkpoints was already removed before.
These are why this commit removes "whichChkpt" argument.
Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Kyotaro Horiguchi
Discussion: https://postgr.es/m/fa2e12eb-81c3-0717-0272-755f8a81c8f2@oss.nttdata.com
sigwait() is in SUSv2 and all targeted Unix systems have it. An earlier
pre-standard function prototype existed on some older systems, but we
no longer need a workaround for that.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Greg Stark <stark@mit.edu>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
getrusage() is in SUSv2 and all targeted Unix systems have it.
Note that POSIX only covers ru_utime and ru_stime and we rely on many
more fields without any kind of configure probe, but that predates this
commit.
The only supported system we need replacement code for now is Windows,
and that can be done without a configure probe.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Greg Stark <stark@mit.edu>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
Commit e2f65f425 added contrib/pg_prewarm to the prerequisites for
running the src/test/recovery suite, but did not bother to update
the documentation about that.
We've long held the minimum at 3.80, but that's required more than
one workaround. Commit 0f39b70a6 broke it again, because it turns
out that exporting a target-specific variable didn't work in 3.80.
Considering that 3.81 is now old enough to get a driver's license,
and that the only remaining buildfarm member testing 3.80 (prairiedog)
is likely to be retired soon, let's just stop supporting 3.80.
Adjust docs and Makefile.global's minimum-version check to match.
There are a couple of comments in the Makefiles suggesting that
random things could be done differently after we desupport 3.80,
but I couldn't get excited about changing any of them right now.
Back-patch to v15, as 0f39b70a6 was.
Discussion: https://postgr.es/m/20220720172321.GL12702@telsasoft.com
The common recipe when TAP tests are disabled doesn't work, because the
libpq-specific recipe wants to define the PATH environment variable, so
the starting '@' is misinterpreted as part of the command instead of
silencing said command.
Fix by setting the environment variable in a way that doesn't interfere
with the recipe.
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20220720172321.GL12702@telsasoft.com
The dependency logic failed to register a column-level dependency
when a view or rule contains a reference to a specific column of
the result of a function-returning-composite. That meant you could
drop the column from the composite type, causing trouble for future
executions of the view. We've known about this for years, but never
summoned the energy to actually fix it, instead installing various
low-level defenses to prevent crashing on references to dropped columns.
We had to do that to plug the hole in stable branches, where there might
be pre-existing broken references; but let's fix the root cause today.
To do that, add some logic (borrowed from get_rte_attribute_is_dropped)
to find_expr_references_walker, to check whether a Var referencing an
RTE_FUNCTION RTE is referencing a column of a composite type, and if
so add the proper dependency.
However ... it seems mighty unwise to remove said low-level defenses,
since there could be other bugs now or in the future that allow
reaching them. By the same token, letting those defenses go untested
seems unwise. Hence, rather than just dropping the associated test
cases, hack them to continue working by the expedient of manually
dropping the pg_depend entries that this fix installs.
Back-patch into v15. I don't want to risk changing this behavior
in stable branches, but it seems not too late for v15. (Since
we have already forced initdb for beta3, we can be sure that all
production v15 installations will have these added dependencies.)
Discussion: https://postgr.es/m/182492.1658431155@sss.pgh.pa.us
Things like "opt_name" can well be shared by various commands rather
than there being multiple definitions of the same thing. Rename these
productions and move them to appear together in gram.y, which may
improve chances of reuse in the future.
Discussion: https://postgr.es/m/20220721174212.cmitjpuimx6ssyyj@alvherre.pgsql
Commit c6f2f016 added an explicit check for a Windows "junction point".
That turned out to be needed only because get_dirent_type() was busted
on Windows. It's been fixed by commit 9d3444dc, so remove it.
Add a TAP-test to demonstrate that in-place tablespaces are copied by
pg_basebackup. This exercises the codepath that would fail before
c6f2f016 on Windows, and shows that it still doesn't fail now that we're
using get_dirent_type() on both Windows and Unix.
Back-patch to 15, where in-place tablespaces arrived and caused this
problem (ie directories where previously only symlinks were expected).
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGLzLK4PUPx0_AwXEWXOYAejU%3D7XpxnYE55Y%2Be7hB2N3FA%40mail.gmail.com
Commit 87e6ed7c8 added code that intended to report Windows "junction
points" as DT_LNK (the same way we report symlinks on Unix). Windows
junction points are *also* directories according to the Windows
attributes API, and we were reporting them as as DT_DIR. Change the
order we check the attribute flags, to prioritize DT_LNK.
If at some point we start using Windows' recently added real symlinks
and need to distinguish them from junction points, we may need to
rethink this, but for now this continues the tradition of wrapper
functions that treat junction points as symlinks.
Back-patch to 14, where get_dirent_type() landed.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CA%2BhUKGLzLK4PUPx0_AwXEWXOYAejU%3D7XpxnYE55Y%2Be7hB2N3FA%40mail.gmail.com
Discussion: https://postgr.es/m/20220721111751.x7hod2xgrd76xr5c%40alvherre.pgsql
O_FSYNC was a pre-POSIX way of spelling O_SYNC, supported since commit
9d645fd84c for non-conforming operating systems of the time. It's not
needed on any modern system. We can just use standard O_SYNC directly
if it exists (= all targeted systems except Windows), and get rid of our
OPEN_SYNC_FLAG macro.
Similarly for standard O_DSYNC, we can just use that directly if it
exists (= all targeted systems except DragonFlyBSD), and get rid of our
OPEN_DATASYNC_FLAG macro.
We still avoid choosing open_datasync as a default value for
wal_sync_method if O_DSYNC has the same value as O_SYNC (= only
OpenBSD), so there is no change in default behavior.
Discussion: https://postgr.es/m/CA%2BhUKGJE7y92NY7FG2ftUbZUaqohBU65_Ys_7xF5mUHo4wirTQ%40mail.gmail.com
Commit 4f658dc8 provided the traditional BSD fls() function in
src/port/fls.c so it could be used in several places. Later we added a
bunch of similar facilities in pg_bitutils.h, based on compiler
builtins that map to hardware instructions. It's a bit confusing to
have both 1-based and 0-based variants of this operation in use in
different parts of the tree, and neither is blessed by a standard.
Let's drop fls.c and the configure probe, and reuse the newer code.
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKG%2B7dSX1XF8yFGmYk-%3D48dbjH2kmzZj16XvhbrWP-9BzRg%40mail.gmail.com
This allows users to omit the statistics name in a CREATE STATISTICS
command, letting the system auto-generate a sensible, unique name,
putting the statistics object in the same schema as the table.
Simon Riggs, reviewed by Matthias van de Meent.
Discussion: https://postgr.es/m/CANbhV-FGD2d_C3zFTfT2aRfX_TaPSgOeKES58RLZx5XzQp5NhA@mail.gmail.com
Due to lack of concern for the case in the dependency code, it's
possible to drop a column of a composite type even though stored
queries have references to the dropped column via functions-in-FROM
that return the composite type. There are "soft" references,
namely FROM-clause aliases for such columns, and "hard" references,
that is actual Vars referring to them. The right fix for hard
references is to add dependencies preventing the drop; something
we've known for many years and not done (and this commit still doesn't
address it). A "soft" reference shouldn't prevent a drop though.
We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but
nobody had noticed that the current behavior can result in dump/reload
failures, because ruleutils.c can print more column aliases than the
underlying composite type now has. So we need to rejigger the
column-alias-handling code to treat such columns as dropped and not
print aliases for them.
Rather than writing new code for this, I used expandRTE() which already
knows how to figure out which function result columns are dropped.
I'd initially thought maybe we could use expandRTE() in all cases, but
that fails for EXPLAIN's purposes, because the planner strips a lot of
RTE infrastructure that expandRTE() needs. So this patch just uses it
for unplanned function RTEs and otherwise does things the old way.
If there is a hard reference (Var), then removing the column alias
causes us to fail to print the Var, since there's no longer a name
to print. Failing seems less desirable than printing a made-up
name, so I made it print "?dropped?column?" instead.
Per report from Timo Stolz. Back-patch to all supported branches.
Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
This patch adds a new SUBSCRIPTION parameter "origin". It specifies
whether the subscription will request the publisher to only send changes
that don't have an origin or send changes regardless of origin. Setting it
to "none" means that the subscription will request the publisher to only
send changes that have no origin associated. Setting it to "any" means
that the publisher sends changes regardless of their origin. The default
is "any".
Usage:
CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres port=9999'
PUBLICATION pub1 WITH (origin = none);
This can be used to avoid loops (infinite replication of the same data)
among replication nodes.
This feature allows filtering only the replication data originating from
WAL but for initial sync (initial copy of table data) we don't have such a
facility as we can only distinguish the data based on origin from WAL. As
a follow-up patch, we are planning to forbid the initial sync if the
origin is specified as none and we notice that the publication tables were
also replicated from other publishers to avoid duplicate data or loops.
We forbid to allow creating origin with names 'none' and 'any' to avoid
confusion with the same name options.
Author: Vignesh C, Amit Kapila
Reviewed-By: Peter Smith, Amit Kapila, Dilip Kumar, Shi yu, Ashutosh Bapat, Hayato Kuroda
Discussion: https://postgr.es/m/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubhjcQ@mail.gmail.com
This renames the relation storing the relfilenode state into something
more generic as it also stores data for non-toast relations. A
restriction on the number of digits used for the OID number when
filtering toast relation names is removed, while on it, as there is no
need for it.
Reported-by: Justin Pryzby
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/20220719022652.GE12702@telsasoft.com
Most of these have been introduced in d2d3547 with the new pattern
validation logic, and would leak memory worth an amount of one
PQExpBuffer each time (as of 256 bytes at minimum, possibly more).
Most of the patch has been written by Tang Haiying, with a few tweaks
coming from Álvaro Herrera.
Reported-by: Tang Haiying
Author: Tang Haiying, Álvaro Herrera
Reviewed-by: Mark Dilger, Andres Freund, Álvaro Herrera, Tom Lane, Japin
Li, Michael Paquier, Junwang Zhao
Backpatch-through: 15
Commit 964d01ae9 marked a lot of fields as read_write_ignore
to stay consistent with what was dumped by the manually-maintained
outfuncs.c code. However, it seems that a pretty fair number
of those omissions were either flat-out oversights, or a shortcut
taken because hand-written code seemed like it'd be too much trouble.
Let's upgrade things where it seems to make sense to dump.
To do this, we need to add support to gen_node_support.pl and
outfuncs.c for variable-length arrays of Node pointers. That's
pretty straightforward given the model of the existing code
for arrays of scalars, but I found I needed to tighten the
type-recognizing regexes in gen_node_support.pl. (As they
stood, they mistook "foo **" for "foo *". Make sure they're
all fully anchored to prevent additional problems.)
The main thing left un-done here is that a lot of partitioning-related
structs are still not dumped, because they are bare structs not Nodes.
I'm not sure about the wisdom of that choice ... but changing it would
be fairly invasive, so it probably requires more justification than
just making planner node dumps more complete.
Discussion: https://postgr.es/m/1295668.1658258637@sss.pgh.pa.us
Without processing shared_preload_libraries, it's impossible to
recover if custom WAL resource managers are needed. It may also pose a
problem running VACUUM on a table with a custom AM, if the module
implementing the AM is expecting to be loaded by
shared_preload_libraries.
The reason this wasn't done before was just the general principle to
do fewer things in single-user mode. But it's easy enough to just set
shared_preload_libraries to empty, for the same effect.
Discussion: https://postgr.es/m/9decc18a42634f8a2f15c97a385a0f51a752f396.camel%40j-davis.com
Reviewed-by: Tom Lane, Andres Freund
Backpatch-through: 15
When the ability to print variable-length-array fields was first
added to outfuncs.c, there was no corresponding read capability,
as it was used only for debug dumps of planner-internal Nodes.
Not a lot of thought seems to have been put into the output format:
it's just the space-separated array elements and nothing else.
Later such fields appeared in Plan nodes, and still later we grew
read support so that Plans could be transferred to parallel workers,
but the original text format wasn't rethought. It seems inadequate
to me because (a) no cross-check is possible that we got the right
number of array entries, (b) we can't tell the difference between
a NULL pointer and a zero-length array, and (c) except for
WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified
when the pointer is NULL, a situation that can arise in some fields
that we currently conveniently avoid printing.
Since we're currently in a campaign to make the Node infrastructure
generally more it-just-works-without-thinking-about-it, now seems
like a good time to improve this.
Let's adopt a format similar to that used for Lists, that is "<>"
for a NULL pointer or "(item item item)" otherwise. Also retool
the code to not have so many copies of the identical logic.
I bumped catversion out of an abundance of caution, although I think
that we don't use any such array fields in Nodes that can get into
the catalogs.
Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
This allows aliases for sub-SELECTs and VALUES clauses in the FROM
clause to be omitted.
This is an extension of the SQL standard, supported by some other
database systems, and so eases the transition from such systems, as
well as removing the minor inconvenience caused by requiring these
aliases.
Patch by me, reviewed by Tom Lane.
Discussion: https://postgr.es/m/CAEZATCUCGCf82=hxd9N5n6xGHPyYpQnxW8HneeH+uP7yNALkWA@mail.gmail.com
Windows 10 gained support for flushing NTFS files with fdatasync()
semantics. The main advantage over open_datasync (in Windows API terms
FILE_FLAG_WRITE_THROUGH) is that the latter does not flush SATA drive
caches. The default setting is not changed, so users have to opt in to
this.
Discussion: https://postgr.es/m/CA%2BhUKGJZJVO%3DiX%2Beb-PXi2_XS9ZRqnn_4URh0NUQOwt6-_51xQ%40mail.gmail.com
When a non-exclusive backup is canceled, do_pg_abort_backup() is called
and resets some variables set by pg_backup_start (pg_start_backup in v14
or before). But previously it forgot to reset the session state indicating
whether a non-exclusive backup is in progress or not in this session.
This issue could cause an assertion failure when the session running
BASE_BACKUP is terminated after it executed pg_backup_start and
pg_backup_stop (pg_stop_backup in v14 or before). Also it could cause
a segmentation fault when pg_backup_stop is called after BASE_BACKUP
in the same session is canceled.
This commit fixes the issue by making do_pg_abort_backup reset
that session state.
Back-patch to all supported branches.
Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
Multiple non-exclusive backups are able to be run conrrently in different
sessions. But, in the same session, only one non-exclusive backup can be
run at the same moment. If pg_backup_start (pg_start_backup in v14 or before)
is called in the middle of another non-exclusive backup in the same session,
an error is thrown.
However, previously, in logical replication walsender mode, even if that
walsender session had already called pg_backup_start and started
a non-exclusive backup, it could execute BASE_BACKUP command and
start another non-exclusive backup. Which caused subsequent pg_backup_stop
to throw an error because BASE_BACKUP unexpectedly reset the session state
marked by pg_backup_start.
This commit prevents BASE_BACKUP command in the middle of another
non-exclusive backup in the same session.
Back-patch to all supported branches.
Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
Detail and hint messages should be full sentences and should end with a
period, but some of the messages newly-introduced in v15 did not follow
that.
Author: Justin Pryzby
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/20220719120948.GF12702@telsasoft.com
Backpatch-through: 15
We allow users to set the values of not-yet-loaded extension GUCs,
remembering those values in "placeholder" GUC entries. When/if
the extension is loaded later in the session, we need to verify that
the user had permissions to set the GUC. That was done correctly
before commit a0ffa885e, but as of that commit, we'd check the
permissions of the active role when the LOAD happens, not the role
that had set the value. (This'd be a security bug if it had made it
into a released version.)
In principle this is simple enough to fix: we just need to remember
the exact role OID that set each GUC value, and use that not
GetUserID() when verifying permissions. Maintaining that data in
the guc.c data structures is slightly tedious, but fortunately it's
all basically just copy-n-paste of the logic for tracking the
GucSource of each setting, as we were already doing.
Another oversight is that validate_option_array_item() hadn't
been taught to check for granted GUC privileges. This appears
to manifest only in that ALTER ROLE/DATABASE RESET ALL will
fail to reset settings that the user should be allowed to reset.
Patch by myself and Nathan Bossart, per report from Nathan Bossart.
Back-patch to v15 where the faulty code came in.
Discussion: https://postgr.es/m/20220706224727.GA2158260@nathanxps13
This is mostly just to get outfuncs.c support for them, so that
the agginfos and aggtransinfos lists can be dumped when dumping
the contents of PlannerInfo.
While here, improve some related comments; notably, clean up
obsolete comments left over from when preprocess_minmax_aggregates
had to make its own scan of the query tree.
Discussion: https://postgr.es/m/742479.1658160504@sss.pgh.pa.us
setrefs.c contains logic to discard no-op SubqueryScan nodes, that is,
ones that have no qual to check and copy the input targetlist unchanged.
(Formally it's not very nice to be applying such optimizations so late
in the planner, but there are practical reasons for it; mostly that we
can't unify relids between the subquery and the parent query until we
flatten the rangetable during setrefs.c.) This behavior falsifies our
previous cost estimates, since we would've charged cpu_tuple_cost per
row just to pass data through the node. Most of the time that's little
enough to not matter, but there are cases where this effect visibly
changes the plan compared to what you would've gotten with no
sub-select.
To improve the situation, make the callers of cost_subqueryscan tell
it whether they think the targetlist is trivial. cost_subqueryscan
already has the qual list, so it can check the other half of the
condition easily. It could make its own determination of tlist
triviality too, but doing so would be repetitive (for callers that
may call it several times) or unnecessarily expensive (for callers
that can determine this more cheaply than a general test would do).
This isn't a 100% solution, because createplan.c also does things
that can falsify any earlier estimate of whether the tlist is
trivial. However, it fixes nearly all cases in practice, if results
for the regression tests are anything to go by.
setrefs.c also contains logic to discard no-op Append and MergeAppend
nodes. We did have knowledge of that behavior at costing time, but
somebody failed to update it when a check on parallel-awareness was
added to the setrefs.c logic. Fix that while we're here.
These changes result in two minor changes in query plans shown in
our regression tests. Neither is relevant to the purposes of its
test case AFAICT.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/2581077.1651703520@sss.pgh.pa.us
Per discussion, this commit includes a couple of changes to these two
flavors of REINDEX:
* The grammar is changed to make the name of the object optional, hence
one can rebuild all the indexes of the wanted area by specifying only
"REINDEX DATABASE;" or "REINDEX SYSTEM;". Previously, the object name
was mandatory and had to match the name of the database on which the
command is issued.
* REINDEX DATABASE is changed to ignore catalogs, making this task only
possible with REINDEX SYSTEM. This is a historical change, but there
was no way to work only on the indexes of a database without touching
the catalogs. We have discussed more approaches here, like the addition
of an option to skip the catalogs without changing the original
behavior, but concluded that what we have here is for the best.
This builds on top of the TAP tests introduced in 5fb5b6c, showing the
change in behavior for REINDEX SYSTEM. reindexdb is updated so as we do
not issue an extra REINDEX SYSTEM when working on a database in the
non-concurrent case, something that was confusing when --concurrently
got introduced, so this simplifies the code.
Author: Simon Riggs
Reviewed-by: Ashutosh Bapat, Bernd Helmle, Álvaro Herrera, Cary Huang,
Michael Paquier
Discussion: https://postgr.es/m/CANbhV-H=NH6Om4-X6cRjDWfH_Mu1usqwkuYVp-hwdB_PSHWRfg@mail.gmail.com
Adding such commands in the main regression test suite is not a good
approach performance-wise as it impacts all the objects in the
regression database, so this additional coverage is added in the TAP
tests of reindexdb where we already run a few REINDEX commands with
SYSTEM and DATABASE so there is no runtime difference for the test.
This additional coverage checks which relations are rewritten with
relfilenode changes, as of:
- a toast index in user table.
- a toast index in catalog table.
- a catalog index.
- a user index.
This test suite is something I have implemented for a separate patch
that reworks a bit the way we handle these two REINDEX behaviors, but it
has enough value in itself to be in a separate commit. This also makes
easier to follow what actually changes once the REINDEX logic is
reworked (currently, DABATASE rewrites both catalog and user tables, and
SYSTEM works only on catalogs).
Discussion: https://postgr.es/m/YtOqA7ldcJQADEE8@paquier.xyz
This fixes problems on windows when logging collector is used in a service,
failing with:
FATAL: could not redirect stderr: Bad file descriptor
This is triggered by 76e38b37a5. The problem is that STDOUT/STDERR_FILENO
aren't defined on windows, which lead us to use _fileno(stdout) etc, but that
doesn't work if stdout/stderr are closed.
Author: Andres Freund <andres@anarazel.de>
Reported-By: Sandeep Thakkar <sandeep.thakkar@enterprisedb.com>
Message-Id: 20220520164558.ozb7lm6unakqzezi@alap3.anarazel.de (on pgsql-packagers)
Backpatch: 15-, where 76e38b37a5 came in
Because they are not available we've used _fileno(stdin) in some places, but
that doesn't reliably work, because stdin might be closed. This is the
prerequisite of the subsequent commit, fixing a failure introduced in
76e38b37a5.
Author: Andres Freund <andres@anarazel.de>
Reported-By: Sandeep Thakkar <sandeep.thakkar@enterprisedb.com>
Message-Id: 20220520164558.ozb7lm6unakqzezi@alap3.anarazel.de (on pgsql-packagers)
Backpatch: 15-, where 76e38b37a5 came in
parse.pl and check_rules.pl used "no warnings 'uninitialized'",
which doesn't seem like it measures up to current project standards.
Removing that shows that it was hiding various places that accessed
off the end of an array, which are easily protected by minor logic
adjustments. There's no change in the script results.
While here, improve the Makefile rule that invokes these scripts.
It neglected to depend on check_rules.pl, so that editing that file
didn't result in re-running the check; and it ran check_rules.pl
after building preproc.y, so that if check_rules.pl did fail the
next "make" attempt would just bypass it. check_rules.pl failures
are sufficiently un-heard-of that I don't feel a need to back-patch
this.
Discussion: https://postgr.es/m/838180.1658181982@sss.pgh.pa.us
In db0a272d12 I used open(our $something, ...), which perlcritic doesn't
like. It looks like the warning is due to perlcritic knowing about 'my' but
not 'our' when checking for bareword file handles.
However, it's clearly unnecessary to use "our" here, change it to "my".
Via buildfarm member crake and discussion with Tom Lane.
Discussion: https://postgr.es/m/20220718215042.sl3hivoupdb7lkwv@awork3.anarazel.de
This is in preparation for building postgres with meson / ninja.
Move the dtrace postprocessing sed commands into a separate file so
that it can be shared by meson. Also split the rule into two for
proper dependency declaration.
Reviewed-by: Andres Freund <andres@anarazel.de>
Author: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com
This is in preparation for building postgres with meson / ninja.
When building with meson, commands are run at the root of the build tree. Add
an option to put build output into the appropriate place. This can be utilized
by src/tools/msvc/ for a minor simplification, which also provides some
coverage for the new option.
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com
This is in preparation for building postgres with meson / ninja.
When building with meson, commands are run at the root of the build tree. Add
an option to put build output into the appropriate place.
Author: Andres Freund <andres@anarazel.de>
Author: Peter Eisentraut <peter@eisentraut.org>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com
This is in preparation for building postgres with meson / ninja.
meson's 'capture' (redirecting stdout to a file) is a bit slower than programs
redirecting output themselves (mostly due to a python wrapper necessary for
windows). That doesn't matter for most things, but errcodes.h is a dependency
of nearly everything, making it a bit faster seem worthwhile.
Medium term it might also be worth avoiding writing errcodes.h if its contents
didn't actually change, to avoid unnecessary recompilations.
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com
This is in preparation for building postgres with meson / ninja.
When building with meson, commands are run at the root of the build tree. Add
an option to put build output into the appropriate place. This can be utilized
by src/tools/msvc/ for a minor simplification, which also provides some
coverage for the new option.
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com
This is in preparation for building postgres with meson / ninja.
We already have duplicated code for this between the make and msvc
builds. Adding a third copy seems like a bad plan, thus move the generation
into a perl script.
As we don't want to rely on perl being available for builds from tarballs,
generate the file during distprep.
Author: Peter Eisentraut <peter@eisentraut.org>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com
This is in preparation for building postgres with meson / ninja.
When building with meson, commands are run at the root of the build tree. Add
an option to put build output into the appropriate place. This can be utilized
by src/tools/msvc/ for a minor simplification, which also provides some
coverage for the new option.
Add option to generate a timestamp for check_rules.pl, so that proper
dependencies on it having been run can be generated.
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com
This is in preparation for building postgres with meson / ninja.
When building with meson, commands are run at the root of the build tree. Add
an option to put build output into the appropriate place. This can be utilized
by src/tools/msvc/ for a minor simplification, which also provides some
coverage for the new option.
To deal with dependencies to the variable set of input files to this script,
add an option to generate a dependency file (which meson / ninja can consume).
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com
Commit e3fcca0d0d reverted modifications to HOT for BRIN, but it also
removed a couple unrelated tests from stats.sql. Reinstate those tests.
Reported-by: Peter Eisentraut
A code comment said that the standard does not define a number for
ERRCODE_SQL_JSON_ITEM_CANNOT_BE_CAST_TO_TARGET_TYPE, but this was
fixed in a later draft version of the standard, so use that number
now.
This is in preparation for defaulting to -fvisibility=hidden in extensions,
instead of relying on all symbols in extensions to be exported.
This should have been committed before 089480c077, but something in my commit
scripts went wrong.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20211101020311.av6hphdl6xbjbuif@alap3.anarazel.de
Until now postgres built extension libraries with global visibility, i.e.
exporting all symbols. On the one platform where that behavior is not
natively available, namely windows, we emulate it by analyzing the input files
to the shared library and exporting all the symbols therein.
Not exporting all symbols is actually desirable, as it can improve loading
speed, reduces the likelihood of symbol conflicts and can improve intra
extension library function call performance. It also makes the non-windows
builds more similar to windows builds.
Additionally, with meson implementing the export-all-symbols behavior for
windows, turns out to be more verbose than desirable.
This patch adds support for hiding symbols by default and, to counteract that,
explicit symbol visibility annotation for compilers that support
__attribute__((visibility("default"))) and -fvisibility=hidden. That is
expected to be most, if not all, compilers except msvc (for which we already
support explicit symbol export annotations).
Now that extension library symbols are explicitly exported, we don't need to
export all symbols on windows anymore, hence remove that behavior from
src/tools/msvc. The supporting code can't be removed, as we still need to
export all symbols from the main postgres binary.
Author: Andres Freund <andres@anarazel.de>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20211101020311.av6hphdl6xbjbuif@alap3.anarazel.de
This is in preparation for defaulting to -fvisibility=hidden in extensions,
instead of exporting all symbols. For that symbols intended to be exported
need to be tagged with PGDLLEXPORT. Most extensions only need to do so for
_PG_init() and functions defined with PG_FUNCTION_INFO_V1. Adding central
declarations avoids each extension having to add PGDLLEXPORT. Any existing
declarations in extensions will continue to work if fmgr.h is included before
them, otherwise compilation for Windows will fail.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20211101020311.av6hphdl6xbjbuif@alap3.anarazel.de
The patch that added regcollation doesn't seem to have been too
thorough about supporting it everywhere that other reg* types
are supported. Fix that. (The find_expr_references omission
is moderately serious, since it could result in missing expression
dependencies. The others are less exciting.)
Noted while fixing bug #17483. Back-patch to v13 where
regcollation was added.
Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
Some of the test cases added by commit 3a0e38504 are failing
intermittently in CI testing. It looks like, when a connection
attempt fails, it's possible for psql to exit and the test script
to slurp up the postmaster's log file before the connected backend
has managed to write the log entry we're expecting to see.
It's not clear whether that's fixable in any robust way. Pending
more thought, just comment out the log_like checks. The ones in
connect_ok tests should be fine, since surely the log entry should
be emitted before we complete the client auth sequence. I took
out all the ones in connect_fails tests though.
Discussion: https://postgr.es/m/E1oCNLk-000LCH-Af@gemulon.postgresql.org
reset_shared just invokes CreateSharedMemoryAndSemaphores, so let's
get rid of it and invoke that directly. This removes a confusing
seeming-inconsistency between the postmaster's startup sequence
and the startup sequence used in standalone mode.
Nathan Bossart, reviewed by Pavel Borisov
Discussion: https://postgr.es/m/20220329221702.GA559657@nathanxps13
This replaces all MemSet() calls with struct initialization where that
is easily and obviously possible. (For example, some cases have to
worry about padding bits, so I left those.)
(The same could be done with appropriate memset() calls, but this
patch is part of an effort to phase out MemSet(), so it doesn't touch
memset() calls.)
Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/9847b13c-b785-f4e2-75c3-12ec77a3b05c@enterprisedb.com
Since commit a65e0864, we've required Unix systems to have
sigprocmask(). As noted in that commit's message, we were still
emulating the historical pre-standard sigsetmask() function in our
Windows support code. Emulate standard sigprocmask() instead, for
consistency.
The PG_SETMASK() abstraction is now redundant and all calls could in
theory be replaced by plain sigprocmask() calls, but that isn't done by
this commit.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3153247.1657834482%40sss.pgh.pa.us
Commit 4518c798 blocks signals for a short region of code, but it
assumed that whatever called it had the signal mask set to UnBlockSig on
entry. That may be true today (or may even not be, in extensions in the
wild), but it would be better not to make that assumption. We should
save-and-restore the caller's signal mask.
The PG_SETMASK() portability macro couldn't be used for that, which is
why it wasn't done before. But... considering that commit a65e0864
established back in 9.6 that supported POSIX systems have sigprocmask(),
and that this is POSIX-only code, there is no reason not to use standard
sigprocmask() directly to achieve that.
Back-patch to all supported releases, like 4518c798 and 80845b7c.
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGKx6Biq7_UuV0kn9DW%2B8QWcpJC1qwhizdtD9tN-fn0H0g%40mail.gmail.com
Currently, debugging client certificate verification failures is
mostly limited to looking at the TLS alert code on the client side.
For simple deployments, sometimes it's enough to see "sslv3 alert
certificate revoked" and know exactly what needs to be fixed, but if
you add any more complexity (multiple CA layers, misconfigured CA
certificates, etc.), trying to debug what happened based on the TLS
alert alone can be an exercise in frustration.
Luckily, the server has more information about exactly what failed in
the chain, and we already have the requisite callback implemented as a
stub. We fill that in, collect the data, and pass the constructed
error message back to the main code via a static variable. This lets
us add our error details directly to the final "could not accept SSL
connection" log message, as opposed to issuing intermediate LOGs.
It ends up looking like
LOG: connection received: host=localhost port=43112
LOG: could not accept SSL connection: certificate verify failed
DETAIL: Client certificate verification failed at depth 1: unable to get local issuer certificate.
Failed certificate data (unverified): subject "/CN=Test CA for PostgreSQL SSL regression test client certs", serial number 2315134995201656577, issuer "/CN=Test root CA for PostgreSQL SSL regression test suite".
The length of the Subject and Issuer strings is limited to prevent
malicious client certs from spamming the logs. In case the truncation
makes things ambiguous, the certificate's serial number is also
logged.
Author: Jacob Champion <pchampion@vmware.com>
Discussion: https://www.postgresql.org/message-id/flat/d13c4a5787c2a3f83705124f0391e0738c796751.camel@vmware.com
For some systems, we need to avoid unsatisfied-external-reference
errors in static inlines. See
27d2693187 for example. In order to
test that on other systems, the gcc option -fkeep-inline-functions can
be used. But it actually is a bit stricter than what we currently
have in place, so fix up a few more places along the lines of the
above commit. (This undoes part of commit
2cd2569c72b8920048e35c31c9be30a6170e1410.)
(Note, this does not add that gcc option anywhere to the build system,
it just makes it possible to use it successfully manually.)
Discussion: https://www.postgresql.org/message-id/flat/E1oBgIW-002ehP-VJ%40gemulon.postgresql.org
Noticed while working in this area. This code was introduced in PG15,
which is still in beta, so backpatch to there for consistency.
Backpatch-through: 15
Teach this script to handle function pointer fields honestly.
Previously they were just silently ignored, but that's not likely to
be a behavior we can accept indefinitely. This mostly entails fixing
it so that a field declaration spanning multiple lines can be parsed,
because we have a bunch of such fields that're laid out that way.
But that's a good improvement in its own right.
With that change and a minor regex adjustment, the only struct it
fails to parse in the node-defining headers is A_Const, because
of the embedded union. The path of least resistance is to move
that union declaration outside the struct.
Having done those things, we can make it error out if it finds
any within-struct syntax it doesn't understand, which seems like
a pretty important property for robustness.
This commit doesn't change the output files at all; it's just in
the way of future-proofing.
Discussion: https://postgr.es/m/2593369.1657759779@sss.pgh.pa.us
Previously we displayed "DSMFillZeroWrite" while in posix_fallocate(),
because we shared the same wait event for "mmap" and "posix" DSM types.
Let's introduce a new wait event "DSMAllocate", to be more accurate.
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220711174518.yldckniicknsxgzl%40awork3.anarazel.de
In early releases of the DSM infrastructure, it was possible to resize
segments. That was removed in release 12 by commit 3c60d0fa. Now the
ftruncate() + posix_fallocate() sequence during DSM segment creation has
a redundant step: we're always extending from zero to the desired size,
so we might as well just call posix_fallocate().
Let's also include the remaining ftruncate() call (non-Linux POSIX
systems) in the wait event reporting, for good measure.
Discussion: https://postgr.es/m/CA%2BhUKGJSm-nq8s%2B_59zb7NbFQF-OS%3DxTnTAiGLrQpuSmU2y_1A%40mail.gmail.com
On Linux, we call posix_fallocate() on shm_open()'d memory to avoid
later potential SIGBUS (see commit 899bd785).
Based on field reports of systems stuck in an EINTR retry loop there,
there, we made it possible to break out of that loop via slightly odd
coding where the CHECK_FOR_INTERRUPTS() call was somewhat removed from
the loop (see commit 422952ee).
On further reflection, that was not a great choice for at least two
reasons:
1. If interrupts were held, the CHECK_FOR_INTERRUPTS() would do nothing
and the EINTR error would be surfaced to the user.
2. If EINTR was reported but neither QueryCancelPending nor
ProcDiePending was set, then we'd dutifully retry, but with a bit more
understanding of how posix_fallocate() works, it's now clear that you
can get into a loop that never terminates. posix_fallocate() is not a
function that can do some of the job and tell you about progress if it's
interrupted, it has to undo what it's done so far and report EINTR, and
if signals keep arriving faster than it can complete (cf recovery
conflict signals), you're stuck.
Therefore, for now, we'll simply block most signals to guarantee
progress. SIGQUIT is not blocked (see InitPostmasterChild()), because
its expected handler doesn't return, and unblockable signals like
SIGCONT are not expected to arrive at a high rate. For good measure,
we'll include the ftruncate() call in the blocked region, and add a
retry loop.
Back-patch to all supported releases.
Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Nicola Contu <nicola.contu@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220701154105.jjfutmngoedgiad3%40alvherre.pgsql
No members of the buildfarm are using this version of Visual Studio,
resulting in all the code cleaned up here as being mostly dead, and
VS2017 is the oldest version still supported.
More versions could be cut, but the gain would be minimal, while
removing only VS2013 has the advantage to remove from the core code all
the dependencies on the value defined by _MSC_VER, where compatibility
tweaks have accumulated across the years mostly around locales and
strtof(), so that's a nice isolated cleanup.
Note that this commit additionally allows a revert of 3154e16. The
versions of Visual Studio now supported range from 2015 to 2022.
Author: Michael Paquier
Reviewed-by: Juan José Santamaría Flecha, Tom Lane, Thomas Munro, Justin
Pryzby
Discussion: https://postgr.es/m/YoH2IMtxcS3ncWn+@paquier.xyz
This reverts commit 617d691412.
While I still think the basic idea is attractive, we need to sort
out what happens with built .c files, and there also seem to be
VPATH issues.
Commit 9c727360b neglected the lesson we've learned before:
protect references to backend global variables with #ifndef FRONTEND.
Since there's already a place for static inlines in this file,
move the just-converted functions to that stanza. Undo the
entirely gratuitous de-macroization of RelationGetNumberOfBlocks
(that one may be okay, since it has no global variable references,
but it's also pointless).
Per buildfarm.
The backend already used a mechanically-generated list of *.c files,
but everywhere else we had a manually-written-out list of files in
which to seek translatable messages. Commit b0a55e432 contains the
latest in a long line of failures to update those lists. Rather than
manually fix its oversight, let's change to using "$(wildcard *.c)"
in all these nls.mk files.
Many of these files also have manual references to some *.c files
in other directories, most often src/common/. Perhaps we should try
to improve that situation too; but it's a bit less clear how, so for
now just fix the local file references.
Kyotaro Horiguchi and Tom Lane
Discussion: https://postgr.es/m/20220713.160853.453362706160476128.horikyota.ntt@gmail.com
The initial version of gen_node_support.pl manually excluded most
utility statement node types from having out/read support, and
also some raw-parse-tree-only node types. That was mostly to keep
the output comparable to the old hand-maintained code. We'd like
to have out/read support for utility statements, for debugging
purposes and so that they can be included in new-style SQL functions;
so it's time to lift that restriction.
Most if not all of the previously-excluded raw-parse-tree-only node
types can appear in expression subtrees of utility statements, so
they have to be handled too.
We don't quite have full read support yet; certain custom_read_write
node types need to have their handwritten read functions implemented
before that will work.
Doing this allows us to drop the previous hack in _outQuery to not
dump the utilityStmt field in most cases, which means we no longer
need manually-maintained out/read functions for Query, so get rid
of those in favor of auto-generating them.
Fix a couple of omissions in gen_node_support.pl that are exposed
through having to handle more node types.
catversion bump forced because somebody was sloppy about the field
order in the manually-maintained Query out/read functions.
(Committers should note that almost all changes in parsenodes.h
are now grounds for a catversion bump.)
Commit 054325c5ee created a memory leak in PQsendQueryInternal in case
an error occurs while sending the message. Repair.
Backpatch to 14, like that commit. Reported by Coverity.
In what must have been a copy'n paste mistake, all the flag tests use
the same flag rather than a different flag each. The bug is not
suprising, considering that it's dead code; add a minimal, testimonial
line to cover it.
This is all pretty inconsequential, because this is just example code,
but it had better be correct.
Discussion: https://postgr.es/m/20220712152059.fwli2majwgzdmh4r@alvherre.pgsql
Previously, the STORAGE specification was only available in ALTER
TABLE. This makes it available in CREATE TABLE as well.
Also make the code and the documentation for STORAGE and COMPRESSION
attributes consistent.
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: wenjing zeng <wjzeng2012@gmail.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/de83407a-ae3d-a8e1-a788-920eb334f25b@sigaev.ru
Read/out support in 5ca0fe5c8a was missing/incomplete, per Tom Lane.
Again, as far as core is concerned, this is not only dead code but also
untested; however, third parties may come to rely on it, so the standard
features should work.
Discussion: https://postgr.es/m/1548311.1657636605@sss.pgh.pa.us
88dad06b47 contains a make $(shell)
construct that apparently confuses older GNU make versions (possibly
because of the # inside the shell command?). This construct, which
would allow # comments inside LINGUAS files, was adapted from gettext
recommendations, but we don't actually need that functionality, so
sidestep this whole issue by just using plain "cat".
In passing, make this code work with vpath.
This moves the list of available languages from nls.mk into a separate
file called po/LINGUAS. Advantages:
- It keeps the parts notionally managed by programmers (nls.mk)
separate from the parts notionally managed by translators (LINGUAS).
- It's the standard practice recommended by the Gettext manual
nowadays.
- The Meson build system also supports this layout (and of course
doesn't know anything about our custom nls.mk), so this would enable
sharing the list of languages between the two build systems.
(The MSVC build system currently finds all po files by globbing, so it
is not affected by this change.)
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/557a9f5c-e871-edc7-2f58-a4140fb65b7b@enterprisedb.com
When checking for interleaved partitions, we mark the partition as
interleaved when;
1. we find an earlier partition index when looping over the
sorted-by-Datum indexes[] array, or;
2. we find that the NULL partition allows some non-NULL Datum value.
In the code, as it was written in db632fbca we'll continue to check for
case 2 when we've already marked the partition as interleaved for case 1.
Here we make it so we don't bother marking the partition as interleaved
for case 2 when it's already been marked due to case 1.
Really all this saves is a useless call to bms_add_member(), but since
this code is new to PG15, it seems worth fixing it now to save anyone the
trouble of complaining at some time in the future. We have the
opportunity to improve this now before PG15 is out. This might ease some
future back-patching pain.
Per report and patch by Zhihong Yu. However, I slightly revised the
comments and altered the bms_add_member() code to match in both locations.
We already know that index is equal to boundinfo->null_index from the if
condition.
Author: Zhihong Yu
Discussion: https://postgr.es/m/CALNJ-vQbZR0pYxz9zQ5bqXVcwtGgNgVupeEpNT65HZ+yWZnc4g@mail.gmail.com
Backpatch-through: 15, same as db632fbca.
The following options are added to createuser:
* --valid-until to generate a VALID UNTIL clause for the role created.
* --bypassrls/--no-bypassrls for BYPASSRLS/NOBYPASSRLS.
* -m/--member to make the new role a member of an existing role, with an
extra ROLE clause generated. The clause generated overlaps with
-g/--role, but per discussion this was the most popular choice as option
name.
* -a/--admin for the addition of an ADMIN clause.
These option names are chosen to be completely new, so as they do not
impact anybody relying on the existing option set. Tests are added for
the new options and extended a bit, while on it, to cover more patterns
where quotes are added to various elements of the query generated.
Author: Shinya Kato
Reviewed-by: Nathan Bossart, Daniel Gustafsson, Robert Haas, Kyotaro
Horiguchi, David G. Johnston, Przemysław Sztoch
Discussion: https://postgr.es/m/69a9851035cf0f0477bcc5d742b031a3@oss.nttdata.com
Truncating off the end of a freshly copied List is not a very efficient
way of copying the first N elements of a List.
In many of the cases that are updated here, the pattern was only being
used to remove the final element of a List. That's about the best case
for it, but there were many instances where the truncate trimming the List
down much further.
4cc832f94 added list_copy_head(), so let's use it in cases where it's
useful.
Author: David Rowley
Discussion: https://postgr.es/m/1986787.1657666922%40sss.pgh.pa.us
This utility supports 23 options that are not really ordered in the
code, making the addition of new things more complicated than necessary.
This cleanup is in preparation for a patch to add even more options.
Discussion: https://postgr.es/m/69a9851035cf0f0477bcc5d742b031a3@oss.nttdata.com
There are a few things that we could do a little better within
get_cheapest_group_keys_order():
1. We should be using list_free() rather than pfree() on a List.
2. We should use for_each_from() instead of manually coding a for loop to
skip the first n elements of a List
3. list_truncate(list_copy(...), n) is not a great way to copy the first n
elements of a list. Let's invent list_copy_head() for that. That way we
don't need to copy the entire list just to truncate it directly
afterwards.
4. We can simplify finding the cheapest cost by setting the cheapest cost
variable to DBL_MAX. That allows us to skip special-casing the initial
iteration of the loop.
Author: David Rowley
Discussion: https://postgr.es/m/CAApHDvrGyL3ft8waEkncG9y5HDMu5TFFJB1paoTC8zi9YK97Nw@mail.gmail.com
Backpatch-through: 15, where get_cheapest_group_keys_order was added.
Previously, ECPG could only cope with variable declarations whose
type names either weren't any SQL keyword, or were at least partially
reserved. If you tried to use something in the unreserved_keyword
category, you got a syntax error.
This is pretty awful, not only because it says right on the tin that
those words are not reserved, but because the set of such keywords
tends to grow over time. Thus, an ECPG program that was just fine
last year could fail when recompiled with a newer SQL grammar.
We had to work around this recently when STRING became a keyword,
but it's time for an actual fix instead of a band-aid.
To fix, borrow a trick from C parsers and make the lexer's behavior
change when it sees a word that is known as a typedef. This is not
free of downsides: if you try to use such a name as a SQL keyword
in EXEC SQL later in the program, it won't be recognized as a SQL
keyword, leading to a syntax error there instead. So in a real
sense this is just trading one hazard for another. But there is an
important difference: with this, whether your ECPG program works
depends only on what typedef names and SQL commands are used in the
program text. If it compiles today it'll still compile next year,
even if more words have become SQL keywords.
Discussion: https://postgr.es/m/3661437.1653855582@sss.pgh.pa.us
Justin Pryzby reported that some scenarios could cause gathering
of extended statistics to spend many seconds in an un-cancelable
qsort() operation. To fix, invent qsort_interruptible(), which is
just like qsort_arg() except that it will also do CHECK_FOR_INTERRUPTS
every so often. This bloats the backend by a couple of kB, which
seems like a good investment. (We considered just enabling
CHECK_FOR_INTERRUPTS in the existing qsort and qsort_arg functions,
but there are some callers for which that'd demonstrably be unsafe.
Opt-in seems like a better way.)
For now, just apply qsort_interruptible() in statistics collection.
There's probably more places where it could be useful, but we can
always change other call sites as we find problems.
Back-patch to v14. Before that we didn't have extended stats on
expressions, so that the problem was less severe. Also, this patch
depends on the sort_template infrastructure introduced in v14.
Tom Lane and Justin Pryzby
Discussion: https://postgr.es/m/20220509000108.GQ28830@telsasoft.com
validate_exec() didn't guarantee to set errno to something appropriate
after a failure, leading to callers not being able to print an on-point
message. Improve that.
Noted by Kyotaro Horiguchi, though this isn't exactly his proposal.
Discussion: https://postgr.es/m/20220615.131403.1791191615590453058.horikyota.ntt@gmail.com
pg_upgrade does not use common/logging.c, which is unfortunate
but changing it to do so seems like more work than is justified.
However, we really need to make it work more like common/logging.c
in one respect: the latter expects supplied message strings to not
end with a newline, instead adding one internally. As it stands,
pg_upgrade's logging facilities expect a caller-supplied newline
in some cases and not others, which is already an invitation to bugs,
but the inconsistency with our other frontend code makes it worse.
There are already several places with missing or extra newlines,
and it's inevitable that there won't be more if we let this stand.
Hence, run around and get rid of all trailing newlines in message
strings, and add an Assert that there's not one, similar to the
existing Assert in common/logging.c. Adjust the logging functions
to supply a newline at the right places.
(Some of these strings also have a *leading* newline, which would
be a good thing to get rid of too; but this patch doesn't attempt
that.)
There are some consequent minor changes in output. The ones that
aren't outright bug fixes are generally removal of extra blank
lines that the original coding intentionally inserted. It didn't
seem worth being bug-compatible with that.
Patch by me, reviewed by Kyotaro Horiguchi and Peter Eisentraut
Discussion: https://postgr.es/m/113191.1655233060@sss.pgh.pa.us
Having different build systems producing different contents of the
NodeTag enum would be catastrophic for extension ABI stability.
But that ordering depends on the order in which gen_node_support.pl
processes its input files. It seems too fragile to let the Makefiles,
MSVC build scripts, and soon meson build scripts all set this order
independently. As a klugy but serviceable solution, put a canonical
copy of the file list into gen_node_support.pl itself, and check that
against the files given on the command line.
Also, while it's fine to add and delete node tags during development,
we must not let the assigned NodeTag values change unexpectedly in
stable branches. Add a cross-check that can be enabled when a branch
is forked off (or later, but that is a time when we're unlikely to
miss doing it). It just checks that the last auto-assigned number
doesn't change, which is simplistic but will catch the most likely
sorts of mistakes.
From time to time we do need to add a node tag in a stable branch.
To support doing that without changing the branch's auto-assigned
tag numbers, invent pg_node_attr(nodetag_number(VALUE)) which can
be used to give such a node a hand-assigned tag above the last
auto-assigned one.
Discussion: https://postgr.es/m/1249010.1657574337@sss.pgh.pa.us
This allows explaining gen_node_support.pl's handling of execnodes.h
and some other input files as being a shortcut for explicit marking
of all their node declarations as pg_node_attr(nodetag_only).
I foresee that someday we might need to be more fine-grained about
that, and this change provides the infrastructure needed to do so.
For now, it just allows removal of the script's klugy special case
for CallContext and InlineCodeBlock.
Discussion: https://postgr.es/m/75063.1657410615@sss.pgh.pa.us
Commit f10a025cfe added support for List to store Xids, but didn't
handle the new type in all cases. Add some obviously necessary pieces.
As far as I am aware, this is all dead code as far as core code is
concerned, but it seems unacceptable not to have it in case third-party
code wants to rely on this type of list. (Some parts of the List API
remain unimplemented, but that can be fixed as and when needed -- see
lack of list_intersection_oid, list_deduplicate_int as precedents.)
Discussion: https://postgr.es/m/20220708164534.nbejhgt4ajz35p65@alvherre.pgsql
Commit 3838fa269 added a lookahead loop to allow building strings multiple
bytes at a time. This loop could exit because it reached the end of input,
yet did not check for that before checking if we reached the end of a
valid string. To fix, put the end of string check back in the outer loop.
Per Valgrind animal skink
Now some foreign data wrappers support TRUNCATE command.
So it's useful to support TRUNCATE triggers on foreign tables for
audit logging or for preventing undesired truncation.
Author: Yugo Nagata
Reviewed-by: Fujii Masao, Ian Lawrence Barwick
Discussion: https://postgr.es/m/20220630193848.5b02e0d6076b86617a915682@sraoss.co.jp
Further to commit 92d70b77, let's drop the code we carry for the
following untested architectures: M68K, M88K, M32R, SuperH. We have no
idea if anything actually works there, and surely as vintage hardware
and microcontrollers they would be underpowered for modern purposes.
We could always consider re-adding SuperH based on evidence of usage and
build farm support, if someone shows up to provide it.
While here, SPARC is usually written in all caps.
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (the idea, not the patch)
Discussion: https://postgr.es/m/959917.1657522169%40sss.pgh.pa.us
Refactor so that log_line_prefix() is a thin wrapper over a new
function log_status_format(), and move the implementation to the
latter. Export log_status_format() so that it can be used by an
emit_log_hook.
Discussion: https://postgr.es/m/39c8197652f4d3050aedafae79fa5af31096505f.camel%40j-davis.com
Reviewed-by: Michael Paquier, Alvaro Herrera
Remove PageIsValid() and PageSizeIsValid(), which weren't used and
seem unnecessary.
Some code using these formerly-macros needs some adjustments because
it was previously playing loose with the Page vs. PageHeader types,
which is no longer possible with the functions instead of macros.
Reviewed-by: Amul Sul <sulamul@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517%40enterprisedb.com
dshash.c previously maintained flags to be able to assert that you
didn't hold any partition lock. These flags could get out of sync with
reality in error scenarios.
Get rid of all that, and make assertions about the locks themselves
instead. Since LWLockHeldByMe() loops internally, we don't want to put
that inside another loop over all partition locks. Introduce a new
debugging-only interface LWLockAnyHeldByMe() to avoid that.
This problem was noted by Tom and Andres while reviewing changes to
support the new shared memory stats system, and later showed up in
reality while working on commit 389869af.
Back-patch to 11, where dshash.c arrived.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220311012712.botrpsikaufzteyt@alap3.anarazel.de
Discussion: https://postgr.es/m/CA%2BhUKGJ31Wce6HJ7xnVTKWjFUWQZPBngxfJVx4q0E98pDr3kAw%40mail.gmail.com
This addresses two issues in the tests of test_oat_hooks:
- The role regress_test_user was being left behind, preventing the test
to succeed on repeated runs. It makes sense to leave some objects
behind to have more coverage for pg_upgrade (as does test_pg_dump), but
the role dropped here does not own any objects so there is no reason to
keep it.
- GRANT SET ON PARAMETER is issued, creating an entry in
pg_parameter_acl without cleaning up the entry created. This causes
an overlap with unsafe_tests as both use work_mem, making the latter
fail. This commit adds an extra REVOKE SET ON PARAMETER to clean the
contents of pg_parameter_acl, switching to maintenance_work_mem rather
than work_mem to avoid an overlap between both tests.
The tests of test_oat_hooks cannot use installcheck yet as these are
proving to be unstable with caching and the namespace search hooks, so
the issues fixed here cannot be reached yet, but they would be once the
hook issue is addressed and installcheck is allowed again in
test_oat_hooks.
Discussion: https://postgr.es/m/YrpVkADAY0knF6vM@paquier.xyz
Backpatch-through: 15
The original comments mentioned a "parameter" as something not defined
in a fast-exit path to assume a true status. This is rather confusing
as the parameter DefElem is defined, and the intention is to check if
its value is defined. This improves both comments to mention the value
assigned to the DefElem's value instead, so as future patches are able
to catch the tweak if this code pattern gets copied around more.
Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv0yWynWTmp4o34s0d98xVubys9fy=p0YXsZ5_sUcNnMw@mail.gmail.com
* Remove arbitrary mention of certain endianness and bitness variants;
it's enough to say that applicable variants are expected to work.
* List RISC-V (known to work, being tested).
* List SuperH and M88K (code exists, unknown status, like M68K).
* De-list VAX and remove code (known not to work).
* Remove stray trace of Alpha (support was removed years ago).
* List illumos, DragonFlyBSD (known to work, being tested).
* No need to single Windows out by listing a specific version, when we
don't do that for other OSes; it's enough to say that we support
current versions of the listed OSes (when 16 ships, that'll be
Windows 10+).
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Greg Stark <stark@mit.edu>
Discussion: https://postgr.es/m/CA%2BhUKGKk7NZO1UnJM0PyixcZPpCGqjBXW_0bzFZpJBGAf84XKg%40mail.gmail.com
When you hit ^C, the terminal driver in Unix-like systems echoes "^C" as
well as sending an interrupt signal (depending on stty settings). At
least libedit (but maybe also libreadline) is then confused about the
current cursor location, and corrupts the display if you try to scroll
back. Fix, by moving to a new line before the next prompt is displayed.
Back-patch to all supported released.
Author: Pavel Stehule <pavel.stehule@gmail.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3278793.1626198638%40sss.pgh.pa.us
Fix incorrect reporting of the location of errors (such as bogus
node attributes). Add header comments to the generated files,
containing copyright notices and reminders that they are generated
files, as we do in other file-generating scripts. Arrange to not
leave a clutter of temporary files when the script detects an error.
Discussion: https://postgr.es/m/3843645.1657385930@sss.pgh.pa.us
copyfuncs.c and friends no longer seem like great places to put
high-level remarks about what's covered and what isn't. Move that
material to backend/nodes/README and other more-prominent places.
Add back (versions of) some remarks that disappeared in 2be87f092.
Discussion: https://postgr.es/m/3843645.1657385930@sss.pgh.pa.us
In the same vein as commit 251154beb, make it clear that we never
instantiate PlanState.
Also mark MemoryContextData as abstract. This has no effect right now,
since memnodes.h isn't one of the files fed to gen_node_support.pl.
But it seems like good documentation and future-proofing.
Add a script to automatically generate the node support functions
(copy, equal, out, and read, as well as the node tags enum) from the
struct definitions.
For each of the four node support files, it creates two include files,
e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main
file. All the scaffolding of the main file stays in place.
I have tried to mostly make the coverage of the output match what is
currently there. For example, one could now do out/read coverage of
utility statement nodes, but I have manually excluded those for now.
The reason is mainly that it's easier to diff the before and after,
and adding a bunch of stuff like this might require a separate
analysis and review.
Subtyping (TidScan -> Scan) is supported.
For the hard cases, you can just write a manual function and exclude
generating one. For the not so hard cases, there is a way of
annotating struct fields to get special behaviors. For example,
pg_node_attr(equal_ignore) has the field ignored in equal functions.
(In this patch, I have only ifdef'ed out the code to could be removed,
mainly so that it won't constantly have merge conflicts. It will be
deleted in a separate patch. All the code comments that are worth
keeping from those sections have already been moved to the header
files where the structs are defined.)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
PostgreSQL contains the implementation of the red-black tree. The red-black
tree is the ordered data structure, and one of its advantages is the ability
to do inequality searches. This commit adds rbt_find_less() and
rbt_find_great() functions implementing these searches. While these searches
aren't yet used in the core code, they might be useful for extensions.
Discussion: https://postgr.es/m/CAGRrpzYE8-7GCoaPjOiL9T_HY605MRax-2jgTtLq236uksZ1Sw%40mail.gmail.com
Author: Steve Chavez, Alexander Korotkov
Reviewed-by: Alexander Korotkov
Commit 9a974cbcba did this for user
tables, but pg_upgrade treats pg_largeobject as a user table, and so
needs the same treatment. Without this fix, if you rewrite the
pg_largeobject table and then perform an upgrade with pg_upgrade, the
table will apparently be empty on the new cluster, while all of your
objects will end up with an orphaned file.
With this fix, instead of the old cluster's pg_largeobject files ending
up orphaned, the original files fro the new cluster do. That's mostly
harmless because we expect the table to be empty, but we might want
to arrange to remove the as part of the upgrade. Since we're still
debating the best way of doing that, I (rhaas) have decided to postpone
dealing with that problem and get the basic fix committed.
Justin Pryzby, reviewed by Shruthi Gowda and by me.
Discussion: http://postgr.es/m/20220701231413.GI13040@telsasoft.com
This CPU architecture has been discontinued. We already removed HP-UX
support, we never supported Windows/Itanium, and the open source
operating systems that a vintage hardware owner might hope to run have
all either ended Itanium support or never fully released support (NetBSD
may eventually). The extra code we carry for this rare ISA is now
untested. It seems like a good time to remove it.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/1415825.1656893299%40sss.pgh.pa.us
HP-UX hardware is no longer produced, build farm coverage recently
ended, and there are no known active maintainers targeting this OS.
Since there is a major rewrite of the build system in the pipeline for
PostgreSQL 16, and that requires development, testing and maintainance
for each OS and tool chain, it seems like a good time to drop support
for:
* HP-UX, the operating system.
* HP aCC, the HP-UX native compiler.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/1415825.1656893299%40sss.pgh.pa.us
These are documented to be the allowed types for the RETURNING clause,
but the restriction was not being enforced, which caused a segfault if
another type was specified. Add some testing for this.
Per report from a.kozhemyakin
Backpatch to release 15.
The general convention in the executor is to refer to child plans
and planstates via the outerPlan[State] and innerPlan[State]
macros, but a few places didn't do it like that. For consistency
and readability, convert all the stragglers to use the macros.
(See also commit 40f42d2a3, which did some similar cleanup a few
years ago, but missed these cases.)
Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-vYhh1xsa_veah4PUed2Xq=Ed_YH3=Mqt5A3Y=EgfCEg@mail.gmail.com
It is useful for debugging purposes to report the checkpoint LSN and
REDO LSN in log_checkpoints message. It can give more context while
analyzing checkpoint-related issues. pg_controldata reports the last
checkpoint LSN and REDO LSN, but having this information alongside
the log message helps analyze issues that happened previously,
connect the dots and identify the root cause.
Author: Bharath Rupireddy, Kyotaro Horiguchi
Reviewed-by: Michael Paquier, Julien Rouhaud, Nathan Bossart, Fujii Masao, Greg Stark
Discussion: https://postgr.es/m/CALj2ACWt6kqriAHrO+AJj+OmP=suwbktHT5JoYAn-nqZe2gd2g@mail.gmail.com
When locking a specific named relation for a FOR [KEY] UPDATE/SHARE
clause, transformLockingClause() finds the relation to lock by
scanning the rangetable for an RTE with a matching eref->aliasname.
However, it failed to account for the visibility rules of a join RTE.
If a join RTE doesn't have a user-supplied alias, it will have a
generated eref->aliasname of "unnamed_join" that is not visible as a
relation name in the parse namespace. Such an RTE needs to be skipped,
otherwise it might be found in preference to a regular base relation
with a user-supplied alias of "unnamed_join", preventing it from being
locked.
In addition, if a join RTE doesn't have a user-supplied alias, but
does have a join_using_alias, then the RTE needs to be matched using
that alias rather than the generated eref->aliasname, otherwise a
misleading "relation not found" error will be reported rather than a
"join cannot be locked" error.
Backpatch all the way, except for the second part which only goes back
to 14, where JOIN USING aliases were added.
Dean Rasheed, reviewed by Tom Lane.
Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com
This commit bumps the runtime value of _WIN32_WINNT to be 0x0A00 for any
builds on Windows. Hence, this makes Windows 10 the minimal requirement
when running PostgreSQL under WIN32, be it for builds of Cygwin, MinGW
or Visual Studio.
The previous minimal runtime version was either Windows Vista when
building with at least Visual Studio 2015 or Windows XP for the rest.
Windows 10 is the most modern version supported by Microsoft, and per
discussion, as we don't have buildfarm members that run older versions
anymore, this is the minimal supported version that suits better for our
needs. This will actually make easier the development of some patches,
two being async I/O and large page handling by avoiding a lot of
compatibility gotchas, on platforms that have most likely few users
anyway.
It is possible to remove MIN_WINNT in win32.h and the macros
IsWindowsXXXOrGreater() that were used in the code at runtime to check
which version of Windows was getting used. The change in pg_locale.c
comes from Juan. Note that all my tests passed, and that the CI is
green. The buildfarm will quickly tell if this needs more adjustments.
Author: Michael Paquier, Juan José Santamaría Flecha
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/Yo7tHKD8VCkeNi71@paquier.xyz
By convention, the tab-complete subscription parameters are listed in the
COMPLETE_WITH lists in alphabetical order, but when the "disable_on_error"
parameter was introduced this was not done.
This patch just tidies that up.
Reported-by: Peter Smith
Author: Peter Smith
Reviewed-by: Euler Taveira, Takamichi Osumi
Backpatch-through: 15, where it was introduced
Discussion: https://postgr.es/m/CAHut+PucvKZgg_eJzUW--iL6DXHg1Jwj6F09tQziE3kUF67uLg@mail.gmail.com
That comment might have been true at some point during development, but
definitely isn't anymore.
Reported-By: Melanie Plageman <melanieplageman@gmail.com>
Backpatch: 15-
We hadn't noticed this because it's dead code: there is no
situation where we read raw parse trees from text format.
So maybe the right fix is to remove the function altogether,
but I'll forbear for now; it's not the only dead code in
readfuncs.c, I think.
Noted while comparing existing code to the results of
Peter's auto-generation script.
40af10b57 changed things so we make use of a generation memory context for
storing tuples to be sorted by tuplesort.c. That change does not play
nicely with the changes made in 9f03ca915 (back in 2014). That commit
changed things so that index_form_tuple() is called while switched into
the tuplestore's tuplecontext. In order to fetch the tuple from the index,
index_form_tuple() must do various memory allocations which are unrelated
to the storage of the final returned tuple. Although all of these
allocations are pfree'd, the fact that we now use a generation context
means that the memory for these pfree'd allocations won't be used again by
any other allocation due to generation.c's lack of freelists. This could
result in sorts used for building indexes exceeding maintenance_work_mem
by a very large amount.
Here we fix it so we no longer allocate anything apart from the tuple
itself into the generation context by adding a new version of
index_form_tuple named index_form_tuple_context, which can be called to
specify the MemoryContext to allocate the tuple into.
Discussion: https://postgr.es/m/CAApHDvrHQkiFRHiGiAS-LMOvJN-eK-s762=tVzBz8ZqUea-a_A@mail.gmail.com
Backpatch-through: 15, where 40af10b57 was added.
There's no reason anymore to only drop subscription stats if associated with a
slot, now that stats drops are transactional. And since there's now no other
cleanup of stats, this would lead to stats for slot-less subscriptions to get
leaked (however most slot-less subs won't have stats).
Additionally, the comment referring to autovacuum cleaning up stats was
clearly outdated.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAD21AoAwiby3HeJE7vJe16Gr75RFfJ640dyHqvsiUhyKJTXPtw@mail.gmail.com
Backpatch: 15-
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
50e17ad28 increased the size of ExprEvalStep from 64 bytes up to 88 bytes.
Lots of effort was spent during the development of the current expression
evaluation code to make an instance of this struct as small as possible.
Making this struct larger than needed reduces CPU cache efficiency during
expression evaluation which causes noticeable slowdowns during query
execution.
In order to reduce the size of the struct, here we remove the fn_addr
field. The values from this field can be obtained via fcinfo, just with
some extra pointer dereferencing. The extra indirection does not seem to
cause any noticeable slowdowns.
Various other fields have been moved into the ScalarArrayOpExprHashTable
struct. These fields are only used when the ScalarArrayOpExprHashTable
pointer has already been dereferenced, so no additional pointer
dereferences occur for these. Here we also make hash_fcinfo_data the last
field in ScalarArrayOpExprHashTable so that we can avoid a further pointer
dereference to get the FunctionCallInfoBaseData. This also saves a call to
palloc().
50e17ad28 was added in 14, but it's too late to adjust the size of the
ExprEvalStep in that version, so here we just backpatch to 15, which is
currently in beta.
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Backpatch-through: 15
Some routines open-coded the construction of DataRow messages. Use
TupOutputState struct and associated functions instead, which was
already done in some places.
SendTimeLineHistory() is a bit more complicated and isn't converted by
this.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/7e4fdbdc-699c-4cd0-115d-fb78a957fc22@enterprisedb.com
macOS has traditionally used extension .dylib for shared libraries
(used at build time) and .so for dynamically loaded modules (used by
dlopen()). This complicates the build system a bit. Also, Meson uses
.dylib for both, so it would be worth unifying this in order to be
able to get equal build output.
There doesn't appear to be any reason to use any particular extension
for dlopened modules, since dlopen() will accept anything and
PostgreSQL is well-factored to be able to deal with any extension.
Other software packages that I have handy appear to be about 50/50
split on which extension they use for their plugins. So it seems
possible to change this safely.
Discussion: https://www.postgresql.org/message-id/flat/bcc45f78-e3c3-8fb3-7c42-5371b48b5266%40enterprisedb.com
auto_explain.log_parameter_max_length is a new GUC part of the
extension, similar to the corresponding core setting, that controls the
inclusion of query parameters in the logged explain output.
More tests are added to check the behavior of this new parameter: when
parameters logged in full (the default of -1), when disabled (value of
0) and when partially truncated (value different than the two others).
Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87ee09mohb.fsf@wibble.ilmari.org
Previously the timer was enabled whenever there were any pending stats after
executing a statement, just to then be disabled again when not idle
anymore. That lead to an increase in GetCurrentTimestamp() calls from within
timeout.c compared to 14.
To avoid that increase, leave the timer enabled until stats are reported,
rather than until idle. The timer is only disabled once the pending stats have
been reported.
For me this fixes the increase in GetCurrentTimestamp() calls, there now are
fewer calls in 15 than in 14, in the previously slowed down workload.
While at it, also update assertion in pgstat_report_stat() to be more precise.
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Backpatch: 15-
The new expression step types increased the size of ExprEvalStep by ~4 for all
types of expression steps, slowing down expression evaluation noticeably. Move
them out of line.
There's other issues with these expression steps, but addressing them is
largely independent of this aspect.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Backpatch: 15-
This reverts most of 91c0570a79, f28bf667f6, fe0972ee5e, afdeff1052. The
only thing left is the retry loop in 019_replslot_limit.pl that avoids
spurious failures by retrying a couple times.
We haven't seen any hard evidence that this is caused by anything but slow
process shutdown. We did not find any cases where walsenders did not vanish
after waiting for longer. Therefore there's no reason for this debugging code
to remain.
Discussion: https://postgr.es/m/20220530190155.47wr3x2prdwyciah@alap3.anarazel.de
Backpatch: 15-
When we changed some built-in functions to use anycompatiblearray
instead of anyarray, we created a dump/restore hazard for user-defined
operators and aggregates relying on those functions: the user objects
have to be modified to change their signatures similarly. This causes
pg_upgrade to fail partway through if the source installation contains
such objects. We generally try to have pg_upgrade detect such hazards
and fail before it does anything exciting, so add logic to detect
this case too.
Back-patch to v14 where the change was made.
Justin Pryzby, reviewed by Andrey Borodin
Discussion: https://postgr.es/m/3383880.QJadu78ljV@vejsadalnx
Noted while comparing existing code to the output of the proposed
patch to automate creation of these functions. Some of the changes
are just cosmetic, but others represent real bugs. I've not
attempted to analyze the user-visible impact.
Back-patch to v15 where this code came in.
Discussion: https://postgr.es/m/1794155.1656984188@sss.pgh.pa.us
We were going into IDLE state too soon when executing queries via
PQsendQuery in pipeline mode, causing several scenarios to misbehave in
different ways -- most notably, as reported by Daniele Varrazzo, that a
warning message is produced by libpq:
message type 0x33 arrived from server while idle
But it is also possible, if queries are sent and results consumed not in
lockstep, for the expected mediating NULL result values from PQgetResult
to be lost (a problem which has not been reported, but which is more
serious).
Fix this by introducing two new concepts: one is a command queue element
PGQUERY_CLOSE to tell libpq to wait for the CloseComplete server
response to the Close message that is sent by PQsendQuery. Because the
application is not expecting any PGresult from this, the mechanism to
consume it is a bit hackish.
The other concept, authored by Horiguchi-san, is a PGASYNC_PIPELINE_IDLE
state for libpq's state machine to differentiate "really idle" from
merely "the idle state that occurs in between reading results from the
server for elements in the pipeline". This makes libpq not go fully
IDLE when the libpq command queue contains entries; in normal cases, we
only go IDLE once at the end of the pipeline, when the server response
to the final SYNC message is received. (However, there are corner cases
it doesn't fix, such as terminating the query sequence by
PQsendFlushRequest instead of PQpipelineSync; this sort of scenario is
what requires PGQUERY_CLOSE bit above.)
This last bit helps make the libpq state machine clearer; in particular
we can get rid of an ugly hack in pqParseInput3 to avoid considering
IDLE as such when the command queue contains entries.
A new test mode is added to libpq_pipeline.c to tickle some related
problematic cases.
Reported-by: Daniele Varrazzo <daniele.varrazzo@gmail.com>
Co-authored-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CA+mi_8bvD0_CW3sumgwPvWdNzXY32itoG_16tDYRu_1S2gV2iw@mail.gmail.com
Amendment to 84ad713cf8: Not all
prepared statements have a result descriptor. As currently coded,
this would crash when reading pg_prepared_statements. Make those
cases return null for result_types instead. Also add a test case for
it.
A previous commit replaced all the calls to this function with
durable_rename() as of dac1ff3, making it used nowhere in the tree.
Using it in extension code is also risky based on the issues described
in this previous commit, so let's remove it. This makes possible the
removal of HAVE_WORKING_LINK.
Author: Nathan Bossart
Reviewed-by: Robert Haas, Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/20220407182954.GA1231544@nathanxps13
durable_rename_excl() attempts to avoid overwriting any existing files
by using link() and unlink(), and it falls back to rename() on some
platforms (aka WIN32), which offers no such overwrite protection. Most
callers use durable_rename_excl() just in case there is an existing
file, but in practice there shouldn't be one (see below for more
details).
Furthermore, failures during durable_rename_excl() can result in
multiple hard links to the same file. As per Nathan's tests, it is
possible to end up with two links to the same file in pg_wal after a
crash just before unlink() during WAL recycling. Specifically, the test
produced links to the same file for the current WAL file and the next
one because the half-recycled WAL file was re-recycled upon restarting,
leading to WAL corruption.
This change replaces all the calls of durable_rename_excl() to
durable_rename(). This removes the protection against accidentally
overwriting an existing file, but some platforms are already living
without it and ordinarily there shouldn't be one. The function itself
is left around in case any extensions are using it. It will be removed
on HEAD via a follow-up commit.
Here is a summary of the existing callers of durable_rename_excl() (see
second discussion link at the bottom), replaced by this commit. First,
basic_archive used it to avoid overwriting an archive concurrently
created by another server, but as mentioned above, it will still
overwrite files on some platforms. Second, xlog.c uses it to recycle
past WAL segments, where an overwrite should not happen (origin of the
change at f0e37a8) because there are protections about the WAL segment
to select when recycling an entry. The third and last area is related
to the write of timeline history files. writeTimeLineHistory() will
write a new timeline history file at the end of recovery on promotion,
so there should be no such files for the same timeline.
What remains is writeTimeLineHistoryFile(), that can be used in parallel
by a WAL receiver and the startup process, and some digging of the
buildfarm shows that EEXIST from a WAL receiver can happen with an error
of "could not link file \"pg_wal/xlogtemp.NN\" to \"pg_wal/MM.history\",
which would cause an automatic restart of the WAL receiver as it is
promoted to FATAL, hence this should improve the stability of the WAL
receiver as rename() would overwrite an existing TLI history file
already fetched by the startup process at recovery.
This is a bug fix, but knowing the unlikeliness of the problem involving
one or more crashes at an exceptionally bad moment, no backpatch is
done. Also, I want to be careful with such changes (aaa3aed did the
opposite of this change by removing HAVE_WORKING_LINK so as Windows
would do a link() rather than a rename() but this was not
concurrent-safe). A backpatch could be revisited in the future. This
is the second time this change is attempted, ccfbd92 being the first
one, but this time no assertions are added for the case of a TLI history
file written concurrently by the WAL receiver or the startup process
because we can expect one to exist (some of the TAP tests are able to
trigger with a proper timing).
Author: Nathan Bossart
Reviewed-by: Robert Haas, Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/20220407182954.GA1231544@nathanxps13
Discussion: https://postgr.es/m/Ym6GZbqQdlalSKSG@paquier.xyz
Use it for RelationSyncEntry->streamed_txns, which is currently using an
integer list.
The API support is not complete, not because it is hard to write but
because it's unclear that it's worth the code space, there being so
little use of XID lists.
Discussion: https://postgr.es/m/202205130830.g5ntonhztspb@alvherre.pgsql
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Attempting such an operation would already fail, but in various and
confusing ways. For example, while in recovery, some elog() messages
would be reported, but these should never be user-facing. This commit
restricts any write operations done on large objects in a read-only
context, so as the errors generated are more user-friendly. This is per
the discussion done with Tom Lane and Robert Haas.
Some regression tests are added to check the case of all the SQL
functions working on large objects (including an update of the test's
alternate output).
Author: Yugo Nagata
Discussion: https://postgr.es/m/20220527153028.61a4608f66abcd026fd3806f@sraoss.co.jp
Amendment to ec40f34224: We also need to
change the way the datum is supplied to int8. Otherwise, the value is
still cut off as an int4, and it will crash on 32-bit platforms.
Several buildfarm members are failing the pg_upgrade test in
REL_15_STABLE, though the identical test is fine in HEAD.
On thorntail it's possible to see that the problem is an
overlength socket path name, and I bet the same is true
on the others.
The normally-started postmasters used in the test are already
set up with short socket directory paths, but we neglected to
tell pg_upgrade itself to do likewise when starting child
postmasters, and indeed it seems to be explicitly selecting
the test directory instead.
Back-patch to v15 where the current test script was introduced.
(The previous script might have the same issue, because I don't
see any use of -s/--socketdir in it either; but we've had no
complaints, so leave it alone for now.)
Discussion: https://postgr.es/m/1410025.1656890531@sss.pgh.pa.us
None of the other bison parsers contains this directive, and it gives
rise to some unfortunate and impenetrable messages, so just remove it.
Backpatch to release 12, where it was introduced.
Per gripe from Erik Rijkers
Discussion: https://postgr.es/m/ba069ce2-a98f-dc70-dc17-2ccf2a9bf7c7@xs4all.nl
Interpret its privileges argument as a comma-separated list of
privilege names, as in has_table_privilege and other functions.
This is actually net less code, since the support routine to
parse that already exists, and we can drop convert_priv_string()
which had no other use-case.
Robins Tharakan
Discussion: https://postgr.es/m/e5a05dc54ba64408b3dd260171c1abaf@EX13D05UWC001.ant.amazon.com
After commit 662dbe2c8, psql tab completion didn't conveniently
support the case of "ALTER EXTENSION foo UPDATE". It'd always
add "TO", which is fine if you want to specify a target version
but not if you don't ... and surely the latter is the much more
common case.
To fix, remove "TO" from the initially offered completion; you now
need to press TAB one additional time to get that. We won't try to
duplicate the old behavior of attempting initial completion on the
target version along with TO. It's too squirrelly to get the quoting
right, and this is such an infrequent usage that it doesn't seem worth
expending a lot of effort and special code on.
Noted by Noah Misch. Back-patch to v15.
Discussion: https://postgr.es/m/20220703083217.GB2476530@rfd.leadboat.com
Per buildfarm member prairiedog, this platform rejects uninitialized
global variables in shared libraries. Back-patch to v10, like the
addition of the variable.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/20220703030619.GB2378460@rfd.leadboat.com
ecpglib has been calling it once per SQL query and once per EXEC SQL GET
DESCRIPTOR. Instead, if newlocale() has not succeeded before, call it
while establishing a connection. This mitigates three problems:
- If newlocale() failed in EXEC SQL GET DESCRIPTOR, the command silently
proceeded without the intended locale change.
- On AIX, each newlocale()+freelocale() cycle leaked memory.
- newlocale() CPU usage may have been nontrivial.
Fail the connection attempt if newlocale() fails. Rearrange
ecpg_do_prologue() to validate the connection before its uselocale().
The sort of program that may regress is one running in an environment
where newlocale() fails. If that program establishes connections
without running SQL statements, it will stop working in response to this
change. I'm betting against the importance of such an ECPG use case.
Most SQL execution (any using ECPGdo()) has long required newlocale()
success, so there's little a connection could do without newlocale().
Back-patch to v10 (all supported versions).
Reviewed by Tom Lane. Reported by Guillaume Lelarge.
Discussion: https://postgr.es/m/20220101074055.GA54621@rfd.leadboat.com
POSIX shm_open() can sleep for a long time and fail spuriously because
of contention on an internal lock file on Solaris (and presumably
illumos). Commit 389869af fixed the main problem with this, namely that
we could crash, but it's now clear that "posix" is not a good default.
Therefore, choose "sysv" at initdb time on Solaris and illumos. Other
choices are still available by editing the postgresql.conf file.
Back-patch only to 15, because contention is much less likely further
back, and it doesn't seem like a good idea to change this in released
branches. This should clear up the failures on build farm animal
margay.
Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
pg_attribute_nonnull(...) can be used to generate compiler warnings
when a function is called with the specified arguments set to NULL, as
per an idea from Andres Freund. An empty argument list indicates that
no pointer arguments can be NULL. pg_attribute_nonnull() only works for
compilers that support the nonnull function attribute. If nonnull is
not supported, pg_attribute_nonnull() has no effect.
As a beginning, this commit uses it for the DefineCustomXXXVariable()
functions to generate warnings when the "name" and "value" arguments are
set to NULL. This will likely be expanded to other places in the
future, where it makes sense.
Author: Nathan Bossart
Reviewed by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20220525061739.ur7x535vtzyzkmqo@alap3.anarazel.de
There were many calls to construct_array() and deconstruct_array() for
built-in types, for example, when dealing with system catalog columns.
These all hardcoded the type attributes necessary to pass to these
functions.
To simplify this a bit, add construct_array_builtin(),
deconstruct_array_builtin() as wrappers that centralize this hardcoded
knowledge. This simplifies many call sites and reduces the amount of
hardcoded stuff that is spread around.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/2914356f-9e5f-8c59-2995-5997fc48bcba%40enterprisedb.com
Previously, we trusted the OS not to report EEXIST unless we'd passed in
IPC_CREAT | IPC_EXCL or O_CREAT | O_EXCL, as appropriate. Solaris's
shm_open() can in fact do that, causing us to crash because we didn't
ereport and then we blithely assumed the mapping was successful.
Let's treat EEXIST just like any other error, unless we're actually
trying to create a new segment. This applies to shm_open(), where this
behavior has been seen, and also to the equivalent operations for our
sysv and mmap modes just on principle.
Based on the underlying reason for the error, namely contention on a
lock file managed by Solaris librt for each distinct name, this problem
is only likely to happen on 15 and later, because the new shared memory
stats system produces shm_open() calls for the same path from
potentially large numbers of backends concurrently during
authentication. Earlier releases only shared memory segments between a
small number of parallel workers under one Gather node. You could
probably hit it if you tried hard enough though, and we should have been
more defensive in the first place. Therefore, back-patch to all
supported releases.
Per build farm animal margay. This isn't the end of the story, though,
it just changes random crashes into random "File exists" errors; more
work needed for a green build farm.
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
pg_start_backup() and pg_stop_backup() have been respectively renamed to
pg_backup_start() and pg_backup_stop() as of 39969e2, but a few comments
did not get the call.
Reviewed-by: Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/YrqGlj1+4DF3dbZ/@paquier.xyz
MemSet() with a value other than 0 just falls back to memset(), so the
indirection is unnecessary if the value is constant and not 0. Since
there is some interest in getting rid of MemSet(), this gets some easy
cases out of the way. (There are a few MemSet() calls that I didn't
change to maintain the consistency with their surrounding code.)
Discussion: https://www.postgresql.org/message-id/flat/CAEudQApCeq4JjW1BdnwU=m=-DvG5WyUik0Yfn3p6UNphiHjj+w@mail.gmail.com
plpython_unicode_3.out was already removed a long time ago, so it
being listed here was very out of date.
plpython_types_3.out was removed with the Python 2 removal.
TransactionIdIsInProgress had a fast path to return 'false' if the
single-item CLOG cache said that the transaction was known to be
committed. However, that was wrong, because a transaction is first
marked as committed in the CLOG but doesn't become visible to others
until it has removed its XID from the proc array. That could lead to an
error:
ERROR: t_xmin is uncommitted in tuple to be updated
or for an UPDATE to go ahead without blocking, before the previous
UPDATE on the same row was made visible.
The window is usually very short, but synchronous replication makes it
much wider, because the wait for synchronous replica happens in that
window.
Another thing that makes it hard to hit is that it's hard to get such
a commit-in-progress transaction into the single item CLOG cache.
Normally, if you call TransactionIdIsInProgress on such a transaction,
it determines that the XID is in progress without checking the CLOG
and without populating the cache. One way to prime the cache is to
explicitly call pg_xact_status() on the XID. Another way is to use a
lot of subtransactions, so that the subxid cache in the proc array is
overflown, making TransactionIdIsInProgress rely on pg_subtrans and
CLOG checks.
This has been broken ever since it was introduced in 2008, but the race
condition is very hard to hit, especially without synchronous
replication. There were a couple of reports of the error starting from
summer 2021, but no one was able to find the root cause then.
TransactionIdIsKnownCompleted() is now unused. In 'master', remove it,
but I left it in place in backbranches in case it's used by extensions.
Also change pg_xact_status() to check TransactionIdIsInProgress().
Previously, it only checked the CLOG, and returned "committed" before
the transaction was actually made visible to other queries. Note that
this also means that you cannot use pg_xact_status() to reproduce the
bug anymore, even if the code wasn't fixed.
Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with
the pg_xact_status() change added by me.
Author: Simon Riggs
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
Previously, we encoded both NULL and the first byte at the base address
as 0. That confusion led to the assertion in commit e07d4ddc, which
failed when min_dynamic_shared_memory was used. Give them distinct
encodings, by switching to 1-based offsets for non-NULL pointers. Also
improve macro hygiene in passing (missing/misplaced parentheses), and
remove open-coded access to the raw offset value from freepage.c/h.
Although e07d4ddc was back-patched to 10, the only code that actually
makes use of relptr at the base address arrived in 84b1c63a, so no need
to back-patch further than 14 for now.
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20220519193839.GT19626%40telsasoft.com
Commit 64919aaab made pull_up_simple_subquery set rte->subquery = NULL
after doing the deed, so that we don't waste cycles copying a
now-useless subquery tree around. This turns out to create a core dump
hazard in range_table_mutator, which supposes that that field is never
NULL. Apparently none of our own code invokes query_tree_mutator or
range_table_mutator on the top Query after subquery pullup; but it
wouldn't be surprising if outside code does, and anyway I'm working
on a v16 patch that will need it.
We can fix this cleanly by just getting rid of the special-case
handling of this field and treating it more like all the rest.
I think the special case might be left over from a time when
QTW_DONT_COPY_QUERY was the default behavior, but that was eons ago.
Thanks to Dean Rasheed for review.
Discussion: https://postgr.es/m/545569.1656107045@sss.pgh.pa.us
Since commit 6a2a70a02, we've used signalfd() to receive latch wakeups
when building with WAIT_USE_EPOLL (default for Linux and illumos), and
our traditional self-pipe when falling back to WAIT_USE_POLL (default
for other Unixes with neither epoll() nor kqueue()).
Unexplained hangs and kernel panics have been reported on illumos
systems, apparently linked to this use of signalfd(), leading illumos
users and build farm members to have to define WAIT_USE_POLL explicitly
as a work-around. A bug report exists at
https://www.illumos.org/issues/13700 but no fix is available yet.
Let's provide a way for illumos users to go back to self-pipes with
epoll(), like releases before 14, and choose that by default. No change
for Linux users. To help with development/debugging, macros
WAIT_USE_{EPOLL,POLL} and WAIT_USE_{SIGNALFD,SELF_PIPE} can be defined
explicitly to override the defaults.
Back-patch to 14, where we started using signalfd().
Reported-by: Japin Li <japinli@hotmail.com>
Reported-by: Olaf Bohlen <olbohlen@eenfach.de> (off-list)
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/MEYP282MB1669C8D88F0997354C2313C1B6CA9%40MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
Commit a117cebd63 used the original userid
for ACL checks located directly in DefineIndex(), but it still adopted
the table owner userid for more ACL checks than intended. That broke
dump/reload of indexes that refer to an operator class, collation, or
exclusion operator in a schema other than "public" or "pg_catalog".
Back-patch to v10 (all supported versions), like the earlier commit.
Nathan Bossart and Noah Misch
Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
The ssl test "IPv4 host with CIDR mask does not match" apparently has
a portability problem. Some operating systems don't reject the host
name specification "192.0.2.1/32" as an IP address, and that is then
later rejected when the SNI is set, which results in a different error
message that the test is supposed to verify.
The value of the test has been questioned in the discussion, and it
was suggested that removing it would be an acceptable fix, so that's
what this is doing.
Reported-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Bug: #17522
Discussion: https://www.postgresql.org/message-id/flat/17522-bfcd5c603b5f4daa%40postgresql.org
The test was not waiting for the subscriber's data synchronization to
happen after refreshing the publication on the subscriber side. This leads
subscriber's apply worker to skip applying the changes on the
corresponding relation which results in a test failure.
Reported-by: Hou Zhijie, as per buildfarm
Author: Hou Zhijie
Reviewed-by: Masahiko Sawada, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB5716A69496A8E2F2E155DB8D94B59@OS0PR01MB5716.jpnprd01.prod.outlook.com
It has been reported that PL/Tcl built on macOS with GCC >=11 crashes.
The reason is that there is a hash_search() function in the operating
system's libraries, and that ends up being called instead of the one
in postgres. This has something to do with how the linker resolves
references between the various possibilities it has been given, and
somehow something changed that it is now picking that one in this
configuration.
We found that removing the -lc from the link command line fixes this
problem. The -lc was introduced a long time ago in commit
e3909672f1, and we think the reasons
might be obsolete, so we decided that we'll try to just remove it and
see if any problems arise.
Discussion: https://www.postgresql.org/message-id/flat/a78c847a-4f79-9286-be99-e819e9e4139e%40enterprisedb.com
When rebuilding the relation mapping on subscribers, we were not releasing
the attribute mapping's memory which was no longer required.
The attribute mapping used in logical tuple conversion was refactored in
PG13 (by commit e1551f96e6) but we forgot to update the related code that
frees the attribute map.
Author: Hou Zhijie
Reviewed-by: Amit Langote, Amit Kapila, Shi yu
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
072132f0 used the attnum offset to access the raw_fields array when
checking that the attribute names of the header and of the relation
match, leading to incorrect results or even crashes if the attribute
numbers of a relation are changed, like on a dropped attribute. This
fixes the logic to use the correct attribute names for the header
matching requirements.
Also, this commit disallows HEADER MATCH in COPY TO as there is no
validation that can be done in this case.
The tests are expanded for HEADER MATCH with COPY FROM and dropped
columns, with cases where a relation has a dropped and re-added column,
as well as a reduced set of columns.
Author: Julien Rouhaud
Reviewed-by: Peter Eisentraut, Michael Paquier
Discussion: https://postgr.es/m/20220607154744.vvmitnqhyxrne5ms@jrouhaud
Second thoughts about 9cd43f6cb: given that we're staying bug-compatible
with the old behavior of using double not single quotes for extension
versions, we can simplify this completion code by pretending that
extension versions *are* identifiers, and not using VERBATIM. Then
_complete_from_query() will think that the query results are identifiers
in need of quoting, and we end up with the same behavior as before.
This doesn't work for Query_for_list_of_available_extension_versions_with_TO,
but let's just drop that: there is no other place where we handle
multi-keyword phrases that way, and it doesn't seem very desirable here
either. Handle completion of "UPDATE TO" in our more usual pattern.
Discussion: https://postgr.es/m/CAMkU=1yV+egSYrzWvbDY8VZ6bKEMrKbzxr-HTuiHi+wDgSUMgA@mail.gmail.com
We build the partition map entries on subscribers while applying the
changes for update/delete on partitions. The component relation in each
entry is closed after its use so we need to update it on successive use of
cache entries.
This problem was there since the original commit f1ac27bfda that
introduced this code but we didn't notice it till the recent commit
26b3455afa started to use the component relation of partition map cache
entry.
Reported-by: Tom Lane, as per buildfarm
Author: Amit Langote, Hou Zhijie
Reviewed-by: Amit Kapila, Shi Yu
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
In logical replication, we will check if the target table on the
subscriber is updatable by comparing the replica identity of the table on
the publisher with the table on the subscriber. When the target table is a
partitioned table, we only check its replica identity but not for the
partition tables. This leads to assertion failure while applying changes
for update/delete as we expect those to succeed only when the
corresponding partition table has a primary key or has a replica
identity defined.
Fix it by checking the replica identity of the partition table while
applying changes.
Reported-by: Shi Yu
Author: Shi Yu, Hou Zhijie
Reviewed-by: Amit Langote, Amit Kapila
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
In 02b8048ba I (tgl) got rid of the need for most tab-completion queries
to return pre-quoted identifiers. But I over-hastily removed the
quote_ident call from Query_for_list_of_available_extension_versions*
too; those still need it, because what is returned isn't an identifier
at all and will (almost?) always need quoting.
Arguably we should use quote_literal here instead. But quote_ident
works too and people may be used to that behavior, so stick with it.
In passing, fix inconsistent omission of schema-qualification in
Query_for_list_of_encodings. That's not a security issue per our
current guidelines, but it ought to be like the rest.
Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1yV+egSYrzWvbDY8VZ6bKEMrKbzxr-HTuiHi+wDgSUMgA@mail.gmail.com
This reverts commits 5753d4ee32 and fe60b67250 that modified HOT to
ignore BRIN indexes. The commit message for 5753d4ee32 claims that:
When determining whether an index update may be skipped by using
HOT, we can ignore attributes indexed only by BRIN indexes. There
are no index pointers to individual tuples in BRIN, and the page
range summary will be updated anyway as it relies on visibility
info.
This is partially incorrect - it's true BRIN indexes don't point to
individual tuples, so HOT chains are not an issue, but the visibitlity
info is not sufficient to keep the index up to date. This can easily
result in corrupted indexes, as demonstrated in the hackers thread.
This does not mean relaxing the HOT restrictions for BRIN is a lost
cause, but it needs to handle the two aspects (allowing HOT chains and
updating the page range summaries) as separate. But that requires a
major changes, and it's too late for that in the current dev cycle.
Reported-by: Tomas Vondra
Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
We were not updating the partition map cache in the subscriber even when
the corresponding remote rel is changed. Due to this data was getting
incorrectly replicated for partition tables after the publisher has
changed the table schema.
Fix it by resetting the required entries in the partition map cache after
receiving a new relation mapping from the publisher.
Reported-by: Shi Yu
Author: Shi Yu, Hou Zhijie
Reviewed-by: Amit Langote, Amit Kapila
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
This reverts the changes to pg_upgrade's Makefile and .gitignore done in
15b6d21. The TAP tests run in isolation, executing pg_upgrade in
tmp_check/ in the build directory so as any files created in the
execution path (reindex_hash.sql and delete_old_cluster.{sh,bat}) are
never in the tree, so entries are not necessary in this case. However,
not having these impacts the cleanliness of the code tree when running
./pg_upgrade directly from src/bin/pg_upgrade/.
This commit adds back to .gitignore all the files generated in the
execution path, and the Makefile rule to clean them up if they exist.
Per gripe from Tom Lane.
Discussion: https://postgr.es/m/90595.1655227384@sss.pgh.pa.us
While building a new attrmap which maps partition attribute numbers to
remoterel's, we incorrectly update the map for dropped column attributes.
Later, it caused cache look-up failure when we tried to use the map to
fetch the information about attributes.
This also fixes the partition map cache invalidation which was using the
wrong type cast to fetch the entry. We were using stale partition map
entry after invalidation which leads to the assertion or cache look-up
failure.
Reported-by: Shi Yu
Author: Hou Zhijie, Shi Yu
Reviewed-by: Amit Langote, Amit Kapila
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
This commit, in completion of 157f873, forces a ROLLBACK for
--single-transaction only when ON_ERROR_STOP is used when one of the
steps defined by -f/-c fails. Hence, COMMIT is always used when
ON_ERROR_STOP is not set, ignoring the status code of the last action
taken in the set of switches specified by -c/-f (previously ROLLBACK
would have been issued even without ON_ERROR_STOP if the last step
failed, while COMMIT was issued if a step in-between failed as long as
the last step succeeded, leading to more inconsistency).
While on it, this adds much more test coverage in this area when not
using ON_ERROR_STOP with multiple switch patterns involving -c and -f
for query files, single queries and slash commands.
The behavior of ON_ERROR_STOP is arguably a bug, but there was no much
support for a backpatch to force a ROLLBACK on a step failure, so this
change is done only on HEAD for now.
Per discussion with Tom Lane and Kyotaro Horiguchi.
Discussion: https://postgr.es/m/Yqbc8bAdwnP02na4@paquier.xyz
If an application executed operations like EXEC SQL PREPARE
without having first established a database connection, it could
get a core dump instead of the expected clean failure. This
occurred because we did "pthread_getspecific(actual_connection_key)"
without ever having initialized the TSD key actual_connection_key.
The results of that are probably platform-specific, but at least
on Linux it often leads to a crash.
To fix, add calls to ecpg_pthreads_init() in the code paths that
might use actual_connection_key uninitialized. It's harmless
(and hopefully inexpensive) to do that more than once.
Per bug #17514 from Okano Naoki. The problem's ancient, so
back-patch to all supported branches.
Discussion: https://postgr.es/m/17514-edd4fad547c5692c@postgresql.org
Use the same error message for all cases of pathname overrun,
since users aren't going to much care which one was too long.
Add missing newline to said error (as pg_upgrade's version
of pg_fatal requires that).
Add pathname overrun checks for the individual log files,
not just the directories.
Remove initial newline in log files; the new scheme here
guarantees that we'll never be appending to an old file.
Kyotaro Horiguchi and Tom Lane
Discussion: https://postgr.es/m/20220613.120551.729848632120189555.horikyota.ntt@gmail.com
Recent additions to the subscription tests check for log entries, but
fail to account for the possible presence of an SQL errror code, which
happens if log_error_verbosity is set to 'verbose'. Add this into the
regular expressions that are checked for.
In commit ec62cb0aa, I foolishly replaced ExecEvalWholeRowVar's
lookup_rowtype_tupdesc_domain call with just lookup_rowtype_tupdesc,
because I didn't see how a domain could be involved there, and
there were no regression test cases to jog my memory. But the
existing code was correct, so revert that change and add a test
case showing why it's necessary. (Note: per comment in struct
DatumTupleFields, it is correct to produce an output tuple that's
labeled with the base composite type, not the domain; hence just
blindly looking through the domain is correct here.)
Per bug #17515 from Dan Kubb. Back-patch to v11 where domains over
composites became a thing.
Discussion: https://postgr.es/m/17515-a24737438363aca0@postgresql.org
This function can be called from mark_async_capable_plan(), a helper
function for create_append_plan(), before set_subqueryscan_references(),
to determine the triviality of a SubqueryScan that is a child of an
Append plan node, which is done before doing finalize_plan() on the
SubqueryScan (if necessary) and set_plan_references() on the subplan,
unlike when called from set_subqueryscan_references(). The reason why
this is safe wouldn't be that obvious, so add comments explaining this.
Follow-up for commit c2bb02bc2.
Reviewed by Zhihong Yu.
Discussion: https://postgr.es/m/CAPmGK17%2BGiJBthC6va7%2B9n6t75e-M1N0U18YB2G1B%2BE5OdrNTA%40mail.gmail.com
The new show-all-results feature in psql (7844c9918) went out of its
way to show notices next to the results of the statements (in a
multi-statement string) that caused them. This also had the
consequence that notices for a single statement were not shown until
after the statement had executed, instead of right away. After some
discussion, it seems very difficult to satisfy both of these goals, so
here we are giving up on the first goal and just show the notices as
we get them. This restores the pre-7844c9918 behavior for notices.
Reported-by: Alastair McKinley <a.mckinley@analyticsengines.com>
Author: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://www.postgresql.org/message-id/flat/PAXPR02MB760039506C87A2083AD85575E3DA9%40PAXPR02MB7600.eurprd02.prod.outlook.com
The original advice for hard-wired SetConfigOption calls was to use
PGC_S_OVERRIDE, particularly for PGC_INTERNAL GUCs. However,
that's really overkill for PGC_INTERNAL GUCs, since there is no
possibility that we need to override a user-provided setting.
Instead use PGC_S_DYNAMIC_DEFAULT in most places, so that the
value will appear with source = 'default' in pg_settings and thereby
not be shown by psql's new \dconfig command. The one exception is
that when changing in_hot_standby in a hot-standby session, we still
use PGC_S_OVERRIDE, because people felt that seeing that in \dconfig
would be a good thing.
Similarly use PGC_S_DYNAMIC_DEFAULT for the auto-tune value of
wal_buffers (if possible, that is if wal_buffers wasn't explicitly
set to -1), and for the typical 2MB value of max_stack_depth.
In combination these changes remove four not-very-interesting
entries from the typical output of \dconfig, all of which people
fingered as "why is that showing up?" in the discussion thread.
Discussion: https://postgr.es/m/3118455.1649267333@sss.pgh.pa.us
Some locales use a comma as decimal separator (like Czech or French),
and psql's 001_basic.pl for \timing was not able to handle that
properly. This fixes the matching regexes to be able to handle both
comma and dot as possible decimal separators, as per a suggestion from
Andrew Dunstan.
psql tests were the only place with such a portability issue
(check-world passed here with a forced LANG/LANGUAGE). These tests are
new as of c0280bc, so there is no need for a backpatch.
Reported-by: Pavel Stehule
Discussion: https://postgr.es/m/CAFj8pRBz8iQmd2aOaCLvO-rJY6vZr-h6Q0qvV0J+yb78J7uiaA@mail.gmail.com
38bfae3 has moved the contents written to files by pg_upgrade under a
new directory called pg_upgrade_output.d/ located in the new cluster's
data folder, and it used a simple structure made of two subdirectories
leading to a fixed structure: log/ and dump/. This design has made
weaker pg_upgrade on repeated calls, as we could get failures when
creating one or more of those directories, while potentially losing the
logs of a previous run (logs are retained automatically on failure, and
cleaned up on success unless --retain is specified). So a user would
need to clean up pg_upgrade_output.d/ as an extra step for any repeated
calls of pg_upgrade. The most common scenario here is --check followed
by the actual upgrade, but one could see a failure when specifying an
incorrect input argument value. Removing entirely the logs would have
the disadvantage of removing all the past information, even if --retain
was specified at some past step.
This result is annoying for a lot of users and automated upgrade flows.
So, rather than requiring a manual removal of pg_upgrade_output.d/, this
redesigns the set of output directories in a more dynamic way, based on
a suggestion from Tom Lane and Daniel Gustafsson. pg_upgrade_output.d/
is still the base path, but a second directory level is added, mostly
named after an ISO-8601-formatted timestamp (in short human-readable,
with milliseconds appended to the name to avoid any conflicts). The
logs and dumps are saved within the same subdirectories as previously,
as of log/ and dump/, but these are located inside the subdirectory
named after the timestamp.
The logs of a given run are removed only after a successful run if
--retain is not used, and pg_upgrade_output.d/ is kept if there are any
logs from a previous run. Note that previously, pg_upgrade would have
kept the logs even after a successful --check but that was inconsistent
compared to the case without --check when using --retain. The code in
charge of the removal of the output directories is now refactored into a
single routine.
Two TAP tests are added with some --check commands (one failure case and
one success case), to look after the issue fixed here. Note that the
tests had to be tweaked a bit to fit with the new directory structure so
as it can find any logs generated on failure. This is still going to
require a change in the buildfarm client for the case where pg_upgrade
is tested without the TAP test, though, but I'll tackle that with a
separate patch where needed.
Reported-by: Tushar Ahuja
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson, Justin Pryzby
Discussion: https://postgr.es/m/77e6ecaa-2785-97aa-f229-4b6e047cbd2b@enterprisedb.com
Bug #17512 highlighted that a suitably broken data type could cause the
backend to crash if either the hash function or equality function were in
someway non-deterministic based on their input values. Such a data type
could cause a crash of the backend due to some code which assumes that
we'll always find a hash table entry corresponding to an item in the
Memoize LRU list.
Here we remove the assumption that we'll always find the entry
corresponding to the given LRU list item and add run-time checks to verify
we have found the given item in the cache.
This is not a fix for bug #17512, but it will turn the crash reported by
that bug report into an internal ERROR.
Reported-by: Ales Zeleny
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAApHDvpxFSTwvoYWT7kmFVSZ9zLAeHb=S9vrz=RExMgSkQNWqw@mail.gmail.com
Backpatch-through: 14, where Memoize was added.
pg_stat_get_subscription scanned one more LogicalRepWorker array entry
than is really allocated. In the worst case this could lead to SIGSEGV,
if the LogicalRepCtx data structure is near the end of shared memory.
That seems quite unlikely though (thanks to the ordering of calls in
CreateSharedMemoryAndSemaphores) and we've heard no field reports of it.
A more likely misbehavior is one row of garbage data in the function's
result, but even that is not real likely because of the check that the
pid field matches some live backend.
Report and fix by Kuntal Ghosh. This bug is old, so back-patch
to all supported branches.
Discussion: https://postgr.es/m/CAGz5QCJykEDzW6jQK6Yz7Qh_PMtD=95de_7QoocbVR2Qy8hWZA@mail.gmail.com
An error PGresult generated by libpq itself, such as a report of
connection loss, won't have broken-down error fields.
should_processing_continue() blithely assumed that
PG_DIAG_SEVERITY_NONLOCALIZED would always be present, and would
dump core if it wasn't.
Per grepping to see if 6d157e7cb's mistake was repeated elsewhere.
An error PGresult generated by libpq itself, such as a report of
connection loss, won't have broken-down error fields.
ecpg_raise_backend() blithely assumed that PG_DIAG_MESSAGE_PRIMARY
would always be present, and would end up passing a NULL string
pointer to snprintf when it isn't. That would typically crash
before 3779ac62d, and it would fail to provide a useful error report
in any case. Best practice is to substitute PQerrorMessage(conn)
in such cases, so do that.
Per bug #17421 from Masayuki Hirose. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17421-790ff887e3188874@postgresql.org
psql --single-transaction is able to handle multiple -c and -f switches
in a single transaction since d5563d7d, but this had the surprising
behavior of forcing a transaction COMMIT even if psql failed with an
error in the client (for example incorrect path given to \copy), which
would generate an error, but still commit any changes that were already
applied in the backend. This commit makes the behavior more consistent,
by enforcing a transaction ROLLBACK if any commands fail, both
client-side and backend-side, so as no changes are applied if one error
happens in any of them.
Some tests are added on HEAD to provide some coverage about all that.
Backend-side errors are unreliable as IPC::Run can complain on SIGPIPE
if psql quits before reading a query result, but that should work
properly in the case where any errors come from psql itself, which is
what the original report is about.
Reported-by: Christoph Berg
Author: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/17504-76b68018e130415e@postgresql.org
Backpatch-through: 10
The hard-wired PageOutput arguments in usage() and sibling functions
have been a perennial maintenance gotcha, and there's no reason to
think we'll ever get any better about that. Let's get rid of those
magic constants by constructing the output in a buffer where we can
count the newlines before calling PageOutput. (Perhaps this is
microscopically slower; but none of these functions are performance
critical, and anyway we might well be buying back all the cost by
avoiding having to pass most of the data through snprintf.c. I could
not detect any speed difference in a desultory check.) This also
gets rid of the need to assume that platform-specific variations in
the output are insignificant.
While at it, make the code shorter and more abstract by inventing
helper macros HELP0() and HELPN() to encapsulate the specific
output actions being invoked.
Discussion: https://postgr.es/m/365160.1654289490@sss.pgh.pa.us
TAP tests are run from their own directory in the source tree, and in a
VPATH build the execution of the pg_upgrade command was leaving behind a
file in the source tree, that should be left untouched. In order to
avoid this issue, the test moves to PostgreSQL::Test::Utils::tmp_check,
so as any files generated by pg_upgrade do not impact the source tree,
but the build tree. This has as nice side-effect to make unnessary the
presence of such files in pg_upgrade's .gitignore and Makefile. This
strategy is similar to psql's test 010_tab_completion.pl, though the
reasons behind this choice are different.
In passing, fix one misleading test name that was added by 99f6f19.
Per discussion with Peter Eisentraut, Andrew Dunstan, Tom Lane, Andres
Freund and myself.
Discussion: https://postgr.es/m/f80ace33-11fb-1cd3-20f8-98f51d151088@enterprisedb.com
Provide a gloss of which command does what, as all other backslash
commands have. Put the large-object command section into a more
considered spot in the list.
In passing, update the output-lines count in helpVariables()
(oversight in 7844c9918, looks like).
Thibaud Walkowiak, reviewed by Nathan Bossart and myself
Discussion: https://postgr.es/m/43f0439c-df3e-a045-ac99-af33523cc2d4@dalibo.com
Currently, we simply combine the column lists when publishing tables on
multiple publications and that can sometimes lead to unexpected behavior.
Say, if a column is published in any row-filtered publication, then the
values for that column are sent to the subscriber even for rows that don't
match the row filter, as long as the row matches the row filter for any
other publication, even if that other publication doesn't include the
column.
The main purpose of introducing a column list is to have statically
different shapes on publisher and subscriber or hide sensitive column
data. In both cases, it doesn't seem to make sense to combine column
lists.
So, we disallow the cases where the column list is different for the same
table when combining publications. It can be later extended to combine the
column lists for selective cases where required.
Reported-by: Alvaro Herrera
Author: Hou Zhijie
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/202204251548.mudq7jbqnh7r@alvherre.pgsql
Since a117cebd6, some older gcc versions issue "variable may be used
uninitialized in this function" complaints for brin_summarize_range.
Silence that using the same coding pattern as in bt_index_check_internal;
arguably, a117cebd6 had too narrow a view of which compilers might give
trouble.
Nathan Bossart and Tom Lane. Back-patch as the previous commit was.
Discussion: https://postgr.es/m/20220601163537.GA2331988@nathanxps13
Perl 5.36 has reclassified the warning condition that this test
case used, so that the expected error fails to appear. Tweak
the test so it instead exercises a case that's handled the same
way in all Perl versions of interest.
This appears to meet our standards for back-patching into
out-of-support branches: it changes no user-visible behavior
but enables testing of old branches with newer tools.
Hence, back-patch as far as 9.2.
Dagfinn Ilmari Mannsåker, per report from Jitka Plesníková.
Discussion: https://postgr.es/m/564579.1654093326@sss.pgh.pa.us
This reverts commit d9d076222f "VACUUM: ignore indexing operations
with CONCURRENTLY".
These changes caused indexes created with the CONCURRENTLY option to
miss heap tuples that were HOT-updated and HOT-pruned during the index
creation. Before these changes, HOT pruning would have been prevented
by the Xmin of the transaction creating the index, but because this
change was precisely to allow the Xmin to move forward ignoring that
backend, now other backends scanning the table can prune them. This is
not a problem for VACUUM (which requires a lock that conflicts with a
CREATE INDEX CONCURRENTLY operation), but HOT-prune can definitely
occur. In other words, Xmin advancement was sped up, but at the cost of
corrupting the resulting index.
Regrettably, this means that the new feature in PG14 that RIC/CIC on
very large tables no longer force VACUUM to retain very old tuples goes
away. We might try to implement it again in a later release, but for
now the risk of indexes missing tuples is too high and there's no easy
fix.
Backpatch to 14, where this change appeared.
Reported-by: Peter Slavov <pet.slavov@gmail.com>
Diagnosys-by: Andrey Borodin <x4mmm@yandex-team.ru>
Diagnosys-by: Michael Paquier <michael@paquier.xyz>
Diagnosys-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/17485-396609c6925b982d%40postgresql.org
We hadn't noticed this because (a) few people feed invalid
timezone abbreviation files to the server, and (b) in typical
scenarios guc.c would throw ereport(ERROR) and then transaction
abort handling would silently clean up the leaked file reference.
However, it was possible to observe file leakage warnings if one
breaks an already-active abbreviation file, because guc.c does
not throw ERROR when loading supposedly-validated settings during
session start or SIGHUP processing.
Report and fix by Kyotaro Horiguchi (cosmetic adjustments by me)
Discussion: https://postgr.es/m/20220530.173740.748502979257582392.horikyota.ntt@gmail.com
With the old logic, when the reciever had not yet attached, we would
never call shm_mq_inc_bytes_written(), even if force_flush = true
was specified. That could result in a situation where data that the
sender believes it has sent is never received.
Along the way, remove a useless function prototype for a nonexistent
function from shm_mq.h.
Commit 46846433a0 introduced these
problems.
Pavan Deolasee, with a few changes by me.
Discussion: https://postgr.es/m/CABOikdPkwtLLCTnzzmpSMXo3QZa2yXq0J7Q61ssdLFAJYrOVvQ@mail.gmail.com
foreign_data has kept around a set of tests for TRUNCATE to look after
the case of foreign tables, with[out] inheritance and with[out]
partitions, assuming that the command is not supported for this relkind.
However, TRUNCATE is supported on foreign tables if the FDW involved is
able to handle the command, like postgres_fdw.
Note that postgres_fdw includes tests to cover all the cases removed by
this commit (which had misleading comments), so these did not provide
any additional coverage anyway.
Author: Yugo Nagata
Discussion: https://postgr.es/m/20220527172543.0a2fdb469cf048b81c0967d3@sraoss.co.jp
Commit 1a36bc9db (SQL/JSON query functions) introduced STRING as a
type_func_name_keyword, thereby breaking applications that use
"string" as a table name, column name, function parameter name, etc.
That seems like a pretty bad thing, not least because the SQL spec
says that STRING is an unreserved keyword.
This is easy enough to fix so far as the core grammar is concerned.
However, doing so causes some ECPG test cases to fail, specifically
those that use "string" as a typedef name. It turns out this is
because portions of the ECPG grammar allow type_func_name_keywords
but not unreserved_keywords as typedef names. That's pretty horrid,
and it's mildly astonishing that we've not heard complaints about it
before. We can fix two of those uses trivially, but the ones in the
var_type production are less easy. As a stopgap, hard-code STRING as
an allowed alternative in var_type.
Per report from Alastair McKinley.
Discussion: https://postgr.es/m/3661437.1653855582@sss.pgh.pa.us
In the codepath when no encoding conversion is required, the check for
incomplete character at the end of input incorrectly used server
encoding's max character length, instead of the client's. Usually the
server and client encodings are the same when we're not performing
encoding conversion, but SQL_ASCII is an exception.
In the passing, also fix some outdated comments that still talked about
the old COPY protocol. It was removed in v14.
Per bug #17501 from Vitaly Voronov. Backpatch to v14 where this was
introduced.
Discussion: https://www.postgresql.org/message-id/17501-128b1dd039362ae6@postgresql.org
This reverts commit 06f5295 as per issues with this approach, both in
terms of efficiency impact and stability. First, contrary to the
single-item cache for transaction IDs in transam.c, the cache may finish
by not be hit for a long time, and without an invalidation mechanism to
clear it, it would cause inconsistent results on wraparound for
example. Second, the use of SubTransGetTopmostTransaction() for the
caching has a limited impact on performance. SubTransGetParent() could
have more impact, though the benchmarking of the single-item approach
still needs to be proved, particularly under the conditions where SLRU
lookups are stressed in parallel with overflowed snapshots (aka more
than 64 subxids generated, for example).
After discussion with Andres Freund.
Discussion: https://postgr.es/m/20220524235250.gtt3uu5zktfkr4hv@alap3.anarazel.de
If a short description is specified as NULL in one of the various
DefineCustomXXXVariable() functions available to external modules to
define a custom parameter, SHOW ALL would crash. This change teaches
SHOW ALL to properly handle NULL short descriptions, as well as any code
paths that manipulate it, to gain in flexibility. Note that
help_config.c was already able to do that, when describing a set of GUCs
for postgres --describe-config.
Author: Steve Chavez
Reviewed by: Nathan Bossart, Andres Freund, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/CAGRrpzY6hO-Kmykna_XvsTv8P2DshGiU6G3j8yGao4mk0CqjHA%40mail.gmail.com
Backpatch-through: 10
9d9c02ccd added code to allow the executor to take shortcuts when quals
on monotonic window functions guaranteed that once the qual became false
it could never become true again. When possible, baserestrictinfo quals
are converted to become these quals, which we call run conditions.
Unfortunately, in 9d9c02ccd, I forgot to update
remove_unused_subquery_outputs to teach it about these run conditions.
This could cause a WindowFunc column which was unused in the target list
but referenced by an upper-level WHERE clause to be removed from the
subquery when the qual in the WHERE clause was converted into a window run
condition. Because of this, the entire WindowClause would be removed from
the query resulting in additional rows making it into the resultset when
they should have been filtered out by the WHERE clause.
Here we fix this by recording which target list items in the subquery have
run conditions. That gets passed along to remove_unused_subquery_outputs
to tell it not to remove these items from the target list.
Bug: #17495
Reported-by: Jeremy Evans
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/17495-7ffe2fa0b261b9fa@postgresql.org
Commits a59c79564 et al. tried to sync libpq's SSL key file
permissions checks with what we've used for years in the backend.
We did not intend to create any new failure cases, but it turns out
we did: restricting the key file's ownership breaks cases where the
client is allowed to read a key file despite not having the identical
UID. In particular a client running as root used to be able to read
someone else's key file; and having seen that I suspect that there are
other, less-dubious use cases that this restriction breaks on some
platforms.
We don't really need an ownership check, since if we can read the key
file despite its having restricted permissions, it must have the right
ownership --- under normal conditions anyway, and the point of this
patch is that any additional corner cases where that works should be
deemed allowable, as they have been historically. Hence, just drop
the ownership check, and rearrange the permissions check to get rid
of its faulty assumption that geteuid() can't be zero. (Note that the
comparable backend-side code doesn't have to cater for geteuid() == 0,
since the server rejects that very early on.)
This does have the end result that the permissions safety check used
for a root user's private key file is weaker than that used for
anyone else's. While odd, root really ought to know what she's doing
with file permissions, so I think this is acceptable.
Per report from Yogendra Suralkar. Like the previous patch,
back-patch to all supported branches.
Discussion: https://postgr.es/m/MW3PR15MB3931DF96896DC36D21AFD47CA3D39@MW3PR15MB3931.namprd15.prod.outlook.com
repeat() checked for integer overflow during its calculation of the
required output space, but it just passed the resulting integer to
palloc(). This meant that result sizes between 1GB and 2GB led to
ERRCODE_INTERNAL_ERROR, "invalid memory alloc request size" rather
than ERRCODE_PROGRAM_LIMIT_EXCEEDED, "requested length too large".
That seems like a bit of a wart, so add an explicit AllocSizeIsValid
check to make these error cases uniform.
Do likewise in the sibling functions lpad() etc. While we're here,
also modernize their overflow checks to use pg_mul_s32_overflow() etc
instead of expensive divisions.
Per complaint from Japin Li. This is basically cosmetic, so I don't
feel a need to back-patch.
Discussion: https://postgr.es/m/ME3P282MB16676ED32167189CB0462173B6D69@ME3P282MB1667.AUSP282.PROD.OUTLOOK.COM
Mistake in 5891c7a8ed, likely made when switching the default value from none
to fetch during development.
Reported-By: Nathan Bossart <nathandbossart@gmail.com>
Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/20220524220147.GA1298892@nathanxps13
"\r" (for progress output) must not be inside a translatable string
(gettext gets upset).
In passing, move the minimum supported version number to a separate
argument, so that we don't have to retranslate this string every year
now.
The changes to show all query results (7844c9918) broke \timing output
in case of an error; it didn't update the timing result and showed
0.000 ms.
Fix by updating the timing result also in the error case. Also, for
robustness, update the timing result any time a result is obtained,
not only for the last, so a sensible value is always available.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Richard Guo <guofenglinux@gmail.com>
Author: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://www.postgresql.org/message-id/3813350.1652111765%40sss.pgh.pa.us
On slow machines the modified test could end up switching the order in which
transactional stats are reported in one session and non-transactional stats in
another session. As stats handling of truncate is implemented as setting
live/dead rows 0, the order in which a truncate's stats changes are applied,
relative to normal stats updates, matters. The handling of stats for truncate
hasn't changed due to shared memory stats, this is longstanding behavior.
We might want to improve truncate's stats handling in the future, but for now
just change the order of forced flushed to make the test stable.
Reported-By: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/YoZf7U/WmfmFYFEx@msg.df7cb.de
ruleutils.c was coded to suppress the AS label for a SELECT output
expression if the column name is "?column?", which is the parser's
fallback if it can't think of something better. This is fine, and
avoids ugly clutter, so long as (1) nothing further up in the parse
tree relies on that column name or (2) the same fallback would be
assigned when the rule or view definition is reloaded. Unfortunately
(2) is far from certain, both because ruleutils.c might print the
expression in a different form from how it was originally written
and because FigureColname's rules might change in future releases.
So we shouldn't rely on that.
Detecting exactly whether there is any outer-level use of a SELECT
column name would be rather expensive. This patch takes the simpler
approach of just passing down a flag indicating whether there *could*
be any outer use; for example, the output column names of a SubLink
are not referenceable, and we also do not care about the names exposed
by the right-hand side of a setop. This is sufficient to suppress
unwanted clutter in all but one case in the regression tests. That
seems like reasonable evidence that it won't be too much in users'
faces, while still fixing the cases we need to fix.
Per bug #17486 from Nicolas Lutic. This issue is ancient, so
back-patch to all supported branches.
Discussion: https://postgr.es/m/17486-1ad6fd786728b8af@postgresql.org
Several places in the planner tried to clamp a double value to fit
in a "long" by doing
(long) Min(x, (double) LONG_MAX);
This is subtly incorrect, because it casts LONG_MAX to double and
potentially back again. If long is 64 bits then the double value
is inexact, and the platform might round it up to LONG_MAX+1
resulting in an overflow and an undesirably negative output.
While it's not hard to rewrite the expression into a safe form,
let's put it into a common function to reduce the risk of someone
doing it wrong in future.
In principle this is a bug fix, but since the problem could only
manifest with group count estimates exceeding 2^63, it seems unlikely
that anyone has actually hit this or will do so anytime soon. We're
fixing it mainly to satisfy fuzzer-type tools. That being the case,
a HEAD-only fix seems sufficient.
Andrey Lepikhov
Discussion: https://postgr.es/m/ebbc2efb-7ef9-bf2f-1ada-d6ec48f70e58@postgrespro.ru
This is based on a set of suggestions from Noah, with the following
changes made:
- The set of databases created in the tests are now prefixed with
"regression" to not trigger any warnings with name restrictions when
compiling the code with -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS, and
now only the first name checks after the Windows case of double quotes
mixed with backslashes.
- Fix an issue with EXTRA_REGRESS_OPTS, which were not processed in a
way consistent with 027_stream_regress.pl (missing space between the
option string and pg_regress). This got introduced in 7dd3ee5.
- Add a check on the exit code of the pg_regress command, to catch
failures after running the regression tests.
Reviewed-by: Noah Misch
Discussion: https://postgr.es/m/YoHhWD5vQzb2mmiF@paquier.xyz
This new-in-v15 test case assumed it could set max_stack_depth as high
as 2MB. You might think that'd be true on any modern platform but
you'd be wrong, as I found out while experimenting with NetBSD/hppa.
This test is about privileges not platform capabilities, so there seems
no need to use any value greater than the 100kB setting already used
in a couple of places in the core regression tests. There's certainly
no call to expect people to raise their platform's default ulimit just
to run this test.
When an implicit operator family is created, it wasn't getting reported.
Make it do so.
This has always been missing. Backpatch to 10.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr
_pg_version (version number based on PostgreSQL::Version) is a field
private to Cluster.pm but there was no helper routine to retrieve it
from a Cluster's node. The same is done for install_path, for example,
and the version object becomes handy when writing tests that need
version-specific handling.
Reviewed-by: Andrew Dunstan, Daniel Gustafsson
Discussion: https://postgr.es/m/YoWfoJTc987tsxpV@paquier.xyz
I rephrased the error messages to be more in the style of
option_parse_int(), and also made use of the new "detail" message
facility. I didn't actually use option_parse_int() (which could be
used for -n) because libpgfeutils wasn't used here yet and I wanted to
keep this just to string changes. But it could be done in the future.
This is a slight, convenient semantics change from what commit
0f0cfb4940 ("Fix parallel operations that prevent oldest xmin from
advancing") introduced that lets us simplify the coding in the one place
where it is used.
Backpatch to 13. This is related to commit 6fea65508a ("Tighten
ComputeXidHorizons' handling of walsenders") rewriting the code site
where this is used, which has not yet been backpatched, but it may well
be in the future.
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/202204191637.eldwa2exvguw@alvherre.pgsql
Commit 923def9a53 and 52e4f0cd47 allowed to specify column lists and row
filters for publication tables. This commit extends the
pg_publication_tables view and pg_get_publication_tables function to
display that information.
This information will be useful to users and we also need this for the
later commit that prohibits combining multiple publications with different
column lists for the same table.
Author: Hou Zhijie
Reviewed By: Amit Kapila, Alvaro Herrera, Shi Yu, Takamichi Osumi
Discussion: https://postgr.es/m/202204251548.mudq7jbqnh7r@alvherre.pgsql
We weren't checking the length of the column list in the alias clause of
an XMLTABLE or JSON_TABLE function (a "tablefunc" RTE), and it was
possible to make the server crash by passing an overly long one. Fix it
by throwing an error in that case, like the other places that deal with
alias lists.
In passing, modify the equivalent test used for join RTEs to look like
the other ones, which was different for no apparent reason.
This bug came in when XMLTABLE was born in version 10; backpatch to all
stable versions.
Reported-by: Wang Ke <krking@zju.edu.cn>
Discussion: https://postgr.es/m/17480-1c9d73565bb28e90@postgresql.org
We can use a single line to print all tuple counts that MERGE processed,
for conciseness, and elide those that are zeroes. Non-text formats
report all numbers, as is typical.
Per comment from Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20220511163350.GL19626@telsasoft.com
In order to estimate the cache hit ratio of a Memoize node, one of the
inputs we require is the estimated number of times the Memoize node will
be rescanned. The higher this number, the large the cache hit ratio is
likely to become. Unfortunately, the value being passed as the number of
"calls" to the Memoize was incorrectly using the Nested Loop's
outer_path->parent->rows instead of outer_path->rows. This failed to
account for the fact that the outer_path might be parameterized by some
upper-level Nested Loop.
This problem could lead to Memoize plans appearing more favorable than
they might actually be. It could also lead to extended executor startup
times when work_mem values were large due to the planner setting overly
large MemoizePath->est_entries resulting in the Memoize hash table being
initially made much larger than might be required.
Fix this simply by passing outer_path->rows rather than
outer_path->parent->rows. Also, adjust the expected regression test
output for a plan change.
Reported-by: Pavel Stehule
Author: David Rowley
Discussion: https://postgr.es/m/CAFj8pRAMp%3DQsMi6sPQJ4W3hczoFJRvyXHJV3AZAZaMyTVM312Q%40mail.gmail.com
Backpatch-through: 14, where Memoize was introduced
Per BF animal chipmunk: CREATE DATABASE could apparently fail due to an
AV process being in the template database and not quitting fast enough
for the 5 second timeout in CountOtherDBBackends(). The test script had
autovacuum_naptime=1s to encourage more activity and opening of fds, but
that wasn't strictly necessary for this test. Take it out.
Per BF animal skink: there was a 300s timeout for all tests in the
script, but apparently that was not enough under valgrind. Let's use
the standard timeout $PostgreSQL::Test::Utils::timeout_default, but
restart it for each query we run.
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKa8HNJaA24gqiiFoGy0ysndeVoJsHvX_q1-DVLFaGAmw%40mail.gmail.com
I started out with the intention to rename value_type to item_type to
avoid a collision with a typedef name that appears on some platforms.
Along the way, I noticed that the adjacent field "format" was not being
correctly handled by the backend/nodes/ infrastructure functions:
copyfuncs.c erroneously treated it as a scalar, while equalfuncs,
outfuncs, and readfuncs omitted handling it at all. This looks like
it might be cosmetic at the moment because the field is always NULL
after parse analysis; but that's likely a bug in itself, and the code's
certainly not very future-proof. Let's fix it while we can still do so
without forcing an initdb on beta testers.
Further study found a few other inconsistencies in the backend/nodes/
infrastructure for the recently-added JSON node types, so fix those too.
catversion bumped because of potential change in stored rules.
Discussion: https://postgr.es/m/526703.1652385613@sss.pgh.pa.us
Currently, preloaded libraries are expected to request additional
shared memory and LWLocks in _PG_init(). However, it is not unusal
for such requests to depend on MaxBackends, which won't be
initialized at that time. Such requests could also depend on GUCs
that other modules might change. This introduces a new hook where
modules can safely use MaxBackends and GUCs to request additional
shared memory and LWLocks.
Furthermore, this change restricts requests for shared memory and
LWLocks to this hook. Previously, libraries could make requests
until the size of the main shared memory segment was calculated.
Unlike before, we no longer silently ignore requests received at
invalid times. Instead, we FATAL if someone tries to request
additional shared memory or LWLocks outside of the hook.
Nathan Bossart and Julien Rouhaud
Discussion: https://postgr.es/m/20220412210112.GA2065815%40nathanxps13
Discussion: https://postgr.es/m/Yn2jE/lmDhKtkUdr@paquier.xyz
pgstat_report_stat() is only supposed to be called outside of transactions. In
5891c7a8ed I added a pgstat_report_stat() call into LogicalRepApplyLoop()'s
timeout branch. While not commonly reached inside a transaction, it is
reachable (e.g. due to network bottlenecks or the sender being stalled / slow
for some reason).
To fix, add a !IsTransactionState() check.
No test added because there's no easy way to reproduce this case without
patching the code.
Reported-By: Erik Rijkers <er@xs4all.nl>
Discussion: https://postgr.es/m/b3463b8c-2328-dcac-0136-af95715493c1@xs4all.nl
A few places used binary_upgrade_* variables without including the header,
which worked without warnings because the variables are defined in those
places. However that can cause linker complaints with MSVC - except that we
don't see them right now, due to the use of a symbol export file.
Discussion: https://postgr.es/m/20220512164513.vaheofqp2q24l65r@alap3.anarazel.de
Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta
tasks specified by RELEASE_CHANGES. For reference, the command was
./renumber_oids.pl --first-mapped-oid 8000 --target-oid 6205
The original coding did this in pqDropServerData(), which seems
fairly backwards. Pending commands are more like already-queued
output data, which is dropped in pqDropConnection(). Moving the
operation means that we clear the command queue immediately upon
detecting connection drop, which improves the sanity of subsequent
behavior. In particular this eliminates duplicated error message
text as a consequence of code added in b15f25446, which supposed
that a nonempty command queue must mean the prior operation is
still active.
There might be an argument for backpatching this to v14; but as with
b15f25446, I'm unsure about interactions with 618c16707. For now,
given the lack of complaints about v14's behavior, leave it alone.
Per report from Peter Eisentraut.
Discussion: https://postgr.es/m/de57761c-b99b-3435-b0a6-474c72b1149a@enterprisedb.com
This follows in the footsteps of commit 2591ee8ec by removing one more
ill-advised shortcut from planning of GroupingFuncs. It's true that
we don't intend to execute the argument expression(s) at runtime, but
we still have to process any Vars appearing within them, or we risk
failure at setrefs.c time (or more fundamentally, in EXPLAIN trying
to print such an expression). Vars in upper plan nodes have to have
referents in the next plan level, whether we ever execute 'em or not.
Per bug #17479 from Michael J. Sullivan. Back-patch to all supported
branches.
Richard Guo
Discussion: https://postgr.es/m/17479-6260deceaf0ad304@postgresql.org
Three variables in pqsignal.h (UnBlockSig, BlockSig and StartupBlockSig)
were not marked with PGDLLIMPORT, as they are declared in a way that
prevents mark_pgdllimport.pl to detect them. These variables are
redefined in a style more consistent with the other headers, allowing
the script to find and mark them.
PGDLLIMPORT was missing for __pg_log_level in logging.h, so add it
back. The marking got accidentally removed in 9a374b77, just after its
addition in 8ec5694.
While on it, add a comment in mark_pgdllimport.pl explaining what are
the arguments needed by the script (aka a list of header paths).
Reported-by: Andres Freund
Discussion: https://postgr.es/m/20220506234924.6mxxotl3xl63db3l@alap3.anarazel.de
The code for unloading a library has been commented-out for over 12
years, ever since commit 602a9ef5a7, and we're
no closer to supporting it now than we were back then.
Nathan Bossart, reviewed by Michael Paquier and by me.
Discussion: http://postgr.es/m/Ynsc9bRL1caUSBSE@paquier.xyz
To enable diagnosis of systems that are not processing ProcSignalBarrier
requests promptly, add a LOG message every 5 seconds if we seem to be
wedged. Although you could already see this state as a wait event in
pg_stat_activity, the log message also shows the PID of the process that
is preventing progress.
Also add DEBUG1 logging around the whole wait loop.
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BTgmoYJ03r5359gQutRGP9BtigYCg3_UskcmnVjBf-QO3-0pQ%40mail.gmail.com
The problem is that we don't send keep-alive messages for a long time
while processing large transactions during logical replication where we
don't send any data of such transactions. This can happen when the table
modified in the transaction is not published or because all the changes
got filtered. We do try to send the keep_alive if necessary at the end of
the transaction (via WalSndWriteData()) but by that time the
subscriber-side can timeout and exit.
To fix this we try to send the keepalive message if required after
processing certain threshold of changes.
Reported-by: Fabrice Chapuis
Author: Wang wei and Amit Kapila
Reviewed By: Masahiko Sawada, Euler Taveira, Hou Zhijie, Hayato Kuroda
Backpatch-through: 10
Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com
Presently, the server may emit a variety of log messages when inspecting
a runtime-computed GUC, mostly in the shape of one LOG message with the
default configuration, related to the startup sequence launched as such
GUCs require a load of the control file and of external shared
libraries.
For example, the server will always emit a "database system is shut
down" LOG (unless the user has set log_min_messages higher than LOG),
which is an annoying behavior as "postgres -C" is expected to only emit
in its output the parameter value we are looking for. The parameter
value is sent to stdout, while the logs are sent to stderr so we could
recommend to use a redirection, but there was not much love for this
workaround either.
To avoid such extra log messages, per discussion, this change sets
log_min_messages to FATAL internally when -C is used on a
runtime-computed GUC (even if set to PANIC in postgresql.conf). At
FATAL, the user will still receive messages explaining why a GUC value
cannot be inspected, and will know if the command is attempted on a
server already running, something not supported yet for a
runtime-computed GUC.
Reported-by: Magnus Hagander, Bruce Momjian
Author: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/Yni6ZHkGotUU+RSf@paquier.xyz
The current setup assumes that commands for lz4, zstd and gzip always
exist by default if not enforced by a user's environment. However,
vcpkg, as one example, installs libraries but no binaries, so this
default setup to assume that a command should always be present would
cause failures. This commit improves the detection of such external
commands as follows:
* If a ENV value is available, trust the environment/user and use it.
* If a ENV value is not available, check its execution by looking in the
current PATH, by launching a simple "$command --version" (that should be
portable enough).
** On execution failure, ignore ENV{command}.
** On execution success, set ENV{command} = "$command".
Note that this new rule applies to gzip, lz4 and zstd but not tar that
we assume will always exist. Those commands are set up in the
environment only when using bincheck and taptest. The CI includes all
those commands and I have checked that their setup is correct there. I
have also tested this change in a MSVC environment where we have none of
those commands.
While on it, remove the references to lz4 from the documentation and
vcregress.pl in ~v13. --with-lz4 has been added in v14~ so there is no
point to have this information in these older branches.
Reported-by: Andrew Dunstan
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/14402151-376b-a57a-6d0c-10ad12608e12@dunslane.net
Backpatch-through: 10
697492434 added 3 new quicksort specialization functions for common
datatypes.
That commit was not very consistent in how it would determine if we're
compiling for 32-bit or 64-bit machines. It would sometimes use
USE_FLOAT8_BYVAL and at other times check if SIZEOF_DATUM == 8. This
could cause theoretical problems due to the way USE_FLOAT8_BYVAL is now
defined based on SIZEOF_VOID_P >= 8. If pointers for some reason were
ever larger than 8-bytes then we'd end up doing 32-bit comparisons
mistakenly. Let's just always check SIZEOF_DATUM >= 8.
It also seems that ssup_datum_signed_cmp is just never used on 32-bit
builds, so let's just ifdef that out to make sure we never accidentally
use that comparison function on such machines. This also allows us to
ifdef out 1 of the 3 new specialization quicksort functions in 32-bit
builds which seems to shrink down the binary by over 4KB on my machine.
In passing, also add the missing DatumGetInt32() / DatumGetInt64() macros
in the comparison functions.
Discussion: https://postgr.es/m/CAApHDvqcQExRhtRa9hJrJB_5egs3SUfOcutP3m+3HO8A+fZTPA@mail.gmail.com
Reviewed-by: John Naylor
This commit addresses the following issues in the TAP tests of
pg_upgrade, introduced in 322becb:
- Remove --port and --host for commands that already rely on a node's
environment PGHOST and PGPORT.
- Switch from run_log() to command_ok(), as all the commands executed in
the tests should succeed.
- Change EXTRA_REGRESS_OPTS to make it count as a shell fragment (fixing
s/OPT/OPTS on a way), to be compatible with the various Makefiles using
it as well as 027_stream_regress.pl in the recovery tests. The command
built for the execution the pg_regress command is reformatted, while on
it, to map with the recovery test doing the same thing (we should
refactor and consolidate that in the future, perhaps).
- Re-add the test for database names stressing the behavior of
backslashes with double quotes, mostly here for Windows.
Tests doable with the upgrade across different major versions still work
the same way.
Reported-by: Noah Misch
Discussion: https://postgr.es/m/20220502042718.GB1565149@rfd.leadboat.com
The parser code that transformed VALUES from row-oriented to
column-oriented lists failed if there were zero columns.
You can't write that straightforwardly (though probably you
should be able to), but the case can be reached by expanding
a "tab.*" reference to a zero-column table.
Per bug #17477 from Wang Ke. Back-patch to all supported branches.
Discussion: https://postgr.es/m/17477-0af3c6ac6b0a6ae0@postgresql.org
This reverts commit eafdf9de06
and its back-branch counterparts. Corey Huinker pointed out that
we'd discussed this exact change back in 2016 and rejected it,
on the grounds that there's at least one usage pattern with LIMIT
where an infinite endpoint can usefully be used. Perhaps that
argument needs to be re-litigated, but there's no time left before
our back-branch releases. To keep our options open, restore the
status quo ante; if we do end up deciding to change things, waiting
one more quarter won't hurt anything.
Rather than just doing a straight revert, I added a new test case
demonstrating the usage with LIMIT. That'll at least remind us of
the issue if we forget again.
Discussion: https://postgr.es/m/3603504.1652068977@sss.pgh.pa.us
Discussion: https://postgr.es/m/CADkLM=dzw0Pvdqp5yWKxMd+VmNkAMhG=4ku7GnCZxebWnzmz3Q@mail.gmail.com
It intended to, but did not, achieve this. Adopt the new standard of
setting user ID just after locking the relation. Back-patch to v10 (all
supported versions).
Reviewed by Simon Riggs. Reported by Alvaro Herrera.
Security: CVE-2022-1552
When a feature enumerates relations and runs functions associated with
all found relations, the feature's user shall not need to trust every
user having permission to create objects. BRIN-specific functionality
in autovacuum neglected to account for this, as did pg_amcheck and
CLUSTER. An attacker having permission to create non-temp objects in at
least one schema could execute arbitrary SQL functions under the
identity of the bootstrap superuser. CREATE INDEX (not a
relation-enumerating operation) and REINDEX protected themselves too
late. This change extends to the non-enumerating amcheck interface.
Back-patch to v10 (all supported versions).
Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin.
Reported by Alexander Lakhin.
Security: CVE-2022-1552
If a cluster is promoted (aka the control file shows a state different
than DB_IN_ARCHIVE_RECOVERY) while CreateRestartPoint() is still
processing, this function could miss an update of the control file for
"checkPoint" and "checkPointCopy" but still do the recycling and/or
removal of the past WAL segments, assuming that the to-be-updated LSN
values should be used as reference points for the cleanup. This causes
a follow-up restart attempting crash recovery to fail with a PANIC on a
missing checkpoint record if the end-of-recovery checkpoint triggered by
the promotion did not complete while the cluster abruptly stopped or
crashed before the completion of this checkpoint. The PANIC would be
caused by the redo LSN referred in the control file as located in a
segment already gone, recycled by the previous restartpoint with
"checkPoint" out-of-sync in the control file.
This commit fixes the update of the control file during restartpoints so
as "checkPoint" and "checkPointCopy" are updated even if the cluster has
been promoted while a restartpoint is running, to be on par with the set
of WAL segments actually recycled in the end of CreateRestartPoint().
This problem exists in all the stable branches. However, commit
7ff23c6, by removing the last call of CreateCheckPoint() from the
startup process, has made this bug much easier to reason about as
concurrent checkpoints are not possible anymore. No backpatch is done
yet, mostly out of caution from me as a point release is close by, but
we need to think harder about the case of concurrent checkpoints at
promotion if the bgwriter is not considered as running by the startup
process in ~v14, so this change is done only on HEAD for the moment.
Reported-by: Fujii Masao, Rui Zhao
Author: Kyotaro Horiguchi
Reviewed-by: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/20220316.102444.2193181487576617583.horikyota.ntt@gmail.com
Commits 4eb21763 and b74e94dc introduced a way to force every backend to
close all relation files, to fix an ancient Windows-only bug.
This commit extends that behavior to all operating systems and adds
a couple of extra barrier points, to fix a totally different class of
bug: the reuse of relfilenodes in scenarios that have no other kind of
cache invalidation to prevent file descriptor mix-ups.
In all releases, data corruption could occur when you moved a database
to another tablespace and then back again. Despite that, no back-patch
for now as the infrastructure required is too new and invasive. In
master only, since commit aa010514, it could also happen when using
CREATE DATABASE with a user-supplied OID or via pg_upgrade.
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220209220004.kb3dgtn2x2k2gtdm%40alap3.anarazel.de
With sufficiently bad luck, it was possible for IssuePendingWritebacks()
to reopen a file after we'd processed PROCSIGNAL_BARRIER_SMGRRELEASE and
before the file was unlinked by some other backend. That left a small
hole in commit 4eb21763's plan to fix all spurious errors from DROP
TABLESPACE and similar on Windows.
Fix by closing md.c's segments, instead of just closing fd.c's
descriptors, and then teaching smgrwriteback() not to open files that
aren't already open.
Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20220209220004.kb3dgtn2x2k2gtdm%40alap3.anarazel.de
Since 6bc8ef0b7f, the maximum number
of backends can't change as background workers are registered, but
these comments still reflect the way things worked prior to that.
Also, per recent discussion, some modules call SetConfigOption()
from _PG_init(). It's not entirely clear to me whether we want to
regard that as a fully supported operation, but since we know it's
a thing that happens, it at least deserves a mention in the comments,
so add that.
Nathan Bossart, reviewed by Anton A. Melnikov
Discussion: http://postgr.es/m/20220419154658.GA2487941@nathanxps13
Setting up an EVP context for ciphers banned under FIPS generate
two OpenSSL errors in the queue, and as we only consume one from
the queue the other is at the head for the next invocation:
postgres=# select md5('foo');
ERROR: could not compute MD5 hash: unsupported
postgres=# select md5('foo');
ERROR: could not compute MD5 hash: initialization error
Clearing the error queue when creating the context ensures that
we don't pull in an error from an earlier operation.
Discussion: https://postgr.es/m/C89D932C-501E-4473-9750-638CFCD9095E@yesql.se