This is needed so that renumber_oids.pl can handle renumbering
shared catalog declarations, which need to provide C macros for
the OIDs of the shared toast table and index. The previous
method of writing a C macro separately was error-prone anyway.
Also teach renumber_oids.pl about DECLARE_UNIQUE_INDEX_PKEY,
as we missed doing when inventing that macro.
There are no changes to postgres.bki here, so no need for a
catversion bump.
Discussion: https://postgr.es/m/2995325.1650487527@sss.pgh.pa.us
CLUSTER sort won't use the datum1 SortTuple field when clustering
against an index whose leading key is an expression. This makes it
unsafe to use the abbreviated keys optimization, which was missed by the
logic that sets up SortSupport state. Affected tuplesorts output tuples
in a completely bogus order as a result (the wrong SortSupport based
comparator was used for the leading attribute).
This issue is similar to the bug fixed on the master branch by recent
commit cc58eecc5d. But it's a far older issue, that dates back to the
introduction of the abbreviated keys optimization by commit 4ea51cdfe8.
Backpatch to all supported versions.
Author: Peter Geoghegan <pg@bowt.ie>
Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKG+bA+bmwD36_oDxAoLrCwZjVtST2fqe=b4=qZcmU7u89A@mail.gmail.com
Backpatch: 10-
Such cases will lead to infinite loops, so they're of no practical
value. The numeric variant of generate_series() already threw error
for this, so borrow its message wording.
Per report from Richard Wesley. Back-patch to all supported branches.
Discussion: https://postgr.es/m/91B44E7B-68D5-448F-95C8-B4B3B0F5DEAF@duckdblabs.com
Coverity complains that dpns->outer_plan is deferenced (to obtain
->targetlist) when possibly NULL. We can avoid this by using
dpns->outer_tlist instead, which was already obtained a few lines up.
The fact that we end up with
dpns->inner_tlist = dpns->outer_tlist
is a bit suspicious-looking and maybe worthy of more investigation, but
I'll leave that for another day.
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/202204191345.qerjy3kxi3eb@alvherre.pgsql
Should have been done this way to start with, but I failed to notice
This way we avoid some pointless initialization, and better contains the
variable to exist in the scope where it is really used.
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/202204191345.qerjy3kxi3eb@alvherre.pgsql
There's no reason to keep a separate local variable when we have a place
for it elsewhere. This allows to simplify some code.
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/202204191345.qerjy3kxi3eb@alvherre.pgsql
An ALTER FUNCTION command that tried to update both the function's
proparallel property and its proconfig list failed to do the former,
because it stored the new proparallel value into a tuple that was
no longer the interesting one. Carelessness in 7aea8e4f2.
(I did not bother with a regression test, because the only likely
future breakage would be for someone to ignore the comment I added
and add some other field update after the heap_modify_tuple step.
A test using existing function properties could not catch that.)
Per report from Bryn Llewellyn. Back-patch to all supported branches.
Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com
The new numeric code for extract epoch from interval accidentally
truncated the DAYS_PER_YEAR value to an integer, leading to results
that mismatched the floating-point interval_part calculations.
The commit a2da77cdb4 that introduced
this actually contains the regression test change that this reverts.
I suppose this was missed at the time.
Reported-by: Joseph Koshakow <koshy44@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/CAAvxfHd5n%3D13NYA2q_tUq%3D3%3DSuWU-CufmTf-Ozj%3DfrEgt7pXwQ%40mail.gmail.com
Don't try to look at the attidentity field of system attributes,
because they're not there in the TupleDescAttr array. Sometimes
this is harmless because we accidentally pick up a zero, but
otherwise we'll report "no owned sequence found" from an attempt
to alter a system attribute. (It seems possible that a SIGSEGV
could occur, too, though I've not seen it in testing.)
It's not in this function's charter to complain that you can't
alter a system column, so instead just hard-wire an assumption
that system attributes aren't identities. I didn't bother with
a regression test because the appearance of the bug is very
erratic.
Per bug #17465 from Roman Zharkov. Back-patch to all supported
branches. (There's not actually a live bug before v12, because
before that get_attidentity() did the right thing anyway.
But for consistency I changed the test in the older branches too.)
Discussion: https://postgr.es/m/17465-f2a554a6cb5740d3@postgresql.org
Suppress further attempts to read ahead in the WAL if we run out of
data, until the records already decoded have been replayed. This
restores the traditional behavior for continuous archive recovery, which
is to retry the failing restore_command only every 5 seconds. With the
coding in 5dc0418f, we would start retrying every time through the
recovery loop when our WAL decoding window hit the end of the current
segment and we tried to look ahead into a not-yet-available next file.
That was very slow.
Also change the no_readahead_until mechanism to use <= rather than <,
which seems more useful. Otherwise we'd either get one extra unwanted
retry of restore_command, or we'd need to add 1 to an LSN.
No change in behavior for regular streaming. That was already limited
by the flushedUpto variable, which won't be updated until we replay what
we have already.
Reported by Andres Freund while analyzing the failure of a TAP test on
build farm animal skink (investigation ongoing but probably due to
otherwise unrelated timing bugs triggered by this slowness magnified by
valgrind).
Discussion: https://postgr.es/m/20220409005910.alw46xqmmgny2sgr%40alap3.anarazel.de
Previously we didn't, which lead to an assertion failure when resetting
partially loaded statistics. This was encountered on the buildfarm, for
as-of-yet unknown reasons.
Ttighten up a validity check when reading the stats file, verifying 'E'
signals the end of the file (rather than just stopping reading). That's then
used in a test appending to the stats file that crashed before the fix in
pgstat_drop_all_entries().
Reported by buildfarm animals mylodon and kestrel, via Tom Lane.
Discussion: https://postgr.es/m/1656446.1650043715@sss.pgh.pa.us
This function gave the wrong answer when there's more than one
RegisteredSnapshots entry, whether or not any of them is the
CatalogSnapshot. This leads to assertion failure in some scenarios
involving fetching toasted data using a cursor. (As per discussion,
I'm dubious that this is the right contract to be enforcing at all;
but it surely doesn't help to be enforcing it incorrectly.)
Fetching toasted data using a cursor is evidently under-tested,
so add a test case too.
Per report from Erik Rijkers. This is new code, so no need for
back-patch.
Discussion: https://postgr.es/m/dc9dd229-ed30-6c62-4c41-d733ffff776b@xs4all.nl
Per-backend global variables like VacuumPageHit are initialized once per
VACUUM command. This was missed by commit 49c9d9fc, which unified
VACUUM VERBOSE and autovacuum logging. As a result of that oversight,
incorrect values were shown when multiple relations were processed by a
single VACUUM VERBOSE command.
Relations that happened to be processed later on would show "buffer
usage:" values that incorrectly included buffer accesses made while
processing earlier unrelated relations. The same accesses were counted
multiple times.
To fix, take initial values for the tracker variables at the start of
heap_vacuum_rel(), and report delta values later on.
ComputeXidHorizons (nee GetOldestXmin) thought that it could identify
walsenders by checking for proc->databaseId == 0. Perhaps that was
safe when the code was written, but it's been wrong at least since
autovacuum was invented. Background processes that aren't connected
to any particular database, such as the autovacuum launcher and
logical replication launcher, look like that too.
This imprecision is harmful because when such a process advertises an
xmin, the result is to hold back dead-tuple cleanup in all databases,
though it'd be sufficient to hold it back in shared catalogs (which
are the only relations such a process can access). Aside from being
generally inefficient, this has recently been seen to cause regression
test failures in the buildfarm, as a consequence of the logical
replication launcher's startup transaction preventing VACUUM from
marking pages of a user table as all-visible.
We only want that global hold-back effect for the case where a
walsender is advertising a hot standby feedback xmin. Therefore,
invent a new PGPROC flag that says that a process' xmin should be
considered globally, and check that instead of using the incorrect
databaseId == 0 test. Currently only a walsender sets that flag,
and only if it is not connected to any particular database. (This is
for bug-compatibility with the undocumented behavior of the existing
code, namely that feedback sent by a client who has connected to a
particular database would not be applied globally. I'm not sure this
is a great definition; however, such a client is capable of issuing
plain SQL commands, and I don't think we want xmins advertised for
such commands to be applied globally. Perhaps this could do with
refinement later.)
While at it, I rewrote the comment in ComputeXidHorizons, and
re-ordered the commented-upon if-tests, to make them match up
for intelligibility's sake.
This is arguably a back-patchable bug fix, but given the lack of
complaints I think it prudent to let it age awhile in HEAD first.
Discussion: https://postgr.es/m/1346227.1649887693@sss.pgh.pa.us
Be consistent about the lines that VACUUM VERBOSE outputs by including
an "index scan not needed: " line for completely empty tables. This
makes the output more readable, especially with multiple distinct VACUUM
operations processed by the same VACUUM command. It's also more
consistent; even empty tables can use the failsafe, which wasn't
reported in the standard way until now.
Follow-up to commit 6e20f460, which taught VACUUM VERBOSE to be more
consistent about reporting on scanned pages with empty tables.
The age of OldestXmin (a.k.a. "removable cutoff") when VACUUM ends often
indicates the approximate number of XIDs consumed while VACUUM ran.
However, there is at least one important exception: the cutoff could be
held back by a snapshot that was acquired before our VACUUM even began.
Successive VACUUM operations may even use exactly the same old cutoff in
extreme cases involving long held snapshots.
The log messages that described how removable cutoff aged (which were
added by commit 872770fd) created the impression that we were reporting
on how VACUUM's usable cutoff advanced while VACUUM ran, which was
misleading in these extreme cases. Fix by using a more general wording.
Per gripe from Tom Lane.
In passing, relocate related instrumentation code for clarity.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/1643035.1650035653@sss.pgh.pa.us
When not loading stats at startup (i.e. pgstat_discard_stats() getting
called), reset timestamps of fixed numbered stats would be left at
0. Oversight in 5891c7a8ed.
Instead use pgstat_reset_after_failure() and add tests verifying that
fixed-numbered reset timestamps are set appropriately.
Reported-By: "David G. Johnston" <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CAKFQuwamFuaQHKdhcMt4Gbw5+Hca2UE741B8gOOXoA=TtAd2Yw@mail.gmail.com
If a partitioned table has partitions owned by roles other than the
owner of the partitioned table, don't include them in the to-be-
clustered list. This is similar to what VACUUM FULL does (except we do
it sooner, because there is no reason to postpone it). Add a simple
test to verify that only owned partitions are clustered.
While at it, change memory context switch-and-back to occur once per
partition instead of outside of the loop.
Author: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20220411140609.GF26620@telsasoft.com
Commit 1a36bc9dba conained some logic that was a little opaque and
could have involved a NULL dereference, as complained about by Coverity.
Make the logic more transparent and in doing so avoid the NULL
dereference.
Only 1 of 3 of these changes appear to be handled by pgindent. That change
is new to v15. The remaining two appear to be left alone by pgindent. The
exact reason for that is not 100% clear to me. It seems related to the
fact that it's a line that contains *only* a single line comment and no
actual code. It does not seem worth investigating this in too much
detail. In any case, these do not conform to our usual practices, so fix
them.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
heap_fetch() used to have a "keep_buf" parameter that told it to return
ownership of the buffer pin to the caller after finding that the
requested tuple TID exists but is invisible to the specified snapshot.
This was thoughtlessly removed in commit 5db6df0c0, which broke
heapam_tuple_lock() (formerly EvalPlanQualFetch) because that function
needs to do more accesses to the tuple even if it's invisible. The net
effect is that we would continue to touch the page for a microsecond or
two after releasing pin on the buffer. Usually no harm would result;
but if a different session decided to defragment the page concurrently,
we could see garbage data and mistakenly conclude that there's no newer
tuple version to chain up to. (It's hard to say whether this has
happened in the field. The bug was actually found thanks to a later
change that allowed valgrind to detect accesses to non-pinned buffers.)
The most reasonable way to fix this is to reintroduce keep_buf,
although I made it behave slightly differently: buffer ownership
is passed back only if there is a valid tuple at the requested TID.
In HEAD, we can just add the parameter back to heap_fetch().
To avoid an API break in the back branches, introduce an additional
function heap_fetch_extended() in those branches.
In HEAD there is an additional, less obvious API change: tuple->t_data
will be set to NULL in all cases where buffer ownership is not returned,
in particular when the tuple exists but fails the time qual (and
!keep_buf). This is to defend against any other callers attempting to
access non-pinned buffers. We concluded that making that change in back
branches would be more likely to introduce problems than cure any.
In passing, remove a comment about heap_fetch that was obsoleted by
9a8ee1dc6.
Per bug #17462 from Daniil Anisimov. Back-patch to v12 where the bug
was introduced.
Discussion: https://postgr.es/m/17462-9c98a0f00df9bd36@postgresql.org
There was a small buglet in commit 52e4f0cd47 whereby a tuple acquired
from cache was not released, giving rise to WARNING messages; fix that.
While at it, restructure the code a bit on stylistic grounds.
Author: Hou zj <houzj.fnst@fujitsu.com>
Reported-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut+PvKTyhTBtYCQsP6Ph7=o-oWRSX+v+PXXLXp81-o2bazig@mail.gmail.com
Commit f4fb45d15c misguidedly tried to free some state during aggregate
finalization for json_objectagg. This resulted in attempts to access
freed memory, especially when the function is used as a window function.
Commit 4eb9798879 attempted to ameliorate that, but in fact it should
just be ripped out, which is done here. Also add some regression tests
for json_objectagg in various flavors as a window function.
Original report from Jaime Casanova, diagnosis by Andres Freund.
Discussion: https://postgr.es/m/YkfeMNYRCGhySKyg@ahch-to
The last usage of this argument in this routine can be tracked down to
7e2f9062, aka 11 years ago. Getting rid of this argument can also be an
advantage for extensions calling check_index_is_clusterable(), as it
removes any need to worry about the meaning of what a recheck would be
at this level.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411140609.GF26620@telsasoft.com
This reverts commits 0147fc7, 4567596, aa64f23, and 5ecd018.
There is no longer agreement that introducing this function
was the right way to address the problem. The consensus now
seems to favor trying to make a correct value for MaxBackends
available to mdules executing their _PG_init() functions.
Nathan Bossart
Discussion: http://postgr.es/m/20220323045229.i23skfscdbvrsuxa@jrouhaud
We were setting MERGE source subplan's targetlist by expanding the
individual attributes of the source relation completely, early in the
parse analysis phase. This failed to work when the condition of an
action included a whole-row reference, causing setrefs.c to error out
with
ERROR: variable not found in subplan target lists
because at that point there is nothing to resolve the whole-row
reference with. We can fix this by having preprocess_targetlist expand
the source targetlist for Vars required from the source rel by all
actions. Moreover, by using this expansion mechanism we can do away
with the targetlist expansion in transformMergeStmt, which is good
because then we no longer pull in columns that aren't needed for
anything.
Add a test case for the problem.
While at it, remove some redundant code in preprocess_targetlist():
MERGE was doing separately what is already being done for UPDATE/DELETE,
so we can just rely on the latter and remove the former. (The handling
of inherited rels was different for MERGE, but that was a no-longer-
necessary hack.)
Fix outdated, related comments for fix_join_expr also.
Author: Richard Guo <guofenglinux@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Joe Wildish <joe@lateraljoin.com>
Discussion: https://postgr.es/m/fab3b90a-914d-46a9-beb0-df011ee39ee5@www.fastmail.com
Compression option handling (level, algorithm or even workers) can be
used across several parts of the system and not only base backups.
Structures, objects and routines are renamed in consequence, to remove
the concept of base backups from this part of the code making this
change straight-forward.
pg_receivewal, that has gained support for LZ4 since babbbb5, will make
use of this infrastructure for its set of compression options, bringing
more consistency with pg_basebackup. This cleanup needs to be done
before releasing a beta of 15. pg_dump is a potential future target, as
well, and adding more compression options to it may happen in 16~.
Author: Michael Paquier
Reviewed-by: Robert Haas, Georgios Kokolatos
Discussion: https://postgr.es/m/YlPQGNAAa04raObK@paquier.xyz
All but a few existing callers assume without checking that this
function succeeds. While it probably will, that's a poor excuse for
not checking. Let's make it return void and instead throw an error
if it doesn't find the block reference. Callers that actually need
to handle the no-such-block case must now use the underlying function
XLogRecGetBlockTagExtended.
In addition to being a bit less error-prone, this should also serve
to suppress some Coverity complaints about XLogRecGetBlockRefInfo.
While at it, clean up some inconsistency about use of the
XLogRecHasBlockRef macro: make XLogRecGetBlockTagExtended use
that instead of open-coding the same condition, and avoid calling
XLogRecHasBlockRef twice in relevant code paths. (That is,
calling XLogRecHasBlockRef followed by XLogRecGetBlockTag is now
deprecated: use XLogRecGetBlockTagExtended instead.)
Patch HEAD only; this doesn't seem to have enough value to consider
a back-branch API break.
Discussion: https://postgr.es/m/425039.1649701221@sss.pgh.pa.us
Remove comment block about how heap page vacuuming used to set tuples
with storage to LP_UNUSED in a rare edge case that can no longer happen
following commit 8523492d4e. The comments seem unnecessary now, since
it's now generally clear that heap vacuuming only applies to LP_DEAD
items from VACUUM's first heap pass following more recent work from
commits 12b5ade902 and 4f8d9d1217.
As of commit 39969e2a1, no caller of do_pg_backup_start() passes NULL
for labelfile or tblspcmapfile, nor is it plausible that any would
do so in the future. Remove the code that coped with that case,
as (a) it's dead and (b) it causes Coverity to bleat about possibly
leaked storage.
While here, do some janitorial work on the function's header comment.
With nowait passed as false, pgstat_lock_entry() must return true
so there's no need to check its result. Coverity seems unconvinced
of this, so whack it upside the head with a (void) cast.
Before commit 412ad7a556, delayChkpt
was a Boolean. Now it's an integer. Extensions using it need to be
appropriately updated, so let's rename the field to make sure that
a hard compilation failure occurs.
Replacing delayChkpt with delayChkptFlags made a few comments extend
past 80 characters, so I reflowed them and changed some wording very
slightly.
The back-branches will need a different change to restore compatibility
with existing minor releases; this is just for master.
Per suggestion from Tom Lane.
Discussion: http://postgr.es/m/a7880f4d-1d74-582a-ada7-dad168d046d1@enterprisedb.com
These are usually not useful since users will use packaged
distributions and won't be interested in rebuilding their installation
from source. Also, we have only used these kinds of hints for some
features and in some places, not consistently throughout.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/2552aed7-d0e9-280a-54aa-2dc7073f371d%40enterprisedb.com
Previously, the output of EXPLAIN (BUFFERS) option showed only the I/O
timing spent reading and writing shared and local buffers. This commit
adds on top of that the I/O timing for temporary buffers in the output
of EXPLAIN (for spilled external sorts, hashes, materialization. etc).
This can be helpful for users in cases where the I/O related to
temporary buffers is the bottleneck.
Like its cousin, this information is available only when track_io_timing
is enabled. Playing the patch, this is showing an extra overhead of up
to 1% even when using gettimeofday() as implementation for interval
timings, which is slightly within the usual range noise still that's
measurable.
Author: Masahiko Sawada
Reviewed-by: Georgios Kokolatos, Melanie Plageman, Julien Rouhaud,
Ranier Vilela
Discussion: https://postgr.es/m/CAD21AoAJgotTeP83p6HiAGDhs_9Fw9pZ2J=_tYTsiO5Ob-V5GQ@mail.gmail.com