createplan.c tries to save a runtime projection step by specifying
a scan plan node's output as being exactly the table's columns, or
index's columns in the case of an index-only scan, if there is not a
reason to do otherwise. This logic did not previously pay attention
to whether an index's columns are returnable. That worked, sort of
accidentally, until commit 9a3ddeb51 taught setrefs.c to reject plans
that try to read a non-returnable column. I have no desire to loosen
setrefs.c's new check, so instead adjust use_physical_tlist() to not
try to optimize this way when there are non-returnable column(s).
Per report from Ryan Kelly. Like the previous patch, back-patch
to all supported branches.
Discussion: https://postgr.es/m/CAHUie24ddN+pDNw7fkhNrjrwAX=fXXfGZZEHhRuofV_N_ftaSg@mail.gmail.com
"pg_ctl stop/restart" checked that the postmaster PID is valid just
once, as a side-effect of sending the stop signal, and then would
wait-till-timeout for the postmaster.pid file to go away. This
neglects the case wherein the postmaster dies uncleanly after we
signal it. Similarly, once "pg_ctl promote" has sent the signal,
it'd wait for the corresponding on-disk state change to occur
even if the postmaster dies.
I'm not sure how we've managed not to notice this problem, but it
seems to explain slow execution of the 017_shm.pl test script on AIX
since commit 4fdbf9af5, which added a speculative "pg_ctl stop" with
the idea of making real sure that the postmaster isn't there. In the
test steps that kill-9 and then restart the postmaster, it's possible
to get past the initial signal attempt before kill() stops working
for the doomed postmaster. If that happens, pg_ctl waited till
PGCTLTIMEOUT before giving up ... and the buildfarm's AIX members
have that set very high.
To fix, include a "kill(pid, 0)" test (similar to what
postmaster_is_alive uses) in these wait loops, so that we'll
give up immediately if the postmaster PID disappears.
While here, I chose to refactor those loops out of where they were.
do_stop() and do_restart() can perfectly well share one copy of the
wait-for-stop loop, and it seems desirable to put a similar function
beside that for wait-for-promote.
Back-patch to all supported versions, since pg_ctl's wait logic
is substantially identical in all, and we're seeing the slow test
behavior in all branches.
Discussion: https://postgr.es/m/20220210023537.GA3222837@rfd.leadboat.com
Some pre-2017 Test::More versions need perfect $Test::Builder::Level
maintenance to find the variable. Buildfarm member snapper reported an
overall failure that the file intended to hide via the TODO construct.
That trouble was reachable in v11 and v10. For later branches, this
serves as defense in depth. Back-patch to v10 (all supported versions).
Discussion: https://postgr.es/m/20220202055556.GB2745933@rfd.leadboat.com
The back-patch of commit fdd965d074 broke
CLOBBER_CACHE_ALWAYS for v9.6 through v13. It updated the
InvalidateSystemCaches() call for CLOBBER_CACHE_RECURSIVELY, neglecting
the one for CLOBBER_CACHE_ALWAYS. Back-patch to v13, v12, v11, and v10.
Reviewed by Tomas Vondra. Reported by Tomas Vondra.
Discussion: https://postgr.es/m/df7b4c0b-7d92-f03f-75c4-9e08b269a716@enterprisedb.com
Recent versions of Devel::PPPort try to redefine eval_pv() to
dodge a bug in pre-5.31 Perl versions. Unfortunately the redefinition
fails on compilers that don't support statements nested within
expressions. However, we aren't actually interested in this bug fix,
since we always call eval_pv() with croak_on_error = FALSE.
So, until there's an upstream fix for this breakage, just comment
out the macro to revert to the older behavior.
Per report from Wei Sun, as well as previous buildfarm failure
on pademelon (which I'd unfortunately not looked at carefully
enough to understand the cause). Back-patch to all supported
versions, since we're using the same ppport.h in all.
Discussion: https://postgr.es/m/tencent_2EFCC8BA0107B6EC0F97179E019A8A43C806@qq.com
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pademelon&dt=2022-02-02%2001%3A22%3A58
A very well-informed user might deduce this from what we said already,
but I'd bet against it. Lay it out explicitly.
While here, rewrite the comment about tuple routing to be more
intelligible to an average SQL user.
Per bug #17395 from Alexander Lakhin. Back-patch to v11. (The text
in this area is different in v10 and I'm not sufficiently excited
about this point to adapt the patch.)
Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org
There are two Asserts in nodeMergejoin.c that are reachable if
the input data is not in the expected order. This seems way too
fragile. Alexander Lakhin reported a case where the assertions
could be triggered with misconfigured foreign-table partitions,
and bitter experience with unstable operating system collation
definitions suggests another easy route to hitting them. Neither
Assert is in a place where we can't afford one more test-and-branch,
so replace 'em with plain test-and-elog logic.
Per bug #17395. While the reported symptom is relatively recent,
collation changes could happen anytime, so back-patch to all
supported branches.
Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org
With Python 3.10, configure spits out warnings about the module
distutils.sysconfig being deprecated and scheduled for removal in
Python 3.12. Change the uses in configure to use the module sysconfig
instead. The logic stays largely the same, although we have to
rely on INCLUDEPY instead of the deprecated get_python_inc function.
Note that sysconfig exists since Python 2.7, so this moves the
minimum required version up from Python 2.6 (or 2.4, before v13).
Also, sysconfig didn't exist in Python 3.1, so the minimum 3.x
version is now 3.2.
Back-patch of commit bd233bdd8 into all supported branches.
In v10, this also includes back-patching v11's beff4bb9c, primarily
because this opinion is clearly out-of-date:
While at it, get rid of the code's assumption that both the major and
minor numbers contain exactly one digit. That will foreseeably be
broken by Python 3.10 in perhaps four or five years. That's far enough
out that we probably don't need to back-patch this.
Peter Eisentraut, Tom Lane, Andres Freund
Discussion: https://postgr.es/m/c74add3c-09c4-a9dd-1a03-a846e5b2fc52@enterprisedb.com
ppport.h was only updated in 05798c9f7f (master). Unfortunately my commit
c89f409749 uses PERL_VERSION_LT which came in with that update. Breaking most
buildfarm animals.
I should have noticed that...
We might want to backpatch the ppport update instead, but for now lets get the
buildfarm green again.
Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de
Backpatch: 10-14, master doesn't need it
For older versions we need our own copy of perl's setlocale(), because it was
not exposed (why we need the setlocale in the first place is explained in
plperl_init_interp) . The copy stopped working in 5.28, as some of the used
macros are not public anymore. But Perl_setlocale is available in 5.28, so
use that.
Author: Victor Wagner <vitus@wagner.pp.ru>
Reviewed-By: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/20200501134711.08750c5f@antares.wagner.home
Backpatch: all versions
Commit 8431e296ea reworked ProcArrayApplyRecoveryInfo to sort XIDs
before adding them to KnownAssignedXids. But the XIDs are sorted using
xidComparator, which compares the XIDs simply as uint32 values, not
logically. KnownAssignedXidsAdd() however expects XIDs in logical order,
and calls TransactionIdFollowsOrEquals() to enforce that. If there are
XIDs for which the two orderings disagree, an error is raised and the
recovery fails/restarts.
Hitting this issue is fairly easy - you just need two transactions, one
started before the 4B limit (e.g. XID 4294967290), the other sometime
after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when
compared using xidComparator we try to add them in the opposite order.
Which makes KnownAssignedXidsAdd() fail with an error like this:
ERROR: out-of-order XID insertion in KnownAssignedXids
This only happens during replica startup, while processing RUNNING_XACTS
records to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, we
skip these records. So this does not affect already running replicas,
but if you restart (or create) a replica while there are transactions
with XIDs for which the two orderings disagree, you may hit this.
Long-running transactions and frequent replica restarts increase the
likelihood of hitting this issue. Once the replica gets into this state,
it can't be started (even if the old transactions are terminated).
Fixed by sorting the XIDs logically - this is fine because we're dealing
with normal XIDs (because it's XIDs assigned to backends) and from the
same wraparound epoch (otherwise the backends could not be running at
the same time on the primary node). So there are no problems with the
triangle inequality, which is why xidComparator compares raw values.
Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me.
This issue is present in all releases since 9.4, however releases up to
9.6 are EOL already so backpatch to 10 only.
Reviewed-by: Abhijit Menon-Sen
Reviewed-by: Alvaro Herrera
Backpatch-through: 10
Discussion: https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com
This reverts commits 6051857fc and ed52c3707, but only in the back
branches. Further testing has shown that while those changes do fix
some things, they also break others; in particular, it looks like
walreceivers fail to detect walsender-initiated connection close
reliably if the walsender shuts down this way. We'll keep trying to
improve matters in HEAD, but it now seems unwise to push these changes
into stable releases.
Discussion: https://postgr.es/m/CA+hUKG+OeoETZQ=Qw5Ub5h3tmwQhBmDA=nuNO3KG=zWfUypFAw@mail.gmail.com
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
Without this, we get odd behavior when the previous cycle of
lexing exited in a non-default exclusive state. Every other
copy of this code is aware that it has to do BEGIN(INITIAL),
but repl_scanner.l did not get that memo.
The real-world impact of this is probably limited, since most
replication clients would abandon their connection after getting
a syntax error. Still, it's a bug.
This mistake is old, so back-patch to all supported branches.
Discussion: https://postgr.es/m/1874781.1643035952@sss.pgh.pa.us
In the normal configuration where GEQO_DEBUG isn't defined,
recent clang versions have started to complain that geqo_main.c
accumulates the edge_failures count but never does anything
with it. As a minimal back-patchable fix, insert a void cast
to silence this warning. (I'd speculated about ripping out the
GEQO_DEBUG logic altogether, but I don't think we'd wish to
back-patch that.)
Per recently-established project policy, this is a candidate
for back-patching into out-of-support branches: it suppresses
an annoying compiler warning but changes no behavior. Hence,
back-patch all the way to 9.2.
Discussion: https://postgr.es/m/CA+hUKGLTSZQwES8VNPmWO9AO0wSeLt36OCPDAZTccT1h7Q7kTQ@mail.gmail.com
In sort_inner_and_outer we iterate a list of PathKey elements, but the
variable is declared as (List *). This mistake is benign, because we
only pass the pointer to lcons() and never dereference it.
This exists since ~2004, but it's confusing. So fix and backpatch to all
supported branches.
Backpatch-through: 10
Discussion: https://postgr.es/m/bf3a6ea1-a7d8-7211-0669-189d5c169374%40enterprisedb.com
Previously, unless we had to add a NOT NULL constraint to the column,
this command resulted in updating only the index's relcache entry.
That's problematic when replication behavior is being driven off the
existence of a primary key: other sessions (and ours too for that
matter) failed to recalculate their opinion of whether the table can
be replicated. Add a relcache invalidation to fix it.
This has been broken since pg_class.relhaspkey was removed in v11.
Before that, updating the table's relhaspkey value sufficed to cause
a cache flush. Hence, backpatch to v11.
Report and patch by Hou Zhijie
Discussion: https://postgr.es/m/OS0PR01MB5716EBE01F112C62F8F9B786947B9@OS0PR01MB5716.jpnprd01.prod.outlook.com
In libpq and ecpglib, multiple threads can concurrently enter the
initialization logic for message localization. Since we set the
its-done flag before actually doing the work, it'd be possible
for some threads to reach gettext() before anyone has called
bindtextdomain(). Barring bugs in libintl itself, this would not
result in anything worse than failure to localize some early
messages. Nonetheless, it's a bug, and an easy one to fix.
Noted while investigating bug #17299 from Clemens Zeidler
(much thanks to Liam Bowen for followup investigation on that).
It currently appears that that actually *is* a bug in libintl itself,
but that doesn't let us off the hook for this bit.
Back-patch to all supported versions.
Discussion: https://postgr.es/m/17299-7270741958c0b1ab@postgresql.org
Discussion: https://postgr.es/m/CAE7q7Eit4Eq2=bxce=Fm8HAStECjaXUE=WBQc-sDDcgJQ7s7eg@mail.gmail.com
While individual logical rewrite files were synced to disk, the directory was
not. On some filesystems that could lead to loosing directory entries after a
crash.
Reported-By: Tom Lane <tgl@sss.pgh.pa.us>
Author: Nathan Bossart <bossartn@amazon.com>
Discussion: https://postgr.es/m/867F2E29-2782-4869-970E-B984C6D35A8F@amazon.com
Backpatch: 10-
The logic in charge of writing commit timestamps (enabled with
track_commit_timestamp) for subtransactions had a one-bug bug,
where it would be possible that commit timestamps go missing for the
last subtransaction committed.
While on it, simplify a bit the iteration logic in the loop writing the
commit timestamps, as per suggestions from Kyotaro Horiguchi and Tom
Lane, so as some variable initializations are not part of the loop
itself.
Issue introduced in 73c986a.
Analyzed-by: Alex Kingsborough
Author: Alex Kingsborough, Kyotaro Horiguchi
Discussion: https://postgr.es/m/73A66172-4050-4F2A-B7F1-13508EDA2144@amazon.com
Backpatch-through: 10
Commits 6c4a8903b et al. had a couple of deficiencies:
* The logic I added to Cluster::start to see if a PID file is present
could be fooled by a stale PID file left over from a previous
postmaster. To fix, if we're not sure whether we expect to find a
running postmaster or not, validate the PID using "kill 0".
* 017_shm.pl has a loop in which it just issues repeated Cluster::start
calls; this will fail if some invocation fails but leaves self->_pid
set. Per buildfarm results, the above fix is not enough to make this
safe: we might have "validated" a PID for a postmaster that exits
immediately after we look. Hence, match each failed start call with
a stop call that will get us back to the self->_pid == undef state.
Add a fail_ok option to Cluster::stop to make this work.
Discussion: https://postgr.es/m/CA+hUKGKV6fOHvfiPt8=dOKzvswjAyLoFoJF1iQXMNpi7+hD1JQ@mail.gmail.com
"pg_ctl start" might start a new postmaster and then return failure
anyway, for example if PGCTLTIMEOUT is exceeded. If there is a
postmaster there, it's still incumbent on us to shut it down at
script end, so check for the PID file even though we are about
to fail.
This has been broken all along, so back-patch to all supported branches.
Discussion: https://postgr.es/m/647439.1642622744@sss.pgh.pa.us
It seems highly unlikely that gettext() can be relied on to be
async-signal-safe. psql used to understand that, but someone got
it wrong long ago in the src/bin/scripts/ version of handle_sigint,
and then the bad idea was perpetuated when those two versions were
unified into src/fe_utils/cancel.c.
I'm unsure why there have not been field complaints about this
... maybe gettext() is signal-safe once it's translated at least
one message? But we have no business assuming any such thing.
In cancel.c (v13 and up), I preserved our ability to localize
"Cancel request sent" messages by invoking gettext() before
the signal handler is set up. In earlier branches I just made
src/bin/scripts/ not localize those messages, as psql did then.
(Just for extra unsafety, the src/bin/scripts/ version was
invoking fprintf() from a signal handler. Sigh.)
Noted while fixing signal-safety issues in PQcancel() itself.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us
PQcancel() is supposed to be safe to call from a signal handler,
and indeed psql uses it that way. All of the library functions
it uses are specified to be async-signal-safe by POSIX ...
except for strerror. Neither plain strerror nor strerror_r
are considered safe. When this code was written, back in the
dark ages, we probably figured "oh, strerror will just index
into a constant array of strings" ... but in any locale except C,
that's unlikely to be true. Probably the reason we've not heard
complaints is that (a) this error-handling code is unlikely to be
reached in normal use, and (b) in many scenarios, localized error
strings would already have been loaded, after which maybe it's
safe to call strerror here. Still, this is clearly unacceptable.
The best we can do without relying on strerror is to print the
decimal value of errno, so make it do that instead. (This is
probably not much loss of user-friendliness, given that it is
hard to get a failure here.)
Back-patch to all supported branches.
Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us
Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. While that
resolved the issue, it also means there are no extended stats for
declaratively partitioned tables, because there are no data in the
non-leaf relations.
That also means declaratively partitioned tables were not affected by
the issue 859b3003de addressed, which means this is a regression
affecting queries that calculate estimates for the whole inheritance
tree as a whole (which includes e.g. GROUP BY queries).
But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.
It may be necessary to run ANALYZE on partitioned tables, to collect
proper statistics. For declarative partitioning there should no prior
statistics, and it might take time before autoanalyze is triggered. For
tables partitioned by inheritance the statistics may include data from
child relations (if built 859b3003de), contradicting the current code.
Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).
Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
Since commit 859b3003de we only build extended statistics for individual
relations, ignoring the child relations. This resolved the issue with
updating catalog tuple twice, but we still tried to use the statistics
when calculating estimates for the whole inheritance tree. When the
relations contain very distinct data, it may produce bogus estimates.
This is roughly the same issue 427c6b5b9 addressed ~15 years ago, and we
fix it the same way - by ignoring extended statistics when calculating
estimates for the inheritance tree as a whole. We still consider
extended statistics when calculating estimates for individual child
relations, of course.
This may result in plan changes due to different estimates, but if the
old statistics were not describing the inheritance tree particularly
well it's quite likely the new plans is actually better.
Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).
Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
Commit 7745bc352 intended to ensure that whole-row Vars would be
printed with "::type" decoration in all contexts where plain
"var.*" notation would result in star-expansion, notably in
ROW() and VALUES() constructs. However, it missed the case of
INSERT with a single-row VALUES, as reported by Timur Khanjanov.
Nosing around ruleutils.c, I found a second oversight: the
code for RowCompareExpr generates ROW() notation without benefit
of an actual RowExpr, and naturally it wasn't in sync :-(.
(The code for FieldStore also does this, but we don't expect that
to generate strictly parsable SQL anyway, so I left it alone.)
Back-patch to all supported branches.
Discussion: https://postgr.es/m/efaba6f9-4190-56be-8ff2-7a1674f9194f@intrans.baku.az
I had a brain fade in commit d32899157, and used 2:30AM as the
example timestamp for both spring-forward and fall-back cases.
But it's not actually ambiguous at all in the fall-back case,
because that transition is from 2AM to 1AM under USA rules.
Fix the example to use 1:30AM, which *is* ambiguous.
Noted while answering a question from Aleksander Alekseev.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/2191355.1641828552@sss.pgh.pa.us
If contrib/btree_gist is used to make a GIST index on a char(N)
(bpchar) column, and that column is retrieved via an index-only
scan, what came out had all trailing spaces removed. Since
that doesn't happen in any other kind of table scan, this is
clearly a bug. The cause is that gbt_bpchar_compress() strips
trailing spaces (using rtrim1) before a new index entry is made.
That was probably a good idea when this code was first written,
but since we invented index-only scans, it's not so good.
One answer could be to mark this opclass as incapable of index-only
scans. But to do so, we'd need an extension module version bump,
followed by manual action by DBAs to install the updated version
of btree_gist. And it's not really a desirable place to end up,
anyway.
Instead, let's fix the code by removing the unwanted space-stripping
action and adjusting the opclass's comparison logic to ignore
trailing spaces as bpchar normally does. This will not hinder
cases that work today, since index searches with this logic will
act the same whether trailing spaces are stored or not. It will
not by itself fix the problem of getting space-stripped results
from index-only scans, of course. Users who care about that can
REINDEX affected indexes after installing this update, to immediately
replace all improperly-truncated index entries. Otherwise, it can
be expected that the index's behavior will change incrementally as
old entries are replaced by new ones.
Per report from Alexander Lakhin. Back-patch to all supported branches.
Discussion: https://postgr.es/m/696c995b-b37f-5526-f45d-04abe713179f@gmail.com
Instead of using a hardcoded or default path to the perl file the .bat
file is a wrapper for, we use a path that means the file is found in
the same directory as the .bat file.
Patch by Anton Voloshin, slightly tweaked by me.
Backpatch to all live branches
Discussion: https://postgr.es/m/2b7a674b-5fb0-d264-75ef-ecc7a31e54f8@postgrespro.ru
We disallow altering a column datatype within a regular table,
if the table's rowtype is used as a column type elsewhere,
because we lack code to go around and rewrite the other tables.
This restriction should apply to partitioned tables as well, but it
was not checked because ATRewriteTables and ATPrepAlterColumnType
were not on the same page about who should do it for which relkinds.
Per bug #17351 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17351-6db1870f3f4f612a@postgresql.org
Under concurrency, it is possible for two sessions to be merrily locking
and releasing a tuple and marking it again as HEAP_XMAX_INVALID all the
while a third session attempts to lock it, miserably fails at it, and
then contemplates life, the universe and everything only to eventually
fail an assertion that said bit is not set. Before SKIP LOCKED that was
indeed a reasonable expectation, but alas! commit df630b0dd5 falsified
it.
This bug is as old as time itself, and even older, if you think time
begins with the oldest supported branch. Therefore, backpatch to all
supported branches.
Author: Simon Riggs <simon.riggs@enterprisedb.com>
Discussion: https://postgr.es/m/CANbhV-FeEwMnN8yuMyss7if1ZKjOKfjcgqB26n8pqu1e=q0ebg@mail.gmail.com
Commit 4ace45677 failed to fix the problem fully, because the
same issue of attempting to fetch a non-returnable index column
can occur when rechecking the indexqual after using a lossy index
operator. Moreover, it broke EXPLAIN for such indexquals (which
indicates a gap in our test cases :-().
Revert the code changes of 4ace45677 in favor of adding a new field
to struct IndexOnlyScan, containing a version of the indexqual that
can be executed against the index-returned tuple without using any
non-returnable columns. (The restrictions imposed by check_index_only
guarantee this is possible, although we may have to recompute indexed
expressions.) Support construction of that during setrefs.c
processing by marking IndexOnlyScan.indextlist entries as resjunk
if they can't be returned, rather than removing them entirely.
(We could alternatively require setrefs.c to look up the IndexOptInfo
again, but abusing resjunk this way seems like a reasonably safe way
to avoid needing to do that.)
This solution isn't great from an API-stability standpoint: if there
are any extensions out there that build IndexOnlyScan structs directly,
they'll be broken in the next minor releases. However, only a very
invasive extension would be likely to do such a thing. There's no
change in the Path representation, so typical planner extensions
shouldn't have a problem.
As before, back-patch to all supported branches.
Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us
Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org