Commit Graph

52322 Commits

Author SHA1 Message Date
Michael Paquier 27d3844418 pageinspect: Add more sanity checks to prevent out-of-bound reads
A couple of code paths use the special area on the page passed by the
function caller, expecting to find some data in it.  However, feeding
an incorrect page can lead to out-of-bound reads when trying to access
the page special area (like a heap page that has no special area,
leading PageGetSpecialPointer() to grab a pointer outside the allocated
page).

The functions used for hash and btree indexes have some protection
already against that, while some other functions using a relation OID
as argument would make sure that the access method involved is correct,
but functions taking in input a raw page without knowing the relation
the page is attached to would run into problems.

This commit improves the set of checks used in the code paths of BRIN,
btree (including one check if a leaf page is found with a non-zero
level), GIN and GiST to verify that the page given in input has a
special area size that fits with each access method, which is done
though PageGetSpecialSize(), becore calling PageGetSpecialPointer().

The scope of the checks done is limited to work with pages that one
would pass after getting a block with get_raw_page(), as it is possible
to craft byteas that could bypass existing code paths.  Having too many
checks would also impact the usability of pageinspect, as the existing
code is very useful to look at the content details in a corrupted page,
so the focus is really to avoid out-of-bound reads as this is never a
good thing even with functions whose execution is limited to
superusers.

The safest approach could be to rework the functions so as these fetch a
block using a relation OID and a block number, but there are also cases
where using a raw page is useful.

Tests are added to cover all the code paths that needed such checks, and
an error message for hash indexes is reworded to fit better with what
this commit adds.

Reported-By: Alexander Lakhin
Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/16527-ef7606186f0610a1@postgresql.org
Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru
Backpatch-through: 10
2022-03-27 17:53:51 +09:00
Tom Lane 0144c9c7e7 Suppress compiler warning in relptr_store().
clang 13 with -Wextra warns that "performing pointer subtraction with
a null pointer has undefined behavior" in the places where freepage.c
tries to set a relptr variable to constant NULL.  This appears to be
a compiler bug, but it's unlikely to get fixed instantly.  Fortunately,
we can work around it by introducing an inline support function, which
seems like a good change anyway because it removes the macro's existing
double-evaluation hazard.

Backpatch to v10 where this code was introduced.

Patch by me, based on an idea of Andres Freund's.

Discussion: https://postgr.es/m/48826.1648310694@sss.pgh.pa.us
2022-03-26 14:29:40 -04:00
Tom Lane 579cef5faf Harden TAP tests that intentionally corrupt page checksums.
The previous method for doing that was to write zeroes into a
predetermined set of page locations.  However, there's a roughly
1-in-64K chance that the existing checksum will match by chance,
and yesterday several buildfarm animals started to reproducibly
see that, resulting in test failures because no checksum mismatch
was reported.

Since the checksum includes the page LSN, test success depends on
the length of the installation's WAL history, which is affected by
(at least) the initial catalog contents, the set of locales installed
on the system, and the length of the pathname of the test directory.
Sooner or later we were going to hit a chance match, and today is
that day.

Harden these tests by specifically inverting the checksum field and
leaving all else alone, thereby guaranteeing that the checksum is
incorrect.

In passing, fix places that were using seek() to set up for syswrite(),
a combination that the Perl docs very explicitly warn against.  We've
probably escaped problems because no regular buffered I/O is done on
these filehandles; but if it ever breaks, we wouldn't deserve or get
much sympathy.

Although we've only seen problems in HEAD, now that we recognize the
environmental dependencies it seems like it might be just a matter
of time until someone manages to hit this in back-branch testing.
Hence, back-patch to v11 where we started doing this kind of test.

Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us
2022-03-25 14:23:26 -04:00
Alvaro Herrera ffd28516e6
Fix replay of create database records on standby
Crash recovery on standby may encounter missing directories when
replaying create database WAL records.  Prior to this patch, the standby
would fail to recover in such a case.  However, the directories could be
legitimately missing.  Consider a sequence of WAL records as follows:

    CREATE DATABASE
    DROP DATABASE
    DROP TABLESPACE

If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.

This patch adds a mechanism similar to invalid-page tracking, to keep a
tally of missing directories during crash recovery.  If all the missing
directory references are matched with corresponding drop records at the
end of crash recovery, the standby can safely continue following the
primary.

Backpatch to 13, at least for now.  The bug is older, but fixing it in
older branches requires more careful study of the interactions with
commit e6d8069522, which appeared in 13.

A new TAP test file is added to verify the condition.  However, because
it depends on commit d6d317dbf6, it can only be added to branch
master.  I (Álvaro) manually verified that the code behaves as expected
in branch 14.  It's a bit nervous-making to leave the code uncovered by
tests in older branches, but leaving the bug unfixed is even worse.
Also, the main reason this fix took so long is precisely that we
couldn't agree on a good strategy to approach testing for the bug, so
perhaps this is the best we can do.

Diagnosed-by: Paul Guo <paulguo@gmail.com>
Author: Paul Guo <paulguo@gmail.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
2022-03-25 13:16:21 +01:00
Robert Haas bbace5697d Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.
If TRUNCATE causes some buffers to be invalidated and thus the
checkpoint does not flush them, TRUNCATE must also ensure that the
corresponding files are truncated on disk. Otherwise, a replay
from the checkpoint might find that the buffers exist but have
the wrong contents, which may cause replay to fail.

Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design
suggestion from Heikki Linnakangas, with some changes to the
comments by me. Review of this and a prior patch that approached
the issue differently by Heikki Linnakangas, Andres Freund, Álvaro
Herrera, Masahiko Sawada, and Tom Lane.

Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com
2022-03-24 14:32:48 -04:00
Andres Freund 81045e1e1c Don't try to translate NULL in GetConfigOptionByNum().
Noticed via -fsanitize=undefined. Introduced when a few columns in
GetConfigOptionByNum() / pg_settings started to be translated in 72be8c29a /
PG 12.

Backpatch to all affected branches, for the same reasons as 46ab07ffda.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 12-
2022-03-23 13:18:02 -07:00
Andres Freund 89a94c24aa Don't call fwrite() with len == 0 when writing out relcache init file.
Noticed via -fsanitize=undefined.

Backpatch to all branches, for the same reasons as 46ab07ffda.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 10-
2022-03-23 13:13:18 -07:00
Andres Freund e52e9bd57f configure: check for dlsym instead of dlopen.
When building with sanitizers the sanitizer library provides dlopen, but not
dlsym(), making configure think that -ldl isn't needed. Just checking for
dlsym() ought to suffice, hard to see dlsym() being provided without dlopen()
also being provided.

Backpatch to all branches, for the same reasons as 46ab07ffda.

Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 10-
2022-03-23 12:43:41 -07:00
Alvaro Herrera 9814c708c6
pg_upgrade: Upgrade an Assert to a real 'if' test
It seems possible for the condition being tested to be true in
production, and nobody would never know (except when some data
eventually becomes corrupt?).

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m//202109040001.zky3wgv2qeqg@alvherre.pgsql
2022-03-23 19:23:51 +01:00
Alvaro Herrera caaeb88ff7
Fix "missing continuation record" after standby promotion
Invalidate abortedRecPtr and missingContrecPtr after a missing
continuation record is successfully skipped on a standby. This fixes a
PANIC caused when a recently promoted standby attempts to write an
OVERWRITE_RECORD with an LSN of the previously read aborted record.

Backpatch to 10 (all stable versions).

Author: Sami Imseih <simseih@amazon.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/44D259DE-7542-49C4-8A52-2AB01534DCA9@amazon.com
2022-03-23 18:22:10 +01:00
Thomas Munro cd3a5055f9 Try to stabilize vacuum test.
As commits b700f96c and 3414099c did for the reloptions test, make
sure VACUUM can always truncate the table as expected.

Back-patch to 12, where vacuum_truncate arrived.

Discussion: https://postgr.es/m/CAD21AoCNoWjYkdEtr%2BVDoF9v__V905AedKZ9iF%3DArgCtrbxZqw%40mail.gmail.com
2022-03-23 15:06:59 +13:00
Andres Freund 2d608c9607 Add missing dependency of pg_dumpall to WIN32RES.
When cross-building to windows, or building with mingw on windows, the build
could fail with
  x86_64-w64-mingw32-gcc: error: win32ver.o: No such file or director
because pg_dumpall didn't depend on WIN32RES, but it's recipe references
it. The build nevertheless succeeded most of the time, due to
pg_dump/pg_restore having the required dependency, causing win32ver.o to be
built.

Reported-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGJeekpUPWW6yCVdf9=oBAcCp86RrBivo4Y4cwazAzGPng@mail.gmail.com
Backpatch: 10-, omission present on all live branches
2022-03-22 08:28:51 -07:00
Michael Paquier fdb1be4962 Fix failures in SSL tests caused by out-of-tree keys and certificates
This issue is environment-sensitive, where the SSL tests could fail in
various way by feeding on defaults provided by sslcert, sslkey,
sslrootkey, sslrootcert, sslcrl and sslcrldir coming from a local setup,
as of ~/.postgresql/ by default.  Horiguchi-san has reported two
failures, but more advanced testing from me (aka inclusion of garbage
SSL configuration in ~/.postgresql/ for all the configuration
parameters) has showed dozens of failures that can be triggered in the
whole test suite.

History has showed that we are not good when it comes to address such
issues, fixing them locally like in dd87799, and such problems keep
appearing.  This commit strengthens the entire test suite to put an end
to this set of problems by embedding invalid default values in all the
connection strings used in the tests.  The invalid values are prefixed
in each connection string, relying on the follow-up values passed in the
connection string to enforce any invalid value previously set.  Note
that two tests related to CRLs are required to fail with certain pre-set
configurations, but we can rely on enforcing an empty value instead
after the invalid set of values.

Reported-by: Kyotaro Horiguchi
Reviewed-by: Andrew Dunstan, Daniel Gustafsson, Kyotaro Horiguchi
Discussion: https://postgr.es/m/20220316.163658.1122740600489097632.horikyota.ntt@gmail.com
backpatch-through: 10
2022-03-22 13:21:33 +09:00
Tom Lane 48b6035f0f Fix assorted missing logic for GroupingFunc nodes.
The planner needs to treat GroupingFunc like Aggref for many purposes,
in particular with respect to processing of the argument expressions,
which are not to be evaluated at runtime.  A few places hadn't gotten
that memo, notably including subselect.c's processing of outer-level
aggregates.  This resulted in assertion failures or wrong plans for
cases in which a GROUPING() construct references an outer aggregation
level.

Also fix missing special cases for GroupingFunc in cost_qual_eval
(resulting in wrong cost estimates for GROUPING(), although it's
not clear that that would affect plan shapes in practice) and in
ruleutils.c (resulting in excess parentheses in pretty-print mode).

Per bug #17088 from Yaoguang Chen.  Back-patch to all supported
branches.

Richard Guo, Tom Lane

Discussion: https://postgr.es/m/17088-e33882b387de7f5c@postgresql.org
2022-03-21 17:44:29 -04:00
Tom Lane 05ccf974cd Fix risk of deadlock failure while dropping a partitioned index.
DROP INDEX needs to lock the index's table before the index itself,
else it will deadlock against ordinary queries that acquire the
relation locks in that order.  This is correctly mechanized for
plain indexes by RangeVarCallbackForDropRelation; but in the case of
a partitioned index, we neglected to lock the child tables in advance
of locking the child indexes.  We can fix that by traversing the
inheritance tree and acquiring the needed locks in RemoveRelations,
after we have acquired our locks on the parent partitioned table and
index.

While at it, do some refactoring to eliminate confusion between
the actual and expected relkind in RangeVarCallbackForDropRelation.
We can save a couple of syscache lookups too, by having that function
pass back info that RemoveRelations will need.

Back-patch to v11 where partitioned indexes were added.

Jimmy Yih, Gaurab Dey, Tom Lane

Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com
2022-03-21 12:22:13 -04:00
Tom Lane a1efc8f8c8 Doc: fix our example systemd script.
The example used "TimeoutSec=0", but systemd's documented way to get
the desired effect is "TimeoutSec=infinity".

Discussion: https://postgr.es/m/164770078557.670.5467111518383664377@wrigleys.postgresql.org
2022-03-20 12:39:48 -04:00
Michael Paquier c422e8da7e doc: Mention SET TABLESPACE clause for ALTER MATERIALIZED VIEW
This command flavor is supported, but there was nothing in the
documentation about it.

Author: Yugo Nagata
Discussion: https://postgr.es/m/20220316133337.5dc9740abfa24c25ec9f67f5@sraoss.co.jp
Backpatch-through: 10
2022-03-19 16:37:39 +09:00
Tom Lane ae8ec7feba Fix incorrect xmlschema output for types timetz and timestamptz.
The output of table_to_xmlschema() and allied functions includes
a regex describing valid values for these types ... but the regex
was itself invalid, as it failed to escape a literal "+" sign.

Report and fix by Renan Soares Lopes.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/7f6fabaa-3f8f-49ab-89ca-59fbfe633105@me.com
2022-03-18 16:01:42 -04:00
Tom Lane 1d072bd203 Revert applying column aliases to the output of whole-row Vars.
In commit bf7ca1587, I had the bright idea that we could make the
result of a whole-row Var (that is, foo.*) track any column aliases
that had been applied to the FROM entry the Var refers to.  However,
that's not terribly logically consistent, because now the output of
the Var is no longer of the named composite type that the Var claims
to emit.  bf7ca1587 tried to handle that by changing the output
tuple values to be labeled with a blessed RECORD type, but that's
really pretty disastrous: we can wind up storing such tuples onto
disk, whereupon they're not readable by other sessions.

The only practical fix I can see is to give up on what bf7ca1587
tried to do, and say that the column names of tuples produced by
a whole-row Var are always those of the underlying named composite
type, query aliases or no.  While this introduces some inconsistencies,
it removes others, so it's not that awful in the abstract.  What *is*
kind of awful is to make such a behavioral change in a back-patched
bug fix.  But corrupt data is worse, so back-patched it will be.

(A workaround available to anyone who's unhappy about this is to
introduce an extra level of sub-SELECT, so that the whole-row Var is
referring to the sub-SELECT's output and not to a named table type.
Then the Var is of type RECORD to begin with and there's no issue.)

Per report from Miles Delahunty.  The faulty commit dates to 9.5,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/2950001.1638729947@sss.pgh.pa.us
2022-03-17 18:18:05 -04:00
Tomas Vondra 677a1dc0ca Fix publish_as_relid with multiple publications
Commit 83fd4532a7 allowed publishing of changes via ancestors, for
publications defined with publish_via_partition_root. But the way
the ancestor was determined in get_rel_sync_entry() was incorrect,
simply updating the same variable. So with multiple publications,
replicating different ancestors, the outcome depended on the order
of publications in the list - the value from the last loop was used,
even if it wasn't the top-most ancestor.

This is a probably rare situation, as in most cases publications do
not overlap, so each partition has exactly one candidate ancestor
to replicate as and there's no ambiguity.

Fixed by tracking the "ancestor level" for each publication, and
picking the top-most ancestor. Adds a test case, verifying the
correct ancestor is used for publishing the changes and that this
does not depend on order of publications in the list.

Older releases have another bug in this loop - once all actions are
replicated, the loop is terminated, on the assumption that inspecting
additional publications is unecessary. But that misses the fact that
those additional applications may replicate different ancestors.

Fixed by removal of this break condition. We might still terminate the
loop in some cases (e.g. when replicating all actions and the ancestor
is the partition root).

Backpatch to 13, where publish_via_partition_root was introduced.

Initial report and fix by me, test added by Hou zj. Reviews and
improvements by Amit Kapila.

Author: Tomas Vondra, Hou zj, Amit Kapila
Reviewed-by: Amit Kapila, Hou zj
Discussion: https://postgr.es/m/d26d24dd-2fab-3c48-0162-2b7f84a9c893%40enterprisedb.com
2022-03-16 18:06:27 +01:00
Alexander Korotkov 7d30f59da7 Fix default signature length for gist_ltree_ops
911e702077 implemented operator class parameters including the signature length
in ltree.  Previously, the signature length for gist_ltree_ops was 8.  Because
of bug 911e702077 the default signature length for gist_ltree_ops became 28 for
ltree 1.1 (where options method is NOT provided) and 8 for ltree 1.2 (where
options method is provided).  This commit changes the default signature length
for ltree 1.1 to 8.

Existing gist_ltree_ops indexes might be corrupted in various scenarios.
Thus, we have to recommend reindexing all the gist_ltree_ops indexes after
the upgrade.

Reported-by: Victor Yegorov
Reviewed-by: Tomas Vondra, Tom Lane, Andres Freund, Nikita Glukhov
Reviewed-by: Andrew Dunstan
Author: Tomas Vondra, Alexander Korotkov
Discussion: https://postgr.es/m/17406-71e02820ae79bb40%40postgresql.org
Discussion: https://postgr.es/m/d80e0a55-6c3e-5b26-53e3-3c4f973f737c%40enterprisedb.com
2022-03-16 11:41:30 +03:00
Thomas Munro 26e0079398 Fix race between DROP TABLESPACE and checkpointing.
Commands like ALTER TABLE SET TABLESPACE may leave files for the next
checkpoint to clean up.  If such files are not removed by the time DROP
TABLESPACE is called, we request a checkpoint so that they are deleted.
However, there is presently a window before checkpoint start where new
unlink requests won't be scheduled until the following checkpoint.  This
means that the checkpoint forced by DROP TABLESPACE might not remove the
files we expect it to remove, and the following ERROR will be emitted:

	ERROR:  tablespace "mytblspc" is not empty

To fix, add a call to AbsorbSyncRequests() just before advancing the
unlink cycle counter.  This ensures that any unlink requests forwarded
prior to checkpoint start (i.e., when ckpt_started is incremented) will
be processed by the current checkpoint.  Since AbsorbSyncRequests()
performs memory allocations, it cannot be called within a critical
section, so we also need to move SyncPreCheckpoint() to before
CreateCheckPoint()'s critical section.

This is an old bug, so back-patch to all supported versions.

Author: Nathan Bossart <nathandbossart@gmail.com>
Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220215235845.GA2665318%40nathanxps13
2022-03-16 17:20:50 +13:00
Michael Paquier dc5b3bdae3 pageinspect: Fix memory context allocation of page in brin_revmap_data()
This caused the function to fail, as the aligned copy of the raw page
given by the function caller was not saved in the correct memory
context, which needs to be multi_call_memory_ctx in this case.

Issue introduced by 076f4d9.

Per buildfarm members sifika, mylodon and longfin.  I have reproduced
that locally with macos.

Discussion: https://postgr.es/m/YjFPOtfCW6yLXUeM@paquier.xyz
Backpatch-through: 10
2022-03-16 12:29:47 +09:00
Thomas Munro ea70f69450 Fix documentation typo in commit 5e6368b4.
Back-patch to 14.
2022-03-16 15:56:41 +13:00
Thomas Munro 1396b5c6ed Fix waiting in RegisterSyncRequest().
If we run out of space in the checkpointer sync request queue (which is
hopefully rare on real systems, but common with very small buffer pool),
we wait for it to drain.  While waiting, we should report that as a wait
event so that users know what is going on, and also handle postmaster
death, since otherwise the loop might never terminate if the
checkpointer has exited.

Back-patch to 12.  Although the problem exists in earlier releases too,
the code is structured differently before 12 so I haven't gone any
further for now, in the absence of field complaints.

Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de
2022-03-16 15:35:42 +13:00
Michael Paquier b61e6214cf pageinspect: Fix handling of page sizes and AM types
This commit fixes a set of issues related to the use of the SQL
functions in this module when the caller is able to pass down raw page
data as input argument:
- The page size check was fuzzy in a couple of places, sometimes
looking after only a sub-range, but what we are looking for is an exact
match on BLCKSZ.  After considering a few options here, I have settled
down to do a generalization of get_page_from_raw().  Most of the SQL
functions already used that, and this is not strictly required if not
accessing an 8-byte-wide value from a raw page, but this feels safer in
the long run for alignment-picky environment, particularly if a code
path begins to access such values.  This also reduces the number of
strings that need to be translated.
- The BRIN function brin_page_items() uses a Relation but it did not
check the access method of the opened index, potentially leading to
crashes.  All the other functions in need of a Relation already did
that.
- Some code paths could fail on elog(), but we should to use ereport()
for failures that can be triggered by the user.

Tests are added to stress all the cases that are fixed as of this
commit, with some junk raw pages (\set VERBOSITY ensures that this works
across all page sizes) and unexpected index types when functions open
relations.

Author: Michael Paquier, Justin Prysby
Discussion: https://postgr.es/m/20220218030020.GA1137@telsasoft.com
Backpatch-through: 10
2022-03-16 11:20:47 +09:00
Thomas Munro 78c0f85e43 Wake up for latches in CheckpointWriteDelay().
The checkpointer shouldn't ignore its latch.  Other backends may be
waiting for it to drain the request queue.  Hopefully real systems don't
have a full queue often, but the condition is reached easily when
shared_buffers is small.

This involves defining a new wait event, which will appear in the
pg_stat_activity view often due to spread checkpoints.

Back-patch only to 14.  Even though the problem exists in earlier
branches too, it's hard to hit there.  In 14 we stopped using signal
handlers for latches on Linux, *BSD and macOS, which were previously
hiding this problem by interrupting the sleep (though not reliably, as
the signal could arrive before the sleep begins; precisely the problem
latches address).

Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de
2022-03-16 13:57:07 +13:00
Thomas Munro d9f7ad54e5 Back-patch LLVM 14 API changes.
Since LLVM 14 has stopped changing and is about to be released,
back-patch the following changes from the master branch:

  e6a7600202
  807fee1a39
  a56e7b6601

Back-patch to 11, where LLVM JIT support came in.
2022-03-16 11:42:00 +13:00
Tom Lane 8dcd1c3564 Restore the previous semantics of get_constraint_index().
Commit 8b069ef5d changed this function to look at pg_constraint.conindid
rather than searching pg_depend.  That was a good performance improvement,
but it failed to preserve the exact semantics.  The old code would only
return an index that was "owned by" (internally dependent on) the
specified constraint, whereas the new code will also return indexes that
are just referenced by foreign key constraints.  This confuses ALTER
TABLE, which was implicitly expecting the previous semantics, into
failing with errors like
    ERROR:  relation 146621 has multiple clustered indexes
or
    ERROR:  "pk_attbl" is not an index for table "atref"

We can fix this without reverting the performance improvement by adding
a contype check in get_constraint_index().  Another way could be to
make ALTER TABLE check it, but I'm worried that extension code could
also have subtle dependencies on the old semantics.

Tom Lane and Japin Li, per bug #17409 from Holly Roberts.
Back-patch to v14 where the error crept in.

Discussion: https://postgr.es/m/17409-52871dda8b5741cb@postgresql.org
2022-03-11 13:47:26 -05:00
Michael Paquier 8f09157287 doc: Add ALTER/DROP ROUTINE to the event trigger matrix
ALTER ROUTINE triggers the events ddl_command_start and ddl_command_end,
and DROP ROUTINE triggers sql_drop, ddl_command_start and
ddl_command_end, but this was not mention on the matrix table.

Reported-by: Leslie Lemaire
Discussion: https://postgr.es/m/164647533363.646.5802968483136493025@wrigleys.postgresql.org
Backpatch-through: 11
2022-03-09 14:59:18 +09:00
Noah Misch f60bb3e0a9 Introduce PG_TEST_TIMEOUT_DEFAULT for TAP suite non-elapsing timeouts.
Slow hosts may avoid load-induced, spurious failures by setting
environment variable PG_TEST_TIMEOUT_DEFAULT to some number of seconds
greater than 180.  Developers may see faster failures by setting that
environment variable to some lesser number of seconds.  In tests, write
$PostgreSQL::Test::Utils::timeout_default wherever the convention has
been to write 180.  This change raises the default for some briefer
timeouts.  Back-patch to v10 (all supported versions).

Discussion: https://postgr.es/m/20220218052842.GA3627003@rfd.leadboat.com
2022-03-04 18:53:17 -08:00
Tom Lane a008c03dd7 Fix pg_regress to print the correct postmaster address on Windows.
pg_regress reported "Unix socket" as the default location whenever
HAVE_UNIX_SOCKETS is defined.  However, that's not been accurate
on Windows since 8f3ec75de.  Update this logic to match what libpq
actually does now.

This is just cosmetic, but still it's potentially misleading.
Back-patch to v13 where 8f3ec75de came in.

Discussion: https://postgr.es/m/3894060.1646415641@sss.pgh.pa.us
2022-03-04 13:23:58 -05:00
Tom Lane 5c9d17e94c Fix bogus casting in BlockIdGetBlockNumber().
This macro cast the result to BlockNumber after shifting, not before,
which is the wrong thing.  Per the C spec, the uint16 fields would
promote to int not unsigned int, so that (for 32-bit int) the shift
potentially shifts a nonzero bit into the sign position.  I doubt
there are any production systems where this would actually end with
the wrong answer, but it is undefined behavior per the C spec, and
clang's -fsanitize=undefined option reputedly warns about it on some
platforms.  (I can't reproduce that right now, but the code is
undeniably wrong per spec.)  It's easy to fix by casting to
BlockNumber (uint32) in the proper places.

It's been wrong for ages, so back-patch to all supported branches.

Report and patch by Zhihong Yu (cosmetic tweaking by me)

Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com
2022-03-03 19:03:35 -05:00
Tom Lane b0bc196e52 Clean up assorted failures under clang's -fsanitize=undefined checks.
Most of these are cases where we could call memcpy() or other libc
functions with a NULL pointer and a zero count, which is forbidden
by POSIX even though every production version of libc allows it.
We've fixed such things before in a piecemeal way, but apparently
never made an effort to try to get them all.  I don't claim that
this patch does so either, but it gets every failure I observe in
check-world, using clang 12.0.1 on current RHEL8.

numeric.c has a different issue that the sanitizer doesn't like:
"ln(-1.0)" will compute log10(0) and then try to assign the
resulting -Inf to an integer variable.  We don't actually use the
result in such a case, so there's no live bug.

Back-patch to all supported branches, with the idea that we might
start running a buildfarm member that tests this case.  This includes
back-patching c1132aae3 (Check the size in COPY_POINTER_FIELD),
which previously silenced some of these issues in copyfuncs.c.

Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com
2022-03-03 18:13:24 -05:00
Tom Lane 2a1f84636d Allow root-owned SSL private keys in libpq, not only the backend.
This change makes libpq apply the same private-key-file ownership
and permissions checks that we have used in the backend since commit
9a83564c5.  Namely, that the private key can be owned by either the
current user or root (with different file permissions allowed in the
two cases).  This allows system-wide management of key files, which
is just as sensible on the client side as the server, particularly
when the client is itself some application daemon.

Sync the comments about this between libpq and the backend, too.

Back-patch of a59c79564 and 50f03473e into all supported branches.

David Steele

Discussion: https://postgr.es/m/f4b7bc55-97ac-9e69-7398-335e212f7743@pgmasters.net
2022-03-02 11:57:02 -05:00
Tom Lane ac910bb232 Disallow execution of SPI functions during plperl function compilation.
Perl can be convinced to execute user-defined code during compilation
of a plperl function (or at least a plperlu function).  That's not
such a big problem as long as the activity is confined within the
Perl interpreter, and it's not clear we could do anything about that
anyway.  However, if such code tries to use plperl's SPI functions,
we have a bigger problem.  In the first place, those functions are
likely to crash because current_call_data->prodesc isn't set up yet.
In the second place, because it isn't set up, we lack critical info
such as whether the function is supposed to be read-only.  And in
the third place, this path allows code execution during function
validation, which is strongly discouraged because of the potential
for security exploits.  Hence, reject execution of the SPI functions
until compilation is finished.

While here, add check_spi_usage_allowed() calls to various functions
that hadn't gotten the memo about checking that.  I think that perhaps
plperl_sv_to_literal may have been intentionally omitted on the grounds
that it was safe at the time; but if so, the addition of transforms
functionality changed that.  The others are more recently added and
seem to be flat-out oversights.

Per report from Mark Murawski.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/9acdf918-7fff-4f40-f750-2ffa84f083d2@intellasoft.net
2022-02-25 17:40:43 -05:00
Andres Freund 9ff7fd9063 pg_waldump: Fix error message for WAL files smaller than XLOG_BLCKSZ.
When opening a WAL file smaller than XLOG_BLCKSZ (e.g. 0 bytes long) while
determining the wal_segment_size, pg_waldump checked errno, despite errno not
being set by the short read. Resulting in a bogus error message.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220214.181847.775024684568733277.horikyota.ntt@gmail.com
Backpatch: 11-, the bug was introducedin fc49e24fa
2022-02-25 10:31:16 -08:00
Heikki Linnakangas 7d80e93fb1 Fix data loss on crash after sorted GiST index build.
If a checkpoint happens during the index build, and the system crashes
after the checkpoint and the index build have finished, the data written
to the index before the checkpoint started could be lost. The checkpoint
won't have fsync'd it, and it won't be replayed at crash recovery either.
Fix by calling smgrimmedsync() after the index build, just like in B-tree
index build.

Backpatch to v14 where the sorted GiST index build was introduced.

Reported-by: Melanie Plageman
Discussion: https://www.postgresql.org/message-id/CAAKRu_ZJJynimxKj5xYBSziL62-iEtPE+fx-B=JzR=jUtP92mw@mail.gmail.com
2022-02-24 14:34:09 +02:00
Tom Lane dd7c059791 Re-allow underscore as first character of custom GUC names.
Commit 3db826bd5 intended that valid_custom_variable_name's
rules for valid identifiers match those of scan.l.  However,
I (tgl) had some kind of brain fade and put "_" in the wrong
list.

Fix by Japin Li, per bug #17415 from Daniel Polski.

Discussion: https://postgr.es/m/17415-ebdb683d7e09a51c@postgresql.org
2022-02-23 11:10:46 -05:00
Michael Paquier 627c79a1e8 Add compute_query_id = regress
"regress" is a new mode added to compute_query_id aimed at facilitating
regression testing when a module computing query IDs is loaded into the
backend, like pg_stat_statements.  It works the same way as "auto",
meaning that query IDs are computed if a module enables it, except that
query IDs are hidden in EXPLAIN outputs to ensure regression output
stability.

Like any GUCs of the kind (force_parallel_mode, etc.), this new
configuration can be added to an instance's postgresql.conf, or just
passed down with PGOPTIONS at command level.  compute_query_id uses an
enum for its set of option values, meaning that this addition ensures
ABI compatibility.

Using this new configuration mode allows installcheck-world to pass when
running the tests on an instance with pg_stat_statements enabled,
stabilizing the test output while checking the paths doing query ID
computations.

Reported-by: Anton Melnikov
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/1634283396.372373993@f75.i.mail.ru
Discussion: https://postgr.es/m/YgHlxgc/OimuPYhH@paquier.xyz
Backpatch-through: 14
2022-02-22 10:23:49 +09:00
Andres Freund 7bbfe59941 Fix temporary object cleanup failing due to toast access without snapshot.
When cleaning up temporary objects during process exit the cleanup could fail
with:
  FATAL: cannot fetch toast data without an active snapshot

The bug is caused by RemoveTempRelationsCallback() not setting up a
snapshot. If an object with toasted catalog data needs to be cleaned up,
init_toast_snapshot() could fail with the above error.

Most of the time however the the problem is masked due to cached catalog
snapshots being returned by GetOldestSnapshot(). But dropping an object can
cause catalog invalidations to be emitted. If no further catalog accesses are
necessary between the invalidation processing and the next toast datum
deletion, the bug becomes visible.

It's easy to miss this bug because it typically happens after clients
disconnect and the FATAL error just ends up in the log.

Luckily temporary table cleanup at the next use of the same temporary schema
or during DISCARD ALL does not have the same problem.

Fix the bug by pushing a snapshot in RemoveTempRelationsCallback(). Also add
isolation tests for temporary object cleanup, including objects with toasted
catalog data.

A future HEAD only commit will add more assertions.

Reported-By: Miles Delahunty
Author: Andres Freund
Discussion: https://postgr.es/m/CAOFAq3BU5Mf2TTvu8D9n_ZOoFAeQswuzk7yziAb7xuw_qyw5gw@mail.gmail.com
Backpatch: 10-
2022-02-21 09:57:05 -08:00
Andrew Dunstan 8b5cd373ba
Remove most msys special processing in TAP tests
Following migration of Windows buildfarm members running TAP tests to
use of ucrt64 perl for those tests, special processing for msys perl is
no longer necessary and so is removed.

Backpatch to release 10

Discussion: https://postgr.es/m/c65a8781-77ac-ea95-d185-6db291e1baeb@dunslane.net
2022-02-20 11:48:45 -05:00
Andrew Dunstan 652ff988fb
Remove PostgreSQL::Test::Utils::perl2host completely
Commit f1ac4a74de disabled this processing, and as nothing has broken (as
expected) here we proceed to remove the routine and adjust all the call
sites.

Backpatch to release 10

Discussion: https://postgr.es/m/0ba775a2-8aa0-0d56-d780-69427cf6f33d@dunslane.net
Discussion: https://postgr.es/m/20220125023609.5ohu3nslxgoygihl@alap3.anarazel.de
2022-02-20 08:55:06 -05:00
Tom Lane 2e30d77a19 Suppress warning about stack_base_ptr with late-model GCC.
GCC 12 complains that set_stack_base is storing the address of
a local variable in a long-lived pointer.  This is an entirely
reasonable warning (indeed, it just helped us find a bug);
but that behavior is intentional here.  We can work around it
by using __builtin_frame_address(0) instead of a specific local
variable; that produces an address a dozen or so bytes different,
in my testing, but we don't care about such a small difference.
Maybe someday a compiler lacking that function will start to issue
a similar warning, but we'll worry about that when it happens.

Patch by me, per a suggestion from Andres Freund.  Back-patch to
v12, which is as far back as the patch will go without some pain.
(Recently-established project policy would permit a back-patch as
far as 9.2, but I'm disinclined to expend the work until GCC 12
is much more widespread.)

Discussion: https://postgr.es/m/3773792.1645141467@sss.pgh.pa.us
2022-02-17 22:45:34 -05:00
Etsuro Fujita a9e186da81 Doc: Update documentation for modifying postgres_fdw foreign tables.
Document that they can be modified using COPY as well.

Back-patch to v11 where commit 3d956d956 added support for COPY in
postgres_fdw.
2022-02-16 15:15:02 +09:00
Amit Kapila 04645bbcae WAL log unchanged toasted replica identity key attributes.
Currently, during UPDATE, the unchanged replica identity key attributes
are not logged separately because they are getting logged as part of the
new tuple. But if they are stored externally then the untoasted values are
not getting logged as part of the new tuple and logical replication won't
be able to replicate such UPDATEs. So we need to log such attributes as
part of the old_key_tuple during UPDATE.

Reported-by: Haiying Tang
Author: Dilip Kumar and Amit Kapila
Reviewed-by: Alvaro Herrera, Haiying Tang, Andres Freund
Backpatch-through: 10
Discussion: https://postgr.es/m/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
2022-02-14 08:07:46 +05:30
Alexander Korotkov c76665edce Fix memory leak in IndexScan node with reordering
Fix ExecReScanIndexScan() to free the referenced tuples while emptying the
priority queue.  Backpatch to all supported versions.

Discussion: https://postgr.es/m/CAHqSB9gECMENBQmpbv5rvmT3HTaORmMK3Ukg73DsX5H7EJV7jw%40mail.gmail.com
Author: Aliaksandr Kalenik
Reviewed-by: Tom Lane, Alexander Korotkov
Backpatch-through: 10
2022-02-14 03:32:31 +03:00
Tom Lane ae27b1acc4 Fix thinko in PQisBusy().
In commit 1f39a1c06 I made PQisBusy consider conn->write_failed, but
that is now looking like complete brain fade.  In the first place, the
logic is quite wrong: it ought to be like "and not" rather than "or".
This meant that once we'd gotten into a write_failed state, PQisBusy
would always return true, probably causing the calling application to
iterate its loop until PQconsumeInput returns a hard failure thanks
to connection loss.  That's not what we want: the intended behavior
is to return an error PGresult, which the application probably has
much cleaner support for.

But in the second place, checking write_failed here seems like the
wrong thing anyway.  The idea of the write_failed mechanism is to
postpone handling of a write failure until we've read all we can from
the server; so that flag should not interfere with input-processing
behavior.  (Compare 7247e243a.)  What we *should* check for is
status = CONNECTION_BAD, ie, socket already closed.  (Most places that
close the socket don't touch asyncStatus, but they do reset status.)
This primarily ensures that if PQisBusy() returns true then there is
an open socket, which is assumed by several call sites in our own
code, and probably other applications too.

While at it, fix a nearby thinko in libpq's my_sock_write: we should
only consult errno for res < 0, not res == 0.  This is harmless since
pqsecure_raw_write would force errno to zero in such a case, but it
still could confuse readers.

Noted by Andres Freund.  Backpatch to v12 where 1f39a1c06 came in.

Discussion: https://postgr.es/m/20220211011025.ek7exh6owpzjyudn@alap3.anarazel.de
2022-02-12 13:23:20 -05:00
Tom Lane 277e744ae1 Don't use_physical_tlist for an IOS with non-returnable columns.
createplan.c tries to save a runtime projection step by specifying
a scan plan node's output as being exactly the table's columns, or
index's columns in the case of an index-only scan, if there is not a
reason to do otherwise.  This logic did not previously pay attention
to whether an index's columns are returnable.  That worked, sort of
accidentally, until commit 9a3ddeb51 taught setrefs.c to reject plans
that try to read a non-returnable column.  I have no desire to loosen
setrefs.c's new check, so instead adjust use_physical_tlist() to not
try to optimize this way when there are non-returnable column(s).

Per report from Ryan Kelly.  Like the previous patch, back-patch
to all supported branches.

Discussion: https://postgr.es/m/CAHUie24ddN+pDNw7fkhNrjrwAX=fXXfGZZEHhRuofV_N_ftaSg@mail.gmail.com
2022-02-11 15:23:52 -05:00
Tom Lane 1e8c5cf7c6 Make pg_ctl stop/restart/promote recheck postmaster aliveness.
"pg_ctl stop/restart" checked that the postmaster PID is valid just
once, as a side-effect of sending the stop signal, and then would
wait-till-timeout for the postmaster.pid file to go away.  This
neglects the case wherein the postmaster dies uncleanly after we
signal it.  Similarly, once "pg_ctl promote" has sent the signal,
it'd wait for the corresponding on-disk state change to occur
even if the postmaster dies.

I'm not sure how we've managed not to notice this problem, but it
seems to explain slow execution of the 017_shm.pl test script on AIX
since commit 4fdbf9af5, which added a speculative "pg_ctl stop" with
the idea of making real sure that the postmaster isn't there.  In the
test steps that kill-9 and then restart the postmaster, it's possible
to get past the initial signal attempt before kill() stops working
for the doomed postmaster.  If that happens, pg_ctl waited till
PGCTLTIMEOUT before giving up ... and the buildfarm's AIX members
have that set very high.

To fix, include a "kill(pid, 0)" test (similar to what
postmaster_is_alive uses) in these wait loops, so that we'll
give up immediately if the postmaster PID disappears.

While here, I chose to refactor those loops out of where they were.
do_stop() and do_restart() can perfectly well share one copy of the
wait-for-stop loop, and it seems desirable to put a similar function
beside that for wait-for-promote.

Back-patch to all supported versions, since pg_ctl's wait logic
is substantially identical in all, and we're seeing the slow test
behavior in all branches.

Discussion: https://postgr.es/m/20220210023537.GA3222837@rfd.leadboat.com
2022-02-10 16:49:39 -05:00