Commit Graph

4220 Commits

Author SHA1 Message Date
Tom Lane f6e1ee3cfa Ensure we have a snapshot while dropping ON COMMIT DROP temp tables.
Dropping a temp table could entail TOAST table access to clean out
toasted catalog entries, such as large pg_constraint.conbin strings
for complex CHECK constraints.  If we did that via ON COMMIT DROP,
we triggered the assertion in init_toast_snapshot(), because
there was no provision for setting up a snapshot for the drop
actions.  Fix that.

(I assume here that the adjacent truncation actions for ON COMMIT
DELETE ROWS don't have a similar problem: it doesn't seem like
nontransactional truncations would need to touch any toasted fields.
If that proves wrong, we could refactor a bit to have the same
snapshot acquisition cover that too.)

The test case added here does not fail before v15, because that
assertion was added in 277692220 which was not back-patched.
However, the race condition the assertion warns of surely
exists further back, so back-patch to all supported branches.

Per report from Richard Guo.

Discussion: https://postgr.es/m/CAMbWs4-x26=_QxxgdJyNbiCDzvtr2WV5ZDso_v-CukKEe6cBZw@mail.gmail.com
2023-10-16 14:06:11 -04:00
Tom Lane a715c02122 In COPY FROM, fail cleanly when unsupported encoding conversion is needed.
In recent releases, such cases fail with "cache lookup failed for
function 0" rather than complaining that the conversion function
doesn't exist as prior versions did.  Seems to be a consequence of
sloppy refactoring in commit f82de5c46.  Add the missing error check.

Per report from Pierre Fortin.  Back-patch to v14 where the
oversight crept in.

Discussion: https://postgr.es/m/20230929163739.3bea46e5.pfortin@pfortin.com
2023-10-01 12:09:26 -04:00
Heikki Linnakangas 2d4f99bc69 Fix briefly showing old progress stats for ANALYZE on inherited tables.
ANALYZE on a table with inheritance children analyzes all the child
tables in a loop. When stepping to next child table, it updated the
child rel ID value in the command progress stats, but did not reset
the 'sample_blks_total' and 'sample_blks_scanned' counters.
acquire_sample_rows() updates 'sample_blks_total' as soon as the scan
starts and 'sample_blks_scanned' after processing the first block, but
until then, pg_stat_progress_analyze would display a bogus combination
of the new child table relid with old counter values from the
previously processed child table. Fix by resetting 'sample_blks_total'
and 'sample_blks_scanned' to zero at the same time that
'current_child_table_relid' is updated.

Backpatch to v13, where pg_stat_progress_analyze view was introduced.

Reported-by: Justin Pryzby
Discussion: https://www.postgresql.org/message-id/20230122162345.GP13860%40telsasoft.com
2023-09-30 17:07:37 +03:00
Noah Misch d4648a74be Reject substituting extension schemas or owners matching ["$'\].
Substituting such values in extension scripts facilitated SQL injection
when @extowner@, @extschema@, or @extschema:...@ appeared inside a
quoting construct (dollar quoting, '', or "").  No bundled extension was
vulnerable.  Vulnerable uses do appear in a documentation example and in
non-bundled extensions.  Hence, the attack prerequisite was an
administrator having installed files of a vulnerable, trusted,
non-bundled extension.  Subject to that prerequisite, this enabled an
attacker having database-level CREATE privilege to execute arbitrary
code as the bootstrap superuser.  By blocking this attack in the core
server, there's no need to modify individual extensions.  Back-patch to
v11 (all supported versions).

Reported by Micah Gate, Valerie Woolard, Tim Carey-Smith, and Christoph
Berg.

Security: CVE-2023-39417
2023-08-07 06:06:00 -07:00
Michael Paquier 954cc2139c Fix updates of indisvalid for partitioned indexes
indisvalid is switched to true for partitioned indexes when all its
partitions have valid indexes when attaching a new partition, up to the
top-most parent if all its leaves are themselves valid when dealing with
multiple layers of partitions.

The copy of the tuple from pg_index used to switch indisvalid to true
came from the relation cache, which is incorrect.  Particularly, in the
case reported by Shruthi Gowda, executing a series of commands in a
single transaction would cause the validation of partitioned indexes to
use an incorrect version of a pg_index tuple, as indexes are reloaded
after an invalidation request with RelationReloadIndexInfo(), a much
faster version than a full index cache rebuild.  In this case, the
limited information updated in the cache leads to an incorrect version
of the tuple used.  One of the symptoms reported was the following
error, with a replica identity update, for instance:
"ERROR: attempted to update invisible tuple"

This is incorrect since 8b08f7d, so backpatch all the way down.

Reported-by: Shruthi Gowda
Author: Michael Paquier
Reviewed-by: Shruthi Gowda, Dilip Kumar
Discussion: https://postgr.es/m/CAASxf_PBcxax0wW-3gErUyftZ0XrCs3Lrpuhq4-Z3Fak1DoW7Q@mail.gmail.com
Backpatch-through: 11
2023-07-14 10:13:17 +09:00
Andres Freund d11efe8303 Handle DROP DATABASE getting interrupted
Until now, when DROP DATABASE got interrupted in the wrong moment, the removal
of the pg_database row would also roll back, even though some irreversible
steps have already been taken. E.g. DropDatabaseBuffers() might have thrown
out dirty buffers, or files could have been unlinked. But we continued to
allow connections to such a corrupted database.

To fix this, mark databases invalid with an in-place update, just before
starting to perform irreversible steps. As we can't add a new column in the
back branches, we use pg_database.datconnlimit = -2 for this purpose.

An invalid database cannot be connected to anymore, but can still be
dropped.

Unfortunately we can't easily add output to psql's \l to indicate that some
database is invalid, it doesn't fit in any of the existing columns.

Add tests verifying that a interrupted DROP DATABASE is handled correctly in
the backend and in various tools.

Reported-by: Evgeny Morozov <postgresql3@realityexists.net>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20230509004637.cgvmfwrbht7xm7p6@awork3.anarazel.de
Discussion: https://postgr.es/m/20230314174521.74jl6ffqsee5mtug@awork3.anarazel.de
Backpatch: 11-, bug present in all supported versions
2023-07-13 13:03:33 -07:00
Andres Freund e246fd4236 Release lock after encountering bogs row in vac_truncate_clog()
When vac_truncate_clog() encounters bogus datfrozenxid / datminmxid values, it
returns early. Unfortunately, until now, it did not release
WrapLimitsVacuumLock. If the backend later tries to acquire
WrapLimitsVacuumLock, the session / autovacuum worker hangs in an
uncancellable way. Similarly, other sessions will hang waiting for the
lock. However, if the backend holding the lock exited or errored out for some
reason, the lock was released.

The bug was introduced as a side effect of 566372b3d6.

It is interesting that there are no production reports of this problem. That
is likely due to a mix of bugs leading to bogus values having gotten less
common, process exit releasing locks and instances of hangs being hard to
debug for "normal" users.

Discussion: https://postgr.es/m/20230621221208.vhsqgduwfpzwxnpg@awork3.anarazel.de
2023-07-13 13:03:33 -07:00
Michael Paquier 235e716bc2 Fix ALTER EXTENSION SET SCHEMA with objects outside an extension's schema
As coded, the code would use as a base comparison the namespace OID from
the first object scanned in pg_depend when switching its namespace
dependency entry to the new one, and use it as a base of comparison for
any follow-up checks.  It would also be used as the old namespace OID to
switch *from* for the extension's pg_depend entry.  Hence, if the first
object scanned has a namespace different than the one stored in the
extension, we would finish by:
- Not checking that the extension objects map with the extension's
schema.
- Not switching the extension -> namespace dependency entry to the new
namespace provided by the user, making ALTER EXTENSION ineffective.

This issue exists since this command has been introduced in d9572c4 for
relocatable extension, so backpatch all the way down to 11.  The test
case has been provided by Heikki, that I have tweaked a bit to show the
effects on pg_depend for the extension.

Reported-by: Heikki Linnakangas
Author: Michael Paquier, Heikki Linnakangas
Discussion: https://postgr.es/m/20eea594-a05b-4c31-491b-007b6fceef28@iki.fi
Backpatch-through: 11
2023-07-10 09:40:15 +09:00
Tomas Vondra 260dbf19a5 Fix oversight in handling of modifiedCols since f24523672d
Commit f24523672d fixed a memory leak by moving the modifiedCols bitmap
into the per-row memory context. In the case of AFTER UPDATE triggers,
the bitmap is however referenced from an event kept until the end of the
query, resulting in a use-after-free bug.

Fixed by copying the bitmap into the AfterTriggerEvents memory context,
which is the one where we keep the trigger events. There's only one
place that needs to do the copy, but the memory context may not exist
yet. Doing that in a separate function seems more readable.

Report by Alexander Pyhalov, fix by me. Backpatch to 13, where the
bitmap was added to the event by commit 71d60e2aa0.

Reported-by: Alexander Pyhalov
Backpatch-through: 13
Discussion: https://postgr.es/m/acddb17c89b0d6cb940eaeda18c08bbe@postgrespro.ru
2023-07-02 22:23:04 +02:00
Michael Paquier 663b35f2df Fix marking of indisvalid for partitioned indexes at creation
The logic that introduced partitioned indexes missed a few things when
invalidating a partitioned index when these are created, still the code
is written to handle recursions:
1) If created from scratch because a mapping index could not be found,
the new index created could be itself invalid, if for example it was a
partitioned index with one of its leaves invalid.
2) A CCI was missing when indisvalid is set for a parent index, leading
to inconsistent trees when recursing across more than one level for a
partitioned index creation if an invalidation of the parent was
required.

This could lead to the creation of a partition index tree where some of
the partitioned indexes are marked as invalid, but some of the parents
are marked valid, which is not something that should happen (as
validatePartitionedIndex() defines, indisvalid is switched to true for a
partitioned index iff all its partitions are themselves valid).

This patch makes sure that indisvalid is set to false on a partitioned
index if at least one of its partition is invalid.  The flag is set to
true if *all* its partitions are valid.

The regression test added in this commit abuses of a failed concurrent
index creation, marked as invalid, that maps with an index created on
its partitioned table afterwards.

Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin
Discussion: https://postgr.es/m/14987634-43c0-0cb3-e075-94d423607e08@gmail.com
Backpatch-through: 11
2023-06-30 13:54:56 +09:00
Michael Paquier 6160e221d5 Ignore invalid indexes when enforcing index rules in ALTER TABLE ATTACH PARTITION
A portion of ALTER TABLE .. ATTACH PARTITION is to ensure that the
partition being attached to the partitioned table has a correct set of
indexes, so as there is a consistent index mapping between the
partitioned table and its new-to-be partition.  However, as introduced
in 8b08f7d, the current logic could choose an invalid index as a match,
which is something that can exist when dealing with more than two levels
of partitioning, like attaching a partitioned table (that has
partitions, with an index created by CREATE INDEX ON ONLY) to another
partitioned table.

A partitioned index with indisvalid set to false is equivalent to an
incomplete partition tree, meaning that an invalid partitioned index
does not have indexes defined in all its partitions.  Hence, choosing an
invalid partitioned index can create inconsistent partition index trees,
where the parent attaching to is valid, but its partition may be
invalid.

In the report from Alexander Lakhin, this showed up as an assertion
failure when validating an index.  Without assertions enabled, the
partition index tree would be actually broken, as indisvalid should
be switched to true for a partitioned index once all its partitions are
themselves valid.  With two levels of partitioning, the top partitioned
table used a valid index and was able to link to an invalid index stored
on its partition, itself a partitioned table.

I have studied a few options here (like the possibility to switch
indisvalid to false for the parent), but came down to the conclusion
that we'd better rely on a simple rule: invalid indexes had better never
be chosen, so as the partition attached uses and creates indexes that
the parent expects.  Some regression tests are added to provide some
coverage.  Note that the existing coverage is not impacted.

This is a problem since partitioned indexes exist, so backpatch all the
way down to v11.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/14987634-43c0-0cb3-e075-94d423607e08@gmail.com
Backpatch-through: 11
2023-06-28 15:57:48 +09:00
Amit Kapila 0b79042701 Fix the errhint message and docs for drop subscription failure.
The existing errhint message and docs were missing the fact that we can't
disassociate from the slot unless the subscription is disabled.

Author: Robert Sjöblom, Peter Smith
Reviewed-by: Peter Eisentraut, Amit Kapila
Backpatch-through: 11
Discussion: https://postgr.es/m/807bdf85-61ea-88e2-5712-6d9fcd4eabff@fortnox.se
2023-06-21 10:23:09 +05:30
Noah Misch 01e8182c73 Replace last PushOverrideSearchPath() call with set_config_option().
The two methods don't cooperate, so set_config_option("search_path",
...) has been ineffective under non-empty overrideStack.  This defect
enabled an attacker having database-level CREATE privilege to execute
arbitrary code as the bootstrap superuser.  While that particular attack
requires v13+ for the trusted extension attribute, other attacks are
feasible in all supported versions.

Standardize on the combination of NewGUCNestLevel() and
set_config_option("search_path", ...).  It is newer than
PushOverrideSearchPath(), more-prevalent, and has no known
disadvantages.  The "override" mechanism remains for now, for
compatibility with out-of-tree code.  Users should update such code,
which likely suffers from the same sort of vulnerability closed here.
Back-patch to v11 (all supported versions).

Alexander Lakhin.  Reported by Alexander Lakhin.

Security: CVE-2023-2454
2023-05-08 06:14:11 -07:00
Michael Paquier d29eba1987 Fix crashes with CREATE SCHEMA AUTHORIZATION and schema elements
CREATE SCHEMA AUTHORIZATION with appended schema elements can lead to
crashes when comparing the schema name of the query with the schemas
used in the qualification of some clauses in the elements' queries.

The origin of the problem is that the transformation routine for the
elements listed in a CREATE SCHEMA query uses as new, expected, schema
name the one listed in CreateSchemaStmt itself.  However, depending on
the query, CreateSchemaStmt.schemaname may be NULL, being computed
instead from the role specification of the query given by the
AUTHORIZATION clause, that could be either:
- A user name string, with the new schema name being set to the same
value as the role given.
- Guessed from CURRENT_ROLE, SESSION_ROLE or CURRENT_ROLE, with a new
schema name computed from the security context where CREATE SCHEMA is
running.

Regression tests are added for CREATE SCHEMA with some appended elements
(some of them with schema qualifications), covering also some role
specification patterns.

While on it, this simplifies the context structure used during the
transformation of the elements listed in a CREATE SCHEMA query by
removing the fields for the role specification and the role type.  They
were not used, and for the role specification this could be confusing as
the schema name may by extracted from that at the beginning of
CreateSchemaCommand().

This issue exists for a long time, so backpatch down to all the versions
supported.

Reported-by: Song Hongyu
Author: Michael Paquier
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/17909-f65c12dfc5f0451d@postgresql.org
Backpatch-through: 11
2023-04-28 19:29:38 +09:00
Tom Lane 0a6aaf0116 Fix another issue with ENABLE/DISABLE TRIGGER on partitioned tables.
In v13 and v14, the ENABLE/DISABLE TRIGGER USER variant malfunctioned
on cloned triggers, failing to find the clones because it thought they
were system triggers.  Other variants of ENABLE/DISABLE TRIGGER would
improperly apply a superuserness check.  Fix by adjusting the is-it-
a-system-trigger check to match reality in those branches.  (As far
as I can find, this is the only place that got it wrong.)

There's no such bug in v15/HEAD, because we revised the catalog
representation of system triggers to be what this code was expecting.
However, add the test case to these branches anyway, because this area
is visibly pretty fragile.  Also remove an obsoleted comment.

The recent v15/HEAD commit 6949b921d fixed a nearby bug.  I now see
that my commit message for that was inaccurate: the behavior of
recursing to clone triggers is older than v15, but it didn't apply
to the case in v13/v14 because in those branches parent partitioned
tables have no pg_trigger entries for foreign-key triggers.  But add
the test case from that commit to v13/v14, just to show what is
happening there.

Per bug #17886 from DzmitryH.

Discussion: https://postgr.es/m/17886-5406d5d828aa4aa3@postgresql.org
2023-04-05 12:56:30 -04:00
Tom Lane b0b55d8b80 Reject system columns as elements of foreign keys.
Up through v11 it was sensible to use the "oid" system column as
a foreign key column, but since that was removed there's no visible
usefulness in making any of the remaining system columns a foreign
key.  Moreover, since the TupleTableSlot rewrites in v12, such cases
actively fail because of implicit assumptions that only user columns
appear in foreign keys.  The lack of complaints about that seems
like good evidence that no one is trying to do it.  Hence, rather
than trying to repair those assumptions (of which there are at least
two, maybe more), let's just forbid the case up front.

Per this patch, a system column in either the referenced or
referencing side of a foreign key will draw this error; however,
putting one in the referenced side would have failed later anyway,
since we don't allow unique indexes to be made on system columns.

Per bug #17877 from Alexander Lakhin.  Back-patch to v12; the
case still appears to work in v11, so we shouldn't break it there.

Discussion: https://postgr.es/m/17877-4bcc658e33df6de1@postgresql.org
2023-03-31 11:18:49 -04:00
Tom Lane 2dcd92e97d Ensure acquire_inherited_sample_rows sets its output parameters.
The totalrows/totaldeadrows outputs were left uninitialized in cases
where we find no analyzable child tables of a partitioned table.  This
could lead to setting the partitioned table's pg_class.reltuples value
to garbage.  It's not clear that that would have any very bad effects
in practice, but fix it anyway because it's making valgrind unhappy.

Reported and diagnosed by Alexander Lakhin (bug #17880).

Discussion: https://postgr.es/m/17880-9282037c923d856e@postgresql.org
2023-03-31 10:08:40 -04:00
Tom Lane 334cc4c966 Reject attempts to alter composite types used in indexes.
find_composite_type_dependencies() ignored indexes, which is a poor
decision because an expression index could have a stored column of
a composite (or other container) type even when the underlying table
does not.  Teach it to detect such cases and error out.  We have to
work a bit harder than for other relations because the pg_depend entry
won't identify the specific index column of concern, but it's not much
new code.

This does not address bug #17872's original complaint that dropping
a column in such a type might lead to violations of the uniqueness
property that a unique index is supposed to ensure.  That seems of
much less concern to me because it won't lead to crashes.

Per bug #17872 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17872-d0fbb799dc3fd85d@postgresql.org
2023-03-27 15:04:02 -04:00
Thomas Munro 2bef57ee8b Fix fractional vacuum_cost_delay.
Commit 4753ef37 changed vacuum_delay_point() to use the WaitLatch() API,
to fix the problem that vacuum could keep running for a very long time
after the postmaster died.

Unfortunately, that broke commit caf626b2's support for fractional
vacuum_cost_delay, which shipped in PostgreSQL 12.  WaitLatch() works in
whole milliseconds.

For now, revert the change from commit 4753ef37, but add an explicit
check for postmaster death.  That's an extra system call on systems
other than Linux and FreeBSD, but that overhead doesn't matter much
considering that we willingly went to sleep and woke up again.  (In
later work, we might add higher resolution timeouts to the latch API so
that we could do this with our standard programming pattern, but that
wouldn't be back-patched.)

Back-patch to 14, where commit 4753ef37 arrived.

Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b-q0hXCBUCAATh0Z4Zi6UkiC0k2DFgoD3nC-r3SkR3tg%40mail.gmail.com
2023-03-15 14:05:27 +13:00
Tom Lane 53a53ea332 Ensure COPY TO on an RLS-enabled table copies no more than it should.
The COPY documentation is quite clear that "COPY relation TO" copies
rows from only the named table, not any inheritance children it may
have.  However, if you enabled row-level security on the table then
this stopped being true, because the code forgot to apply the ONLY
modifier in the "SELECT ... FROM relation" query that it constructs
in order to allow RLS predicates to be attached.  Fix that.

Report and patch by Antonin Houska (comment adjustments and test case
by me).  Back-patch to all supported branches.

Discussion: https://postgr.es/m/3472.1675251957@antos
2023-03-10 13:52:28 -05:00
Tom Lane 70ec756b01 Allow REPLICA IDENTITY to be set on an index that's not (yet) valid.
The motivation for this change is that when pg_dump dumps a
partitioned index that's marked REPLICA IDENTITY, it generates a
command sequence that applies REPLICA IDENTITY before the partitioned
index has been marked valid, causing restore to fail.  We could
perhaps change pg_dump to not do it like that, but that would be
difficult and would not fix existing dump files with the problem.
There seems to be very little reason for the backend to disallow
this anyway --- the code ignores indisreplident when the index
isn't valid --- so instead let's fix it by allowing the case.

Commit 9511fb37a previously expressed a concern that allowing
indisreplident to be set on invalid indexes might allow us to
wind up in a situation where a table could have indisreplident
set on multiple indexes.  I'm not sure I follow that concern
exactly, but in any case the only way that could happen is because
relation_mark_replica_identity is too trusting about the existing set
of markings being valid.  Let's just rip out its early-exit code path
(which sure looks like premature optimization anyway; what are we
doing expending code to make redundant ALTER TABLE ... REPLICA
IDENTITY commands marginally faster and not-redundant ones marginally
slower?) and fix it to positively guarantee that no more than one
index is marked indisreplident.

The pg_dump failure can be demonstrated in all supported branches,
so back-patch all the way.  I chose to back-patch 9511fb37a as well,
just to keep indisreplident handling the same in all branches.

Per bug #17756 from Sergey Belyashov.

Discussion: https://postgr.es/m/17756-dd50e8e0c8dd4a40@postgresql.org
2023-01-21 13:10:29 -05:00
Tom Lane 32d5a4974c Replace RelationOpenSmgr() with RelationGetSmgr().
This is a back-patch of the v15-era commit f10f0ae42 into older
supported branches.  The idea is to design out bugs in which an
ill-timed relcache flush clears rel->rd_smgr partway through
some code sequence that wasn't expecting that.  We had another
report today of a corner case that reliably crashes v14 under
debug_discard_caches (nee CLOBBER_CACHE_ALWAYS), and therefore
would crash once in a blue moon in the field.  We're unlikely
to get rid of all such code paths unless we adopt the more
rigorous coding rules instituted by f10f0ae42.  Therefore,
even though this is a bit invasive, it's time to back-patch.
Some comfort can be taken in the fact that f10f0ae42 has been
in v15 for 16 months without problems.

I left the RelationOpenSmgr macro present in the back branches,
even though no core code should use it anymore, in order to not break
third-party extensions in minor releases.  Such extensions might opt
to start using RelationGetSmgr instead, to reduce their code
differential between v15 and earlier branches.  This carries a hazard
of failing to compile against headers from existing minor releases.
However, once compiled the extension should work fine even with such
releases, because RelationGetSmgr is a "static inline" function so
it creates no link-time dependency.  So depending on distribution
practices, that might be an OK tradeoff.

Per report from Spyridon Dimitrios Agathos.  Original patch
by Amul Sul.

Discussion: https://postgr.es/m/CAFM5RaqdgyusQvmWkyPYaWMwoK5gigdtW-7HcgHgOeAw7mqJ_Q@mail.gmail.com
Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com
2022-11-17 16:54:30 -05:00
Etsuro Fujita 1699125ca2 Correct error message for row-level triggers with transition tables on partitioned tables.
"Triggers on partitioned tables cannot have transition tables." is
incorrect as we allow statement-level triggers on partitioned tables to
have transition tables.

This has been wrong since commit 86f575948; back-patch to v11 where that
commit came in.

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/CAPmGK17gk4vXLzz2iG%2BG4LWRWCoVyam70nZ3OuGm1hMJwDrhcg%40mail.gmail.com
2022-11-04 19:15:03 +09:00
Alvaro Herrera 18865f4df9
Create FKs properly when attaching table as partition
Commit f56f8f8da6 added some code in CloneFkReferencing that's way too
lax about a Constraint node it manufactures, not initializing enough
struct members -- initially_valid in particular was forgotten.  This
causes some FKs in partitions added by ALTER TABLE ATTACH PARTITION to
be marked as not validated.  Set initially_valid true, which fixes the
bug.

While at it, make the struct initialization more complete.  Very similar
code was added in two other places by the same commit; make them all
follow the same pattern for consistency, though no bugs are apparent
there.

This bug has never been reported: I only happened to notice while
working on commit 614a406b4f.  The test case that was added there with
the improper result is repaired.

Backpatch to 12.

Discussion: https://postgr.es/m/20221005105523.bhuhkdx4olajboof@alvherre.pgsql
2022-11-03 20:40:21 +01:00
Alvaro Herrera 483d26930b
Fix self-referencing foreign keys with partitioned tables
There are a number of bugs in this area.  Two of them are fixed here,
namely:
1. get_relation_idx_constraint_oid does not restrict the type of
   constraint that's returned, so with sufficient bad luck it can
   return the OID of a foreign key constraint.  This has the effect that
   a primary key in a partition can end up as a child of a foreign key,
   which makes no sense (it needs to be the child of the equivalent
   primary key.)
   Change the API contract so that only index-backed constraints are
   returned, mimicking get_constraint_index().

2. Both CloneFkReferenced and CloneFkReferencing clone a
   self-referencing foreign key, so the partition ends up with
   a duplicate foreign key.  Change the former function to ignore such
   constraints.

Add some tests to verify that things are better now.  (However, these
new tests show some additional misbehavior that will be fixed later --
namely that there's a constraint marked NOT VALID.)

Backpatch to 12, where these constraints are possible at all.

Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://postgr.es/m/20220603154232.1715b14c@karst
2022-10-07 19:37:48 +02:00
Alvaro Herrera f1e7f25b5a
Change some errdetail() to errdetail_internal()
This prevents marking the argument string for translation for gettext,
and it also prevents the given string (which is already translated) from
being translated at runtime.

Also, mark the strings used as arguments to check_rolespec_name for
translation.

Backpatch all the way back as appropriate.  None of this is caught by
any tests (necessarily so), so I verified it manually.
2022-09-28 17:14:53 +02:00
Tom Lane 9923764614 Fix tupdesc lifespan bug with AfterTriggersTableData.storeslot.
Commit 25936fd46 adjusted things so that the "storeslot" we use
for remapping trigger tuples would have adequate lifespan, but it
neglected to consider the lifespan of the tuple descriptor that
the slot depends on.  It turns out that in at least some cases, the
tupdesc we are passing is a refcounted tupdesc, and the refcount for
the slot's reference can get assigned to a resource owner having
different lifespan than the slot does.  That leads to an error like
"tupdesc reference 0x7fdef236a1b8 is not owned by resource owner
SubTransaction".  Worse, because of a second oversight in the same
commit, we'd try to free the same tupdesc refcount again while
cleaning up after that error, leading to recursive errors and an
"ERRORDATA_STACK_SIZE exceeded" PANIC.

To fix the initial problem, let's just make a non-refcounted copy
of the tupdesc we're supposed to use.  That seems likely to guard
against additional problems, since there's no strong reason for
this code to assume that what it's given is a refcounted tupdesc;
in which case there's an independent hazard of the tupdesc having
shorter lifespan than the slot does.  (I didn't bother trying to
free said copy, since it should go away anyway when the (sub)
transaction context is cleaned up.)

The other issue can be fixed by making the code added to
AfterTriggerFreeQuery work like the rest of that function, ie be
sure that it doesn't try to free the same slot twice in the event
of recursive error cleanup.

While here, also clean up minor stylistic issues in the test case
added by 25936fd46: don't use "create or replace function", as any
name collision within the tests is likely to have ill effects
that that won't mask; and don't use function names as generic as
trigger_function1, especially if you're not going to drop them
at the end of the test stanza.

Per bug #17607 from Thomas Mc Kay.  Back-patch to v12, as the
previous fix was.

Discussion: https://postgr.es/m/17607-bd8ccc81226f7f80@postgresql.org
2022-09-25 17:10:58 -04:00
Etsuro Fujita e4514aafad Fix thinko in comment.
This comment has been wrong since its introduction in commit 0d5f05cde;
backpatch to v12 where that came in.

Discussion: https://postgr.es/m/CAPmGK14VGf-xQjGQN4o1QyAbXAaxugU5%3DqfcmTDh1iufUDnV_w%40mail.gmail.com
2022-09-22 15:55:03 +09:00
Alvaro Herrera 640c20d626
Choose FK name correctly during partition attachment
During ALTER TABLE ATTACH PARTITION, if the name of a parent's foreign
key constraint is already used on the partition, the code tries to
choose another one before the FK attributes list has been populated,
so the resulting constraint name was "<relname>__fkey" instead of
"<relname>_<attrs>_fkey".  Repair, and add a test case.

Backpatch to 12.  In 11, the code to attach a partition was not smart
enough to cope with conflicting constraint names, so the problem doesn't
exist there.

Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://postgr.es/m/20220901184156.738ebee5@karst
2022-09-08 13:17:02 +02:00
Etsuro Fujita 28d351c9ff Fix typo in comment. 2022-08-26 16:55:02 +09:00
Tom Lane 3bfea5cbba Fix subtly-incorrect matching of parent and child partitioned indexes.
When creating a partitioned index, DefineIndex tries to identify
any existing indexes on the partitions that match the partitioned
index, so that it can absorb those as child indexes instead of
building new ones.  Part of the matching is to compare IndexInfo
structs --- but that wasn't done quite right.  We're comparing
the IndexInfo built within DefineIndex itself to one made from
existing catalog contents by BuildIndexInfo.  Notably, while
BuildIndexInfo will run index expressions and predicates through
expression preprocessing, that has not happened to DefineIndex's
struct.  The result is failure to match and subsequent creation
of duplicate indexes.

The easiest and most bulletproof fix is to build a new IndexInfo
using BuildIndexInfo, thereby guaranteeing that the processing done
is identical.

While here, let's also extract the opfamily and collation data
from the new partitioned index, removing ad-hoc logic that
duplicated knowledge about how those are constructed.

Per report from Christophe Pettus.  Back-patch to v11 where
we invented partitioned indexes.

Richard Guo and Tom Lane

Discussion: https://postgr.es/m/8864BFAA-81FD-4BF9-8E06-7DEB8D4164ED@thebuild.com
2022-08-18 12:11:47 -04:00
Tom Lane 5721da7e41 In extensions, don't replace objects not belonging to the extension.
Previously, if an extension script did CREATE OR REPLACE and there was
an existing object not belonging to the extension, it would overwrite
the object and adopt it into the extension.  This is problematic, first
because the overwrite is probably unintentional, and second because we
didn't change the object's ownership.  Thus a hostile user could create
an object in advance of an expected CREATE EXTENSION command, and would
then have ownership rights on an extension object, which could be
modified for trojan-horse-type attacks.

Hence, forbid CREATE OR REPLACE of an existing object unless it already
belongs to the extension.  (Note that we've always forbidden replacing
an object that belongs to some other extension; only the behavior for
previously-free-standing objects changes here.)

For the same reason, also fail CREATE IF NOT EXISTS when there is
an existing object that doesn't belong to the extension.

Our thanks to Sven Klemm for reporting this problem.

Security: CVE-2022-2625
2022-08-08 11:12:31 -04:00
Alvaro Herrera 731d514ae5
Fix ENABLE/DISABLE TRIGGER to handle recursion correctly
Using ATSimpleRecursion() in ATPrepCmd() to do so as bbb927b4db did is
not correct, because ATPrepCmd() can't distinguish between triggers that
may be cloned and those that may not, so would wrongly try to recurse
for the latter category of triggers.

So this commit restores the code in EnableDisableTrigger() that
86f575948c had added to do the recursion, which would do it only for
triggers that may be cloned, that is, row-level triggers.  This also
changes tablecmds.c such that ATExecCmd() is able to pass the value of
ONLY flag down to EnableDisableTrigger() using its new 'recurse'
parameter.

This also fixes what seems like an oversight of 86f575948c that the
recursion to partition triggers would only occur if EnableDisableTrigger()
had actually changed the trigger.  It is more apt to recurse to inspect
partition triggers even if the parent's trigger didn't need to be
changed: only then can we be certain that all descendants share the same
state afterwards.

Backpatch all the way back to 11, like bbb927b4db.  Care is taken not
to break ABI compatibility (and that no catversion bump is needed.)

Co-authored-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com
2022-08-05 09:47:11 +02:00
Alvaro Herrera a3aacb7cbf
Fix replay of create database records on standby
Crash recovery on standby may encounter missing directories
when replaying database-creation WAL records.  Prior to this
patch, the standby would fail to recover in such a case;
however, the directories could be legitimately missing.
Consider the following sequence of commands:

    CREATE DATABASE
    DROP DATABASE
    DROP TABLESPACE

If, after replaying the last WAL record and removing the
tablespace directory, the standby crashes and has to replay the
create database record again, crash recovery must be able to continue.

A fix for this problem was already attempted in 49d9cfc68b, but it
was reverted because of design issues.  This new version is based
on Robert Haas' proposal: any missing tablespaces are created
during recovery before reaching consistency.  Tablespaces
are created as real directories, and should be deleted
by later replay.  CheckRecoveryConsistency ensures
they have disappeared.

The problems detected by this new code are reported as PANIC,
except when allow_in_place_tablespaces is set to ON, in which
case they are WARNING.  Apart from making tests possible, this
gives users an escape hatch in case things don't go as planned.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Author: Paul Guo <paulguo@gmail.com>
Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions)
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions)
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Diagnosed-by: Paul Guo <paulguo@gmail.com>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
2022-07-28 08:26:05 +02:00
Alvaro Herrera 961cab0a5a
Allow "in place" tablespaces.
This is a backpatch to branches 10-14 of the following commits:

7170f2159f Allow "in place" tablespaces.
c6f2f01611 Fix pg_basebackup with in-place tablespaces.
f6f0db4d62 Fix pg_tablespace_location() with in-place tablespaces
7a7cd84893 doc: Remove mention to in-place tablespaces for pg_tablespace_location()
5344723755 Remove unnecessary Windows-specific basebackup code.

In-place tablespaces were introduced as a testing helper mechanism, but
they are going to be used for a bugfix in WAL replay to be backpatched
to all stable branches.

I (Álvaro) had to adjust some code to account for lack of
get_dirent_type() in branches prior to 14.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Michaël Paquier <michael@paquier.xyz>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20220722081858.omhn2in5zt3g4nek@alvherre.pgsql
2022-07-27 07:55:13 +02:00
Tom Lane af72b08894 Invent qsort_interruptible().
Justin Pryzby reported that some scenarios could cause gathering
of extended statistics to spend many seconds in an un-cancelable
qsort() operation.  To fix, invent qsort_interruptible(), which is
just like qsort_arg() except that it will also do CHECK_FOR_INTERRUPTS
every so often.  This bloats the backend by a couple of kB, which
seems like a good investment.  (We considered just enabling
CHECK_FOR_INTERRUPTS in the existing qsort and qsort_arg functions,
but there are some callers for which that'd demonstrably be unsafe.
Opt-in seems like a better way.)

For now, just apply qsort_interruptible() in statistics collection.
There's probably more places where it could be useful, but we can
always change other call sites as we find problems.

Back-patch to v14.  Before that we didn't have extended stats on
expressions, so that the problem was less severe.  Also, this patch
depends on the sort_template infrastructure introduced in v14.

Tom Lane and Justin Pryzby

Discussion: https://postgr.es/m/20220509000108.GQ28830@telsasoft.com
2022-07-12 16:30:36 -04:00
Noah Misch ace9973867 CREATE INDEX: use the original userid for more ACL checks.
Commit a117cebd63 used the original userid
for ACL checks located directly in DefineIndex(), but it still adopted
the table owner userid for more ACL checks than intended.  That broke
dump/reload of indexes that refer to an operator class, collation, or
exclusion operator in a schema other than "public" or "pg_catalog".
Back-patch to v10 (all supported versions), like the earlier commit.

Nathan Bossart and Noah Misch

Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
2022-06-25 09:07:44 -07:00
Heikki Linnakangas f82595ac90 Fix COPY FROM when database encoding is SQL_ASCII.
In the codepath when no encoding conversion is required, the check for
incomplete character at the end of input incorrectly used server
encoding's max character length, instead of the client's. Usually the
server and client encodings are the same when we're not performing
encoding conversion, but SQL_ASCII is an exception.

In the passing, also fix some outdated comments that still talked about
the old COPY protocol. It was removed in v14.

Per bug #17501 from Vitaly Voronov. Backpatch to v14 where this was
introduced.

Discussion: https://www.postgresql.org/message-id/17501-128b1dd039362ae6@postgresql.org
2022-05-29 23:57:16 +03:00
Alvaro Herrera 58b088a9b3
Fix DDL deparse of CREATE OPERATOR CLASS
When an implicit operator family is created, it wasn't getting reported.
Make it do so.

This has always been missing.  Backpatch to 10.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr
2022-05-20 18:52:55 +02:00
Noah Misch 677a494789 In REFRESH MATERIALIZED VIEW, set user ID before running user code.
It intended to, but did not, achieve this.  Adopt the new standard of
setting user ID just after locking the relation.  Back-patch to v10 (all
supported versions).

Reviewed by Simon Riggs.  Reported by Alvaro Herrera.

Security: CVE-2022-1552
2022-05-09 08:35:12 -07:00
Noah Misch ab49ce7c34 Make relation-enumerating operations be security-restricted operations.
When a feature enumerates relations and runs functions associated with
all found relations, the feature's user shall not need to trust every
user having permission to create objects.  BRIN-specific functionality
in autovacuum neglected to account for this, as did pg_amcheck and
CLUSTER.  An attacker having permission to create non-temp objects in at
least one schema could execute arbitrary SQL functions under the
identity of the bootstrap superuser.  CREATE INDEX (not a
relation-enumerating operation) and REINDEX protected themselves too
late.  This change extends to the non-enumerating amcheck interface.
Back-patch to v10 (all supported versions).

Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin.
Reported by Alexander Lakhin.

Security: CVE-2022-1552
2022-05-09 08:35:12 -07:00
Tom Lane 08a9e7a8c7 Fix breakage in AlterFunction().
An ALTER FUNCTION command that tried to update both the function's
proparallel property and its proconfig list failed to do the former,
because it stored the new proparallel value into a tuple that was
no longer the interesting one.  Carelessness in 7aea8e4f2.

(I did not bother with a regression test, because the only likely
future breakage would be for someone to ignore the comment I added
and add some other field update after the heap_modify_tuple step.
A test using existing function properties could not catch that.)

Per report from Bryn Llewellyn.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com
2022-04-19 23:03:59 -04:00
Etsuro Fujita 637afee327 Fix typo in comment. 2022-03-30 19:00:02 +09:00
Alvaro Herrera adc943b4e1
Revert "Fix replay of create database records on standby"
This reverts commit 49d9cfc68b.  The approach taken by this patch has
problems, so we'll come up with a radically different fix.

Discussion: https://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=iCFM0SuYqBbqZEamEg@mail.gmail.com
2022-03-29 15:36:21 +02:00
Alvaro Herrera ffd28516e6
Fix replay of create database records on standby
Crash recovery on standby may encounter missing directories when
replaying create database WAL records.  Prior to this patch, the standby
would fail to recover in such a case.  However, the directories could be
legitimately missing.  Consider a sequence of WAL records as follows:

    CREATE DATABASE
    DROP DATABASE
    DROP TABLESPACE

If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.

This patch adds a mechanism similar to invalid-page tracking, to keep a
tally of missing directories during crash recovery.  If all the missing
directory references are matched with corresponding drop records at the
end of crash recovery, the standby can safely continue following the
primary.

Backpatch to 13, at least for now.  The bug is older, but fixing it in
older branches requires more careful study of the interactions with
commit e6d8069522, which appeared in 13.

A new TAP test file is added to verify the condition.  However, because
it depends on commit d6d317dbf6, it can only be added to branch
master.  I (Álvaro) manually verified that the code behaves as expected
in branch 14.  It's a bit nervous-making to leave the code uncovered by
tests in older branches, but leaving the bug unfixed is even worse.
Also, the main reason this fix took so long is precisely that we
couldn't agree on a good strategy to approach testing for the bug, so
perhaps this is the best we can do.

Diagnosed-by: Paul Guo <paulguo@gmail.com>
Author: Paul Guo <paulguo@gmail.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
2022-03-25 13:16:21 +01:00
Tom Lane 05ccf974cd Fix risk of deadlock failure while dropping a partitioned index.
DROP INDEX needs to lock the index's table before the index itself,
else it will deadlock against ordinary queries that acquire the
relation locks in that order.  This is correctly mechanized for
plain indexes by RangeVarCallbackForDropRelation; but in the case of
a partitioned index, we neglected to lock the child tables in advance
of locking the child indexes.  We can fix that by traversing the
inheritance tree and acquiring the needed locks in RemoveRelations,
after we have acquired our locks on the parent partitioned table and
index.

While at it, do some refactoring to eliminate confusion between
the actual and expected relkind in RangeVarCallbackForDropRelation.
We can save a couple of syscache lookups too, by having that function
pass back info that RemoveRelations will need.

Back-patch to v11 where partitioned indexes were added.

Jimmy Yih, Gaurab Dey, Tom Lane

Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com
2022-03-21 12:22:13 -04:00
Michael Paquier 627c79a1e8 Add compute_query_id = regress
"regress" is a new mode added to compute_query_id aimed at facilitating
regression testing when a module computing query IDs is loaded into the
backend, like pg_stat_statements.  It works the same way as "auto",
meaning that query IDs are computed if a module enables it, except that
query IDs are hidden in EXPLAIN outputs to ensure regression output
stability.

Like any GUCs of the kind (force_parallel_mode, etc.), this new
configuration can be added to an instance's postgresql.conf, or just
passed down with PGOPTIONS at command level.  compute_query_id uses an
enum for its set of option values, meaning that this addition ensures
ABI compatibility.

Using this new configuration mode allows installcheck-world to pass when
running the tests on an instance with pg_stat_statements enabled,
stabilizing the test output while checking the paths doing query ID
computations.

Reported-by: Anton Melnikov
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/1634283396.372373993@f75.i.mail.ru
Discussion: https://postgr.es/m/YgHlxgc/OimuPYhH@paquier.xyz
Backpatch-through: 14
2022-02-22 10:23:49 +09:00
Michael Paquier b30282fccf Fix incorrect memory context switch in COPY TO execution
c532d15 has split the logic of COPY commands into multiple files, one
change being to move the internals of BeginCopy() to BeginCopyTo().
Originally the code was written so as we'd switch back-and-forth between
the current execution memory context and the dedicated memory context
for the COPY command, and this refactoring has introduced an extra
switch to the current memory context from the COPY context once
BeginCopyTo() is done with the past logic coming from BeginCopy().

The code was correctly doing the analyze, rewrite and planning phases in
the COPY context, but it was not assigning "copy_file" (FILE* used when
copying to a source file) and "filename" in the COPY context, making the
COPY status data inconsistent.

Author: Bharath Rupireddy
Reviewed-by: Japin Li
Discussion: https://postgr.es/m/CALj2ACWvVa69foi9jhHFY=2BuHxAoYboyE+vXQTARwxZcJnVrQ@mail.gmail.com
Backpatch-through: 14
2022-01-29 10:23:17 +09:00
Tomas Vondra a192243c75 Check syscache result in AlterStatistics
The syscache lookup may return NULL even for valid OID, for example due
to a concurrent DROP STATISTICS, so a HeapTupleIsValid is necessary.
Without it, it may fail with a segfault.

Reported by Alexander Lakhin, patch by me. Backpatch to 13, where ALTER
STATISTICS ... SET STATISTICS was introduced.

Backpatch-through: 13
Discussion: https://postgr.es/m/17372-bf3b6e947e35ae77%40postgresql.org
2022-01-23 03:18:02 +01:00
Tomas Vondra ea212bd95f Build inherited extended stats on partitioned tables
Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. While that
resolved the issue, it also means there are no extended stats for
declaratively partitioned tables, because there are no data in the
non-leaf relations.

That also means declaratively partitioned tables were not affected by
the issue 859b3003de addressed, which means this is a regression
affecting queries that calculate estimates for the whole inheritance
tree as a whole (which includes e.g. GROUP BY queries).

But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.

It may be necessary to run ANALYZE on partitioned tables, to collect
proper statistics. For declarative partitioning there should no prior
statistics, and it might take time before autoanalyze is triggered. For
tables partitioned by inheritance the statistics may include data from
child relations (if built 859b3003de), contradicting the current code.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
2022-01-15 19:05:22 +01:00