Commit Graph

4574 Commits

Author SHA1 Message Date
Jeff Davis e39d512f3e Comment fix for 60684dd834.
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/766f3799-0269-162f-ba63-4cae34a5534f@enterprisedb.com
2023-04-17 13:45:50 -07:00
Alvaro Herrera 9ce04b50e1
Revert "Catalog NOT NULL constraints" and fallout
This reverts commit e056c557ae and minor later fixes thereof.

There's a few problems in this new feature -- most notably regarding
pg_upgrade behavior, but others as well.  This new feature is not in any
way critical on its own, so instead of scrambling to fix it we revert it
and try again in early 17 with these issues in mind.

Discussion: https://postgr.es/m/3801207.1681057430@sss.pgh.pa.us
2023-04-12 19:29:21 +02:00
David Rowley 68a2a437f4 Improve ereports for VACUUM's BUFFER_USAGE_LIMIT option
There's no need to check if opt->arg is NULL since defGetString() already
does that and raises an ERROR if it is.  Let's just remove that check.

Also, combine the two remaining ERRORs into a single check.  It seems
better to give an indication about what sort of values we're looking for
rather than just to state that the value given isn't valid.  Make
BUFFER_USAGE_LIMIT uppercase in this ERROR message too.  It's already
upper case in one other error message, so make that consistent.

Reported-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20230411.102335.1643720544536884844.horikyota.ntt@gmail.com
2023-04-11 19:36:34 +12:00
Alvaro Herrera e056c557ae
Catalog NOT NULL constraints
We now create pg_constaint rows for NOT NULL constraints with
contype='n'.

We propagate these constraints during operations such as adding
inheritance relationships, creating and attaching partitions, creating
tables LIKE other tables.  We mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations; for example, as opposed to CHECK constraints, we don't
match NOT NULL ones by name when descending a hierarchy to alter it;
instead we match by column number.  This means we don't require the
constraint names to be identical across a hierarchy.

For now, we omit them from system catalogs.  Maybe this is worth
reconsidering.  We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)

This has been very long in the making.  The first patch was written by
Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'),
which I (Álvaro) then hijacked in 2011 and 2012, until that one was
killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints.  However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again.

In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring additional pg_attribute columns to
track the OID of the NOT NULL constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>

Discussion: https://postgr.es/m/CACA0E642A0267EDA387AF2B%40%5B172.26.14.62%5D
Discussion: https://postgr.es/m/AANLkTinLXMOEMz+0J29tf1POokKi4XDkWJ6-DDR9BKgU@mail.gmail.com
Discussion: https://postgr.es/m/20110707213401.GA27098@alvh.no-ip.org
Discussion: https://postgr.es/m/1343682669-sup-2532@alvh.no-ip.org
Discussion: https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Discussion: https://postgr.es/m/20220817181249.q7qvj3okywctra3c@alvherre.pgsql
2023-04-07 19:59:57 +02:00
David Rowley ae78cae3be Add --buffer-usage-limit option to vacuumdb
1cbbee033 added BUFFER_USAGE_LIMIT to the VACUUM and ANALYZE commands, so
here we permit that option to be specified in vacuumdb.

In passing, adjust the documents for vacuum_buffer_usage_limit and the
BUFFER_USAGE_LIMIT VACUUM option to mention "kB" rather than "KB".  Do the
same for the ERROR message in ExecVacuum() and
check_vacuum_buffer_usage_limit().  Without that we might tell a user that
the valid minimum value is 128 KB only to reject that because we accept
only "kB" and not "KB".

Also, add a small reminder comment in vacuum.h to try to trigger the
memory of anyone adding new fields to VacuumParams that they might want to
consider if vacuumdb needs to grow a new option too.

Author: Melanie Plageman
Reviewed-by: Justin Pryzby
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/ZAzTg3iEnubscvbf@telsasoft.com
2023-04-07 12:47:10 +12:00
David Rowley 1cbbee0338 Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option
Add new options to the VACUUM and ANALYZE commands called
BUFFER_USAGE_LIMIT to allow users more control over how large to make the
buffer access strategy that is used to limit the usage of buffers in
shared buffers.  Larger rings can allow VACUUM to run more quickly but
have the drawback of VACUUM possibly evicting more buffers from shared
buffers that might be useful for other queries running on the database.

Here we also add a new GUC named vacuum_buffer_usage_limit which controls
how large to make the access strategy when it's not specified in the
VACUUM/ANALYZE command.  This defaults to 256KB, which is the same size as
the access strategy was prior to this change.  This setting also
controls how large to make the buffer access strategy for autovacuum.

Per idea by Andres Freund.

Author: Melanie Plageman
Reviewed-by: David Rowley
Reviewed-by: Andres Freund
Reviewed-by: Justin Pryzby
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb@awork3.anarazel.de
2023-04-07 11:40:31 +12:00
Daniel Gustafsson 7d71d3dd08 Refresh cost-based delay params more frequently in autovacuum
Allow autovacuum to reload the config file more often so that cost-based
delay parameters can take effect while VACUUMing a relation. Previously,
autovacuum workers only reloaded the config file once per relation
vacuumed, so config changes could not take effect until beginning to
vacuum the next table.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() was called, workers
vacuuming tables with no cost-related storage parameters could still
have different values for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of
cost-related storage parameters). This removes the rationale for keeping
cost limit and cost delay in shared memory. Balancing the cost limit
requires only the number of active autovacuum workers vacuuming a table
with no cost-based storage parameters.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
2023-04-07 01:00:21 +02:00
Daniel Gustafsson a85c60a945 Separate vacuum cost variables from GUCs
Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously inspected VacuumCostLimit and VacuumCostDelay,
which are the global variables backing the GUCs vacuum_cost_limit and
vacuum_cost_delay.

Autovacuum workers needed to override these variables with their
own values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic.
This led to confusing code which, in some cases, both derived and
set a new value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these GUC values more often, introduce
new, independent global variables and add a function to update them
using the GUCs and existing logic.

Per suggestion by Kyotaro Horiguchi

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
2023-04-07 00:54:53 +02:00
Daniel Gustafsson 71a825194f Make vacuum failsafe_active globally visible
While vacuuming a table in failsafe mode, VacuumCostActive should
not be re-enabled.  This currently isn't a problem because vacuum
cost parameters are only refreshed in between vacuuming tables and
failsafe status is reset for every table.

In preparation for allowing vacuum cost parameters to be updated
more frequently, elevate LVRelState->failsafe_active to a global,
VacuumFailsafeActive, which will be checked when determining whether
or not to re-enable vacuum cost-related delays.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
2023-04-07 00:54:08 +02:00
David Rowley b9b125b9c1 Move various prechecks from vacuum() into ExecVacuum()
vacuum() is used for both the VACUUM command and for autovacuum. There
were many prechecks being done inside vacuum() that were just not relevant
to autovacuum.  Let's move the bulk of these into ExecVacuum() so that
they're only executed when running the VACUUM command.  This removes a
small amount of overhead when autovacuum vacuums a table.

While we are at it, allocate VACUUM's BufferAccessStrategy in ExecVacuum()
and pass it into vacuum() instead of expecting vacuum() to make it if it's
not already made by the calling function.  To make this work, we need to
create the vacuum memory context slightly earlier, so we now need to pass
that down to vacuum() so that it's available for use in other memory
allocations.

Author: Melanie Plageman
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/20230405211534.4skgskbilnxqrmxg@awork3.anarazel.de
2023-04-06 15:44:52 +12:00
Andres Freund acab1b0914 Convert many uses of ReadBuffer[Extended](P_NEW) to ExtendBufferedRel()
A few places are not converted. Some because they are tackled in later
commits (e.g. hio.c, xlogutils.c), some because they are more
complicated (e.g. brin_pageops.c).  Having a few users of ReadBuffer(P_NEW) is
good anyway, to ensure the backward compat path stays working.

Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-05 18:57:29 -07:00
David Rowley bccd6908ca Always make a BufferAccessStrategy for ANALYZE
32fbe0239 changed things so we didn't bother allocating the
BufferAccessStrategy during VACUUM (ONLY_DATABASE_STATS); and VACUUM
(FULL), however, it forgot to consider that VACUUM (FULL, ANALYZE) is a
possible combination.  That change would have resulted in such a command
allowing ANALYZE to make full use of shared buffers, which wasn't
intended, so fix that.

Reported-by: Melanie Plageman
Discussion: https://postgr.es/m/CAAKRu_bJRKe+v_=OqwC+5sA3j5qv8rqdAwy3+yHaO3wmtfrCRg@mail.gmail.com
2023-04-06 12:37:03 +12:00
Tom Lane 16dc2703c5 Support "Right Anti Join" plan shapes.
Merge and hash joins can support antijoin with the non-nullable input
on the right, using very simple combinations of their existing logic
for right join and anti join.  This gives the planner more freedom
about how to order the join.  It's particularly useful for hash join,
since we may now have the option to hash the smaller table instead
of the larger.

Richard Guo, reviewed by Ronan Dunklau and myself

Discussion: https://postgr.es/m/CAMbWs48xh9hMzXzSy3VaPzGAz+fkxXXTUbCLohX1_L8THFRm2Q@mail.gmail.com
2023-04-05 16:59:09 -04:00
Tom Lane 4766eef317 Fix another issue with ENABLE/DISABLE TRIGGER on partitioned tables.
In v13 and v14, the ENABLE/DISABLE TRIGGER USER variant malfunctioned
on cloned triggers, failing to find the clones because it thought they
were system triggers.  Other variants of ENABLE/DISABLE TRIGGER would
improperly apply a superuserness check.  Fix by adjusting the is-it-
a-system-trigger check to match reality in those branches.  (As far
as I can find, this is the only place that got it wrong.)

There's no such bug in v15/HEAD, because we revised the catalog
representation of system triggers to be what this code was expecting.
However, add the test case to these branches anyway, because this area
is visibly pretty fragile.  Also remove an obsoleted comment.

The recent v15/HEAD commit 6949b921d fixed a nearby bug.  I now see
that my commit message for that was inaccurate: the behavior of
recursing to clone triggers is older than v15, but it didn't apply
to the case in v13/v14 because in those branches parent partitioned
tables have no pg_trigger entries for foreign-key triggers.  But add
the test case from that commit to v13/v14, just to show what is
happening there.

Per bug #17886 from DzmitryH.

Discussion: https://postgr.es/m/17886-5406d5d828aa4aa3@postgresql.org
2023-04-05 12:56:32 -04:00
Andres Freund 3f695b3117 sequences: Lock buffer before initializing page
fill_seq_fork_with_data(), used to initialize a new sequence relation, only
locked the buffer after calling PageInit(), even though PageInit() modifies
page contents.

This is unlikely to cause real-world issues, as the relation is exclusively
locked at that point, and the buffer not yet marked dirty, so other processes
should not access the buffer.

This issue looks to have been present since the introduction of sequences in
e8647c45d6.

Given the low risk, it does not seem worth backpatching the fix.

Discussion: https://postgr.es/m/20230404185501.wdkmo3k7bedlx6qk@awork3.anarazel.de
2023-04-04 16:42:52 -07:00
Jeff Davis ea1db8ae70 Canonicalize ICU locale names to language tags.
Convert to BCP47 language tags before storing in the catalog, except
during binary upgrade or when the locale comes from an existing
collation or template database.

The resulting language tags can vary slightly between ICU
versions. For instance, "@colBackwards=yes" is converted to
"und-u-kb-true" in older versions of ICU, and to the simpler (but
equivalent) "und-u-kb" in newer versions.

The process of canonicalizing to a language tag also understands more
input locale string formats than ucol_open(). For instance,
"fr_CA.UTF-8" is misinterpreted by ucol_open() and the region is
ignored; effectively treating it the same as the locale "fr" and
opening the wrong collator. Canonicalization properly interprets the
language and region, resulting in the language tag "fr-CA", which can
then be understood by ucol_open().

This commit fixes a problem in prior versions due to ucol_open()
misinterpreting locale strings as described above. For instance,
creating an ICU collation with locale "fr_CA.UTF-8" would store that
string directly in the catalog, which would later be passed to (and
misinterpreted by) ucol_open(). After this commit, the locale string
will be canonicalized to language tag "fr-CA" in the catalog, which
will be properly understood by ucol_open(). Because this fix affects
the resulting collator, we cannot change the locale string stored in
the catalog for existing databases or collations; otherwise we'd risk
corrupting indexes. Therefore, only canonicalize locales for
newly-created (not upgraded) collations/databases. For similar
reasons, do not backport.

Discussion: https://postgr.es/m/8c7af6820aed94dc7bc259d2aa7f9663518e6137.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-04-04 10:38:58 -07:00
Robert Haas 482675987b Add a run_as_owner option to subscriptions.
This option is normally false, but can be set to true to obtain
the legacy behavior where the subscription runs with the permissions
of the subscription owner rather than the permissions of the
table owner. The advantages of this mode are (1) it doesn't require
that the subscription owner have permission to SET ROLE to each
table owner and (2) since no role switching occurs, the
SECURITY_RESTRICTED_OPERATION restrictions do not apply.

On the downside, it allows any table owner to easily usurp
the privileges of the subscription owner - basically, to take
over their account. Because that's generally quite undesirable,
we don't make this mode the default, but we do make it available,
just in case the new behavior causes too many problems for someone.

Discussion: http://postgr.es/m/CA+TgmoZ-WEeG6Z14AfH7KhmpX2eFh+tZ0z+vf0=eMDdbda269g@mail.gmail.com
2023-04-04 12:03:03 -04:00
Robert Haas 1e10d49b65 Perform logical replication actions as the table owner.
Up until now, logical replication actions have been performed as the
subscription owner, who will generally be a superuser.  Commit
cec57b1a0f documented hazards
associated with that situation, namely, that any user who owns a
table on the subscriber side could assume the privileges of the
subscription owner by attaching a trigger, expression index, or
some other kind of executable code to it. As a remedy, it suggested
not creating configurations where users who are not fully trusted
own tables on the subscriber.

Although that will work, it basically precludes using logical
replication in the way that people typically want to use it,
namely, to replicate a database from one node to another
without necessarily having any restrictions on which database
users can own tables. So, instead, change logical replication to
execute INSERT, UPDATE, DELETE, and TRUNCATE operations as the
table owner when they are replicated.

Since this involves switching the active user frequently within
a session that is authenticated as the subscription user, also
impose SECURITY_RESTRICTED_OPERATION restrictions on logical
replication code. As an exception, if the table owner can SET
ROLE to the subscription owner, these restrictions have no
security value, so don't impose them in that case.

Subscription owners are now required to have the ability to
SET ROLE to every role that owns a table that the subscription
is replicating. If they don't, replication will fail. Superusers,
who normally own subscriptions, satisfy this property by default.
Non-superusers users who own subscriptions will need to be
granted the roles that own relevant tables.

Patch by me, reviewed (but not necessarily in its entirety) by
Jelte Fennema, Jeff Davis, and Noah Misch.

Discussion: http://postgr.es/m/CA+TgmoaSCkg9ww9oppPqqs+9RVqCexYCE6Aq=UsYPfnOoDeFkw@mail.gmail.com
2023-04-04 11:25:23 -04:00
Peter Geoghegan a349b86603 Move heaprel struct field next to index rel field.
Commit 61b313e4 added a heaprel struct member to IndexVacuumInfo, but
placed it last.  Move the heaprel struct member next to the index struct
member to improve the code's readability.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznG=TV6S9d3VA=y0vBHbXwnLs9_LLdiML=aNJuHeriwxg@mail.gmail.com
2023-04-03 11:01:11 -07:00
David Rowley 4830f10243 Disable vacuum's use of a buffer access strategy during failsafe
Traditionally, vacuum always makes use of a buffer access strategy 32
buffers in size.  This means that running vacuums tend not to cause too
many shared buffers to become dirty, however, this can cause vacuums to
run much more slowly than they otherwise could as WAL flushes will occur
more frequently due to having to flush WAL out to the LSN of the dirty
page before that page can be written to disk.

When we are performing failsafe VACUUMs (as added in 1e55e7d17), we really
want to make the vacuum work go as quickly as possible, so here we disable
the buffer access strategy when entering failsafe mode while vacuuming a
relation.

Per idea and analyis from Andres Freund.

In passing, also include some changes I had intended for 32fbe0239.

Author: Melanie Plageman
Reviewed-by: Justin Pryzby, David Rowley
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb%40awork3.anarazel.de
2023-04-03 23:05:58 +12:00
David Rowley 32fbe0239b Only make buffer strategy for vacuum when it's likely needed
VACUUM FULL and VACUUM ONLY_DATABASE_STATS will not use the vacuum
strategy ring created in vacuum(), so don't waste effort making it in
those cases.

There are other conceivable cases where the buffer strategy also won't be
used, but those are probably less common and not worth troubling over too
much.  For example VACUUM (PROCESS_MAIN false, PROCESS_TOAST false).
There are other cases too, but many of these are only discovered once
inside vacuum_rel().

Author: Melanie Plageman
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAAKRu_ZLRuzkM3zKogiZAz2hUony37yLY4aaLb8fPf8fgqs5Og@mail.gmail.com
2023-04-03 19:19:45 +12:00
David Rowley 3f476c9534 Remove some global variables from vacuum.c
Using global variables because we don't want to pass these values around
as parameters does not really seem like a great idea, so let's remove
these two global variables and adjust a few functions to accept these
values as parameters instead.

This is part of a wider patch which intends to allow the size of the
buffer access strategy that vacuum uses to be adjusted.

Author: Melanie Plageman
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/CAAKRu_b1q_07uquUtAvLqTM%3DW9nzee7QbtzHwA4XdUo7KX_Cnw%40mail.gmail.com
2023-04-03 15:07:25 +12:00
Andres Freund 61b313e47e Pass down table relation into more index relation functions
This is done in preparation for logical decoding on standby, which needs to
include whether visibility affecting WAL records are about a (user) catalog
table. Which is only known for the table, not the indexes.

It's also nice to be able to pass the heap relation to GlobalVisTestFor() in
vacuumRedirectAndPlaceholder().

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/21b700c3-eecf-2e05-a699-f8c78dd31ec7@gmail.com
2023-04-01 20:18:29 -07:00
Tom Lane f0d65c0eaf Reject system columns as elements of foreign keys.
Up through v11 it was sensible to use the "oid" system column as
a foreign key column, but since that was removed there's no visible
usefulness in making any of the remaining system columns a foreign
key.  Moreover, since the TupleTableSlot rewrites in v12, such cases
actively fail because of implicit assumptions that only user columns
appear in foreign keys.  The lack of complaints about that seems
like good evidence that no one is trying to do it.  Hence, rather
than trying to repair those assumptions (of which there are at least
two, maybe more), let's just forbid the case up front.

Per this patch, a system column in either the referenced or
referencing side of a foreign key will draw this error; however,
putting one in the referenced side would have failed later anyway,
since we don't allow unique indexes to be made on system columns.

Per bug #17877 from Alexander Lakhin.  Back-patch to v12; the
case still appears to work in v11, so we shouldn't break it there.

Discussion: https://postgr.es/m/17877-4bcc658e33df6de1@postgresql.org
2023-03-31 11:18:49 -04:00
Tom Lane c2d7d679c1 Ensure acquire_inherited_sample_rows sets its output parameters.
The totalrows/totaldeadrows outputs were left uninitialized in cases
where we find no analyzable child tables of a partitioned table.  This
could lead to setting the partitioned table's pg_class.reltuples value
to garbage.  It's not clear that that would have any very bad effects
in practice, but fix it anyway because it's making valgrind unhappy.

Reported and diagnosed by Alexander Lakhin (bug #17880).

Discussion: https://postgr.es/m/17880-9282037c923d856e@postgresql.org
2023-03-31 10:08:40 -04:00
Robert Haas c3afe8cf5a Add new predefined role pg_create_subscription.
This role can be granted to non-superusers to allow them to issue
CREATE SUBSCRIPTION. The non-superuser must additionally have CREATE
permissions on the database in which the subscription is to be
created.

Most forms of ALTER SUBSCRIPTION, including ALTER SUBSCRIPTION .. SKIP,
now require only that the role performing the operation own the
subscription, or inherit the privileges of the owner. However, to
use ALTER SUBSCRIPTION ... RENAME or ALTER SUBSCRIPTION ... OWNER TO,
you also need CREATE permission on the database. This is similar to
what we do for schemas. To change the owner of a schema, you must also
have permission to SET ROLE to the new owner, similar to what we do
for other object types.

Non-superusers are required to specify a password for authentication
and the remote side must use the password, similar to what is required
for postgres_fdw and dblink.  A superuser who wants a non-superuser to
own a subscription that does not rely on password authentication may
set the new password_required=false property on that subscription. A
non-superuser may not set password_required=false and may not modify a
subscription that already has password_required=false.

This new password_required subscription property works much like the
eponymous postgres_fdw property.  In both cases, the actual semantics
are that a password is not required if either (1) the property is set
to false or (2) the relevant user is the superuser.

Patch by me, reviewed by Andres Freund, Jeff Davis, Mark Dilger,
and Stephen Frost (but some of those people did not fully endorse
all of the decisions that the patch makes).

Discussion: http://postgr.es/m/CA+TgmoaDH=0Xj7OBiQnsHTKcF2c4L+=gzPBUKSJLh8zed2_+Dg@mail.gmail.com
2023-03-30 11:37:19 -04:00
Peter Eisentraut 0d15afc875 Simplify useless 0L constants
In ancient times, these belonged to arguments or fields that were
actually of type long, but now they are not anymore, so this "L"
decoration is just confusing.  (Some other 0L and other "L" constants
remain, where they are actually associated with a long type.)
2023-03-29 08:25:12 +02:00
Amit Kapila 062a844424 Avoid syncing data twice for the 'publish_via_partition_root' option.
When there are multiple publications for a subscription and one of those
publishes via the parent table by using publish_via_partition_root and the
other one directly publishes the child table, we end up copying the same
data twice during initial synchronization. The reason for this was that we
get both the parent and child tables from the publisher and try to copy
the data for both of them.

This patch extends the function pg_get_publication_tables() to take a
publication list as its input parameter. This allows us to exclude a
partition table whose ancestor is published by the same publication list.

This problem does exist in back-branches but we decide to fix it there in
a separate commit if required. The fix for back-branches requires quite
complicated changes to fetch the required table information from the
publisher as we can't update the function pg_get_publication_tables() in
back-branches. We are not sure whether we want to deviate and complicate
the code in back-branches for this problem as there are no field reports
yet.

Author: Wang wei
Reviewed-by: Peter Smith, Jacob Champion, Kuroda Hayato, Vignesh C, Osumi Takamichi, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB57167F45D481F78CDC5986F794B99@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-03-29 10:46:58 +05:30
Jeff Davis 1671f990dd Validate ICU locales.
For ICU collations, ensure that the locale's language exists in ICU,
and that the locale can be opened.

Basic validation helps avoid minor mistakes and misspellings, which
often fall back to the root locale instead of the intended
locale. It's even more important to avoid such mistakes in ICU
versions 54 and earlier, where the same (misspelled) locale string
could fall back to different locales depending on the environment.

Discussion: https://postgr.es/m/11b1eeb7e7667fdd4178497aeb796c48d26e69b9.camel@j-davis.com
Discussion: https://postgr.es/m/df2efad0cae7c65180df8e5ebb709e5eb4f2a82b.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-28 16:34:29 -07:00
Peter Eisentraut 90189eefc1 Save a few bytes in pg_attribute
Change the columns attndims, attstattarget, and attinhcount from int32
to int16, and reorder a bit.  This saves some space (currently 4
bytes) in pg_attribute and tuple descriptors, which translates into
small performance benefits and/or room for new columns in pg_attribute
needed by future features.

attndims and attinhcount are never realistically used with values
larger than int16.  Just to be sure, add some overflow checks.
attstattarget is currently limited explicitly to 10000.

For consistency, pg_constraint.coninhcount is also changed like
attinhcount.

Discussion: https://www.postgresql.org/message-id/flat/d07ffc2b-e0e8-77f7-38fb-be921dff71af%40enterprisedb.com
2023-03-28 10:05:56 +02:00
Tom Lane a3c9d35ae1 Reject attempts to alter composite types used in indexes.
find_composite_type_dependencies() ignored indexes, which is a poor
decision because an expression index could have a stored column of
a composite (or other container) type even when the underlying table
does not.  Teach it to detect such cases and error out.  We have to
work a bit harder than for other relations because the pg_depend entry
won't identify the specific index column of concern, but it's not much
new code.

This does not address bug #17872's original complaint that dropping
a column in such a type might lead to violations of the uniqueness
property that a unique index is supposed to ensure.  That seems of
much less concern to me because it won't lead to crashes.

Per bug #17872 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17872-d0fbb799dc3fd85d@postgresql.org
2023-03-27 15:04:15 -04:00
Daniel Gustafsson d435f15fff Add SysCacheGetAttrNotNull for guaranteed not-null attrs
When extracting an attr from a cached tuple in the syscache with
SysCacheGetAttr the isnull parameter must be checked in case the
attr cannot be NULL.  For cases when this is known beforehand, a
wrapper is introduced which perform the errorhandling internally
on behalf of the caller, invoking an elog in case of a NULL attr.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/AD76405E-DB45-46B6-941F-17B1EB3A9076@yesql.se
2023-03-25 22:49:33 +01:00
Tom Lane 27f5c712b2 Fix CREATE INDEX progress reporting for multi-level partitioning.
The "partitions_total" and "partitions_done" fields were updated
as though the current level of partitioning was the only one.
In multi-level cases, not only could partitions_total change
over the course of the command, but partitions_done could go
backwards or exceed the currently-reported partitions_total.

Fix by setting partitions_total to the total number of direct
and indirect children once at command start, and then just
incrementing partitions_done at appropriate points.  Invent
a new progress monitoring function "pgstat_progress_incr_param"
to simplify doing the latter.  We can avoid adding cost for the
former when doing CREATE INDEX, because ProcessUtility already
enumerates the children and it's pretty easy to pass the count
down to DefineIndex.  In principle the same could be done in
ALTER TABLE, but that's structurally difficult; for now, just
eat the cost of an extra find_all_inheritors scan in that case.

Ilya Gladyshev and Justin Pryzby

Discussion: https://postgr.es/m/a15f904a70924ffa4ca25c3c744cff31e0e6e143.camel@gmail.com
2023-03-25 15:34:03 -04:00
Tom Lane 3c05284d83 Invent GENERIC_PLAN option for EXPLAIN.
This provides a very simple way to see the generic plan for a
parameterized query.  Without this, it's necessary to define
a prepared statement and temporarily change plan_cache_mode,
which is a bit tedious.

One thing that's a bit of a hack perhaps is that we disable
execution-time partition pruning when the GENERIC_PLAN option
is given.  That's because the pruning code may attempt to
fetch the value of one of the parameters, which would fail.

Laurenz Albe, reviewed by Julien Rouhaud, Christoph Berg,
Michel Pelletier, Jim Jones, and myself

Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel@cybertec.at
2023-03-24 17:07:22 -04:00
Jeff Davis 869650fa86 Support language tags in older ICU versions (53 and earlier).
By calling uloc_canonicalize() before parsing the attributes, the
existing locale attribute parsing logic works on language tags as
well.

Fix a small memory leak, too.

Discussion: http://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-21 16:12:37 -07:00
Tom Lane 72a5b1fc88 Add @extschema:name@ and no_relocate options to extensions.
@extschema:name@ extends the existing @extschema@ feature so that
we can also insert the schema name of some required extension,
thus making cross-extension references robust even if they are in
different schemas.

However, this has the same hazard as @extschema@: if the schema
name is embedded literally in an installed object, rather than being
looked up once during extension script execution, then it's no longer
safe to relocate the other extension to another schema.  To deal with
that without restricting things unnecessarily, add a "no_relocate"
option to extension control files.  This allows an extension to
specify that it cannot handle relocation of some of its required
extensions, even if in themselves those extensions are relocatable.
We detect "no_relocate" requests of dependent extensions during
ALTER EXTENSION SET SCHEMA.

Regina Obe, reviewed by Sandro Santilli and myself

Discussion: https://postgr.es/m/003001d8f4ae$402282c0$c0678840$@pcorp.us
2023-03-20 18:37:11 -04:00
Tomas Vondra 19d8e2308b Ignore BRIN indexes when checking for HOT updates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.

A new type TU_UpdateIndexes provides a signal to the executor to
determine which indexes to update - no indexes, all indexes, or only the
summarizing indexes.

This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.

This was originally committed as 5753d4ee32, but then got reverted by
e3fcca0d0d because of correctness issues.

Original patch by Josef Simanek, various fixes and improvements by Tomas
Vondra and me.

Authors: Matthias van de Meent, Josef Simanek, Tomas Vondra
Reviewed-by: Tomas Vondra, Alvaro Herrera
Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
2023-03-20 11:02:42 +01:00
Peter Eisentraut cc1392d4aa Fix typo
Introduced in de4d456b40.

Reported-by: Erik Rijkers <er@xs4all.nl>
2023-03-17 21:40:25 +01:00
Peter Eisentraut de4d456b40 Improve several permission-related error messages.
Mainly move some detail from errmsg to errdetail, remove explicit
mention of superuser where appropriate, since that is implied in most
permission checks, and make messages more uniform.

Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230316234701.GA903298@nathanxps13
2023-03-17 10:33:09 +01:00
Peter Eisentraut 3b7cd8c690 Small code simplification
Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230310000313.GA3992372%40nathanxps13
2023-03-16 15:33:43 +01:00
Andrew Dunstan b85e91023b Don't try to read default for a non-existent attribute
Oversight in commit 9f8377f7a2 for COPY .. DEFAULT

per report from Alexander Lakhin
2023-03-15 17:20:42 -04:00
Thomas Munro 720de00af4 Fix fractional vacuum_cost_delay.
Commit 4753ef37 changed vacuum_delay_point() to use the WaitLatch() API,
to fix the problem that vacuum could keep running for a very long time
after the postmaster died.

Unfortunately, that broke commit caf626b2's support for fractional
vacuum_cost_delay, which shipped in PostgreSQL 12.  WaitLatch() works in
whole milliseconds.

For now, revert the change from commit 4753ef37, but add an explicit
check for postmaster death.  That's an extra system call on systems
other than Linux and FreeBSD, but that overhead doesn't matter much
considering that we willingly went to sleep and woke up again.  (In
later work, we might add higher resolution timeouts to the latch API so
that we could do this with our standard programming pattern, but that
wouldn't be back-patched.)

Back-patch to 14, where commit 4753ef37 arrived.

Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b-q0hXCBUCAATh0Z4Zi6UkiC0k2DFgoD3nC-r3SkR3tg%40mail.gmail.com
2023-03-15 13:58:18 +13:00
Andrew Dunstan 9f8377f7a2 Add a DEFAULT option to COPY FROM
This allows for a string which if an input field matches causes the
column's default value to be inserted. The advantage of this is that
the default can be inserted in some rows and not others, for which
non-default data is available.

The file_fdw extension is also modified to take allow use of this
option.

Israel Barth Rubio

Discussion: https://postgr.es/m/CAO_rXXAcqesk6DsvioOZ5zmeEmpUN5ktZf-9=9yu+DTr0Xr8Uw@mail.gmail.com
2023-03-13 10:01:56 -04:00
Dean Rasheed 9321c79c86 Fix concurrent update issues with MERGE.
If MERGE attempts an UPDATE or DELETE on a table with BEFORE ROW
triggers, or a cross-partition UPDATE (with or without triggers), and
a concurrent UPDATE or DELETE happens, the merge code would fail.

In some cases this would lead to a crash, while in others it would
cause the wrong merge action to be executed, or no action at all. The
immediate cause of the crash was the trigger code calling
ExecGetUpdateNewTuple() as part of the EPQ mechanism, which fails
because during a merge ri_projectNew is NULL, since merge has its own
per-action projection information, which ExecGetUpdateNewTuple() knows
nothing about.

Fix by arranging for the trigger code to exit early, returning the
TM_Result and TM_FailureData information, if a concurrent modification
is detected, allowing the merge code to do the necessary EPQ handling
in its own way. Similarly, prevent the cross-partition update code
from doing any EPQ processing for a merge, allowing the merge code to
work out what it needs to do.

This leads to a number of simplifications in nodeModifyTable.c. Most
notably, the ModifyTableContext->GetUpdateNewTuple() callback is no
longer needed, and mergeGetUpdateNewTuple() can be deleted, since
there is no longer any requirement for get-update-new-tuple during a
merge. Similarly, ModifyTableContext->cpUpdateRetrySlot is no longer
needed. Thus ExecGetUpdateNewTuple() and the retry_slot handling of
ExecCrossPartitionUpdate() can be restored to how they were in v14,
before the merge code was added, and ExecMergeMatched() no longer
needs any special-case handling for cross-partition updates.

While at it, tidy up ExecUpdateEpilogue() a bit, making it handle
recheckIndexes locally, rather than passing it in as a parameter,
ensuring that it is freed properly. This dates back to when it was
split off from ExecUpdate() to support merge.

Per bug #17809 from Alexander Lakhin, and follow-up investigation of
bug #17792, also from Alexander Lakhin.

Back-patch to v15, where MERGE was introduced, taking care to preserve
backwards-compatibility of the trigger API in v15 for any extensions
that might use it.

Discussion:
  https://postgr.es/m/17809-9e6650bef133f0fe%40postgresql.org
  https://postgr.es/m/17792-0f89452029662c36%40postgresql.org
2023-03-13 10:22:22 +00:00
Tom Lane d66bb048c3 Ensure COPY TO on an RLS-enabled table copies no more than it should.
The COPY documentation is quite clear that "COPY relation TO" copies
rows from only the named table, not any inheritance children it may
have.  However, if you enabled row-level security on the table then
this stopped being true, because the code forgot to apply the ONLY
modifier in the "SELECT ... FROM relation" query that it constructs
in order to allow RLS predicates to be attached.  Fix that.

Report and patch by Antonin Houska (comment adjustments and test case
by me).  Back-patch to all supported branches.

Discussion: https://postgr.es/m/3472.1675251957@antos
2023-03-10 13:52:44 -05:00
Peter Eisentraut 544b452a5a Disallow specifying ICU rules unless locale provider is ICU
Follow-up for 30a53b7929; this was not checked in all cases.

Reported-by: Jeff Davis <pgsql@j-davis.com>
2023-03-09 08:09:40 +01:00
Peter Eisentraut 30a53b7929 Allow tailoring of ICU locales with custom rules
This exposes the ICU facility to add custom collation rules to a
standard collation.

New options are added to CREATE COLLATION, CREATE DATABASE, createdb,
and initdb to set the rules.

Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Discussion: https://www.postgresql.org/message-id/flat/821c71a4-6ef0-d366-9acf-bb8e367f739f@enterprisedb.com
2023-03-08 16:56:37 +01:00
Peter Eisentraut b1534ed99d Clean up comments
Reformat some of the comments in MergeAttributes().  A lot of code has
been added here over time, and the comments could use a bit of editing
to make the code flow read better.
2023-03-08 15:56:32 +01:00
Michael Paquier ee56048b0e Improve readability of code PROCESS_MAIN in vacuum_rel()
4211fbd has been handling PROCESS_MAIN in vacuum_rel() with an "if/else
if" structure to avoid an extra level of indentation, but this has been
found as being rather parse to read.  This commit updates the code so as
we check for PROCESS_MAIN in a single place and then handle its
subpaths, FULL or non-FULL vacuums.  Some comments are added to make
that clearer for the reader.

Reported-by: Melanie Plageman
Author: Nathan Bossart
Reviewed-by: Michael Paquier, Melanie Plageman
Discussion: https://postgr.es/m/20230306194009.5cn6sp3wjotd36nu@liskov
2023-03-08 09:16:44 +09:00
Michael Paquier e20b1ea157 Make get_extension_schema() available
This routine is able to retrieve the OID of the schema used with an
extension (pg_extension.extnamespace), or InvalidOid if this information
is not available.  plpgsql_check embeds a copy of this code when
performing checks on functions, as one out-of-core example.

Author: Pavel Stehule
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/CAFj8pRD+9x55hjDoi285jCcjPc8uuY_D+FLn5RpXggdz+4O2sQ@mail.gmail.com
2023-03-07 14:18:20 +09:00
Tom Lane b803b7d132 Fill EState.es_rteperminfos more systematically.
While testing a fix for bug #17823, I discovered that EvalPlanQualStart
failed to copy es_rteperminfos from the parent EState, resulting in
failure if anything in EPQ execution wanted to consult that information.

This led me to conclude that commit a61b1f748 had been too haphazard
about where to fill es_rteperminfos, and that we need to be sure that
that happens exactly where es_range_table gets filled.  So I changed the
signature of ExecInitRangeTable to help ensure that this new requirement
doesn't get missed.  (Indeed, pgoutput.c was also failing to fill it.
Maybe we don't ever need it there, but I wouldn't bet on that.)

No test case yet; one will arrive with the fix for #17823.
But that needs to be back-patched, while this fix is HEAD-only.

Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
2023-03-06 13:10:57 -05:00
Michael Paquier 4211fbd841 Add PROCESS_MAIN to VACUUM
Disabling this option is useful to run VACUUM (with or without FULL) on
only the toast table of a relation, bypassing the main relation.  This
option is enabled by default.

Running directly VACUUM on a toast table was already possible without
this feature, by using the non-deterministic name of a toast relation
(as of pg_toast.pg_toast_N, where N would be the OID of the parent
relation) in the VACUUM command, and it required a scan of pg_class to
know the name of the toast table.  So this feature is basically a
shortcut to be able to run VACUUM or VACUUM FULL on a toast relation,
using only the name of the parent relation.

A new switch called --no-process-main is added to vacuumdb, to work as
an equivalent of PROCESS_MAIN.

Regression tests are added to cover VACUUM and VACUUM FULL, looking at
pg_stat_all_tables.vacuum_count to see how many vacuums have run on
each table, main or toast.

Author: Nathan Bossart
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/20221230000028.GA435655@nathanxps13
2023-03-06 16:41:05 +09:00
Tom Lane 6949b921d5 Avoid failure when altering state of partitioned foreign-key triggers.
Beginning in v15, if you apply ALTER TABLE ENABLE/DISABLE TRIGGER to
a partitioned table, it also affects the partitions' cloned versions
of the affected trigger(s).  The initial implementation of this
located the clones by name, but that fails on foreign-key triggers
which have names incorporating their own OIDs.  We can fix that, and
also make the behavior more bulletproof in the face of user-initiated
trigger renames, by identifying the cloned triggers by tgparentid.

Following the lead of earlier commits in this area, I took care not
to break ABI in the v15 branch, even though I rather doubt there
are any external callers of EnableDisableTrigger.

While here, update the documentation, which was not touched when
the semantics were changed.

Per bug #17817 from Alan Hodgson.  Back-patch to v15; older versions
do not have this behavior.

Discussion: https://postgr.es/m/17817-31dfb7c2100d9f3d@postgresql.org
2023-03-04 13:32:35 -05:00
Daniel Gustafsson 7ab1bc2939 Fix outdated references to guc.c
Commit 0a20ff54f split out the GUC variables from guc.c into a new file
guc_tables.c. This updates comments referencing guc.c regarding variables
which are now in guc_tables.c.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/6B50C70C-8C1F-4F9A-A7C0-EEAFCC032406@yesql.se
2023-03-02 13:49:39 +01:00
Michael Paquier 8a8661828a Fix corruption of templates after CREATE DATABASE .. STRATEGY WAL_LOG
WAL_LOG does a scan of the template's pg_class to determine the set of
relations that need to be copied from a template database to the new
one.  However, as coded in 9c08aea, this copy strategy would load the
pages of pg_class without considering it as a permanent relation,
causing the loaded pages to never be flushed when they should.  Any
modification of the template's pg_class, mostly through DDLs, would then
be missed, causing corruptions.

STRATEGY = WAL_LOG is the default over FILE_COPY since it has been
introduced, so any changes done to pg_class on a database template would
be gone.  Updates of database templates should be a rare thing, so the
impact of this bug should be hopefully limited.  The pre-14 default
strategy FILE_COPY is safe, and can be used as a workaround.

Ryo Matsumura has found and analyzed the issue, and Nathan has written a
test able to reproduce the failure (with few tweaks from me).

Backpatch down to 15, where STRATEGY = WAL_LOG has been introduced.

Author: Nathan Bossart, Ryo Matsumura
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/TYCPR01MB6868677E499C9AD5123084B5E8A39@TYCPR01MB6868.jpnprd01.prod.outlook.com
Backpatch-through: 15
2023-02-22 10:14:52 +09:00
Tom Lane a0fa18cc0d Fix check for child column generation status matching parent.
In commit 8bf6ec3ba, I mistakenly supposed that MergeAttributes'
loop over saved_schema was reprocessing column definitions that
had already been checked earlier: there is a variant syntax for
creating a child partition in which that's not true.  So we need
to duplicate the full check appearing further up.

(Actually, I believe that the "if (restdef->identity)" part is
not reachable, because we reject identity on partitions earlier.
But it seems wise to keep the check, in case that's ever relaxed,
and to keep this code in sync with the other instance.)

Per report from Alexander Lakhin.

Discussion: https://postgr.es/m/4a8200ca-8378-653e-38ed-b2e1f1611aa6@gmail.com
2023-02-16 18:51:55 -05:00
David Rowley 5352ca22e0 Rename force_parallel_mode to debug_parallel_query
force_parallel_mode is meant to be used to allow us to exercise the
parallel query infrastructure to ensure that it's working as we expect.
It seems some users think this GUC is for forcing the query planner into
picking a parallel plan regardless of the costs.  A quick look at the
documentation would have made them realize that they were wrong, but the
GUC is likely too conveniently named which, evidently, seems to often
result in users expecting that it forces the planner into usefully
parallelizing queries.

Here we rename the GUC to something which casual users are less likely to
mistakenly think is what they need to make their query run more quickly.

For now, the old name can still be used.  We'll revisit if the old name
mapping can be removed once the buildfarm configs are all updated.

Reviewed-by: John Naylor
Discussion: https://postgr.es/m/CAApHDvrsOi92_uA7PEaHZMH-S4Xv+MGhQWA+GrP8b1kjpS1HjQ@mail.gmail.com
2023-02-15 21:21:59 +13:00
Michael Paquier ef7002dbe0 Fix various typos in code and tests
Most of these are recent, and the documentation portions are new as of
v16 so there is no need for a backpatch.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20230208155644.GM1653@telsasoft.com
2023-02-09 14:43:53 +09:00
Peter Eisentraut aa69541046 Remove useless casts to (void *) in arguments of some system functions
The affected functions are: bsearch, memcmp, memcpy, memset, memmove,
qsort, repalloc

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/fd9adf5d-b1aa-e82f-e4c7-263c30145807%40enterprisedb.com
2023-02-07 06:57:59 +01:00
David Rowley e9aaf06328 Remove dead NoMovementScanDirection code
Here remove some dead code from heapgettup() and heapgettup_pagemode()
which was trying to support NoMovementScanDirection scans.  This code can
never be reached as standard_ExecutorRun() never calls ExecutePlan with
NoMovementScanDirection.

Additionally, plans which were scanning an unordered index would use
NoMovementScanDirection rather than ForwardScanDirection.  There was no
real need for this, so here we adjust this so we use ForwardScanDirection
for unordered index scans.  A comment in pathnodes.h claimed that
NoMovementScanDirection was used for PathKey reasons, but if that was
true, it no longer is, per code in build_index_paths().

This does change the non-text format of the EXPLAIN output so that
unordered index scans now have a "Forward" scan direction rather than
"NoMovement".  The text format of EXPLAIN has not changed.

Author: Melanie Plageman
Reviewed-by: Tom Lane, David Rowley
Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
2023-02-01 10:52:41 +13:00
Tom Lane 2489d76c49 Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees.  This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.

To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle.  PlaceHolderVars
receive similar decoration for the same reason.

In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos.  This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.

This change affects FDWs that want to plan foreign joins.  They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.

Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup.  (The README additions
and comments mention some stuff that will appear in the follow-up.)

Patch by me; thanks to Richard Guo for review.

Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 13:16:20 -05:00
Peter Geoghegan 6c6b497266 Revert "Add eager and lazy freezing strategies to VACUUM."
This reverts commit 4d41799261.  Broad
concerns about regressions caused by eager freezing strategy have been
raised.  Whether or not these concerns can be worked through in any time
frame is far from certain.

Discussion: https://postgr.es/m/20230126004347.gepcmyenk2csxrri@awork3.anarazel.de
2023-01-25 22:22:27 -08:00
Michael Paquier 9d2d9728b8 Make auto_explain print the query identifier in verbose mode
When auto_explain.log_verbose is on, auto_explain should print in the
logs plans equivalent to the EXPLAIN (VERBOSE).  However, when
compute_query_id is on, query identifiers were not showing up, being
only handled by EXPLAIN (VERBOSE).  This brings auto_explain on par with
EXPLAIN regarding that.  Note that like EXPLAIN, auto_explain does not
show the query identifier when compute_query_id=regress.

The change is done so as the choice of printing the query identifier is
done in ExplainPrintPlan() rather than in ExplainOnePlan(), to avoid a
duplication of the logic dealing with the query ID.  auto_explain is the
only in-core caller of ExplainPrintPlan().

While looking at the area, I have noticed that more consolidation
between EXPLAIN and auto_explain would be in order for the logging of
the plan duration and the buffer usage.  This refactoring is left as a
future change.

Author: Atsushi Torikoshi
Reviewed-by: Justin Pryzby, Julien Rouhaud
Discussion: https://postgr.es/m/1ea21936981f161bccfce05765c03bee@oss.nttdata.com
2023-01-26 12:23:16 +09:00
Peter Geoghegan 4d41799261 Add eager and lazy freezing strategies to VACUUM.
Eager freezing strategy avoids large build-ups of all-visible pages.  It
makes VACUUM trigger page-level freezing whenever doing so will enable
the page to become all-frozen in the visibility map.  This is useful for
tables that experience continual growth, particularly strict append-only
tables such as pgbench's history table.  Eager freezing significantly
improves performance stability by spreading out the cost of freezing
over time, rather than doing most freezing during aggressive VACUUMs.
It complements the insert autovacuum mechanism added by commit b07642db.

VACUUM determines its freezing strategy based on the value of the new
vacuum_freeze_strategy_threshold GUC (or reloption) with logged tables.
Tables that exceed the size threshold use the eager freezing strategy.
Unlogged tables and temp tables always use eager freezing strategy,
since the added cost is negligible there.  Non-permanent relations won't
incur any extra overhead in WAL written (for the obvious reason), nor in
pages dirtied (since any extra freezing will only take place on pages
whose PD_ALL_VISIBLE bit needed to be set either way).

VACUUM uses lazy freezing strategy for logged tables that fall under the
GUC size threshold.  Page-level freezing triggers based on the criteria
established in commit 1de58df4, which added basic page-level freezing.

Eager freezing is strictly more aggressive than lazy freezing.  Settings
like vacuum_freeze_min_age still get applied in just the same way in
every VACUUM, independent of the strategy in use.  The only mechanical
difference between eager and lazy freezing strategies is that only the
former applies its own additional criteria to trigger freezing pages.
Note that even lazy freezing strategy will trigger freezing whenever a
page happens to have required that an FPI be written during pruning,
provided that the page will thereby become all-frozen in the visibility
map afterwards (due to the FPI optimization from commit 1de58df4).

The vacuum_freeze_strategy_threshold default setting is 4GB.  This is a
relatively low setting that prioritizes performance stability.  It will
be reviewed at the end of the Postgres 16 beta period.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
2023-01-25 14:15:38 -08:00
Robert Haas f1358ca52d Adjust interaction of CREATEROLE with role properties.
Previously, a CREATEROLE user without SUPERUSER could not alter
REPLICATION users in any way, and could not set the BYPASSRLS
attribute. However, they could manipulate the CREATEDB property
even if they themselves did not possess it.

With this change, a CREATEROLE user without SUPERUSER can set or
clear the REPLICATION, BYPASSRLS, or CREATEDB property on a new
role or a role that they have rights to manage if and only if
that property is set for their own role.

This implements the standard idea that you can't give permissions
you don't have (but you can give the ones you do have). We might
in the future want to provide more powerful ways to constrain
what a CREATEROLE user can do - for example, to limit whether
CONNECTION LIMIT can be set or the values to which it can be set -
but that is left as future work.

Patch by me, reviewed by Nathan Bossart, Tushar Ahuja, and Neha
Sharma.

Discussion: http://postgr.es/m/CA+TgmobX=LHg_J5aT=0pi9gJy=JdtrUVGAu0zhr-i5v5nNbJDg@mail.gmail.com
2023-01-24 10:57:09 -05:00
Tom Lane 5a3a95385b Track logrep apply workers' last start times to avoid useless waits.
Enforce wal_retrieve_retry_interval on a per-subscription basis,
rather than globally, and arrange to skip that delay in case of
an intentional worker exit.  This probably makes little difference
in the field, where apply workers wouldn't be restarted often;
but it has a significant impact on the runtime of our logical
replication regression tests (even though those tests use
artificially-small wal_retrieve_retry_interval settings already).

Nathan Bossart, with mostly-cosmetic editorialization by me

Discussion: https://postgr.es/m/20221122004119.GA132961@nathanxps13
2023-01-22 14:08:46 -05:00
Tom Lane c9f7f92648 Allow REPLICA IDENTITY to be set on an index that's not (yet) valid.
The motivation for this change is that when pg_dump dumps a
partitioned index that's marked REPLICA IDENTITY, it generates a
command sequence that applies REPLICA IDENTITY before the partitioned
index has been marked valid, causing restore to fail.  We could
perhaps change pg_dump to not do it like that, but that would be
difficult and would not fix existing dump files with the problem.
There seems to be very little reason for the backend to disallow
this anyway --- the code ignores indisreplident when the index
isn't valid --- so instead let's fix it by allowing the case.

Commit 9511fb37a previously expressed a concern that allowing
indisreplident to be set on invalid indexes might allow us to
wind up in a situation where a table could have indisreplident
set on multiple indexes.  I'm not sure I follow that concern
exactly, but in any case the only way that could happen is because
relation_mark_replica_identity is too trusting about the existing set
of markings being valid.  Let's just rip out its early-exit code path
(which sure looks like premature optimization anyway; what are we
doing expending code to make redundant ALTER TABLE ... REPLICA
IDENTITY commands marginally faster and not-redundant ones marginally
slower?) and fix it to positively guarantee that no more than one
index is marked indisreplident.

The pg_dump failure can be demonstrated in all supported branches,
so back-patch all the way.  I chose to back-patch 9511fb37a as well,
just to keep indisreplident handling the same in all branches.

Per bug #17756 from Sergey Belyashov.

Discussion: https://postgr.es/m/17756-dd50e8e0c8dd4a40@postgresql.org
2023-01-21 13:10:29 -05:00
David Rowley 9f1ca6ce65 Use appendStringInfoSpaces in more places
This adjusts a few places which were appending a string constant
containing spaces onto a StringInfo.  We have appendStringInfoSpaces for
that job, so let's use that instead.

For the change to jsonb.c's add_indent() function, appendStringInfoString
was being called inside a loop to append 4 spaces on each loop.  This
meant that enlargeStringInfo would get called once per loop.  Here it
should be much more efficient to get rid of the loop and just calculate
the number of spaces with "level * 4" and just append all the spaces in
one go.

Here we additionally adjust the appendStringInfoSpaces function so it
makes use of memset rather than a while loop to apply the required spaces
to the StringInfo.  One of the problems with the while loop was that it
was incrementing one variable and decrementing another variable once per
loop.  That's more work than what's required to get the job done.  We may
as well use memset for this rather than trying to optimize the existing
loop.  Some testing has shown memset is faster even for very small sizes.

Discussion: https://postgr.es/m/CAApHDvp_rKkvwudBKgBHniNRg67bzXVjyvVKfX0G2zS967K43A@mail.gmail.com
2023-01-20 13:07:24 +13:00
Tom Lane 47bb9db759 Get rid of the "new" and "old" entries in a view's rangetable.
The rule system needs "old" and/or "new" pseudo-RTEs in rule actions
that are ON INSERT/UPDATE/DELETE.  Historically it's put such entries
into the ON SELECT rules of views as well, but those are really quite
vestigial.  The only thing we've used them for is to carry the
view's relid forward to AcquireExecutorLocks (so that we can
re-lock the view to verify it hasn't changed before re-using a plan)
and to carry its relid and permissions data forward to execution-time
permissions checks.  What we can do instead of that is to retain
these fields of the RTE_RELATION RTE for the view even after we
convert it to an RTE_SUBQUERY RTE.  This requires a tiny amount of
extra complication in the planner and AcquireExecutorLocks, but on
the other hand we can get rid of the logic that moves that data from
one place to another.

The principal immediate benefit of doing this, aside from a small
saving in the pg_rewrite data for views, is that these pseudo-RTEs
no longer trigger ruleutils.c's heuristic about qualifying variable
names when the rangetable's length is more than 1.  That results
in quite a number of small simplifications in regression test outputs,
which are all to the good IMO.

Bump catversion because we need to dump a few more fields of
RTE_SUBQUERY RTEs.  While those will always be zeroes anyway in
stored rules (because we'd never populate them until query rewrite)
they are useful for debugging, and it seems like we'd better make
sure to transmit such RTEs accurately in plans sent to parallel
workers.  I don't think the executor actually examines these fields
after startup, but someday it might.

This is a second attempt at committing 1b4d280ea.  The difference
from the first time is that now we can add some filtering rules to
AdjustUpgrade.pm to allow cross-version upgrade testing to pass
despite all the cosmetic changes in CREATE VIEW outputs.

Amit Langote (filtering rules by me)

Discussion: https://postgr.es/m/CA+HiwqEf7gPN4Hn+LoZ4tP2q_Qt7n3vw7-6fJKOf92tSEnX6Gg@mail.gmail.com
Discussion: https://postgr.es/m/891521.1673657296@sss.pgh.pa.us
2023-01-18 13:23:57 -05:00
Jeff Davis ff9618e82a Fix MAINTAIN privileges for toast tables and partitions.
Commit 60684dd8 left loose ends when it came to maintaining toast
tables or partitions.

For toast tables, simply skip the privilege check if the toast table
is an indirect target of the maintenance command, because the main
table privileges have already been checked.

For partitions, allow the maintenance command if the user has the
MAINTAIN privilege on the partition or any parent.

Also make CLUSTER emit "skipping" messages when the user doesn't have
privileges, similar to VACUUM.

Author: Nathan Bossart
Reported-by: Pavel Luzanov
Reviewed-by: Pavel Luzanov, Ted Yu
Discussion: https://postgr.es/m/20230113231339.GA2422750@nathanxps13
2023-01-14 00:16:23 -08:00
Jeff Davis d46a9792a8 Clean up useless "skipping" messages for VACUUM/ANALYZE.
When VACUUM/ANALYZE are run on an entire database, it warns of
skipping relations for which the user doesn't have sufficient
privileges. That only makes sense for tables, so skip such messages
for indexes, etc.

Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/c0a85c2e83158560314b576b6241c8ed0aea1745.camel%40j-davis.com
2023-01-13 14:42:03 -08:00
Jeff Davis c44f6334ca Simplify permissions for LOCK TABLE.
The prior behavior was confusing and hard to document. For instance,
if you had UPDATE privileges, you could lock a table in any lock mode
except ACCESS SHARE mode.

Now, if granted a privilege to lock at a given mode, one also has
privileges to lock at a less-conflicting mode. MAINTAIN, UPDATE,
DELETE, and TRUNCATE privileges allow any lock mode. INSERT privileges
allow ROW EXCLUSIVE (or below). SELECT privileges allow ACCESS SHARE.

Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/9550c76535404a83156252b25a11babb4792ea1e.camel%40j-davis.com
2023-01-13 14:33:19 -08:00
Tom Lane f0e6d6d3c9 Revert "Get rid of the "new" and "old" entries in a view's rangetable."
This reverts commit 1b4d280ea1.
It's broken the buildfarm members that run cross-version-upgrade tests,
because they're not prepared to deal with cosmetic differences between
CREATE VIEW commands emitted by older servers and HEAD.  Even if we had
a solution to that, which we don't, it'd take some time to roll it out
to the affected animals.  This improvement isn't valuable enough to
justify addressing that problem on an emergency basis, so revert it
for now.
2023-01-11 23:01:22 -05:00
Tom Lane 1b4d280ea1 Get rid of the "new" and "old" entries in a view's rangetable.
The rule system needs "old" and/or "new" pseudo-RTEs in rule actions
that are ON INSERT/UPDATE/DELETE.  Historically it's put such entries
into the ON SELECT rules of views as well, but those are really quite
vestigial.  The only thing we've used them for is to carry the
view's relid forward to AcquireExecutorLocks (so that we can
re-lock the view to verify it hasn't changed before re-using a plan)
and to carry its relid and permissions data forward to execution-time
permissions checks.  What we can do instead of that is to retain
these fields of the RTE_RELATION RTE for the view even after we
convert it to an RTE_SUBQUERY RTE.  This requires a tiny amount of
extra complication in the planner and AcquireExecutorLocks, but on
the other hand we can get rid of the logic that moves that data from
one place to another.

The principal immediate benefit of doing this, aside from a small
saving in the pg_rewrite data for views, is that these pseudo-RTEs
no longer trigger ruleutils.c's heuristic about qualifying variable
names when the rangetable's length is more than 1.  That results
in quite a number of small simplifications in regression test outputs,
which are all to the good IMO.

Bump catversion because we need to dump a few more fields of
RTE_SUBQUERY RTEs.  While those will always be zeroes anyway in
stored rules (because we'd never populate them until query rewrite)
they are useful for debugging, and it seems like we'd better make
sure to transmit such RTEs accurately in plans sent to parallel
workers.  I don't think the executor actually examines these fields
after startup, but someday it might.

Amit Langote

Discussion: https://postgr.es/m/CA+HiwqEf7gPN4Hn+LoZ4tP2q_Qt7n3vw7-6fJKOf92tSEnX6Gg@mail.gmail.com
2023-01-11 19:41:09 -05:00
Tom Lane 8bf6ec3ba3 Improve handling of inherited GENERATED expressions.
In both partitioning and traditional inheritance, require child
columns to be GENERATED if and only if their parent(s) are.
Formerly we allowed the case of an inherited column being
GENERATED when its parent isn't, but that results in inconsistent
behavior: the column can be directly updated through an UPDATE
on the parent table, leading to it containing a user-supplied
value that might not match the generation expression.  This also
fixes an oversight that we enforced partition-key-columns-can't-
be-GENERATED against parent tables, but not against child tables
that were dynamically attached to them.

Also, remove the restriction that the child's generation expression
be equivalent to the parent's.  In the wake of commit 3f7836ff6,
there doesn't seem to be any reason that we need that restriction,
since generation expressions are always computed per-table anyway.
By removing this, we can also allow a child to merge multiple
inheritance parents with inconsistent generation expressions, by
overriding them with its own expression, much as we've long allowed
for DEFAULT expressions.

Since we're rejecting a case that we used to accept, this doesn't
seem like a back-patchable change.  Given the lack of field
complaints about the inconsistent behavior, it's likely that no
one is doing this anyway, but we won't change it in minor releases.

Amit Langote and Tom Lane

Discussion: https://postgr.es/m/2793383.1672944799@sss.pgh.pa.us
2023-01-11 15:55:02 -05:00
Robert Haas e5b8a4c098 Add new GUC createrole_self_grant.
Can be set to the empty string, or to either or both of "set" or
"inherit". If set to a non-empty value, a non-superuser who creates
a role (necessarily by relying up the CREATEROLE privilege) will
grant that role back to themselves with the specified options.

This isn't a security feature, because the grant that this feature
triggers can also be performed explicitly. Instead, it's a user experience
feature. A superuser would necessarily inherit the privileges of any
created role and be able to access all such roles via SET ROLE;
with this patch, you can configure createrole_self_grant = 'set, inherit'
to provide a similar experience for a user who has CREATEROLE but not
SUPERUSER.

Discussion: https://postgr.es/m/CA+TgmobN59ct+Emmz6ig1Nua2Q-_o=r6DSD98KfU53kctq_kQw@mail.gmail.com
2023-01-10 12:44:49 -05:00
Robert Haas cf5eb37c5e Restrict the privileges of CREATEROLE users.
Previously, CREATEROLE users were permitted to make nearly arbitrary
changes to roles that they didn't create, with certain exceptions,
particularly superuser roles.  Instead, allow CREATEROLE users to make such
changes to roles for which they possess ADMIN OPTION, and to
grant membership only in roles for which they possess ADMIN OPTION.

When a CREATEROLE user who is not a superuser creates a role, grant
ADMIN OPTION on the newly-created role to the creator, so that they
can administer roles they create or for which they have been given
privileges.

With these changes, CREATEROLE users still have very significant
powers that unprivileged users do not receive: they can alter, rename,
drop, comment on, change the password for, and change security labels
on roles.  However, they can now do these things only for roles for
which they possess appropriate privileges, rather than all
non-superuser roles; moreover, they cannot grant a role such as
pg_execute_server_program unless they themselves possess it.

Patch by me, reviewed by Mark Dilger.

Discussion: https://postgr.es/m/CA+TgmobN59ct+Emmz6ig1Nua2Q-_o=r6DSD98KfU53kctq_kQw@mail.gmail.com
2023-01-10 12:44:30 -05:00
Amit Kapila 216a784829 Perform apply of large transactions by parallel workers.
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.

In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode -  in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.

This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).

In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.

Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-09 07:52:45 +05:30
Tom Lane c6e1f62e2c Wake up a subscription's replication worker processes after DDL.
Waken related worker processes immediately at commit of a transaction
that has performed ALTER SUBSCRIPTION (including the RENAME and
OWNER variants).  This reduces the response time for such operations.
In the real world that might not be worth much, but it shaves several
seconds off the runtime for the subscription test suite.

In the case of PREPARE, we just throw away this notification state;
it doesn't seem worth the work to preserve it.  The workers will
still react after the eventual COMMIT PREPARED, but not as quickly.

Nathan Bossart

Discussion: https://postgr.es/m/20221122004119.GA132961@nathanxps13
2023-01-06 17:27:58 -05:00
Tom Lane a46a7011b2 Add options to control whether VACUUM runs vac_update_datfrozenxid.
VACUUM normally ends by running vac_update_datfrozenxid(), which
requires a scan of pg_class.  Therefore, if one attempts to vacuum a
database one table at a time --- as vacuumdb has done since v12 ---
we will spend O(N^2) time in vac_update_datfrozenxid().  That causes
serious performance problems in databases with tens of thousands of
tables, and indeed the effect is measurable with only a few hundred.
To add insult to injury, only one process can run
vac_update_datfrozenxid at the same time per DB, so this behavior
largely defeats vacuumdb's -j option.

Hence, invent options SKIP_DATABASE_STATS and ONLY_DATABASE_STATS
to allow applications to postpone vac_update_datfrozenxid() until the
end of a series of VACUUM requests, and teach vacuumdb to use them.

Per bug #17717 from Gunnar L.  Sadly, this answer doesn't seem
like something we'd consider back-patching, so the performance
problem will remain in v12-v15.

Tom Lane and Nathan Bossart

Discussion: https://postgr.es/m/17717-6c50eb1c7d23a886@postgresql.org
2023-01-06 14:17:25 -05:00
Robert Haas 39cffe95f2 Pass down current user ID to AddRoleMems and DelRoleMems.
This is just refactoring; there should be no functonal change. It
might have the effect of slightly reducing the number of calls to
GetUserId(), but the real point is to facilitate future work in
this area.

Patch by me, reviewed by Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmobFzTLkLwOquFrAcdsWBsOWDr-_H-jw+qBvfx-wSzMwDA@mail.gmail.com
2023-01-05 14:33:35 -05:00
Robert Haas 25bb03166b Refactor permissions-checking for role grants.
Instead of having checks in AddRoleMems() and DelRoleMems(), have
the callers perform checks where it's required. In some cases it
isn't, either because the caller has already performed a check for
the same condition, or because the check couldn't possibly fail.

The "Skip permission check if nothing to do" check in each of
AddRoleMems() and DelRoleMems() is pointless. Some call sites
can't pass an empty list. Others can, but in those cases, the role
being modified is one that the current user has just created.
Therefore, they must have permission to modify it, and so no
permission check is required at all.

This patch is intended to have no user-visible consequences. It is
intended to simplify future work in this area.

Patch by me, reviewed by Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmobFzTLkLwOquFrAcdsWBsOWDr-_H-jw+qBvfx-wSzMwDA@mail.gmail.com
2023-01-05 14:30:40 -05:00
Peter Eisentraut bf03cfd162 Windows support in pg_import_system_collations
Windows can enumerate the locales that are either installed or
supported by calling EnumSystemLocalesEx(), similar to what is already
done in the READ_LOCALE_A_OUTPUT switch.  We can refactor some of the
logic already used in that switch into a new function
create_collation_from_locale().

The enumerated locales have BCP 47 shape, that is with a hyphen
between language and territory, instead of POSIX's underscore.  The
created collations will retain the BCP 47 shape, but we will also
create a POSIX alias, so xx-YY will have an xx_YY alias.

A new test collate.windows.win1252 is added that is like
collate.linux.utf8.

Author: Juan Jose Santamaria Flecha <juanjo.santamaria@gmail.com>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/0050ec23-34d9-2765-9015-98c04f0e18ac@postgrespro.ru
2023-01-03 14:21:56 +01:00
Bruce Momjian c8e1ba736b Update copyright for 2023
Backpatch-through: 11
2023-01-02 15:00:37 -05:00
Peter Geoghegan 4ce3afb82e Refactor how VACUUM passes around its XID cutoffs.
Use a dedicated struct for the XID/MXID cutoffs used by VACUUM, such as
FreezeLimit and OldestXmin.  This state is initialized in vacuum.c, and
then passed around by code from vacuumlazy.c to heapam.c freezing
related routines.  The new convention is that everybody works off of the
same cutoff state, which is passed around via pointers to const.

Also simplify some of the logic for dealing with frozen xmin in
heap_prepare_freeze_tuple: add dedicated "xmin_already_frozen" state to
clearly distinguish xmin XIDs that we're going to freeze from those that
were already frozen from before.  That way the routine's xmin handling
code is symmetrical with the existing xmax handling code.  This is
preparation for an upcoming commit that will add page level freezing.

Also refactor the control flow within FreezeMultiXactId(), while adding
stricter sanity checks.  We now test OldestXmin directly, instead of
using FreezeLimit as an inexact proxy for OldestXmin.  This is further
preparation for the page level freezing work, which will make the
function's caller cede control of page level freezing to the function
where appropriate (where heap_prepare_freeze_tuple sees a tuple that
happens to contain a MultiXactId in its xmax).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CAH2-WznS9TxXmz2_=SY+SyJyDFbiOftKofM9=aDo68BbXNBUMA@mail.gmail.com
2022-12-22 09:37:59 -08:00
Michael Paquier 01be9d498f Fix operator typo in tablecmds.c
A bitwise operator was getting used on two bools in
ATAddCheckConstraint() to track if constraints should be merged or not
with the existing ones of a relation, though obviously this should use
a boolean OR operator.  This led to the same result, but let's be
clean.

Oversight in 074c5cf.

Author: Ranier Vilela
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/CAEudQAp2R2fbbi0OHHhv_n4=Ch0t1VtjObR9YMqtGKHJ+faUFQ@mail.gmail.com
2022-12-22 12:08:45 +09:00
Michael Paquier 22e3b55805 Switch some system functions to use get_call_result_type()
This shaves some code by replacing the combinations of
CreateTemplateTupleDesc()/TupleDescInitEntry() hardcoding a mapping of
the attributes listed in pg_proc.dat by get_call_result_type() to build
the TupleDesc needed for the rows generated.

get_call_result_type() is more expensive than the former style, but this
removes some duplication with the lists of OUT parameters (pg_proc.dat
and the attributes hardcoded in these code paths).  This is applied to
functions that are not considered as critical (aka that could be called
repeatedly for monitoring purposes).

Author: Bharath Rupireddy
Reviewed-by: Robert Haas, Álvaro Herrera, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACV23HW5HP5hFjd89FNS-z5X8r2jNXdMXcpN2BgTtKd87w@mail.gmail.com
2022-12-21 10:11:22 +09:00
Andrew Dunstan 8284cf5f74 Add copyright notices to meson files
Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net
2022-12-20 07:54:39 -05:00
Jeff Davis 60684dd834 Add grantable MAINTAIN privilege and pg_maintain role.
Allows VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZED VIEW, CLUSTER,
and LOCK TABLE.

Effectively reverts 4441fc704d. Instead of creating separate
privileges for VACUUM, ANALYZE, and other maintenance commands, group
them together under a single MAINTAIN privilege.

Author: Nathan Bossart
Discussion: https://postgr.es/m/20221212210136.GA449764@nathanxps13
Discussion: https://postgr.es/m/45224.1670476523@sss.pgh.pa.us
2022-12-13 17:33:28 -08:00
Alvaro Herrera 840ff5f451
Get rid of recursion-marker values in enum AlterTableType
During ALTER TABLE execution, when prep-time handling of subcommands of
certain types determine that execution-time handling requires recursion,
they signal this by changing the subcommand type to a special value.
This can be done in a simpler way by using a separate flag introduced by
commit ec0925c22a, so do that.

Catversion bumped.  It's not clear to me that ALTER TABLE subcommands
are stored anywhere in catalogs (CREATE FUNCTION rejects it in BEGIN
ATOMIC function bodies), but we do have both write and read support for
them, so be safe.

Discussion: https://postgr.es/m/20220929090033.zxuaezcdwh2fgfjb@alvherre.pgsql
2022-12-12 11:13:26 +01:00
Michael Paquier eae7fe4859 Remove direct call to GetNewObjectId() for pg_auth_members.oid
This routine should not be called directly as mentioned at its top, so
replace it by GetNewOidWithIndex().  Issue introduced by 6566133 when
pg_auth_members.oid got added, so no backpatch is needed.

Author: Maciek Sakrejda
Discussion: https://postgr.es/m/CAOtHd0Ckbih7Ur7XeVyLAJ26VZOfTNcq9qV403bNF4uTGtAN+Q@mail.gmail.com
2022-12-12 09:01:39 +09:00
Alexander Korotkov 096dd80f3c Add USER SET parameter values for pg_db_role_setting
The USER SET flag specifies that the variable should be set on behalf of an
ordinary role.  That lets ordinary roles set placeholder variables, which
permission requirements are not known yet.  Such a value wouldn't be used if
the variable finally appear to require superuser privileges.

The new flags are stored in the pg_db_role_setting.setuser array.  Catversion
is bumped.

This commit is inspired by the previous work by Steve Chavez.

Discussion: https://postgr.es/m/CAPpHfdsLd6E--epnGqXENqLP6dLwuNZrPMcNYb3wJ87WR7UBOQ%40mail.gmail.com
Author: Alexander Korotkov, Steve Chavez
Reviewed-by: Pavel Borisov, Steve Chavez
2022-12-09 13:12:20 +03:00
Peter Eisentraut 5e9b122059 Fix FK comment think-o
from commit d6f96ed94e

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Ian Lawrence Barwick <barwick@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/6a7c7338-1aa2-4689-d171-0b0b294fdd84%40illuminatedcomputing.com
2022-12-07 17:06:50 +01:00
Alvaro Herrera a61b1f7482
Rework query relation permission checking
Currently, information about the permissions to be checked on relations
mentioned in a query is stored in their range table entries.  So the
executor must scan the entire range table looking for relations that
need to have permissions checked.  This can make the permission checking
part of the executor initialization needlessly expensive when many
inheritance children are present in the range range.  While the
permissions need not be checked on the individual child relations, the
executor still must visit every range table entry to filter them out.

This commit moves the permission checking information out of the range
table entries into a new plan node called RTEPermissionInfo.  Every
top-level (inheritance "root") RTE_RELATION entry in the range table
gets one and a list of those is maintained alongside the range table.
This new list is initialized by the parser when initializing the range
table.  The rewriter can add more entries to it as rules/views are
expanded.  Finally, the planner combines the lists of the individual
subqueries into one flat list that is passed to the executor for
checking.

To make it quick to find the RTEPermissionInfo entry belonging to a
given relation, RangeTblEntry gets a new Index field 'perminfoindex'
that stores the corresponding RTEPermissionInfo's index in the query's
list of the latter.

ExecutorCheckPerms_hook has gained another List * argument; the
signature is now:
typedef bool (*ExecutorCheckPerms_hook_type) (List *rangeTable,
					      List *rtePermInfos,
					      bool ereport_on_violation);
The first argument is no longer used by any in-core uses of the hook,
but we leave it in place because there may be other implementations that
do.  Implementations should likely scan the rtePermInfos list to
determine which operations to allow or deny.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com
2022-12-06 16:09:24 +01:00
Alvaro Herrera fb958b5da8
Generalize ri_RootToPartitionMap to use for non-partition children
ri_RootToPartitionMap is currently only initialized for tuple routing
target partitions, though a future commit will need the ability to use
it even for the non-partition child tables, so make adjustments to the
decouple it from the partitioning code.

Also, make it lazily initialized via ExecGetRootToChildMap(), making
that function its preferred access path.  Existing third-party code
accessing it directly should no longer do so; consequently, it's been
renamed to ri_RootToChildMap, which also makes it consistent with
ri_ChildToRootMap.

ExecGetRootToChildMap() houses the logic of setting the map appropriately
depending on whether a given child relation is partition or not.

To support this, also add a separate entry point for TupleConversionMap
creation that receives an AttrMap.  No new code here, just split an
existing function in two.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqEYUhDXSK5BTvG_xk=eaAEJCD4GS3C6uH7ybBvv+Z_Tmg@mail.gmail.com
2022-12-02 10:35:55 +01:00
Alvaro Herrera ad86d159b6
Add 'missing_ok' argument to build_attrmap_by_name
When it's given as true, return a 0 in the position of the missing
column rather than raising an error.

This is currently unused, but it allows us to reimplement column
permission checking in a subsequent commit.  It seems worth breaking
into a separate commit because it affects unrelated code.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqFfiai=qBxPDTjaio_ZcaqUKh+FC=prESrB8ogZgFNNNQ@mail.gmail.com
2022-11-29 09:39:36 +01:00
Andrew Dunstan b5d6382496 Provide per-table permissions for vacuum and analyze.
Currently a table can only be vacuumed or analyzed by its owner or
a superuser. This can now be extended to any user by means of an
appropriate GRANT.

Nathan Bossart

Reviewed by: Bharath Rupireddy, Kyotaro Horiguchi, Stephen Frost, Robert
Haas, Mark Dilger, Tom Lane, Corey Huinker, David G. Johnston, Michael
Paquier.

Discussion: https://postgr.es/m/20220722203735.GB3996698@nathanxps13
2022-11-28 12:08:14 -05:00
Michael Paquier d13b684117 Introduce variables for initial and max nesting depth on configuration files
The code has been assuming already in a few places that the initial
recursion nesting depth is 0, and the recent changes in hba.c (mainly
783e8c6) have relies on this assumption in more places.  The maximum
recursion nesting level is assumed to be 10 for hba.c and GUCs.

Author: Julien Rouhaud
Discussion: https://postgr.es/m/20221124090724.n7amf5kpdhx6vb76@jrouhaud
2022-11-25 07:40:12 +09:00
Andrew Dunstan b7a5ef17cf Simplify WARNING messages from skipped vacuum/analyze on a table
This will more easily accomodate adding new permissions for vacuum and
analyze.

Nathan Bossart following a suggestion from Kyotaro Horiguchi

Discussion: https://postgr.es/m/20220726.104712.912995710251150228.horikyota.ntt@gmail.com
2022-11-23 14:43:16 -05:00
Peter Geoghegan b6074846ce Simplify vacuum_set_xid_limits() signature.
Pass VACUUM parameters (VacuumParams state) to vacuum_set_xid_limits()
directly, rather than passing most individual VacuumParams fields as
separate arguments.

Also make vacuum_set_xid_limits() output parameter symbol names match
those used by its vacuumlazy.c caller.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wz=TE7gW5DgSahDkf0UEZigFGAoHNNN6EvSrdzC=Kn+hrA@mail.gmail.com
2022-11-23 11:10:06 -08:00