Commit Graph

4381 Commits

Author SHA1 Message Date
Michael Paquier 68f7c4b57a Clean up more code using "(expr) ? true : false"
This is similar to fd0625c, taking care of any remaining code paths that
are worth the cleanup.  This also changes some cases using opposite
expression patterns.

Author: Justin Pryzby, Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoCdF8dnUvr-BUWWGvA_XhKSoANacBMZb6jKyCk4TYfQ2Q@mail.gmail.com
2021-10-11 09:36:42 +09:00
Daniel Gustafsson 7111e332c5 Fix duplicate words in comments
Remove accidentally duplicated words in code comments.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/87bl45t0co.fsf@wibble.ilmari.org
2021-10-04 15:12:57 +02:00
Alvaro Herrera c6bc655ee2
Error out if SKIP LOCKED and WITH TIES are both specified
Both bugs #16676[1] and #17141[2] illustrate that the combination of
SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes
to rows returned to other sessions accessing the same row.  Since this
situation is detectable from the syntax and hard to fix otherwise,
forbid for now, with the potential to fix in the future.

[1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org
[2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org

Backpatch-through: 13, where WITH TIES was introduced
Author: David Christensen <david.christensen@crunchydata.com>
Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
2021-10-01 18:29:18 -03:00
Michael Paquier 070d2e19e4 Clarify use of "statistics objects" in the code
The code inconsistently used "statistic object" or "statistics" where
the correct term, as discussed, is actually "statistics object".  This
improves the state of the code to be more consistent.

While on it, fix an incorrect error message introduced in a4d75c8.  This
error should never happen, as the code states, but it would be
misleading.

Author: Justin Pryzby
Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
Backpatch-through: 14
2021-09-29 15:29:38 +09:00
Michael Paquier e767ddcd35 Fix typos and grammar in code comments
Several mistakes have piled in the code comments over the time,
including incorrect grammar, function names and simple typos.  This
commit takes care of a portion of these.

No backpatch is done as this is only cosmetic.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
2021-09-27 14:21:28 +09:00
Amit Kapila 4548c76738 Invalidate all partitions for a partitioned table in publication.
Updates/Deletes on a partition were allowed even without replica identity
after the parent table was added to a publication. This would later lead
to an error on subscribers. The reason was that we were not invalidating
the partition's relcache and the publication information for partitions
was not getting rebuilt. Similarly, we were not invalidating the
partitions' relcache after dropping a partitioned table from a publication
which will prohibit Updates/Deletes on its partition without replica
identity even without any publication.

Reported-by: Haiying Tang
Author: Hou Zhijie and Vignesh C
Reviewed-by: Vignesh C and Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-09-22 08:00:54 +05:30
Tom Lane 4476bcb877 Fix misevaluation of STABLE parameters in CALL within plpgsql.
Before commit 84f5c2908, a STABLE function in a plpgsql CALL
statement's argument list would see an up-to-date snapshot,
because exec_stmt_call would push a new snapshot.  I got rid of
that because the possibility of the snapshot disappearing within
COMMIT made it too hard to manage a snapshot across the CALL
statement.  That's fine so far as the procedure itself goes,
but I forgot to think about the possibility of STABLE functions
within the CALL argument list.  As things now stand, those'll
be executed with the Portal's snapshot as ActiveSnapshot,
keeping them from seeing updates more recent than Portal startup.

(VOLATILE functions don't have a problem because they take their
own snapshots; which indeed is also why the procedure itself
doesn't have a problem.  There are no STABLE procedures.)

We can fix this by pushing a new snapshot transiently within
ExecuteCallStmt itself.  Popping the snapshot before we get
into the procedure proper eliminates the management problem.
The possibly-useless extra snapshot-grab is slightly annoying,
but it's no worse than what happened before 84f5c2908.

Per bug #17199 from Alexander Nawratil.  Back-patch to v11,
like the previous patch.

Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
2021-09-21 19:06:53 -04:00
Tomas Vondra c9eeef2a15 Disallow extended statistics on system columns
Since introduction of extended statistics, we've disallowed references
to system columns. So for example

    CREATE STATISTICS s ON ctid FROM t;

would fail. But with extended statistics on expressions, it was possible
to work around this limitation quite easily

    CREATE STATISTICS s ON (ctid::text) FROM t;

This is an oversight in a4d75c86bf, fixed by adding a simple check.
Backpatch to PostgreSQL 14, where support for extended statistics on
expressions was introduced.

Backpatch-through: 14
Discussion: https://postgr.es/m/20210816013255.GS10479%40telsasoft.com
2021-09-20 00:34:57 +02:00
Peter Eisentraut 4ac0f450b6 Message style improvements 2021-09-16 15:36:44 +02:00
Tom Lane 69e31d05b0 Improve log messages from pg_import_system_collations().
pg_import_system_collations() was a bit inconsistent about how it
reported locales (names output by "locale -a") that it didn't make
pg_collation entries for.  IMV we should print some suitable message
for every locale that we reject, except when it matches a pre-existing
pg_collation entry.  (This is all at DEBUG1 log level, though, so as
not to create noise during initdb.)  Add messages for the two cases
that were previously not logged, namely unrecognized encoding and
client-only encoding.  Re-word the existing messages to have a
consistent style.

Anton Voloshin and Tom Lane

Discussion: https://postgr.es/m/429d64ee-188d-3ce1-106a-53a8b45c4fce@postgrespro.ru
2021-09-14 18:55:15 -04:00
Tom Lane 2e4eae87d0 Send NOTIFY signals during CommitTransaction.
Formerly, we sent signals for outgoing NOTIFY messages within
ProcessCompletedNotifies, which was also responsible for sending
relevant ones of those messages to our connected client.  It therefore
had to run during the main-loop processing that occurs just before
going idle.  This arrangement had two big disadvantages:

* Now that procedures allow intra-command COMMITs, it would be
useful to send NOTIFYs to other sessions immediately at COMMIT
(though, for reasons of wire-protocol stability, we still shouldn't
forward them to our client until end of command).

* Background processes such as replication workers would not send
NOTIFYs at all, since they never execute the client communication
loop.  We've had requests to allow triggers running in replication
workers to send NOTIFYs, so that's a problem.

To fix these things, move transmission of outgoing NOTIFY signals
into AtCommit_Notify, where it will happen during CommitTransaction.
Also move the possible call of asyncQueueAdvanceTail there, to
ensure we don't bloat the async SLRU if a background worker sends
many NOTIFYs with no one listening.

We can also drop the call of asyncQueueReadAllNotifications,
allowing ProcessCompletedNotifies to go away entirely.  That's
because commit 790026972 added a call of ProcessNotifyInterrupt
adjacent to PostgresMain's call of ProcessCompletedNotifies,
and that does its own call of asyncQueueReadAllNotifications,
meaning that we were uselessly doing two such calls (inside two
separate transactions) whenever inbound notify signals coincided
with an outbound notify.  We need only set notifyInterruptPending
to ensure that ProcessNotifyInterrupt runs, and we're done.

The existing documentation suggests that custom background workers
should call ProcessCompletedNotifies if they want to send NOTIFY
messages.  To avoid an ABI break in the back branches, reduce it
to an empty routine rather than removing it entirely.  Removal
will occur in v15.

Although the problems mentioned above have existed for awhile,
I don't feel comfortable back-patching this any further than v13.
There was quite a bit of churn in adjacent code between 12 and 13.
At minimum we'd have to also backpatch 51004c717, and a good deal
of other adjustment would also be needed, so the benefit-to-risk
ratio doesn't look attractive.

Per bug #15293 from Michael Powers (and similar gripes from others).

Artur Zakirov and Tom Lane

Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
2021-09-14 17:18:25 -04:00
Tom Lane c1b7a6c273 Fix some anomalies with NO SCROLL cursors.
We have long forbidden fetching backwards from a NO SCROLL cursor,
but the prohibition didn't extend to cases in which we rewind the
query altogether and then re-fetch forwards.  I think the reason is
that this logic was mainly meant to protect plan nodes that can't
be run in the reverse direction.  However, re-reading the query output
is problematic if the query is volatile (which includes SELECT FOR
UPDATE, not just queries with volatile functions): the re-read can
produce different results, which confuses the cursor navigation logic
completely.  Another reason for disliking this approach is that some
code paths will either fetch backwards or rewind-and-fetch-forwards
depending on the distance to the target row; so that seemingly
identical use-cases may or may not draw the "cursor can only scan
forward" error.  Hence, let's clean things up by disallowing rewind
as well as fetch-backwards in a NO SCROLL cursor.

Ordinarily we'd only make such a definitional change in HEAD, but
there is a third reason to consider this change now.  Commit ba2c6d6ce
created some new user-visible anomalies for non-scrollable cursors
WITH HOLD, in that navigation in the cursor result got confused if the
cursor had been partially read before committing.  The only good way
to resolve those anomalies is to forbid rewinding such a cursor, which
allows removal of the incorrect cursor state manipulations that
ba2c6d6ce added to PersistHoldablePortal.

To minimize the behavioral change in the back branches (including
v14), refuse to rewind a NO SCROLL cursor only when it has a holdStore,
ie has been held over from a previous transaction due to WITH HOLD.
This should avoid breaking most applications that have been sloppy
about whether to declare cursors as scrollable.  We'll enforce the
prohibition across-the-board beginning in v15.

Back-patch to v11, as ba2c6d6ce was.

Discussion: https://postgr.es/m/3712911.1631207435@sss.pgh.pa.us
2021-09-10 13:18:32 -04:00
Tom Lane cba79a1632 Avoid fetching from an already-terminated plan.
Some plan node types don't react well to being called again after
they've already returned NULL.  PortalRunSelect() has long dealt
with this by calling the executor with NoMovementScanDirection
if it sees that we've already run the portal to the end.  However,
commit ba2c6d6ce overlooked this point, so that persisting an
already-fully-fetched cursor would fail if it had such a plan.

Per report from Tomas Barton.  Back-patch to v11, as the faulty
commit was.  (I've omitted a test case because the type of plan
that causes a problem isn't all that stable.)

Discussion: https://postgr.es/m/CAPV2KRjd=ErgVGbvO2Ty20tKTEZZr6cYsYLxgN_W3eAo9pf5sw@mail.gmail.com
2021-09-09 13:36:44 -04:00
Peter Eisentraut 639a86e36a Remove Value node struct
The Value node struct is a weird construct.  It is its own node type,
but most of the time, it actually has a node type of Integer, Float,
String, or BitString.  As a consequence, the struct name and the node
type don't match most of the time, and so it has to be treated
specially a lot.  There doesn't seem to be any value in the special
construct.  There is very little code that wants to accept all Value
variants but nothing else (and even if it did, this doesn't provide
any convenient way to check it), and most code wants either just one
particular node type (usually String), or it accepts a broader set of
node types besides just Value.

This change removes the Value struct and node type and replaces them
by separate Integer, Float, String, and BitString node types that are
proper node types and structs of their own and behave mostly like
normal node types.

Also, this removes the T_Null node tag, which was previously also a
possible variant of Value but wasn't actually used outside of the
Value contained in A_Const.  Replace that by an isnull field in
A_Const.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5ba6bc5b-3f95-04f2-2419-f8ddb4c046fb@enterprisedb.com
2021-09-09 08:36:53 +02:00
Peter Eisentraut cbdf75bf80 Remove useless casts
Casting the argument of strVal() to (Value *) is useless, since
strVal() already does that.

Most code didn't do that anyway; this was apparently just a style that
snuck into certain files.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5ba6bc5b-3f95-04f2-2419-f8ddb4c046fb@enterprisedb.com
2021-09-09 08:36:52 +02:00
Daniel Gustafsson f7c53bb9e3 Consistently use "superuser" instead of "super user"
The correct nomenclature for the highest privileged user is superuser
and not "super user", this replaces the few instances where that was
used erroneously. No user-visible changes are done as all changes are
in comments, so no back-patching.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CALj2ACW3snGBD8BAQiArMDS1Y43LuX3ymwO+N8aUg1Hrv6hYNw@mail.gmail.com
2021-09-08 17:02:18 +02:00
Amit Kapila 8bd5342740 Invalidate relcache for publications defined for all tables.
Updates/Deletes on a relation were allowed even without replica identity
after we define the publication for all tables. This would later lead to
an error on subscribers. The reason was that for such publications we were
not invalidating the relcache and the publication information for
relations was not getting rebuilt. Similarly, we were not invalidating the
relcache after dropping of such publications which will prohibit
Updates/Deletes without replica identity even without any publication.

Author: Vignesh C and Hou Zhijie
Reviewed-by: Hou Zhijie, Kyotaro Horiguchi, Amit Kapila
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/CALDaNm0pF6zeWqCA8TCe2sDuwFAy8fCqba=nHampCKag-qLixg@mail.gmail.com
2021-09-08 11:50:37 +05:30
Alvaro Herrera 0c6828fa98
Add PublicationTable and PublicationRelInfo structs
These encapsulate a relation when referred from replication DDL.
Currently they don't do anything useful (they're just wrappers around
RangeVar and Relation respectively) but in the future they'll be used to
carry column lists.

Extracted from a larger patch by Rahila Syed.

Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
2021-09-06 14:24:50 -03:00
Tom Lane db2760a841 Disallow creating an ICU collation if the DB encoding won't support it.
Previously this was allowed, but the collation effectively vanished
into the ether because of the way lookup_collation() works: you could
not use the collation, nor even drop it.  Seems better to give an
error up front than to leave the user wondering why it doesn't work.

(Because this test is in DefineCollation not CreateCollation, it does
not prevent pg_import_system_collations from creating ICU collations,
regardless of the initially-chosen encoding.)

Per bug #17170 from Andrew Bille.  Back-patch to v10 where ICU support
was added.

Discussion: https://postgr.es/m/17170-95845cf3f0a9c36d@postgresql.org
2021-09-03 16:39:03 -04:00
Tomas Vondra 537ca68dbb Identify simple column references in extended statistics
Until now, when defining extended statistics, everything except a plain
column reference was treated as complex expression. So for example "a"
was a column reference, but "(a)" would be an expression. In most cases
this does not matter much, but there were a couple strange consequences.
For example

    CREATE STATISTICS s ON a FROM t;

would fail, because extended stats require at least two columns. But

    CREATE STATISTICS s ON (a) FROM t;

would succeed, because that requirement does not apply to expressions.
Moreover, that statistics object is useless - the optimizer will always
use the regular statistics collected for attribute "a".

So do a bit more work to identify those expressions referencing a single
column, and translate them to a simple column reference. Backpatch to
14, where support for extended statistics on expressions was introduced.

Reported-by: Justin Pryzby
Backpatch-through: 14
Discussion: https://postgr.es/m/20210816013255.GS10479%40telsasoft.com
2021-09-01 17:41:56 +02:00
Alvaro Herrera 375aed36ad
Keep stats up to date for partitioned tables
In the long-going saga for analyze on partitioned tables, one thing I
missed while reverting 0827e8af70 is the maintenance of analyze count
and last analyze time for partitioned tables.  This is a mostly trivial
change that enables users assess the need for invoking manual ANALYZE on
partitioned tables.

This patch, posted by Justin and modified a bit by me (Álvaro), can be
mostly traced back to Hosoya-san, though any problems introduced with
the scissors are mine.

Backpatch to 14, in line with 6f8127b739.

Co-authored-by: Yuzuko Hosoya <yuzukohosoya@gmail.com>
Co-authored-by: Justin Pryzby <pryzby@telsasoft.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20210816222810.GE10479@telsasoft.com
2021-08-28 15:58:23 -04:00
Noah Misch 97ddda8a82 Fix data loss in wal_level=minimal crash recovery of CREATE TABLESPACE.
If the system crashed between CREATE TABLESPACE and the next checkpoint,
the result could be some files in the tablespace unexpectedly containing
no rows.  Affected files would be those for which the system did not
write WAL; see the wal_skip_threshold documentation.  Before v13, a
different set of conditions governed the writing of WAL; see v12's
<sect2 id="populate-pitr">.  (The v12 conditions were broader in some
ways and narrower in others.)  Users may want to audit non-default
tablespaces for unexpected short files.  The bug could have truncated an
index without affecting the associated table, and reindexing the index
would fix that particular problem.

This fixes the bug by making create_tablespace_directories() more like
TablespaceCreateDbspace().  create_tablespace_directories() was
recursively removing tablespace contents, reasoning that WAL redo would
recreate everything removed that way.  That assumption holds for other
wal_level values.  Under wal_level=minimal, the old approach could
delete files for which no other copy existed.  Back-patch to 9.6 (all
supported versions).

Reviewed by Robert Haas and Prabhat Sahu.  Reported by Robert Haas.

Discussion: https://postgr.es/m/CA+TgmoaLO9ncuwvr2nN-J4VEP5XyAcy=zKiHxQzBbFRxxGxm0w@mail.gmail.com
2021-08-27 23:33:23 -07:00
Stephen Frost ce42efaa26 Use maintenance_io_concurrency for ANALYZE prefetch
When prefetching pages for ANALYZE, we should be using
maintenance_io_concurrenty (by calling
get_tablespace_maintenance_io_concurrency(), not
get_tablespace_io_concurrency()).

ANALYZE prefetching was introduced in c6fc50c, so back-patch to 14.

Backpatch-through: 14
Reported-By: Egor Rogov
Discussion: https://postgr.es/m/9beada99-34ce-8c95-fadb-451768d08c64%40postgrespro.ru
2021-08-27 19:23:14 -04:00
Peter Geoghegan bda822554b track_io_timing logging: Don't special case 0 ms.
Adjust track_io_timing related logging code added by commit 94d13d474d.
Make it consistent with other nearby autovacuum and autoanalyze logging
code by removing logic that suppressed zero millisecond outputs.

log_autovacuum_min_duration log output now reliably shows "read:" and
"write:" millisecond-based values in its report (when track_io_timing is
enabled).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Stephen Frost <sfrost@snowman.net>
Discussion: https://postgr.es/m/CAH2-WznW0FNxSVQMSRazAMYNfZ6DR_gr5WE78hc6E1CBkkJpzw@mail.gmail.com
Backpatch: 14-, where the track_io_timing logging was introduced.
2021-08-27 13:34:00 -07:00
Peter Geoghegan fdfbfa24fa Reorder log_autovacuum_min_duration log output.
This order seems more natural.  It starts with details that are
particular to heap and index data structures, and ends with system-level
costs incurred during the autovacuum worker's VACUUM/ANALYZE operation.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkzxK6ahA9xxsOftRtBX_R0swuHZsvo4QUbak1Bz7hb7Q@mail.gmail.com
Backpatch: 14-, which enhanced the log output in various ways.
2021-08-27 13:08:41 -07:00
Amit Kapila 29b5905470 Fix toast rewrites in logical decoding.
Commit 325f2ec555 introduced pg_class.relwrite to skip operations on
tables created as part of a heap rewrite during DDL. It links such
transient heaps to the original relation OID via this new field in
pg_class but forgot to do anything about toast tables. So, logical
decoding was not able to skip operations on internally created toast
tables. This leads to an error when we tried to decode the WAL for the
next operation for which it appeared that there is a toast data where
actually it didn't have any toast data.

To fix this, we set pg_class.relwrite for internally created toast tables
as well which allowed skipping operations on them during logical decoding.

Author: Bertrand Drouvot
Reviewed-by: David Zhang, Amit Kapila
Backpatch-through: 11, where it was introduced
Discussion: https://postgr.es/m/b5146fb1-ad9e-7d6e-f980-98ed68744a7c@amazon.com
2021-08-25 09:53:07 +05:30
Amit Kapila 1046a69b30 Fix Alter Subscription's Add/Drop Publication behavior.
The current refresh behavior tries to just refresh added/dropped
publications but that leads to removing wrong tables from subscription. We
can't refresh just the dropped publication because it is quite possible
that some of the tables are removed from publication by that time and now
those will remain as part of the subscription. Also, there is a chance
that the tables that were part of the publication being dropped are also
part of another publication, so we can't remove those.

So, we decided that by default, add/drop commands will also act like
REFRESH PUBLICATION which means they will refresh all the publications. We
can keep the old behavior for "add publication" but it is better to be
consistent with "drop publication".

Author: Hou Zhijie
Reviewed-by: Masahiko Sawada, Amit Kapila
Backpatch-through: 14, where it was introduced
Discussion: https://postgr.es/m/OS0PR01MB5716935D4C2CC85A6143073F94EF9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2021-08-24 08:25:21 +05:30
Tom Lane 6b71c925cb Prevent ALTER TYPE/DOMAIN/OPERATOR from changing extension membership.
If recordDependencyOnCurrentExtension is invoked on a pre-existing,
free-standing object during an extension update script, that object
will become owned by the extension.  In our current code this is
possible in three cases:

* Replacing a "shell" type or operator.
* CREATE OR REPLACE overwriting an existing object.
* ALTER TYPE SET, ALTER DOMAIN SET, and ALTER OPERATOR SET.

The first of these cases is intentional behavior, as noted by the
existing comments for GenerateTypeDependencies.  It seems like
appropriate behavior for CREATE OR REPLACE too; at least, the obvious
alternatives are not better.  However, the fact that it happens during
ALTER is an artifact of trying to share code (GenerateTypeDependencies
and makeOperatorDependencies) between the CREATE and ALTER cases.
Since an extension script would be unlikely to ALTER an object that
didn't already belong to the extension, this behavior is not very
troubling for the direct target object ... but ALTER TYPE SET will
recurse to dependent domains, and it is very uncool for those to
become owned by the extension if they were not already.

Let's fix this by redefining the ALTER cases to never change extension
membership, full stop.  We could minimize the behavioral change by
only changing the behavior when ALTER TYPE SET is recursing to a
domain, but that would complicate the code and it does not seem like
a better definition.

Per bug #17144 from Alex Kozhemyakin.  Back-patch to v13 where ALTER
TYPE SET was added.  (The other cases are older, but since they only
affect the directly-named object, there's not enough of a problem to
justify changing the behavior further back.)

Discussion: https://postgr.es/m/17144-e67d7a8f049de9af@postgresql.org
2021-08-17 14:29:22 -04:00
Alvaro Herrera 6f8127b739
Revert analyze support for partitioned tables
This reverts the following commits:
1b5617eb84 Describe (auto-)analyze behavior for partitioned tables
0e69f705cc Set pg_class.reltuples for partitioned tables
41badeaba8 Document ANALYZE storage parameters for partitioned tables
0827e8af70 autovacuum: handle analyze for partitioned tables

There are efficiency issues in this code when handling databases with
large numbers of partitions, and it doesn't look like there isn't any
trivial way to handle those.  There are some other issues as well.  It's
now too late in the cycle for nontrivial fixes, so we'll have to let
Postgres 14 users continue to manually deal with ANALYZE their
partitioned tables, and hopefully we can fix the issues for Postgres 15.

I kept [most of] be280cdad2 ("Don't reset relhasindex for partitioned
tables on ANALYZE") because while we added it due to 0827e8af70, it is
a good bugfix in its own right, since it affects manual analyze as well
as autovacuum-induced analyze, and there's no reason to revert it.

I retained the addition of relkind 'p' to tables included by
pg_stat_user_tables, because reverting that would require a catversion
bump.
Also, in pg14 only, I keep a struct member that was added to
PgStat_TabStatEntry to avoid breaking compatibility with existing stat
files.

Backpatch to 14.

Discussion: https://postgr.es/m/20210722205458.f2bug3z6qzxzpx2s@alap3.anarazel.de
2021-08-16 17:27:52 -04:00
Daniel Gustafsson 069d33d0c5 Emit namespace in the post-copy errmsg
During a VACUUM or CLUSTER command, the initial output emits a
fully qualified relation path with namespace.  The post-action
errmsg only emitted the relation name however, which may lead
to hard to parse output when using multiple jobs with vacuumdb
as the output from different jobs may be interleaved.  Include
the full path in the post-action errmsg to be consistent with
the initial errmsg.

Author: Mike Fiedler <miketheman@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CAMerE0oz+8G-aORZL_BJcPxnBqewZAvND4bSUysjz+r-oT1BxQ@mail.gmail.com
2021-08-16 20:06:54 +02:00
Michael Paquier 7b565843a9 Add call to object access hook at the end of table rewrite in ALTER TABLE
ALTER TABLE .. SET {LOGGED,UNLOGGED,ACCESS METHOD} would never do a
table-level object access hook, which was inconsistent with SET
TABLESPACE.  Note that contrary to SET TABLESPACE, the no-op case is
left off for those commands as this requires tracking if commands have
been called, but they may not execute a physical rewrite.  Another thing
worth noting is that the physical file swap at the end of a rewrite
does a couple of access calls for internal objects created for the swap
operation (internal objects are for example skipped by the tests of
sepgsql), but this does not trigger the hook for the table on which the
operation is done.

f41872d, that added support for SET LOGGED/UNLOGGED in ALTER TABLE,
visibly forgot to consider that.

Based on what I checked, two regression tests of sepgsql in ddl.sql are
going to log more information with this test, something that buildfarm
member rhinoceros will tell soon enough.  I am not completely sure of
their format though, so these are not refreshed yet.

This is arguably a bug, but no backpatch is done as this could cause a
behavior change for anybody using object access hooks.

Reported-by: Jeff Davis
Discussion: https://postgr.es/m/YQJKV29/1a60uG68@paquier.xyz
2021-08-10 12:21:05 +09:00
David Rowley 4a3d806f38 Use ExplainPropertyInteger for queryid in EXPLAIN
This saves a few lines of code.  Also add a comment to mention why we use
ExplainPropertyInteger instead of ExplainPropertyUInteger given that
queryid is a uint64 type.

Author: David Rowley
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/CAApHDvqhSLYpSU_EqUdN39w9Uvb8ogmHV7_3YhJ0S3aScGBjsg@mail.gmail.com
Backpatch-through: 14, where this code was originally added
2021-08-09 15:47:23 +12:00
Tom Lane 9179a82d7a Really fix the ambiguity in REFRESH MATERIALIZED VIEW CONCURRENTLY.
Rather than trying to pick table aliases that won't conflict with
any possible user-defined matview column name, adjust the queries'
syntax so that the aliases are only used in places where they can't be
mistaken for column names.  Mostly this consists of writing "alias.*"
not just "alias", which adds clarity for humans as well as machines.
We do have the issue that "SELECT alias.*" acts differently from
"SELECT alias", but we can use the same hack ruleutils.c uses for
whole-row variables in SELECT lists: write "alias.*::compositetype".

We might as well revert to the original aliases after doing this;
they're a bit easier to read.

Like 75d66d10e, back-patch to all supported branches.

Discussion: https://postgr.es/m/2488325.1628261320@sss.pgh.pa.us
2021-08-07 13:29:32 -04:00
Peter Eisentraut f4f4a649d8 Message style improvements 2021-08-07 12:09:37 +02:00
Amit Kapila 63cf61cdeb Add prepare API support for streaming transactions in logical replication.
Commit a8fd13cab0 added support for prepared transactions to built-in
logical replication via a new option "two_phase" for a subscription. The
"two_phase" option was not allowed with the existing streaming option.

This commit permits the combination of "streaming" and "two_phase"
subscription options. It extends the pgoutput plugin and the subscriber
side code to add the prepare API for streaming transactions which will
apply the changes accumulated in the spool-file at prepare time.

Author: Peter Smith and Ajin Cherian
Reviewed-by: Vignesh C, Amit Kapila, Greg Nancarrow
Tested-By: Haiying Tang
Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru
Discussion: https://postgr.es/m/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3+Q@mail.gmail.com
2021-08-04 07:47:06 +05:30
Heikki Linnakangas 317632f307 Move InRecovery and standbyState global vars to xlogutils.c.
They are used in code that runs both during normal operation and during
WAL replay, and needs to behave differently during replay. Move them to
xlogutils.c, because that's where we have other helper functions used by
redo routines.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/b3b71061-4919-e882-4857-27e370ab134a%40iki.fi
2021-07-31 09:50:26 +03:00
Michael Paquier b0483263dd Add support for SET ACCESS METHOD in ALTER TABLE
The logic used to support a change of access method for a table is
similar to changes for tablespace or relation persistence, requiring a
table rewrite with an exclusive lock of the relation changed.  Table
rewrites done in ALTER TABLE already go through the table AM layer when
scanning tuples from the old relation and inserting them into the new
one, making this implementation straight-forward.

Note that partitioned tables are not supported as these have no access
methods defined.

Author: Justin Pryzby, Jeff Davis
Reviewed-by: Michael Paquier, Vignesh C
Discussion: https://postgr.es/m/20210228222530.GD20769@telsasoft.com
2021-07-28 10:10:44 +09:00
Tom Lane 024515cac5 In event triggers, use "pg_temp" only for our own temp schema.
pg_event_trigger_ddl_commands used "pg_temp" to refer to any
temp schema, not only that of the current backend.  This seems
like overreach.  It's somewhat unlikely that DDL commands would
refer to temp objects of other sessions to begin with, but if they
do, "pg_temp" would be a most misleading way to display the action.

While this seems like a bug, it's not quite out of the realm of
possibility that somebody out there is expecting the current
behavior.  Hence, fix in HEAD, but don't back-patch.

Discussion: https://postgr.es/m/CAAJ_b97W=QaGmag9AhWNbmx3uEYsNkXWL+OVW1_E1D3BtgWvtw@mail.gmail.com
2021-07-27 12:08:20 -04:00
Tom Lane 48c5c90682 Use the "pg_temp" schema alias in EXPLAIN and related output.
This patch causes EXPLAIN output to refer to objects that are in
the current session's temp schema with the "pg_temp" schema alias
rather than that schema's actual name.  This is useful for our own
testing purposes since it will stabilize EXPLAIN VERBOSE output
for such cases, allowing us to use that in regression tests.
It should be less confusing for end users too.

Since ruleutils.c needs to change behavior for this, the change
also leaks into a few other users of ruleutils.c, for example
pg_get_viewdef().  AFAICS that won't cause any problems.
We did find that aggressively trying to change this behavior
across-the-board would cause issues, but as long as "pg_temp"
only appears within generated SQL text, I think it'll be fine.

Along the way, make get_namespace_name_or_temp conform to the
same API as get_namespace_name, ie that it returns a palloc'd
string or NULL.  The current behavior hasn't caused any bugs
since no callers attempt to pfree the result, but if it gets
more widespread usage that could become a problem.

Amul Sul, reviewed and extended by me

Discussion: https://postgr.es/m/CAAJ_b97W=QaGmag9AhWNbmx3uEYsNkXWL+OVW1_E1D3BtgWvtw@mail.gmail.com
2021-07-27 12:03:16 -04:00
Alvaro Herrera 21b3aa9c8f
Remove newly added useless assertion check
Coverity complained that my commit 80ba4bb383 added a dubious coding
for a consistency check that there isn't more than one row for a certain
tgrelid/tgparentid combination.  But we don't check for that explicitly
anywhere else, and if we were to do it, it should be a full
shouldn't-happen elog not just an assert.  It doesn't seem that this is
very important anyway, so remove it.

Discussion: https://postgr.es/m/1337562.1627224583@sss.pgh.pa.us
2021-07-26 12:56:33 -04:00
Alvaro Herrera 80ba4bb383
Make ALTER TRIGGER RENAME consistent for partitioned tables
Renaming triggers on partitioned tables had two problems: first,
it did not recurse to renaming the triggers on the partitions; and
second, it failed to prohibit renaming clone triggers.  Having triggers
with different names in partitions is pointless, and furthermore pg_dump
would not preserve names for partitions anyway.

Not backpatched -- making the ALTER TRIGGER throw an error in stable
versions might cause problems for existing scripts.

Co-authored-by: Arne Roland <A.Roland@index.de>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/d0fd7040c2fb4de1a111b9d9ccc456b8@index.de
2021-07-22 18:33:47 -04:00
Peter Eisentraut 2b00db4fb0 Use l*_node() family of functions where appropriate
Instead of castNode(…, lfoo(…))

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87eecahraj.fsf@wibble.ilmari.org
2021-07-19 08:20:24 +02:00
Amit Kapila 29abde637b Don't allow to set replication slot_name as ''.
We don't allow to create replication slot_name as an empty string ('') via
SQL API pg_create_logical_replication_slot() but it is allowed to be set
via Alter Subscription command. This will lead to apply worker repeatedly
keep trying to stream data via slot_name '' and the user is not allowed to
create the slot with that name.

Author: Japin Li
Reviewed-By: Ranier Vilela, Amit Kapila
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/MEYP282MB1669CBD98E721C77CA696499B61A9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2021-07-19 10:36:15 +05:30
Amit Kapila dcecdfafbd Update comments for AlterSubscription.
Add explanation as to why the subscription needs to be disabled to allow
slot_name as none.

Author: Japin Li and Amit Kapila
Discussion: https://postgr.es/m/MEYP282MB1669CBD98E721C77CA696499B61A9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2021-07-19 08:32:37 +05:30
Dean Rasheed ba620760c4 Improve error checking of CREATE COLLATION options.
Check for conflicting or redundant options, as we do for most other
commands. Specifying any option more than once is at best redundant,
and quite likely indicates a bug in the user's code.

While at it, improve the error for conflicting locale options by
adding detail text (the same as for CREATE DATABASE).

Bharath Rupireddy, reviewed by Vignesh C. Some additional hacking by
me.

Discussion: https://postgr.es/m/CALj2ACWtL6fTLdyF4R_YkPtf1YEDb6FUoD5DGAki3rpD+sWqiA@mail.gmail.com
2021-07-18 11:08:34 +01:00
Alvaro Herrera df80fa2ee5
Preserve firing-on state when cloning row triggers to partitions
When triggers are cloned from partitioned tables to their partitions,
the 'tgenabled' flag (origin/replica/always/disable) was not propagated.
Make it so that the flag on the trigger on partition is initially set to
the same value as on the partitioned table.

Add a test case to verify the behavior.

Backpatch to 11, where this appeared in commit 86f575948c.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200930223450.GA14848@telsasoft.com
2021-07-16 13:01:43 -04:00
Tom Lane a49d081235 Replace explicit PIN entries in pg_depend with an OID range test.
As of v14, pg_depend contains almost 7000 "pin" entries recording
the OIDs of built-in objects.  This is a fair amount of bloat for
every database, and it adds time to pg_depend lookups as well as
initdb.  We can get rid of all of those entries in favor of an OID
range check, i.e. "OIDs below FirstUnpinnedObjectId are pinned".

(template1 and the public schema are exceptions.  Those exceptions
are now wired into IsPinnedObject() instead of initdb's code for
filling pg_depend, but it's the same amount of cruft either way.)

The contents of pg_shdepend are modified likewise.

Discussion: https://postgr.es/m/3737988.1618451008@sss.pgh.pa.us
2021-07-15 11:41:47 -04:00
Dean Rasheed 2bfb50b3df Improve reporting of "conflicting or redundant options" errors.
When reporting "conflicting or redundant options" errors, try to
ensure that errposition() is used, to help the user identify the
offending option.

Formerly, errposition() was invoked in less than 60% of cases. This
patch raises that to over 90%, but there remain a few places where the
ParseState is not readily available. Using errdetail() might improve
the error in such cases, but that is left as a task for the future.

Additionally, since this error is thrown from over 100 places in the
codebase, introduce a dedicated function to throw it, reducing code
duplication.

Extracted from a slightly larger patch by Vignesh C. Reviewed by
Bharath Rupireddy, Alvaro Herrera, Dilip Kumar, Hou Zhijie, Peter
Smith, Daniel Gustafsson, Julien Rouhaud and me.

Discussion: https://postgr.es/m/CALDaNm33FFSS5tVyvmkoK2cCMuDVxcui=gFrjti9ROfynqSAGA@mail.gmail.com
2021-07-15 08:49:45 +01:00
John Naylor c203dcddf9 Remove unused function parameter in get_qual_from_partbound
Commit 0563a3a8b changed how partition constraints were generated such
that this function no longer computes the mapping of parent attnos to
child attnos.

This is an external function that extensions could use, so this is
potentially a breaking change. No external callers are known, however,
and this will make it simpler to write such callers in the future.

Author: Hou Zhijie
Reviewed-by: David Rowley, Michael Paquier, Soumyadeep Chakraborty
Discussion: https://www.postgresql.org/message-id/flat/OS0PR01MB5716A75A45BE46101A1B489894379@OS0PR01MB5716.jpnprd01.prod.outlook.com
2021-07-14 09:52:04 -04:00
Amit Kapila a8fd13cab0 Add support for prepared transactions to built-in logical replication.
To add support for streaming transactions at prepare time into the
built-in logical replication, we need to do the following things:

* Modify the output plugin (pgoutput) to implement the new two-phase API
callbacks, by leveraging the extended replication protocol.

* Modify the replication apply worker, to properly handle two-phase
transactions by replaying them on prepare.

* Add a new SUBSCRIPTION option "two_phase" to allow users to enable
two-phase transactions. We enable the two_phase once the initial data sync
is over.

We however must explicitly disable replication of two-phase transactions
during replication slot creation, even if the plugin supports it. We
don't need to replicate the changes accumulated during this phase,
and moreover, we don't have a replication connection open so we don't know
where to send the data anyway.

The streaming option is not allowed with this new two_phase option. This
can be done as a separate patch.

We don't allow to toggle two_phase option of a subscription because it can
lead to an inconsistent replica. For the same reason, we don't allow to
refresh the publication once the two_phase is enabled for a subscription
unless copy_data option is false.

Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich
Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow
Tested-By: Haiying Tang
Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru
Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com
2021-07-14 07:33:50 +05:30
David Rowley 83f4fcc655 Change the name of the Result Cache node to Memoize
"Result Cache" was never a great name for this node, but nobody managed
to come up with another name that anyone liked enough.  That was until
David Johnston mentioned "Node Memoization", which Tom Lane revised to
just "Memoize".  People seem to like "Memoize", so let's do the rename.

Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/20210708165145.GG1176@momjian.us
Backpatch-through: 14, where Result Cache was introduced
2021-07-14 12:43:58 +12:00
David Rowley e0271d5f1e Remove useless range checks on INT8 sequences
There's no point in checking if an INT8 sequence has a seqmin and seqmax
value is outside the range of the minimum and maximum values for an int64
type.  These both use the same underlying types so an INT8 certainly
cannot be outside the minimum and maximum values supported by int64.

This code is fairly harmless and it seems likely that most compilers
would optimize it out anyway, never-the-less, let's remove it replacing
it with a small comment to mention why the check is not needed.

Author: Greg Nancarrow, with the comment revised by David Rowley
Discussion: https://postgr.es/m/CAJcOf-c9KBUZ8ow_6e%3DWSfbbEyTKfqV%3DVwoFuODQVYMySHtusw%40mail.gmail.com
2021-07-13 13:56:59 +12:00
Tom Lane f10f0ae420 Replace RelationOpenSmgr() with RelationGetSmgr().
The idea behind this patch is to design out bugs like the one fixed
by commit 9d523119f.  Previously, once one did RelationOpenSmgr(rel),
it was considered okay to access rel->rd_smgr directly for some
not-very-clear interval.  But since that pointer will be cleared by
relcache flushes, we had bugs arising from overreliance on a previous
RelationOpenSmgr call still being effective.

Now, very little code except that in rel.h and relcache.c should ever
touch the rd_smgr field directly.  The normal coding rule is to use
RelationGetSmgr(rel) and not expect the result to be valid for longer
than one smgr function call.  There are a couple of places where using
the function every single time seemed like overkill, but they are now
annotated with large warning comments.

Amul Sul, after an idea of mine.

Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com
2021-07-12 17:01:36 -04:00
Tom Lane 626731db26 Lock the extension during ALTER EXTENSION ADD/DROP.
Although we were careful to lock the object being added or dropped,
we failed to get any sort of lock on the extension itself.  This
allowed the ALTER to proceed in parallel with a DROP EXTENSION,
which is problematic for a couple of reasons.  If both commands
succeeded we'd be left with a dangling link in pg_depend, which
would cause problems later.  Also, if the ALTER failed for some
reason, it might try to print the extension's name, and that could
result in a crash or (in older branches) a silly error message
complaining about extension "(null)".

Per bug #17098 from Alexander Lakhin.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/17098-b960f3616c861f83@postgresql.org
2021-07-11 12:54:24 -04:00
Peter Eisentraut 2ed532ee8c Improve error messages about mismatching relkind
Most error messages about a relkind that was not supported or
appropriate for the command was of the pattern

    "relation \"%s\" is not a table, foreign table, or materialized view"

This style can become verbose and tedious to maintain.  Moreover, it's
not very helpful: If I'm trying to create a comment on a TOAST table,
which is not supported, then the information that I could have created
a comment on a materialized view is pointless.

Instead, write the primary error message shorter and saying more
directly that what was attempted is not possible.  Then, in the detail
message, explain that the operation is not supported for the relkind
the object was.  To simplify that, add a new function
errdetail_relkind_not_supported() that does this.

In passing, make use of RELKIND_HAS_STORAGE() where appropriate,
instead of listing out the relkinds individually.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/dc35a398-37d0-75ce-07ea-1dd71d98f8ec@2ndquadrant.com
2021-07-08 09:44:51 +02:00
Amit Kapila 8aafb02616 Refactor function parse_subscription_options.
Instead of using multiple parameters in parse_subscription_options
function signature, use the struct SubOpts that encapsulate all the
subscription options and their values. It will be useful for future work
where we need to add other options in the subscription. Also, use bitmaps
to pass the supported and retrieve the specified options much like the way
it is done in the commit a3dc926009.

Author: Bharath Rupireddy
Reviewed-By: Peter Smith, Amit Kapila, Alvaro Herrera
Discussion: https://postgr.es/m/CALj2ACXtoQczfNsDQWobypVvHbX2DtgEHn8DawS0eGFwuo72kw@mail.gmail.com
2021-07-06 07:46:50 +05:30
Alvaro Herrera d700518d74
Don't reset relhasindex for partitioned tables on ANALYZE
Commit 0e69f705cc introduced code to analyze partitioned table;
however, that code fails to preserve pg_class.relhasindex correctly.
Fix by observing whether any indexes exist rather than accidentally
falling through to assuming none do.

Backpatch to 14.

Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/CALNJ-vS1R3Qoe5t4tbzxrkpBtzRbPq1dDcW4RmA_a+oqweF30w@mail.gmail.com
2021-07-01 12:56:30 -04:00
Peter Eisentraut c31833779d Message style improvements 2021-06-28 08:36:44 +02:00
Michael Paquier 79718c1c6c Fix variable initialization with ALTER SUBSCRIPTION DROP PUBLICATION
copy_data is not a supported option with this sub-command of ALTER
SUBSCRIPTION, which would not make a variable related to it initialized
after parsing the option set in DefElems.  A refresh could then refer to
it.

Author: Ranier Vilela
Reviewed-by: Peter Smith
Discussion: https://postgr.es/m/CAEudQAp5P8nr=ze2Gv=BMj=DJFZnrvendZCZcC-fos3QiDe2sg@mail.gmail.com
2021-06-28 12:11:18 +09:00
Tom Lane 5a0f1c8c01 Remove unnecessary failure cases in RemoveRoleFromObjectPolicy().
It's not really necessary for this function to open or lock the
relation associated with the pg_policy entry it's modifying.  The
error checks it's making on the rel are if anything counterproductive
(e.g., if we don't want to allow installation of policies on system
catalogs, here is not the place to prevent that).  In particular, it
seems just wrong to insist on an ownership check.  That has the net
effect of forcing people to use superuser for DROP OWNED BY, which
surely is not an effect we want.  Also there is no point in rebuilding
the dependencies of the policy expressions, which aren't being
changed.  Lastly, locking the table also seems counterproductive; it's
not helping to prevent race conditions, since we failed to re-read the
pg_policy row after acquiring the lock.  That means that concurrent
DDL would likely result in "tuple concurrently updated/deleted"
errors; which is the same behavior this code will produce, with less
overhead.

Per discussion of bug #17062.  Back-patch to all supported versions,
as the failure cases this eliminates seem just as undesirable in 9.6
as in HEAD.

Discussion: https://postgr.es/m/1573181.1624220108@sss.pgh.pa.us
2021-06-25 13:59:38 -04:00
Peter Eisentraut e59d428f34 Fixes in ALTER SUBSCRIPTION DROP PUBLICATION code
ALTER SUBSCRIPTION DROP PUBLICATION does not actually support
copy_data option, so remove it from tab completion.

Also, reword the error message that is thrown when all the
publications from a subscription are specified to be dropped.

Also, made few doc and cosmetic adjustments.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CALDaNm21RwsDzs4xj14ApteAF7auyyomHNnp+NEL-sH8m-jMvQ@mail.gmail.com
2021-06-25 09:57:02 +02:00
Peter Geoghegan 3499df0dee Support disabling index bypassing by VACUUM.
Generalize the INDEX_CLEANUP VACUUM parameter (and the corresponding
reloption): make it into a ternary style boolean parameter.  It now
exposes a third option, "auto".  The "auto" option (which is now the
default) enables the "bypass index vacuuming" optimization added by
commit 1e55e7d1.

"VACUUM (INDEX_CLEANUP TRUE)" is redefined to once again make VACUUM
simply do any required index vacuuming, regardless of how few dead
tuples are encountered during the first scan of the target heap relation
(unless there are exactly zero).  This gives users a way of opting out
of the "bypass index vacuuming" optimization, if for whatever reason
that proves necessary.  It is also expected to be used by PostgreSQL
developers as a testing option from time to time.

"VACUUM (INDEX_CLEANUP FALSE)" does the same thing as it always has: it
forcibly disables both index vacuuming and index cleanup.  It's not
expected to be used much in PostgreSQL 14.  The failsafe mechanism added
by commit 1e55e7d1 addresses the same problem in a simpler way.
INDEX_CLEANUP can now be thought of as a testing and compatibility
option.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CAH2-WznrBoCST4_Gxh_G9hA8NzGUbeBGnOUC8FcXcrhqsv6OHQ@mail.gmail.com
2021-06-18 20:04:07 -07:00
Tom Lane d21fca0843 Fix misbehavior of DROP OWNED BY with duplicate polroles entries.
Ordinarily, a pg_policy.polroles array wouldn't list the same role
more than once; but CREATE POLICY does not prevent that.  If we
perform DROP OWNED BY on a role that is listed more than once,
RemoveRoleFromObjectPolicy either suffered an assertion failure
or encountered a tuple-updated-by-self error.  Rewrite it to cope
correctly with duplicate entries, and add a CommandCounterIncrement
call to prevent the other problem.

Per discussion, there's other cleanup that ought to happen here,
but this seems like the minimum essential fix.

Per bug #17062 from Alexander Lakhin.  It's been broken all along,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17062-11f471ae3199ca23@postgresql.org
2021-06-18 18:00:09 -04:00
Tom Lane 7c337b6b52 Centralize the logic for protective copying of utility statements.
In the "simple Query" code path, it's fine for parse analysis or
execution of a utility statement to scribble on the statement's node
tree, since that'll just be thrown away afterwards.  However it's
not fine if the node tree is in the plan cache, as then it'd be
corrupted for subsequent executions.  Up to now we've dealt with
that by having individual utility-statement functions apply
copyObject() if they were going to modify the tree.  But that's
prone to errors of omission.  Bug #17053 from Charles Samborski
shows that CREATE/ALTER DOMAIN didn't get this memo, and can
crash if executed repeatedly from plan cache.

In the back branches, we'll just apply a narrow band-aid for that,
but in HEAD it seems prudent to have a more principled fix that
will close off the possibility of other similar bugs in future.
Hence, let's hoist the responsibility for doing copyObject up into
ProcessUtility from its children, thus ensuring that it happens for
all utility statement types.

Also, modify ProcessUtility's API so that its callers can tell it
whether a copy step is necessary.  It turns out that in all cases,
the immediate caller knows whether the node tree is transient, so
this doesn't involve a huge amount of code thrashing.  In this way,
while we lose a little bit in the execute-from-cache code path due
to sometimes copying node trees that wouldn't be mutated anyway,
we gain something in the simple-Query code path by not copying
throwaway node trees.  Statements that are complex enough to be
expensive to copy are almost certainly ones that would have to be
copied anyway, so the loss in the cache code path shouldn't be much.

(Note that this whole problem applies only to utility statements.
Optimizable statements don't have the issue because we long ago made
the executor treat Plan trees as read-only.  Perhaps someday we will
make utility statement execution act likewise, but I'm not holding
my breath.)

Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us
Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
2021-06-18 11:22:58 -04:00
Tom Lane 131ea3e908 Fix plancache refcount leak after error in ExecuteQuery.
When stuffing a plan from the plancache into a Portal, one is
not supposed to risk throwing an error between GetCachedPlan and
PortalDefineQuery; if that happens, the plan refcount incremented
by GetCachedPlan will be leaked.  I managed to break this rule
while refactoring code in 9dbf2b7d7.  There is no visible
consequence other than some memory leakage, and since nobody is
very likely to trigger the relevant error conditions many times
in a row, it's not surprising we haven't noticed.  Nonetheless,
it's a bug, so rearrange the order of operations to remove the
hazard.

Noted on the way to looking for a better fix for bug #17053.
This mistake is pretty old, so back-patch to all supported
branches.
2021-06-16 19:30:17 -04:00
Tom Lane 6b787d9e32 Improve SQLSTATE reporting in some replication-related code.
I started out with the goal of reporting ERRCODE_CONNECTION_FAILURE
when walrcv_connect() fails, but as I looked around I realized that
whoever wrote this code was of the opinion that errcodes are purely
optional.  That's not my understanding of our project policy.  Hence,
make sure that an errcode is provided in each ereport that (a) is
ERROR or higher level and (b) isn't arguably an internal logic error.
Also fix some very dubious existing errcode assignments.

While this is not per policy, it's also largely cosmetic, since few
of these cases could get reported to applications.  So I don't
feel a need to back-patch.

Discussion: https://postgr.es/m/2189704.1623512522@sss.pgh.pa.us
2021-06-16 11:52:05 -04:00
Alexander Korotkov 817bb0a7d1 Revert 29854ee8d1 due to buildfarm failures
Reported-by: Tom Lane
Discussion: https://postgr.es/m/CAPpHfdvcnw3x7jdV3r52p4%3D5S4WUxBCzcQKB3JukQHoicv1LSQ%40mail.gmail.com
2021-06-15 21:44:40 +03:00
Alexander Korotkov 29854ee8d1 Support for unnest(multirange) and cast multirange as an array of ranges
It has been spotted that multiranges lack of ability to decompose them into
individual ranges.  Subscription and proper expanded object representation
require substantial work, and it's too late for v14.  This commit
provides the implementation of unnest(multirange) and cast multirange as
an array of ranges, which is quite trivial.

unnest(multirange) is defined as a polymorphic procedure.  The catalog
description of the cast underlying procedure is duplicated for each multirange
type because we don't have anyrangearray polymorphic type to use here.

Catversion is bumped.

Reported-by: Jonathan S. Katz
Discussion: https://postgr.es/m/flat/60258efe-bd7e-4886-82e1-196e0cac5433%40postgresql.org
Author: Alexander Korotkov
Reviewed-by: Justin Pryzby, Jonathan S. Katz, Zhihong Yu
2021-06-15 15:59:20 +03:00
Michael Paquier 2d689babe3 Improve handling of dropped objects in pg_event_trigger_ddl_commands()
An object found as dropped when digging into the list of objects
returned by pg_event_trigger_ddl_commands() could cause a cache lookup
error, as the calls grabbing for the object address and the type name
would fail if the object was missing.

Those lookup errors could be seen with combinations of ALTER TABLE
sub-commands involving identity columns.  The lookup logic is changed in
this code path to get a behavior similar to any other SQL-callable
function by ignoring objects that are not found, taking advantage of
2a10fdc.  The back-branches are not changed, as they require this commit
that is too invasive for stable branches.

While on it, add test cases to exercise event triggers with identity
columns, and stress more cases with the event ddl_command_end for
relations.

Author: Sven Klemm, Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/CAMCrgp2R1cEXU53iYKtW6yVEp2_yKUz+z=3-CTrYpPP+xryRtg@mail.gmail.com
2021-06-14 14:57:22 +09:00
Tom Lane e56bce5d43 Reconsider the handling of procedure OUT parameters.
Commit 2453ea142 redefined pg_proc.proargtypes to include the types of
OUT parameters, for procedures only.  While that had some advantages
for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty
disastrous from a number of other perspectives.  Notably, since the
primary key of pg_proc is name + proargtypes, this made it possible to
have multiple procedures with identical names + input arguments and
differing output argument types.  That would make it impossible to call
any one of the procedures by writing just NULL (or "?", or any other
data-type-free notation) for the output argument(s).  The change also
seems likely to cause grave confusion for client applications that
examine pg_proc and expect the traditional definition of proargtypes.

Hence, revert the definition of proargtypes to what it was, and
undo a number of complications that had been added to support that.

To support the SQL-spec behavior of DROP PROCEDURE, when there are
no argmode markers in the command's parameter list, we perform the
lookup both ways (that is, matching against both proargtypes and
proallargtypes), succeeding if we get just one unique match.
In principle this could result in ambiguous-function failures
that would not happen when using only one of the two rules.
However, overloading of procedure names is thought to be a pretty
rare usage, so this shouldn't cause many problems in practice.
Postgres-specific code such as pg_dump can defend against any
possibility of such failures by being careful to specify argmodes
for all procedure arguments.

This also fixes a few other bugs in the area of CALL statements
with named parameters, and improves the documentation a little.

catversion bump forced because the representation of procedures
with OUT arguments changes.

Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 17:11:36 -04:00
Tom Lane ba2c6d6cec Avoid misbehavior when persisting a non-stable cursor.
PersistHoldablePortal has long assumed that it should store the
entire output of the query-to-be-persisted, which requires rewinding
and re-reading the output.  This is problematic if the query is not
stable: we might get different row contents, or even a different
number of rows, which'd confuse the cursor state mightily.

In the case where the cursor is NO SCROLL, this is very easy to
solve: just store the remaining query output, without any rewinding,
and tweak the portal's cursor state to match.  Aside from removing
the semantic problem, this could be significantly more efficient
than storing the whole output.

If the cursor is scrollable, there's not much we can do, but it
was already the case that scrolling a volatile query's result was
pretty unsafe.  We can just document more clearly that getting
correct results from that is not guaranteed.

There are already prohibitions in place on using SCROLL with
FOR UPDATE/SHARE, which is one way for a SELECT query to have
non-stable results.  We could imagine prohibiting SCROLL when
the query contains volatile functions, but that would be
expensive to enforce.  Moreover, it could break applications
that work just fine, if they have functions that are in fact
stable but the user neglected to mark them so.  So settle for
documenting the hazard.

While this problem has existed in some guise for a long time,
it got a lot worse in v11, which introduced the possibility
of persisting plpgsql cursors (perhaps implicit ones) even
when they violate the rules for what can be marked WITH HOLD.
Hence, I've chosen to back-patch to v11 but not further.

Per bug #17050 from Алексей Булгаков.

Discussion: https://postgr.es/m/17050-f77aa827dc85247c@postgresql.org
2021-06-08 17:50:29 -04:00
Michael Paquier 187682c321 Reduce risks of conflicts in internal queries of REFRESH MATVIEW CONCURRENTLY
The internal SQL queries used by REFRESH MATERIALIZED VIEW CONCURRENTLY
include some aliases for its diff and temporary relations with
rather-generic names: diff, newdata, newdata2 and mv.  Depending on the
queries used for the materialized view, using CONCURRENTLY could lead to
some internal failures if the matview query and those internal aliases
conflict.

Those names have been chosen in 841c29c8.  This commit switches instead
to a naming pattern which is less likely going to cause conflicts, based
on an idea from Thomas Munro, by appending _$ to those aliases.  This is
not perfect as those new names could still conflict, but at least it has
the advantage to keep the code readable and simple while reducing the
likelihood of conflicts to be close to zero.

Reported-by: Mathis Rudolf
Author: Bharath Rupireddy
Reviewed-by: Bernd Helmle, Thomas Munro, Michael Paquier
Discussion: https://postgr.es/m/109c267a-10d2-3c53-b60e-720fcf44d9e8@credativ.de
Backpatch-through: 9.6
2021-06-03 15:28:24 +09:00
Tom Lane e6241d8e03 Rethink definition of pg_attribute.attcompression.
Redefine '\0' (InvalidCompressionMethod) as meaning "if we need to
compress, use the current setting of default_toast_compression".
This allows '\0' to be a suitable default choice regardless of
datatype, greatly simplifying code paths that initialize tupledescs
and the like.  It seems like a more user-friendly approach as well,
because now the default compression choice doesn't migrate into table
definitions, meaning that changing default_toast_compression is
usually sufficient to flip an installation's behavior; one needn't
tediously issue per-column ALTER SET COMPRESSION commands.

Along the way, fix a few minor bugs and documentation issues
with the per-column-compression feature.  Adopt more robust
APIs for SetIndexStorageProperties and GetAttributeCompression.

Bump catversion because typical contents of attcompression will now
be different.  We could get away without doing that, but it seems
better to ensure v14 installations all agree on this.  (We already
forced initdb for beta2, anyway.)

Discussion: https://postgr.es/m/626613.1621787110@sss.pgh.pa.us
2021-05-27 13:24:27 -04:00
Peter Eisentraut 388e75ad33 Replace run-time error check with assertion
The error message was checking that the structures returned from the
parser matched expectations.  That's something we usually use
assertions for, not a full user-facing error message.  So replace that
with an assertion (hidden inside lfirst_node()).

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/452e9df8-ec89-e01b-b64a-8cc6ce830458%40enterprisedb.com
2021-05-27 09:54:14 +02:00
Amit Kapila 0734b0e983 Improve docs and error messages for parallel vacuum.
The error messages, docs, and one of the options were using
'parallel degree' to indicate parallelism used by vacuum command. We
normally use 'parallel workers' at other places so change it for parallel
vacuum accordingly.

Author: Bharath Rupireddy
Reviewed-by: Dilip Kumar, Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/CALj2ACWz=PYrrFXVsEKb9J1aiX4raA+UBe02hdRp_zqDkrWUiw@mail.gmail.com
2021-05-25 09:26:53 +05:30
Tom Lane f5024d8d7b Re-order pg_attribute columns to eliminate some padding space.
Now that attcompression is just a char, there's a lot of wasted
padding space after it.  Move it into the group of char-wide
columns to save a net of 4 bytes per pg_attribute entry.  While
we're at it, swap the order of attstorage and attalign to make for
a more logical grouping of these columns.

Also re-order actions in related code to match the new field ordering.

This patch also fixes one outright bug: equalTupleDescs() failed to
compare attcompression.  That could, for example, cause relcache
reload to fail to adopt a new value following a change.

Michael Paquier and Tom Lane, per a gripe from Andres Freund.

Discussion: https://postgr.es/m/20210517204803.iyk5wwvwgtjcmc5w@alap3.anarazel.de
2021-05-23 12:12:09 -04:00
Tom Lane b39630fd41 Fix access to no-longer-open relcache entry in logical-rep worker.
If we redirected a replicated tuple operation into a partition child
table, and then tried to fire AFTER triggers for that event, the
relation cache entry for the child table was already closed.  This has
no visible ill effects as long as the entry is still there and still
valid, but an unluckily-timed cache flush could result in a crash or
other misbehavior.

To fix, postpone the ExecCleanupTupleRouting call (which is what
closes the child table) until after we've fired triggers.  This
requires a bit of refactoring so that the cleanup function can
have access to the necessary state.

In HEAD, I took the opportunity to simplify some of worker.c's
function APIs based on use of the new ApplyExecutionData struct.
However, it doesn't seem safe/practical to back-patch that aspect,
at least not without a lot of analysis of possible interactions
with a04daa97a.

In passing, add an Assert to afterTriggerInvokeEvents to catch
such cases.  This seems worthwhile because we've grown a number
of fairly unstructured ways of calling AfterTriggerEndQuery.

Back-patch to v13, where worker.c grew the ability to deal with
partitioned target tables.

Discussion: https://postgr.es/m/3382681.1621381328@sss.pgh.pa.us
2021-05-22 21:25:29 -04:00
Tom Lane 2b0ee126bb Fix usage of "tableoid" in GENERATED expressions.
We consider this supported (though I've got my doubts that it's a
good idea, because tableoid is not immutable).  However, several
code paths failed to fill the field in soon enough, causing such
a GENERATED expression to see zero or the wrong value.  This
occurred when ALTER TABLE adds a new GENERATED column to a table
with existing rows, and during regular INSERT or UPDATE on a
foreign table with GENERATED columns.

Noted during investigation of a report from Vitaly Ustinov.
Back-patch to v12 where GENERATED came in.

Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
2021-05-21 15:02:06 -04:00
Tom Lane 84f5c2908d Restore the portal-level snapshot after procedure COMMIT/ROLLBACK.
COMMIT/ROLLBACK necessarily destroys all snapshots within the session.
The original implementation of intra-procedure transactions just
cavalierly did that, ignoring the fact that this left us executing in
a rather different environment than normal.  In particular, it turns
out that handling of toasted datums depends rather critically on there
being an outer ActiveSnapshot: otherwise, when SPI or the core
executor pop whatever snapshot they used and return, it's unsafe to
dereference any toasted datums that may appear in the query result.
It's possible to demonstrate "no known snapshots" and "missing chunk
number N for toast value" errors as a result of this oversight.

Historically this outer snapshot has been held by the Portal code,
and that seems like a good plan to preserve.  So add infrastructure
to pquery.c to allow re-establishing the Portal-owned snapshot if it's
not there anymore, and add enough bookkeeping support that we can tell
whether it is or not.

We can't, however, just re-establish the Portal snapshot as part of
COMMIT/ROLLBACK.  As in normal transaction start, acquiring the first
snapshot should wait until after SET and LOCK commands.  Hence, teach
spi.c about doing this at the right time.  (Note that this patch
doesn't fix the problem for any PLs that try to run intra-procedure
transactions without using SPI to execute SQL commands.)

This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD,
rename that to allow_nonatomic.

replication/logical/worker.c also needs some fixes, because it wasn't
careful to hold a snapshot open around AFTER trigger execution.
That code doesn't use a Portal, which I suspect someday we're gonna
have to fix.  But for now, just rearrange the order of operations.
This includes back-patching the recent addition of finish_estate()
to centralize the cleanup logic there.

This also back-patches commit 2ecfeda3e into v13, to improve the
test coverage for worker.c (it was that test that exposed that
worker.c's snapshot management is wrong).

Per bug #15990 from Andreas Wicht.  Back-patch to v11 where
intra-procedure COMMIT was added.

Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
2021-05-21 14:03:59 -04:00
Alvaro Herrera cafde58b33
Allow compute_query_id to be set to 'auto' and make it default
Allowing only on/off meant that all either all existing configuration
guides would become obsolete if we disabled it by default, or that we
would have to accept a performance loss in the default config if we
enabled it by default.  By allowing 'auto' as a middle ground, the
performance cost is only paid by those who enable pg_stat_statements and
similar modules.

I only edited the release notes to comment-out a paragraph that is now
factually wrong; further edits are probably needed to describe the
related change in more detail.

Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20210513002623.eugftm4nk2lvvks3@nol
2021-05-15 14:13:09 -04:00
Tom Lane def5b065ff Initial pgindent and pgperltidy run for v14.
Also "make reformat-dat-files".

The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
2021-05-12 13:14:10 -04:00
Fujii Masao d780d7c088 Change data type of counters in BufferUsage and WalUsage from long to int64.
Previously long was used as the data type for some counters in BufferUsage
and WalUsage. But long is only four byte, e.g., on Windows, and it's entirely
possible to wrap a four byte counter. For example, emitting more than
four billion WAL records in one transaction isn't actually particularly rare.

To avoid the overflows of those counters, this commit changes the data type
of them from long to int64.

Suggested-by: Andres Freund
Author: Masahiro Ikeda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20201221211650.k7b53tcnadrciqjo@alap3.anarazel.de
Discussion: https://postgr.es/m/af0964ac-7080-1984-dc23-513754987716@oss.nttdata.com
2021-05-12 09:56:34 +09:00
Michael Paquier 829daab4bb Fix typos in operatorcmds.c
Author: Kyotaro Horiguchi, Justin Pryzby
Discussion: https://postgr.es/m/20210428.173633.1525059946206239295.horikyota.ntt@gmail.com
2021-05-10 15:45:54 +09:00
Michael Paquier 9681f2160d Fix incorrect error code for CREATE/ALTER TABLE COMPRESSION
Specifying an incorrect value for the compression method of an attribute
caused ERRCODE_FEATURE_NOT_SUPPORTED to be raised as error.  Use instead
ERRCODE_INVALID_PARAMETER_VALUE to be more consistent.

Author: Dilip Kumar
Discussion: https://postgr.es/m/CAFiTN-vH84fE-8C4zGZw4v0Wyh4Y2v=5JWg2fGE5+LPaDvz1GQ@mail.gmail.com
2021-05-08 10:18:05 +09:00
Alvaro Herrera 4e8c0f1a0d
AlterSubscription_refresh: avoid stomping on global variable
This patch replaces use of the global "wrconn" variable in
AlterSubscription_refresh with a local variable of the same name, making
it consistent with other functions in subscriptioncmds.c (e.g.
DropSubscription).

The global wrconn is only meant to be used for logical apply/tablesync worker.
Abusing it this way is known to cause trouble if an apply worker
manages to do a subscription refresh, such as reported by Jeremy Finzel
and diagnosed by Andres Freund back in November 2020, at
https://www.postgresql.org/message-id/20201111215820.qihhrz7fayu6myfi@alap3.anarazel.de

Backpatch to 10.  In branch master, also move the connection establishment
to occur outside the PG_TRY block; this way we can remove a test for NULL in
PG_FINALLY, and it also makes the code more consistent with similar code in
the same file.

Author: Peter Smith <peter.b.smith@fujitsu.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/CAHut+Pu7Jv9L2BOEx_Z0UtJxfDevQSAUW2mJqWU+CtmDrEZVAg@mail.gmail.com
2021-05-07 11:46:37 -04:00
Thomas Munro ec48314708 Revert per-index collation version tracking feature.
Design problems were discovered in the handling of composite types and
record types that would cause some relevant versions not to be recorded.
Misgivings were also expressed about the use of the pg_depend catalog
for this purpose.  We're out of time for this release so we'll revert
and try again.

Commits reverted:

1bf946bd: Doc: Document known problem with Windows collation versions.
cf002008: Remove no-longer-relevant test case.
ef387bed: Fix bogus collation-version-recording logic.
0fb0a050: Hide internal error for pg_collation_actual_version(<bad OID>).
ff942057: Suppress "warning: variable 'collcollate' set but not used".
d50e3b1f: Fix assertion in collation version lookup.
f24b1569: Rethink extraction of collation dependencies.
257836a7: Track collation versions for indexes.
cd6f479e: Add pg_depend.refobjversion.
7d1297df: Remove pg_collation.collversion.

Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com
2021-05-07 21:10:11 +12:00
Alvaro Herrera c250062df4
Remove unused argument of ATAddForeignConstraint
Commit 0325d7a595 made this unused but forgot to remove it. Do so now.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/209c99fe-b9a2-94f4-cd68-a8304186a09e@lab.ntt.co.jp
2021-05-05 12:27:39 -04:00
Alvaro Herrera 6f70d7ca1d
Have ALTER CONSTRAINT recurse on partitioned tables
When ALTER TABLE .. ALTER CONSTRAINT changes deferrability properties
changed in a partitioned table, we failed to propagate those changes
correctly to partitions and to triggers.  Repair by adding a recursion
mechanism to affect all derived constraints and all derived triggers.
(In particular, recurse to partitions even if their respective parents
are already in the desired state: it is possible for the partitions to
have been altered individually.)  Because foreign keys involve tables in
two sides, we cannot use the standard ALTER TABLE recursion mechanism,
so we invent our own by following pg_constraint.conparentid down.

When ALTER TABLE .. ALTER CONSTRAINT is invoked on the derived
pg_constraint object that's automaticaly created in a partition as a
result of a constraint added to its parent, raise an error instead of
pretending to work and then failing to modify all the affected triggers.
Before this commit such a command would be allowed but failed to affect
all triggers, so it would silently misbehave.  (Restoring dumps of
existing databases is not affected, because pg_dump does not produce
anything for such a derived constraint anyway.)

Add some tests for the case.

Backpatch to 11, where foreign key support was added to partitioned
tables by commit 3de241dba8.  (A related change is commit f56f8f8da6
in pg12 which added support for FKs *referencing* partitioned tables;
this is what forces us to use an ad-hoc recursion mechanism for this.)

Diagnosed by Tom Lane from bug report from Ron L Johnson.  As of this
writing, no reviews were offered.

Discussion: https://postgr.es/m/75fe0761-a291-86a9-c8d8-4906da077469@gmail.com
Discussion: https://postgr.es/m/3144850.1607369633@sss.pgh.pa.us
2021-05-05 12:21:50 -04:00
Alvaro Herrera e798d095da
Fix OID passed to object-alter hook during ALTER CONSTRAINT
The OID of the constraint is used instead of the OID of the trigger --
an easy mistake to make.  Apparently the object-alter hooks are not very
well tested :-(

Backpatch to 12, where this typo was introduced by 578b229718

Discussion: https://postgr.es/m/20210503231633.GA6994@alvherre.pgsql
2021-05-04 10:09:12 -04:00
Peter Eisentraut a970edbed3 Fix ALTER TABLE / INHERIT with generated columns
When running ALTER TABLE t2 INHERIT t1, we must check that columns in
t2 that correspond to a generated column in t1 are also generated and
have the same generation expression.  Otherwise, this would allow
creating setups that a normal CREATE TABLE sequence would not allow.

Discussion: https://www.postgresql.org/message-id/22de27f6-7096-8d96-4619-7b882932ca25@2ndquadrant.com
2021-05-04 12:09:08 +02:00
David Rowley 3c80e96dff Adjust EXPLAIN output for parallel Result Cache plans
Here we adjust the EXPLAIN ANALYZE output for Result Cache so that we
don't show any Result Cache stats for parallel workers who don't
contribute anything to Result Cache plan nodes.

I originally had ideas that workers who don't help could still have their
Result Cache stats displayed.  The idea with that was so that I could
write some parallel Result Cache regression tests that show the EXPLAIN
ANALYZE output.  However, I realized a little too late that such tests
would just not be possible to have run in a stable way on the buildfarm.

With that knowledge, before 9eacee2e6 went in, I had removed all of the
tests that were showing the EXPLAIN ANALYZE output of a parallel Result
Cache plan, however, I forgot to put back the code that adjusts the
EXPLAIN output to hide the Result Cache stats for parallel workers who
were not fast enough to help out before query execution was over. All
other nodes behave this way and so should Result Cache.

Additionally, with this change, it now seems safe enough to remove the SET
force_parallel_mode = off that I had added to the regression tests.

Also, perform some cleanup in the partition_prune tests. I had adjusted
the explain_parallel_append() function to sanitize the Result Cache
EXPLAIN ANALYZE output.  However, since I didn't actually include any
parallel Result Cache tests that show their EXPLAIN ANALYZE output, that
code does nothing and can be removed.

In passing, move the setting of memPeakKb into the scope where it's used.

Reported-by: Amit Khandekar
Author: David Rowley, Amit Khandekar
Discussion: https://postgr.es/m/CAJ3gD9d8SkfY95GpM1zmsOtX2-Ogx5q-WLsf8f0ykEb0hCRK3w@mail.gmail.com
2021-04-30 14:46:42 +12:00
Tom Lane 9626325da5 Add heuristic incoming-message-size limits in the server.
We had a report of confusing server behavior caused by a client bug
that sent junk to the server: the server thought the junk was a
very long message length and waited patiently for data that would
never come.  We can reduce the risk of that by being less trusting
about message lengths.

For a long time, libpq has had a heuristic rule that it wouldn't
believe large message size words, except for a small number of
message types that are expected to be (potentially) long.  This
provides some defense against loss of message-boundary sync and
other corrupted-data cases.  The server does something similar,
except that up to now it only limited the lengths of messages
received during the connection authentication phase.  Let's
do the same as in libpq and put restrictions on the allowed
length of all messages, while distinguishing between message
types that are expected to be long and those that aren't.

I used a limit of 10000 bytes for non-long messages.  (libpq's
corresponding limit is 30000 bytes, but given the asymmetry of
the FE/BE protocol, there's no good reason why the numbers should
be the same.)  Experimentation suggests that this is at least a
factor of 10, maybe a factor of 100, more than we really need;
but plenty of daylight seems desirable to avoid false positives.
In any case we can adjust the limit based on beta-test results.

For long messages, set a limit of MaxAllocSize - 1, which is the
most that we can absorb into the StringInfo buffer that the message
is collected in.  This just serves to make sure that a bogus message
size is reported as such, rather than as a confusing gripe about
not being able to enlarge a string buffer.

While at it, make sure that non-mainline code paths (such as
COPY FROM STDIN) are as paranoid as SocketBackend is, and validate
the message type code before believing the message length.
This provides an additional guard against getting stuck on corrupted
input.

Discussion: https://postgr.es/m/2003757.1619373089@sss.pgh.pa.us
2021-04-28 15:50:46 -04:00
Alvaro Herrera d6b8d29419
Allow a partdesc-omitting-partitions to be cached
Makes partition descriptor acquisition faster during the transient
period in which a partition is in the process of being detached.

This also adds the restriction that only one partition can be in
pending-detach state for a partitioned table.

While at it, return find_inheritance_children() API to what it was
before 71f4c8c6f7, and create a separate
find_inheritance_children_extended() that returns detailed info about
detached partitions.

(This incidentally fixes a bug in 8aba932251 whereby a memory context
holding a transient partdesc is reparented to a NULL PortalContext,
leading to permanent leak of that memory.  The fix is to no longer rely
on reparenting contexts to PortalContext.   Reported by Amit Langote.)

Per gripe from Amit Langote
Discussion: https://postgr.es/m/CA+HiwqFgpP1LxJZOBYGt9rpvTjXXkg5qG2+Xch2Z1Q7KrqZR1A@mail.gmail.com
2021-04-28 15:44:35 -04:00
Fujii Masao 8e9ea08bae Don't pass "ONLY" options specified in TRUNCATE to foreign data wrapper.
Commit 8ff1c94649 allowed TRUNCATE command to truncate foreign tables.
Previously the information about "ONLY" options specified in TRUNCATE
command were passed to the foreign data wrapper. Then postgres_fdw
constructed the TRUNCATE command to issue the remote server and
included "ONLY" options in it based on the passed information.

On the other hand, "ONLY" options specified in SELECT, UPDATE or DELETE
have no effect when accessing or modifying the remote table, i.e.,
are not passed to the foreign data wrapper. So it's inconsistent to
make only TRUNCATE command pass the "ONLY" options to the foreign data
wrapper. Therefore this commit changes the TRUNCATE command so that
it doesn't pass the "ONLY" options to the foreign data wrapper,
for the consistency with other statements. Also this commit changes
postgres_fdw so that it always doesn't include "ONLY" options in
the TRUNCATE command that it constructs.

Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Kyotaro Horiguchi, Justin Pryzby, Zhihong Yu
Discussion: https://postgr.es/m/551ed8c1-f531-818b-664a-2cecdab99cd8@oss.nttdata.com
2021-04-27 14:41:27 +09:00
Alexander Korotkov 6bbcff096f Mark multirange_constructor0() and multirange_constructor2() strict
These functions shouldn't receive null arguments: multirange_constructor0()
doesn't have any arguments while multirange_constructor2() has a single array
argument, which is never null.

But mark them strict anyway for the sake of uniformity.

Also, make checks for null arguments use elog() instead of ereport() as these
errors should normally be never thrown.  And adjust corresponding comments.

Catversion is bumped.

Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/0f783a96-8d67-9e71-996b-f34a7352eeef%40enterprisedb.com
2021-04-23 13:25:45 +03:00
Alvaro Herrera 8aba932251
Fix relcache inconsistency hazard in partition detach
During queries coming from ri_triggers.c, we need to omit partitions
that are marked pending detach -- otherwise, the RI query is tricked
into allowing a row into the referencing table whose corresponding row
is in the detached partition.  Which is bogus: once the detach operation
completes, the row becomes an orphan.

However, the code was not doing that in repeatable-read transactions,
because relcache kept a copy of the partition descriptor that included
the partition, and used it in the RI query.  This commit changes the
partdesc cache code to only keep descriptors that aren't dependent on
a snapshot (namely: those where no detached partition exist, and those
where detached partitions are included).  When a partdesc-without-
detached-partitions is requested, we create one afresh each time; also,
those partdescs are stored in PortalContext instead of
CacheMemoryContext.

find_inheritance_children gets a new output *detached_exist boolean,
which indicates whether any partition marked pending-detach is found.
Its "include_detached" input flag is changed to "omit_detached", because
that name captures desired the semantics more naturally.
CreatePartitionDirectory() and RelationGetPartitionDesc() arguments are
identically renamed.

This was noticed because a buildfarm member that runs with relcache
clobbering, which would not keep the improperly cached partdesc, broke
one test, which led us to realize that the expected output of that test
was bogus.  This commit also corrects that expected output.

Author: Amit Langote <amitlangote09@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/3269784.1617215412@sss.pgh.pa.us
2021-04-22 15:13:25 -04:00
Alvaro Herrera 7b357cc6ae
Don't add a redundant constraint when detaching a partition
On ALTER TABLE .. DETACH CONCURRENTLY, we add a new table constraint
that duplicates the partition constraint.  But if the partition already
has another constraint that implies that one, then that's unnecessary.
We were already avoiding the addition of a duplicate constraint if there
was an exact 'equal' match -- this just improves the quality of the check.

Author: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20210410184226.GY6592@telsasoft.com
2021-04-21 18:12:05 -04:00
Peter Eisentraut 544b28088f doc: Improve hyphenation consistency 2021-04-21 08:14:43 +02:00
Michael Paquier 7ef8b52cf0 Fix typos and grammar in comments and docs
Author: Justin Pryzby
Discussion: https://postgr.es/m/20210416070310.GG3315@telsasoft.com
2021-04-19 11:32:30 +09:00
Tom Lane f90c708a04 Fix wrong units in two ExplainPropertyFloat calls.
This is only a latent bug, since these calls are only reached for
non-text output formats, and currently none of those will print
the units.  Still, we should get it right in case that ever changes.

Justin Pryzby

Discussion: https://postgr.es/m/20210415163846.GA3315@telsasoft.com
2021-04-16 11:30:27 -04:00
Tom Lane 409723365b Provide query source text when parsing a SQL-standard function body.
Without this, we lose error cursor positions, as shown in the
modified regression test result.

Discussion: https://postgr.es/m/2197698.1617984583@sss.pgh.pa.us
2021-04-15 17:24:12 -04:00
Tom Lane 1111b2668d Undo decision to allow pg_proc.prosrc to be NULL.
Commit e717a9a18 changed the longstanding rule that prosrc is NOT NULL
because when a SQL-language function is written in SQL-standard style,
we don't currently have anything useful to put there.  This seems a poor
decision though, as it could easily have negative impacts on external
PLs (opening them to crashes they didn't use to have, for instance).
SQL-function-related code can just as easily test "is prosqlbody not
null" as "is prosrc null", so there's no real gain there either.
Hence, revert the NOT NULL marking removal and adjust related logic.

For now, we just put an empty string into prosrc for SQL-standard
functions.  Maybe we'll have a better idea later, although the
history of things like pg_attrdef.adsrc suggests that it's not
easy to maintain a string equivalent of a node tree.

This also adds an assertion that queryDesc->sourceText != NULL
to standard_ExecutorStart.  We'd been silently relying on that
for awhile, so let's make it less silent.

Also fix some overlooked documentation and test cases.

Discussion: https://postgr.es/m/2197698.1617984583@sss.pgh.pa.us
2021-04-15 17:17:20 -04:00
Tom Lane 69d5ca484b Fix some inappropriately-disallowed uses of ALTER ROLE/DATABASE SET.
Most GUC check hooks that inspect database state have special checks
that prevent them from throwing hard errors for state-dependent issues
when source == PGC_S_TEST.  This allows, for example,
"ALTER DATABASE d SET default_text_search_config = foo" when the "foo"
configuration hasn't been created yet.  Without this, we have problems
during dump/reload or pg_upgrade, because pg_dump has no idea about
possible dependencies of GUC values and can't ensure a safe restore
ordering.

However, check_role() and check_session_authorization() hadn't gotten
the memo about that, and would throw hard errors anyway.  It's not
entirely clear what is the use-case for "ALTER ROLE x SET role = y",
but we've now heard two independent complaints about that bollixing
an upgrade, so apparently some people are doing it.

Hence, fix these two functions to act more like other check hooks
with similar needs.  (But I did not change their insistence on
being inside a transaction, as it's still not apparent that setting
either GUC from the configuration file would be wise.)

Also fix check_temp_buffers, which had a different form of the disease
of making state-dependent checks without any exception for PGC_S_TEST.
A cursory survey of other GUC check hooks did not find any more issues
of this ilk.  (There are a lot of interdependencies among
PGC_POSTMASTER and PGC_SIGHUP GUCs, which may be a bad idea, but
they're not relevant to the immediate concern because they can't be
set via ALTER ROLE/DATABASE.)

Per reports from Charlie Hornsby and Nathan Bossart.  Back-patch
to all supported branches.

Discussion: https://postgr.es/m/HE1P189MB0523B31598B0C772C908088DB7709@HE1P189MB0523.EURP189.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/20160711223641.1426.86096@wrigleys.postgresql.org
2021-04-13 15:10:18 -04:00
Fujii Masao 81a23dd879 Avoid unnecessary table open/close in TRUNCATE command.
ExecuteTruncate() filters out the duplicate tables specified
in the TRUNCATE command, for example in the case where "TRUNCATE foo, foo"
is executed. Such duplicate tables obviously don't need to be opened
and closed because they are skipped. But previously it always opened
the tables before checking whether they were duplicated ones or not,
and then closed them if they were. That is, the duplicated tables were
opened and closed unnecessarily.

This commit changes ExecuteTruncate() so that it opens the table
after it confirms that table is not duplicated one, which leads to
avoid unnecessary table open/close.

Do not back-patch because such unnecessary table open/close is not
a bug though it exists in older versions.

Author: Bharath Rupireddy
Reviewed-by: Amul Sul, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACUdBO_sXJTa08OZ0YT0qk7F_gAmRa9hT4dxRcgPS4nsZA@mail.gmail.com
2021-04-12 00:05:58 +09:00
Noah Misch df5efaf441 Standardize pg_authid oid_symbol values.
Commit c9c41c7a33 used two different
naming patterns.  Standardize on the majority pattern, which was the
only pattern in the last reviewed version of that commit.
2021-04-10 12:01:41 -07:00
Alvaro Herrera 0e69f705cc
Set pg_class.reltuples for partitioned tables
When commit 0827e8af70 added auto-analyze support for partitioned
tables, it included code to obtain reltuples for the partitioned table
as a number of catalog accesses to read pg_class.reltuples for each
partition.  That's not only very inefficient, but also problematic
because autovacuum doesn't hold any locks on any of those tables -- and
doesn't want to.  Replace that code with a read of pg_class.reltuples
for the partitioned table, and make sure ANALYZE and TRUNCATE properly
maintain that value.

I found no code that would be affected by the change of relpages from
zero to non-zero for partitioned tables, and no other code that should
be maintaining it, but if there is, hopefully it'll be an easy fix.

Per buildfarm.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/1823909.1617862590@sss.pgh.pa.us
2021-04-09 11:50:33 -04:00
Michael Paquier 609b0652af Fix typos and grammar in documentation and code comments
Comment fixes are applied on HEAD, and documentation improvements are
applied on back-branches where needed.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210408164008.GJ6592@telsasoft.com
Backpatch-through: 9.6
2021-04-09 13:53:07 +09:00
Fujii Masao 8ff1c94649 Allow TRUNCATE command to truncate foreign tables.
This commit introduces new foreign data wrapper API for TRUNCATE.
It extends TRUNCATE command so that it accepts foreign tables as
the targets to truncate and invokes that API. Also it extends postgres_fdw
so that it can issue TRUNCATE command to foreign servers, by adding
new routine for that TRUNCATE API.

The information about options specified in TRUNCATE command, e.g.,
ONLY, CACADE, etc is passed to FDW via API. The list of foreign tables to
truncate is also passed to FDW. FDW truncates the foreign data sources
that the passed foreign tables specify, based on those information.
For example, postgres_fdw constructs TRUNCATE command using them
and issues it to the foreign server.

For performance, TRUNCATE command invokes the FDW routine for
TRUNCATE once per foreign server that foreign tables to truncate belong to.

Author: Kazutaka Onishi, Kohei KaiGai, slightly modified by Fujii Masao
Reviewed-by: Bharath Rupireddy, Michael Paquier, Zhihong Yu, Alvaro Herrera, Stephen Frost, Ashutosh Bapat, Amit Langote, Daniel Gustafsson, Ibrar Ahmed, Fujii Masao
Discussion: https://postgr.es/m/CAOP8fzb_gkReLput7OvOK+8NHgw-RKqNv59vem7=524krQTcWA@mail.gmail.com
Discussion: https://postgr.es/m/CAJuF6cMWDDqU-vn_knZgma+2GMaout68YUgn1uyDnexRhqqM5Q@mail.gmail.com
2021-04-08 20:56:08 +09:00
Alvaro Herrera 0827e8af70
autovacuum: handle analyze for partitioned tables
Previously, autovacuum would completely ignore partitioned tables, which
is not good regarding analyze -- failing to analyze those tables means
poor plans may be chosen.  Make autovacuum aware of those tables by
propagating "changes since analyze" counts from the leaf partitions up
the partitioning hierarchy.

This also introduces necessary reloptions support for partitioned tables
(autovacuum_enabled, autovacuum_analyze_scale_factor,
autovacuum_analyze_threshold).  It's unclear how best to document this
aspect.

Author: Yuzuko Hosoya <yuzukohosoya@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAKkQ508_PwVgwJyBY=0Lmkz90j8CmWNPUxgHvCUwGhMrouz6UA@mail.gmail.com
2021-04-08 01:19:36 -04:00
Peter Eisentraut e717a9a18b SQL-standard function body
This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
statements for language SQL with a function body that conforms to the
SQL standard and is portable to other implementations.

Instead of the PostgreSQL-specific AS $$ string literal $$ syntax,
this allows writing out the SQL statements making up the body
unquoted, either as a single statement:

    CREATE FUNCTION add(a integer, b integer) RETURNS integer
        LANGUAGE SQL
        RETURN a + b;

or as a block

    CREATE PROCEDURE insert_data(a integer, b integer)
    LANGUAGE SQL
    BEGIN ATOMIC
      INSERT INTO tbl VALUES (a);
      INSERT INTO tbl VALUES (b);
    END;

The function body is parsed at function definition time and stored as
expression nodes in a new pg_proc column prosqlbody.  So at run time,
no further parsing is required.

However, this form does not support polymorphic arguments, because
there is no more parse analysis done at call time.

Dependencies between the function and the objects it uses are fully
tracked.

A new RETURN statement is introduced.  This can only be used inside
function bodies.  Internally, it is treated much like a SELECT
statement.

psql needs some new intelligence to keep track of function body
boundaries so that it doesn't send off statements when it sees
semicolons that are inside a function body.

Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:47:55 +02:00
Peter Geoghegan 1e55e7d175 Add wraparound failsafe to VACUUM.
Add a failsafe mechanism that is triggered by VACUUM when it notices
that the table's relfrozenxid and/or relminmxid are dangerously far in
the past.  VACUUM checks the age of the table dynamically, at regular
intervals.

When the failsafe triggers, VACUUM takes extraordinary measures to
finish as quickly as possible so that relfrozenxid and/or relminmxid can
be advanced.  VACUUM will stop applying any cost-based delay that may be
in effect.  VACUUM will also bypass any further index vacuuming and heap
vacuuming -- it only completes whatever remaining pruning and freezing
is required.  Bypassing index/heap vacuuming is enabled by commit
8523492d, which made it possible to dynamically trigger the mechanism
already used within VACUUM when it is run with INDEX_CLEANUP off.

It is expected that the failsafe will almost always trigger within an
autovacuum to prevent wraparound, long after the autovacuum began.
However, the failsafe mechanism can trigger in any VACUUM operation.
Even in a non-aggressive VACUUM, where we're likely to not advance
relfrozenxid, it still seems like a good idea to finish off remaining
pruning and freezing.   An aggressive/anti-wraparound VACUUM will be
launched immediately afterwards.  Note that the anti-wraparound VACUUM
that follows will itself trigger the failsafe, usually before it even
begins its first (and only) pass over the heap.

The failsafe is controlled by two new GUCs: vacuum_failsafe_age, and
vacuum_multixact_failsafe_age.  There are no equivalent reloptions,
since that isn't expected to be useful.  The GUCs have rather high
defaults (both default to 1.6 billion), and are expected to generally
only be used to make the failsafe trigger sooner/more frequently.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzmgH3ySGYeC-m-eOBsa2=sDwa292-CFghV4rESYo39FsQ@mail.gmail.com
2021-04-07 12:37:45 -07:00
Bruce Momjian 4f0b0966c8 Make use of in-core query id added by commit 5fd9dfa5f5
Use the in-core query id computation for pg_stat_activity,
log_line_prefix, and EXPLAIN VERBOSE.

Similar to other fields in pg_stat_activity, only the queryid from the
top level statements are exposed, and if the backends status isn't
active then the queryid from the last executed statements is displayed.

Add a %Q placeholder to include the queryid in log_line_prefix, which
will also only expose top level statements.

For EXPLAIN VERBOSE, if a query identifier has been computed, either by
enabling compute_query_id or using a third-party module, display it.

Bump catalog version.

Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol

Author: Julien Rouhaud

Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu
2021-04-07 14:04:06 -04:00
Tom Lane c5b7ba4e67 Postpone some stuff out of ExecInitModifyTable.
Arrange to do some things on-demand, rather than immediately during
executor startup, because there's a fair chance of never having to do
them at all:

* Don't open result relations' indexes until needed.

* Don't initialize partition tuple routing, nor the child-to-root
tuple conversion map, until needed.

This wins in UPDATEs on partitioned tables when only some of the
partitions will actually receive updates; with larger partition
counts the savings is quite noticeable.  Also, we can remove some
sketchy heuristics in ExecInitModifyTable about whether to set up
tuple routing.

Also, remove execPartition.c's private hash table tracking which
partitions were already opened by the ModifyTable node.  Instead
use the hash added to ModifyTable itself by commit 86dc90056.

To allow lazy computation of the conversion maps, we now set
ri_RootResultRelInfo in all child ResultRelInfos.  We formerly set it
only in some, not terribly well-defined, cases.  This has user-visible
side effects in that now more error messages refer to the root
relation instead of some partition (and provide error data in the
root's column order, too).  It looks to me like this is a strict
improvement in consistency, so I don't have a problem with the
output changes visible in this commit.

Extracted from a larger patch, which seemed to me to be too messy
to push in one commit.

Amit Langote, reviewed at different times by Heikki Linnakangas and
myself

Discussion: https://postgr.es/m/CA+HiwqG7ZruBmmih3wPsBZ4s0H2EhywrnXEduckY5Hr3fWzPWA@mail.gmail.com
2021-04-06 15:57:11 -04:00
Tom Lane 091e22b2e6 Clean up treatment of missing default and CHECK-constraint records.
Andrew Gierth reported that it's possible to crash the backend if no
pg_attrdef record is found to match an attribute that has atthasdef set.
AttrDefaultFetch warns about this situation, but then leaves behind
a relation tupdesc that has null "adbin" pointer(s), which most places
don't guard against.

We considered promoting the warning to an error, but throwing errors
during relcache load is pretty drastic: it effectively locks one out
of using the relation at all.  What seems better is to leave the
load-time behavior as a warning, but then throw an error in any code
path that wants to use a default and can't find it.  This confines
the error to a subset of INSERT/UPDATE operations on the table, and
in particular will at least allow a pg_dump to succeed.

Also, we should fix AttrDefaultFetch to not leave any null pointers
in the tupdesc, because that just creates an untested bug hazard.

While at it, apply the same philosophy of "warn at load, throw error
only upon use of the known-missing info" to CHECK constraints.
CheckConstraintFetch is very nearly the same logic as AttrDefaultFetch,
but for reasons lost in the mists of time, it was throwing ERROR for
the same cases that AttrDefaultFetch treats as WARNING.  Make the two
functions more nearly alike.

In passing, get rid of potentially-O(N^2) loops in equalTupleDesc
by making AttrDefaultFetch sort the entries after fetching them,
so that equalTupleDesc can assume that entries in two equal tupdescs
must be in matching order.  (CheckConstraintFetch already was sorting
CHECK constraints, but equalTupleDesc hadn't been told about it.)

There's some argument for back-patching this, but with such a small
number of field reports, I'm content to fix it in HEAD.

Discussion: https://postgr.es/m/87pmzaq4gx.fsf@news-spur.riddles.org.uk
2021-04-06 10:34:39 -04:00
Peter Eisentraut 82ed7748b7 ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION
At present, if we want to update publications in a subscription, we
can use SET PUBLICATION.  However, it requires supplying all
publications that exists and the new publications.  If we want to add
new publications, it's inconvenient.  The new syntax only supplies the
new publications.  When the refresh is true, it only refreshes the new
publications.

Author: Japin Li <japinli@hotmail.com>
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/MEYP282MB166939D0D6C480B7FBE7EFFBB6BC0@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2021-04-06 11:49:51 +02:00
Peter Geoghegan b4af70cb21 Simplify state managed by VACUUM.
Reorganize the state struct used by VACUUM -- group related items
together to make it easier to understand.  Also stop relying on stack
variables inside lazy_scan_heap() -- move those into the state struct
instead.  Doing things this way simplifies large groups of related
functions whose function signatures had a lot of unnecessary redundancy.

Switch over to using int64 for the struct fields used to count things
that are reported to the user via log_autovacuum and VACUUM VERBOSE
output.  We were using double, but that doesn't seem to have any
advantages.  Using int64 makes it possible to add assertions that verify
that the first pass over the heap (pruning) encounters precisely the
same number of LP_DEAD items that get deleted from indexes later on, in
the second pass over the heap.  These assertions will be added in later
commits.

Finally, adjust the signatures of functions with IndexBulkDeleteResult
pointer arguments in cases where there was ambiguity about whether or
not the argument relates to a single index or all indexes.  Functions
now use the idiom that both ambulkdelete() and amvacuumcleanup() have
always used (where appropriate): accept a mutable IndexBulkDeleteResult
pointer argument, and return a result IndexBulkDeleteResult pointer to
caller.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkeOSYwC6KNckbhk2b1aNnWum6Yyn0NKP9D-Hq1LGTDPw@mail.gmail.com
2021-04-05 13:21:44 -07:00
Michael Paquier 9f6f1f9b8e Fix typo in collationcmds.c
Introduced by 51e225d.

Author: Anton Voloshin
Discussion: https://postgr.es/m/05477da0-703c-7de7-998c-5879738e8f39@postgrespro.ru
2021-04-05 11:18:12 +09:00
David Rowley 9eacee2e62 Add Result Cache executor node (take 2)
Here we add a new executor node type named "Result Cache".  The planner
can include this node type in the plan to have the executor cache the
results from the inner side of parameterized nested loop joins.  This
allows caching of tuples for sets of parameters so that in the event that
the node sees the same parameter values again, it can just return the
cached tuples instead of rescanning the inner side of the join all over
again.  Internally, result cache uses a hash table in order to quickly
find tuples that have been previously cached.

For certain data sets, this can significantly improve the performance of
joins.  The best cases for using this new node type are for join problems
where a large portion of the tuples from the inner side of the join have
no join partner on the outer side of the join.  In such cases, hash join
would have to hash values that are never looked up, thus bloating the hash
table and possibly causing it to multi-batch.  Merge joins would have to
skip over all of the unmatched rows.  If we use a nested loop join with a
result cache, then we only cache tuples that have at least one join
partner on the outer side of the join.  The benefits of using a
parameterized nested loop with a result cache increase when there are
fewer distinct values being looked up and the number of lookups of each
value is large.  Also, hash probes to lookup the cache can be much faster
than the hash probe in a hash join as it's common that the result cache's
hash table is much smaller than the hash join's due to result cache only
caching useful tuples rather than all tuples from the inner side of the
join.  This variation in hash probe performance is more significant when
the hash join's hash table no longer fits into the CPU's L3 cache, but the
result cache's hash table does.  The apparent "random" access of hash
buckets with each hash probe can cause a poor L3 cache hit ratio for large
hash tables.  Smaller hash tables generally perform better.

The hash table used for the cache limits itself to not exceeding work_mem
* hash_mem_multiplier in size.  We maintain a dlist of keys for this cache
and when we're adding new tuples and realize we've exceeded the memory
budget, we evict cache entries starting with the least recently used ones
until we have enough memory to add the new tuples to the cache.

For parameterized nested loop joins, we now consider using one of these
result cache nodes in between the nested loop node and its inner node.  We
determine when this might be useful based on cost, which is primarily
driven off of what the expected cache hit ratio will be.  Estimating the
cache hit ratio relies on having good distinct estimates on the nested
loop's parameters.

For now, the planner will only consider using a result cache for
parameterized nested loop joins.  This works for both normal joins and
also for LATERAL type joins to subqueries.  It is possible to use this new
node for other uses in the future.  For example, to cache results from
correlated subqueries.  However, that's not done here due to some
difficulties obtaining a distinct estimation on the outer plan to
calculate the estimated cache hit ratio.  Currently we plan the inner plan
before planning the outer plan so there is no good way to know if a result
cache would be useful or not since we can't estimate the number of times
the subplan will be called until the outer plan is generated.

The functionality being added here is newly introducing a dependency on
the return value of estimate_num_groups() during the join search.
Previously, during the join search, we only ever needed to perform
selectivity estimations.  With this commit, we need to use
estimate_num_groups() in order to estimate what the hit ratio on the
result cache will be.   In simple terms, if we expect 10 distinct values
and we expect 1000 outer rows, then we'll estimate the hit ratio to be
99%.  Since cache hits are very cheap compared to scanning the underlying
nodes on the inner side of the nested loop join, then this will
significantly reduce the planner's cost for the join.   However, it's
fairly easy to see here that things will go bad when estimate_num_groups()
incorrectly returns a value that's significantly lower than the actual
number of distinct values.  If this happens then that may cause us to make
use of a nested loop join with a result cache instead of some other join
type, such as a merge or hash join.  Our distinct estimations have been
known to be a source of trouble in the past, so the extra reliance on them
here could cause the planner to choose slower plans than it did previous
to having this feature.  Distinct estimations are also fairly hard to
estimate accurately when several tables have been joined already or when a
WHERE clause filters out a set of values that are correlated to the
expressions we're estimating the number of distinct value for.

For now, the costing we perform during query planning for result caches
does put quite a bit of faith in the distinct estimations being accurate.
When these are accurate then we should generally see faster execution
times for plans containing a result cache.  However, in the real world, we
may find that we need to either change the costings to put less trust in
the distinct estimations being accurate or perhaps even disable this
feature by default.  There's always an element of risk when we teach the
query planner to do new tricks that it decides to use that new trick at
the wrong time and causes a regression.  Users may opt to get the old
behavior by turning the feature off using the enable_resultcache GUC.
Currently, this is enabled by default.  It remains to be seen if we'll
maintain that setting for the release.

Additionally, the name "Result Cache" is the best name I could think of
for this new node at the time I started writing the patch.  Nobody seems
to strongly dislike the name. A few people did suggest other names but no
other name seemed to dominate in the brief discussion that there was about
names. Let's allow the beta period to see if the current name pleases
enough people.  If there's some consensus on a better name, then we can
change it before the release.  Please see the 2nd discussion link below
for the discussion on the "Result Cache" name.

Author: David Rowley
Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu, Hou Zhijie
Tested-By: Konstantin Knizhnik
Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com
2021-04-02 14:10:56 +13:00
Stephen Frost c9c41c7a33 Rename Default Roles to Predefined Roles
The term 'default roles' wasn't quite apt as these roles aren't able to
be modified or removed after installation, so rename them to be
'Predefined Roles' instead, adding an entry into the newly added
Obsolete Appendix to help users of current releases find the new
documentation.

Bruce Momjian and Stephen Frost

Discussion: https://postgr.es/m/157742545062.1149.11052653770497832538%40wrigleys.postgresql.org
and https://www.postgresql.org/message-id/20201120211304.GG16415@tamriel.snowman.net
2021-04-01 15:32:06 -04:00
Heikki Linnakangas f82de5c46b Do COPY FROM encoding conversion/verification in larger chunks.
This gives a small performance gain, by reducing the number of calls
to the conversion/verification function, and letting it work with
larger inputs. Also, reorganizing the input pipeline makes it easier
to parallelize the input parsing: after the input has been converted
to the database encoding, the next stage of finding the newlines can
be done in parallel, because there cannot be any newline chars
"embedded" in multi-byte characters in the encodings that we support
as server encodings.

This changes behavior in one corner case: if client and server
encodings are the same single-byte encoding (e.g. latin1), previously
the input would not be checked for zero bytes ('\0'). Any fields
containing zero bytes would be truncated at the zero. But if encoding
conversion was needed, the conversion routine would throw an error on
the zero. After this commit, the input is always checked for zeros.

Reviewed-by: John Naylor
Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01%40iki.fi
2021-04-01 12:23:40 +03:00
Heikki Linnakangas ea1b99a661 Add 'noError' argument to encoding conversion functions.
With the 'noError' argument, you can try to convert a buffer without
knowing the character boundaries beforehand. The functions now need to
return the number of input bytes successfully converted.

This is is a backwards-incompatible change, if you have created a custom
encoding conversion with CREATE CONVERSION. This adds a check to
pg_upgrade for that, refusing the upgrade if there are any user-defined
encoding conversions. Custom conversions are very rare, there are no
commonly used extensions that I know of that uses that feature. No other
objects can depend on conversions, so if you do have one, you can fairly
easily drop it before upgrading, and recreate it after the upgrade with
an updated version.

Add regression tests for built-in encoding conversions. This doesn't cover
every conversion, but it covers all the internal functions in conv.c that
are used to implement the conversions.

Reviewed-by: John Naylor
Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01%40iki.fi
2021-04-01 11:45:22 +03:00
David Rowley 28b3e3905c Revert b6002a796
This removes "Add Result Cache executor node".  It seems that something
weird is going on with the tracking of cache hits and misses as
highlighted by many buildfarm animals.  It's not yet clear what the
problem is as other parts of the plan indicate that the cache did work
correctly, it's just the hits and misses that were being reported as 0.

This is especially a bad time to have the buildfarm so broken, so
reverting before too many more animals go red.

Discussion: https://postgr.es/m/CAApHDvq_hydhfovm4=izgWs+C5HqEeRScjMbOgbpC-jRAeK3Yw@mail.gmail.com
2021-04-01 13:33:23 +13:00
David Rowley b6002a796d Add Result Cache executor node
Here we add a new executor node type named "Result Cache".  The planner
can include this node type in the plan to have the executor cache the
results from the inner side of parameterized nested loop joins.  This
allows caching of tuples for sets of parameters so that in the event that
the node sees the same parameter values again, it can just return the
cached tuples instead of rescanning the inner side of the join all over
again.  Internally, result cache uses a hash table in order to quickly
find tuples that have been previously cached.

For certain data sets, this can significantly improve the performance of
joins.  The best cases for using this new node type are for join problems
where a large portion of the tuples from the inner side of the join have
no join partner on the outer side of the join.  In such cases, hash join
would have to hash values that are never looked up, thus bloating the hash
table and possibly causing it to multi-batch.  Merge joins would have to
skip over all of the unmatched rows.  If we use a nested loop join with a
result cache, then we only cache tuples that have at least one join
partner on the outer side of the join.  The benefits of using a
parameterized nested loop with a result cache increase when there are
fewer distinct values being looked up and the number of lookups of each
value is large.  Also, hash probes to lookup the cache can be much faster
than the hash probe in a hash join as it's common that the result cache's
hash table is much smaller than the hash join's due to result cache only
caching useful tuples rather than all tuples from the inner side of the
join.  This variation in hash probe performance is more significant when
the hash join's hash table no longer fits into the CPU's L3 cache, but the
result cache's hash table does.  The apparent "random" access of hash
buckets with each hash probe can cause a poor L3 cache hit ratio for large
hash tables.  Smaller hash tables generally perform better.

The hash table used for the cache limits itself to not exceeding work_mem
* hash_mem_multiplier in size.  We maintain a dlist of keys for this cache
and when we're adding new tuples and realize we've exceeded the memory
budget, we evict cache entries starting with the least recently used ones
until we have enough memory to add the new tuples to the cache.

For parameterized nested loop joins, we now consider using one of these
result cache nodes in between the nested loop node and its inner node.  We
determine when this might be useful based on cost, which is primarily
driven off of what the expected cache hit ratio will be.  Estimating the
cache hit ratio relies on having good distinct estimates on the nested
loop's parameters.

For now, the planner will only consider using a result cache for
parameterized nested loop joins.  This works for both normal joins and
also for LATERAL type joins to subqueries.  It is possible to use this new
node for other uses in the future.  For example, to cache results from
correlated subqueries.  However, that's not done here due to some
difficulties obtaining a distinct estimation on the outer plan to
calculate the estimated cache hit ratio.  Currently we plan the inner plan
before planning the outer plan so there is no good way to know if a result
cache would be useful or not since we can't estimate the number of times
the subplan will be called until the outer plan is generated.

The functionality being added here is newly introducing a dependency on
the return value of estimate_num_groups() during the join search.
Previously, during the join search, we only ever needed to perform
selectivity estimations.  With this commit, we need to use
estimate_num_groups() in order to estimate what the hit ratio on the
result cache will be.   In simple terms, if we expect 10 distinct values
and we expect 1000 outer rows, then we'll estimate the hit ratio to be
99%.  Since cache hits are very cheap compared to scanning the underlying
nodes on the inner side of the nested loop join, then this will
significantly reduce the planner's cost for the join.   However, it's
fairly easy to see here that things will go bad when estimate_num_groups()
incorrectly returns a value that's significantly lower than the actual
number of distinct values.  If this happens then that may cause us to make
use of a nested loop join with a result cache instead of some other join
type, such as a merge or hash join.  Our distinct estimations have been
known to be a source of trouble in the past, so the extra reliance on them
here could cause the planner to choose slower plans than it did previous
to having this feature.  Distinct estimations are also fairly hard to
estimate accurately when several tables have been joined already or when a
WHERE clause filters out a set of values that are correlated to the
expressions we're estimating the number of distinct value for.

For now, the costing we perform during query planning for result caches
does put quite a bit of faith in the distinct estimations being accurate.
When these are accurate then we should generally see faster execution
times for plans containing a result cache.  However, in the real world, we
may find that we need to either change the costings to put less trust in
the distinct estimations being accurate or perhaps even disable this
feature by default.  There's always an element of risk when we teach the
query planner to do new tricks that it decides to use that new trick at
the wrong time and causes a regression.  Users may opt to get the old
behavior by turning the feature off using the enable_resultcache GUC.
Currently, this is enabled by default.  It remains to be seen if we'll
maintain that setting for the release.

Additionally, the name "Result Cache" is the best name I could think of
for this new node at the time I started writing the patch.  Nobody seems
to strongly dislike the name. A few people did suggest other names but no
other name seemed to dominate in the brief discussion that there was about
names. Let's allow the beta period to see if the current name pleases
enough people.  If there's some consensus on a better name, then we can
change it before the release.  Please see the 2nd discussion link below
for the discussion on the "Result Cache" name.

Author: David Rowley
Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu
Tested-By: Konstantin Knizhnik
Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com
2021-04-01 12:32:22 +13:00
Tom Lane 6197db5340 Improve style of some replication-related error messages.
Put the remote end's error message into the primary error string,
instead of relegating it to errdetail().  Although this could end up
being awkward if the remote sends us a really long error message,
it seems more in keeping with our message style guidelines, and more
helpful in situations where the errdetail could get dropped.

Peter Smith

Discussion: https://postgr.es/m/CAHut+Ps-Qv2yQceCwobQDP0aJOkfDzRFrOaR6+2Op2K=WHGeWg@mail.gmail.com
2021-03-31 15:25:53 -04:00
Tom Lane 86dc90056d Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:

1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns.  The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree.  The disadvantage of course
is an extra fetch of each tuple to be updated.  However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost.  At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.

2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that.  To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information.  This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.

Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy.  Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD.  (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)

There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.

FDW authors should note several API changes:

* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query.  Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.

* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable.  See postgres_fdw for sample code.

* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.

Amit Langote and Tom Lane

Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 11:52:37 -04:00
Etsuro Fujita 27e1f14563 Add support for asynchronous execution.
This implements asynchronous execution, which runs multiple parts of a
non-parallel-aware Append concurrently rather than serially to improve
performance when possible.  Currently, the only node type that can be
run concurrently is a ForeignScan that is an immediate child of such an
Append.  In the case where such ForeignScans access data on different
remote servers, this would run those ForeignScans concurrently, and
overlap the remote operations to be performed simultaneously, so it'll
improve the performance especially when the operations involve
time-consuming ones such as remote join and remote aggregation.

We may extend this to other node types such as joins or aggregates over
ForeignScans in the future.

This also adds the support for postgres_fdw, which is enabled by the
table-level/server-level option "async_capable".  The default is false.

Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself.  This commit
is mostly based on the patch proposed by Robert Haas, but also uses
stuff from the patch proposed by Kyotaro Horiguchi and from the patch
proposed by Thomas Munro.  Reviewed by Kyotaro Horiguchi, Konstantin
Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and
others.

Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com
Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
2021-03-31 18:45:00 +09:00
Tom Lane 65158f497a Remove small inefficiency in ExecARDeleteTriggers/ExecARUpdateTriggers.
Whilst poking at nodeModifyTable.c, I chanced to notice that while
its calls to ExecBR*Triggers and ExecIR*Triggers are protected by
tests to see if there are any relevant triggers to fire, its calls
to ExecAR*Triggers are not; the latter functions do the equivalent
tests themselves.  This seems possibly reasonable given the more
complex conditions involved, but what's less reasonable is that
the ExecAR* functions aren't careful to do no work when there is
no work to be done.  ExecARInsertTriggers gets this right, but the
other two will both force creation of a slot that the query may
have no use for.  ExecARUpdateTriggers additionally performed a
usually-useless ExecClearTuple() on that slot.  This is probably
all pretty microscopic in real workloads, but a cycle shaved is a
cycle earned.
2021-03-30 20:01:31 -04:00
Stephen Frost 4753ef37e0 Use a WaitLatch for vacuum/autovacuum sleeping
Instead of using pg_usleep() in vacuum_delay_point(), use a WaitLatch.
This has the advantage that we will realize if the postmaster has been
killed since the last time we decided to sleep while vacuuming.

Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CAFh8B=kcdk8k-Y21RfXPu5dX=bgPqJ8TC3p_qxR_ygdBS=JN5w@mail.gmail.com
2021-03-30 12:52:56 -04:00
Tomas Vondra a4d75c86bf Extended statistics on expressions
Allow defining extended statistics on expressions, not just just on
simple column references.  With this commit, expressions are supported
by all existing extended statistics kinds, improving the same types of
estimates. A simple example may look like this:

  CREATE TABLE t (a int);
  CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t;
  ANALYZE t;

The collected statistics are useful e.g. to estimate queries with those
expressions in WHERE or GROUP BY clauses:

  SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0;

  SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20);

This introduces new internal statistics kind 'e' (expressions) which is
built automatically when the statistics object definition includes any
expressions. This represents single-expression statistics, as if there
was an expression index (but without the index maintenance overhead).
The statistics is stored in pg_statistics_ext_data as an array of
composite types, which is possible thanks to 79f6a942bd.

CREATE STATISTICS allows building statistics on a single expression, in
which case in which case it's not possible to specify statistics kinds.

A new system view pg_stats_ext_exprs can be used to display expression
statistics, similarly to pg_stats and pg_stats_ext views.

ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it
treats indexes, i.e. it drops and recreates the statistics. This means
all statistics are reset, and we no longer try to preserve at least the
functional dependencies. This should not be a major issue in practice,
as the functional dependencies actually rely on per-column statistics,
which were always reset anyway.

Author: Tomas Vondra
Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu
Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-27 00:01:11 +01:00
Noah Misch a14a0118a1 Add "pg_database_owner" default role.
Membership consists, implicitly, of the current database owner.  Expect
use in template databases.  Once pg_database_owner has rights within a
template, each owner of a database instantiated from that template will
exercise those rights.

Reviewed by John Naylor.

Discussion: https://postgr.es/m/20201228043148.GA1053024@rfd.leadboat.com
2021-03-26 10:42:17 -07:00
Alvaro Herrera 71f4c8c6f7
ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY
Allow a partition be detached from its partitioned table without
blocking concurrent queries, by running in two transactions and only
requiring ShareUpdateExclusive in the partitioned table.

Because it runs in two transactions, it cannot be used in a transaction
block.  This is the main reason to use dedicated syntax: so that users
can choose to use the original mode if they need it.  But also, it
doesn't work when a default partition exists (because an exclusive lock
would still need to be obtained on it, in order to change its partition
constraint.)

In case the second transaction is cancelled or a crash occurs, there's
ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final
steps.

The main trick to make this work is the addition of column
pg_inherits.inhdetachpending, initially false; can only be set true in
the first part of this command.  Once that is committed, concurrent
transactions that use a PartitionDirectory will include or ignore
partitions so marked: in optimizer they are ignored if the row is marked
committed for the snapshot; in executor they are always included.  As a
result, and because of the way PartitionDirectory caches partition
descriptors, queries that were planned before the detach will see the
rows in the detached partition and queries that are planned after the
detach, won't.

A CHECK constraint is created that duplicates the partition constraint.
This is probably not strictly necessary, and some users will prefer to
remove it afterwards, but if the partition is re-attached to a
partitioned table, the constraint needn't be rechecked.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
2021-03-25 18:00:28 -03:00
Alvaro Herrera cc121d5596
Add comments for AlteredTableInfo->rel
The prior commit which introduced it was pretty squalid in terms of
code documentation, so add some comments.
2021-03-25 16:07:15 -03:00
Alvaro Herrera cd03c6e94b
Let ALTER TABLE Phase 2 routines manage the relation pointer
Struct AlteredRelationInfo gains a new Relation member, to be used only
by Phase 2 (ATRewriteCatalogs); this allows ATExecCmd() subroutines open
and close the relation internally.

A future commit will use this facility to implement an ALTER TABLE
subcommand that closes and reopens the relation across transaction
boundaries.

(It is possible to keep the relation open past phase 2 to be used by
phase 3 instead of having to reopen it that point, but there are some
minor complications with that; it's not clear that there is much to be
won from doing that, though.)

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
2021-03-25 15:56:11 -03:00
Alvaro Herrera a24ae3d7b9
Remove StoreSingleInheritance reimplementation
I introduced this duplicate code in commit 8b08f7d482 for no good
reason.  Remove it, and backpatch to 11 where it was introduced.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
2021-03-25 10:47:38 -03:00
Bruce Momjian 95d77149c5 Add macro RelationIsPermanent() to report relation permanence
Previously, to check relation permanence, the Relation's Form_pg_class
structure member relpersistence was compared to the value
RELPERSISTENCE_PERMANENT ("p"). This commit adds the macro
RelationIsPermanent() and is used in appropirate places to simplify the
code.  This matches other RelationIs* macros.

This macro will be used in more places in future cluster file encryption
patches.

Discussion: https://postgr.es/m/20210318153134.GH20766@tamriel.snowman.net
2021-03-22 20:23:52 -04:00
Robert Haas a4d5284a10 Error on invalid TOAST compression in CREATE or ALTER TABLE.
The previous coding treated an invalid compression method name as
equivalent to the default, which is certainly not right.

Justin Pryzby

Discussion: http://postgr.es/m/20210321235544.GD4203@telsasoft.com
2021-03-22 10:57:08 -04:00
Robert Haas 226e2be387 More code cleanup for configurable TOAST compression.
Remove unused macro. Fix confusion about whether a TOAST compression
method is identified by an OID or a char.

Justin Pryzby

Discussion: http://postgr.es/m/20210321235544.GD4203@telsasoft.com
2021-03-22 09:21:37 -04:00
Tom Lane ac897c4834 Fix assorted silliness in ATExecSetCompression().
It's not okay to scribble directly on a syscache entry.
Nor to continue accessing said entry after releasing it.

Also get rid of not-used local variables.

Per valgrind testing.
2021-03-21 18:43:07 -04:00
Robert Haas d00fbdc431 Fix use-after-ReleaseSysCache problem in ATExecAlterColumnType.
Introduced by commit bbe0a81db6.

Per buildfarm member prion.
2021-03-19 17:17:48 -04:00
Robert Haas bbe0a81db6 Allow configurable LZ4 TOAST compression.
There is now a per-column COMPRESSION option which can be set to pglz
(the default, and the only option in up until now) or lz4. Or, if you
like, you can set the new default_toast_compression GUC to lz4, and
then that will be the default for new table columns for which no value
is specified. We don't have lz4 support in the PostgreSQL code, so
to use lz4 compression, PostgreSQL must be built --with-lz4.

In general, TOAST compression means compression of individual column
values, not the whole tuple, and those values can either be compressed
inline within the tuple or compressed and then stored externally in
the TOAST table, so those properties also apply to this feature.

Prior to this commit, a TOAST pointer has two unused bits as part of
the va_extsize field, and a compessed datum has two unused bits as
part of the va_rawsize field. These bits are unused because the length
of a varlena is limited to 1GB; we now use them to indicate the
compression type that was used. This means we only have bit space for
2 more built-in compresison types, but we could work around that
problem, if necessary, by introducing a new vartag_external value for
any further types we end up wanting to add. Hopefully, it won't be
too important to offer a wide selection of algorithms here, since
each one we add not only takes more coding but also adds a build
dependency for every packager. Nevertheless, it seems worth doing
at least this much, because LZ4 gets better compression than PGLZ
with less CPU usage.

It's possible for LZ4-compressed datums to leak into composite type
values stored on disk, just as it is for PGLZ. It's also possible for
LZ4-compressed attributes to be copied into a different table via SQL
commands such as CREATE TABLE AS or INSERT .. SELECT.  It would be
expensive to force such values to be decompressed, so PostgreSQL has
never done so. For the same reasons, we also don't force recompression
of already-compressed values even if the target table prefers a
different compression method than was used for the source data.  These
architectural decisions are perhaps arguable but revisiting them is
well beyond the scope of what seemed possible to do as part of this
project.  However, it's relatively cheap to recompress as part of
VACUUM FULL or CLUSTER, so this commit adjusts those commands to do
so, if the configured compression method of the table happens not to
match what was used for some column value stored therein.

Dilip Kumar. The original patches on which this work was based were
written by Ildus Kurbangaliev, and those were patches were based on
even earlier work by Nikita Glukhov, but the design has since changed
very substantially, since allow a potentially large number of
compression methods that could be added and dropped on a running
system proved too problematic given some of the architectural issues
mentioned above; the choice of which specific compression method to
add first is now different; and a lot of the code has been heavily
refactored.  More recently, Justin Przyby helped quite a bit with
testing and reviewing and this version also includes some code
contributions from him. Other design input and review from Tomas
Vondra, Álvaro Herrera, Andres Freund, Oleg Bartunov, Alexander
Korotkov, and me.

Discussion: http://postgr.es/m/20170907194236.4cefce96%40wp.localdomain
Discussion: http://postgr.es/m/CAFiTN-uUpX3ck%3DK0mLEk-G_kUQY%3DSNOTeqdaNRR9FMdQrHKebw%40mail.gmail.com
2021-03-19 15:10:38 -04:00
Peter Eisentraut e1ae40f381 Small error message improvement 2021-03-17 08:17:33 +01:00
Thomas Munro 9e7ccd9ef6 Enable parallelism in REFRESH MATERIALIZED VIEW.
Pass CURSOR_OPT_PARALLEL_OK to pg_plan_query() so that parallel plans
are considered when running the underlying SELECT query.  This wasn't
done in commit e9baa5e9, which did this for CREATE MATERIALIZED VIEW,
because it wasn't yet known to be safe.

Since REFRESH always inserts into a freshly created table before later
merging or swapping the data into place with separate operations, we can
enable such plans here too.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Hou, Zhijie <houzj.fnst@cn.fujitsu.com>
Reviewed-by: Luc Vlaming <luc@swarm64.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CALj2ACXg-4hNKJC6nFnepRHYT4t5jJVstYvri%2BtKQHy7ydcr8A%40mail.gmail.com
2021-03-17 15:04:17 +13:00
Stephen Frost c6fc50cb40 Use pre-fetching for ANALYZE
When we have posix_fadvise() available, we can improve the performance
of an ANALYZE by quite a bit by using it to inform the kernel of the
blocks that we're going to be asking for.  Similar to bitmap index
scans, the number of buffers pre-fetched is based off of the
maintenance_io_concurrency setting (for the particular tablespace or,
if not set, globally, via get_tablespace_maintenance_io_concurrency()).

Reviewed-By: Heikki Linnakangas, Tomas Vondra
Discussion: https://www.postgresql.org/message-id/VI1PR0701MB69603A433348EDCF783C6ECBF6EF0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2021-03-16 14:46:48 -04:00
Stephen Frost 94d13d474d Improve logging of auto-vacuum and auto-analyze
When logging auto-vacuum and auto-analyze activity, include the I/O
timing if track_io_timing is enabled.  Also, for auto-analyze, add the
read rate and the dirty rate, similar to how that information has
historically been logged for auto-vacuum.

Stephen Frost and Jakub Wartak

Reviewed-By: Heikki Linnakangas, Tomas Vondra
Discussion: https://www.postgresql.org/message-id/VI1PR0701MB69603A433348EDCF783C6ECBF6EF0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2021-03-16 14:46:48 -04:00
Peter Geoghegan 5f8727f5a6 VACUUM ANALYZE: Always update pg_class.reltuples.
vacuumlazy.c sometimes fails to update pg_class entries for each index
(to ensure that pg_class.reltuples is current), even though analyze.c
assumed that that must have happened during VACUUM ANALYZE.  There are
at least a couple of reasons for this.  For example, vacuumlazy.c could
fail to update pg_class when the index AM indicated that its statistics
are merely an estimate, per the contract for amvacuumcleanup() routines
established by commit e57345975c back in 2006.

Stop assuming that pg_class must have been updated with accurate
statistics within VACUUM ANALYZE -- update pg_class for indexes at the
same time as the table relation in all cases.  That way VACUUM ANALYZE
will never fail to keep pg_class.reltuples reasonably accurate.

The only downside of this approach (compared to the old approach) is
that it might inaccurately set pg_class.reltuples for indexes whose heap
relation ends up with the same inaccurate value anyway.  This doesn't
seem too bad.  We already consistently called vac_update_relstats() (to
update pg_class) for the heap/table relation twice during any VACUUM
ANALYZE -- once in vacuumlazy.c, and once in analyze.c.  We now make
sure that we call vac_update_relstats() at least once (though often
twice) for each index.

This is follow up work to commit 9f3665fb, which dealt with issues in
btvacuumcleanup().  Technically this fixes an unrelated issue, though.
btvacuumcleanup() no longer provides an accurate num_index_tuples value
following commit 9f3665fb (when there was no btbulkdelete() call during
the VACUUM operation in question), but hashvacuumcleanup() has worked in
the same way for many years now.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzknxdComjhqo4SUxVFk_Q1171GJO2ZgHZ1Y6pion6u8rA@mail.gmail.com
Backpatch: 13-, just like commit 9f3665fb.
2021-03-10 17:07:57 -08:00
Michael Paquier 9d2d457009 Add support for more progress reporting in COPY
The command (TO or FROM), its type (file, pipe, program or callback),
and the number of tuples excluded by a WHERE clause in COPY FROM are
added to the progress reporting already available.

The column "lines_processed" is renamed to "tuples_processed" to
disambiguate the meaning of this column in the cases of CSV and BINARY
COPY and to be more consistent with the other catalog progress views.

Bump catalog version, again.

Author: Matthias van de Meent
Reviewed-by: Michael Paquier, Justin Pryzby, Bharath Rupireddy, Josef
Šimánek, Tomas Vondra
Discussion: https://postgr.es/m/CAEze2WiOcgdH4aQA8NtZq-4dgvnJzp8PohdeKchPkhMY-jWZXA@mail.gmail.com
2021-03-09 14:21:03 +09:00
Tom Lane 5c06abb9b9 Validate the OID argument of pg_import_system_collations().
"SELECT pg_import_system_collations(0)" caused an assertion failure.
With a random nonzero argument --- or indeed with zero, in non-assert
builds --- it would happily make pg_collation entries with garbage
values of collnamespace.  These are harmless as far as I can tell
(unless maybe the OID happens to become used for a schema, later on?).
In any case this isn't a security issue, since the function is
superuser-only.  But it seems like a gotcha for unwary DBAs, so let's
add a check that the given OID belongs to some schema.

Back-patch to v10 where this function was introduced.
2021-03-08 18:21:51 -05:00
Heikki Linnakangas 3174d69fb9 Remove server and libpq support for old FE/BE protocol version 2.
Protocol version 3 was introduced in PostgreSQL 7.4. There shouldn't be
many clients or servers left out there without version 3 support. But as
a courtesy, I kept just enough of the old protocol support that we can
still send the "unsupported protocol version" error in v2 format, so that
old clients can display the message properly. Likewise, libpq still
understands v2 ErrorResponse messages when establishing a connection.

The impetus to do this now is that I'm working on a patch to COPY
FROM, to always prefetch some data. We cannot do that safely with the
old protocol, because it requires parsing the input one byte at a time
to detect the end-of-copy marker.

Reviewed-by: Tom Lane, Alvaro Herrera, John Naylor
Discussion: https://www.postgresql.org/message-id/9ec25819-0a8a-d51a-17dc-4150bb3cca3b%40iki.fi
2021-03-04 10:45:55 +02:00
Michael Paquier fabde52fab Simplify code to switch pg_class.relrowsecurity in tablecmds.c
The same code pattern was repeated twice to enable or disable ROW LEVEL
SECURITY with an ALTER TABLE command.  This makes the code slightly
cleaner.

Author: Justin Pryzby
Reviewed-by: Zhihong Yu
Discussion: https://postgr.es/m/20210228211854.GC20769@telsasoft.com
2021-03-02 12:30:21 +09:00
Alvaro Herrera 25936fd46c
Fix use-after-free bug with AfterTriggersTableData.storeslot
AfterTriggerSaveEvent() wrongly allocates the slot in execution-span
memory context, whereas the correct thing is to allocate it in
a transaction-span context, because that's where the enclosing
AfterTriggersTableData instance belongs into.

Backpatch to 12 (the test back to 11, where it works well with no code
changes, and it's good to have to confirm that the case was previously
well supported); this bug seems introduced by commit ff11e7f4b9.

Reported-by: Bertrand Drouvot <bdrouvot@amazon.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/39a71864-b120-5a5c-8cc5-c632b6f16761@amazon.com
2021-02-27 18:09:15 -03:00
David Rowley bb437f995d Add TID Range Scans to support efficient scanning ranges of TIDs
This adds a new executor node named TID Range Scan.  The query planner
will generate paths for TID Range scans when quals are discovered on base
relations which search for ranges on the table's ctid column.  These
ranges may be open at either end. For example, WHERE ctid >= '(10,0)';
will return all tuples on page 10 and over.

To support this, two new optional callback functions have been added to
table AM.  scan_set_tidrange is used to set the scan range to just the
given range of TIDs.  scan_getnextslot_tidrange fetches the next tuple
in the given range.

For AMs were scanning ranges of TIDs would not make sense, these functions
can be set to NULL in the TableAmRoutine.  The query planner won't
generate TID Range Scan Paths in that case.

Author: Edmund Horner, David Rowley
Reviewed-by: David Rowley, Tomas Vondra, Tom Lane, Andres Freund, Zhihong Yu
Discussion: https://postgr.es/m/CAMyN-kB-nFTkF=VA_JPwFNo08S0d-Yk0F741S2B7LDmYAi8eyA@mail.gmail.com
2021-02-27 22:59:36 +13:00
Thomas Munro 8556267b2b Revert "pg_collation_actual_version() -> pg_collation_current_version()."
This reverts commit 9cf184cc05.  Name
change less well received than anticipated.

Discussion: https://postgr.es/m/afcfb97e-88a1-a540-db95-6c573b93bc2b%40eisentraut.org
2021-02-26 15:29:27 +13:00
Michael Paquier bcf2667bf6 Fix some typos, grammar and style in docs and comments
The portions fixing the documentation are backpatched where needed.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210210235557.GQ20012@telsasoft.com
backpatch-through: 9.6
2021-02-24 16:13:17 +09:00
Thomas Munro 9cf184cc05 pg_collation_actual_version() -> pg_collation_current_version().
The new name seems a bit more natural.

Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
2021-02-22 23:32:16 +13:00
Thomas Munro 0fb0a0503b Hide internal error for pg_collation_actual_version(<bad OID>).
Instead of an unsightly internal "cache lookup failed" message, just
return NULL for bad OIDs, as is the convention for other similar things.

Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
2021-02-22 23:01:20 +13:00
Michael Paquier 9294264278 Use pgstat_progress_update_multi_param() where possible
This commit changes one code path in REINDEX INDEX and one code path
in CREATE INDEX CONCURRENTLY to report the progress of each operation
using pgstat_progress_update_multi_param() rather than
multiple calls to pgstat_progress_update_param().  This has the
advantage to make the progress report more consistent to the end-user
without impacting the amount of information provided.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACV5zW7GxD8D_tyO==bcj6ZktQchEKWKPBOAGKiLhAQo=w@mail.gmail.com
2021-02-22 14:21:40 +09:00
Peter Eisentraut 0e392fcc0d Use errmsg_internal for debug messages
An inconsistent set of debug-level messages was not using
errmsg_internal(), thus uselessly exposing the messages to translation
work.  Fix those.
2021-02-17 11:33:25 +01:00
Amit Kapila d9b0767bec Fix the warnings introduced in commit ce0fdbfe97.
Author: Amit Kapila
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/1610789.1613170207@sss.pgh.pa.us
2021-02-15 07:28:02 +05:30
Thomas Munro c7ecd6af01 ReadNewTransactionId() -> ReadNextTransactionId().
The new name conveys the effect better, is more consistent with similar
functions ReadNextMultiXactId(), ReadNextFullTransactionId(), and
matches the name of the variable that it reads.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzmVR4SakBXQUdhhPpMf1aYvZCnna5%3DHKa7DAgEmBAg%2B8g%40mail.gmail.com
2021-02-15 13:17:02 +13:00
Amit Kapila ce0fdbfe97 Allow multiple xacts during table sync in logical replication.
For the initial table data synchronization in logical replication, we use
a single transaction to copy the entire table and then synchronize the
position in the stream with the main apply worker.

There are multiple downsides of this approach: (a) We have to perform the
entire copy operation again if there is any error (network breakdown,
error in the database operation, etc.) while we synchronize the WAL
position between tablesync worker and apply worker; this will be onerous
especially for large copies, (b) Using a single transaction in the
synchronization-phase (where we can receive WAL from multiple
transactions) will have the risk of exceeding the CID limit, (c) The slot
will hold the WAL till the entire sync is complete because we never commit
till the end.

This patch solves all the above downsides by allowing multiple
transactions during the tablesync phase. The initial copy is done in a
single transaction and after that, we commit each transaction as we
receive. To allow recovery after any error or crash, we use a permanent
slot and origin to track the progress. The slot and origin will be removed
once we finish the synchronization of the table. We also remove slot and
origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or
ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not
finished.

The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and
ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true
cannot be executed inside a transaction block because they can now drop
the slots for which we have no provision to rollback.

This will also open up the path for logical replication of 2PC
transactions on the subscriber side. Previously, we can't do that because
of the requirement of maintaining a single transaction in tablesync
workers.

Bump catalog version due to change of state in the catalog
(pg_subscription_rel).

Author: Peter Smith, Amit Kapila, and Takamichi Osumi
Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com
2021-02-12 07:41:51 +05:30
Amit Kapila cd142e032e Make pg_replication_origin_drop safe against concurrent drops.
Currently, we get the origin id from the name and then drop the origin by
taking ExclusiveLock on ReplicationOriginRelationId. So, two concurrent
sessions can get the id from the name at the same time and then when they
try to drop the origin, one of the sessions will get the either
"tuple concurrently deleted" or "cache lookup failed for replication
origin ..".

To prevent this race condition we do the entire operation under lock. This
obviates the need for replorigin_drop() API and we have removed it so if
any extension authors are using it they need to instead use
replorigin_drop_by_name. See it's usage in pg_replication_origin_drop().

Author: Peter Smith
Reviewed-by: Amit Kapila, Euler Taveira, Petr Jelinek, and Alvaro
Herrera
Discussion: https://www.postgresql.org/message-id/CAHut%2BPuW8DWV5fskkMWWMqzt-x7RPcNQOtJQBp6SdwyRghCk7A%40mail.gmail.com
2021-02-10 07:17:09 +05:30
Michael Paquier 7cb3048f38 Add option PROCESS_TOAST to VACUUM
This option controls if toast tables associated with a relation are
vacuumed or not when running a manual VACUUM.  It was already possible
to trigger a manual VACUUM on a toast relation without processing its
main relation, but a manual vacuum on a main relation always forced a
vacuum on its toast table.  This is useful in scenarios where the level
of bloat or transaction age of the main and toast relations differs a
lot.

This option is an extension of the existing VACOPT_SKIPTOAST that was
used by autovacuum to control if toast relations should be skipped or
not.  This internal flag is renamed to VACOPT_PROCESS_TOAST for
consistency with the new option.

A new option switch, called --no-process-toast, is added to vacuumdb.

Author: Nathan Bossart
Reviewed-by: Kirk Jamison, Michael Paquier, Justin Pryzby
Discussion: https://postgr.es/m/BA8951E9-1524-48C5-94AF-73B1F0D7857F@amazon.com
2021-02-09 14:13:57 +09:00
Heikki Linnakangas 6214e2b228 Fix permission checks on constraint violation errors on partitions.
If a cross-partition UPDATE violates a constraint on the target partition,
and the columns in the new partition are in different physical order than
in the parent, the error message can reveal columns that the user does not
have SELECT permission on. A similar bug was fixed earlier in commit
804b6b6db4.

The cause of the bug is that the callers of the
ExecBuildSlotValueDescription() function got confused when constructing
the list of modified columns. If the tuple was routed from a parent, we
converted the tuple to the parent's format, but the list of modified
columns was grabbed directly from the child's RTE entry.

ExecUpdateLockMode() had a similar issue. That lead to confusion on which
columns are key columns, leading to wrong tuple lock being taken on tables
referenced by foreign keys, when a row is updated with INSERT ON CONFLICT
UPDATE. A new isolation test is added for that corner case.

With this patch, the ri_RangeTableIndex field is no longer set for
partitions that don't have an entry in the range table. Previously, it was
set to the RTE entry of the parent relation, but that was confusing.

NOTE: This modifies the ResultRelInfo struct, replacing the
ri_PartitionRoot field with ri_RootResultRelInfo. That's a bit risky to
backpatch, because it breaks any extensions accessing the field. The
change that ri_RangeTableIndex is not set for partitions could potentially
break extensions, too. The ResultRelInfos are visible to FDWs at least,
and this patch required small changes to postgres_fdw. Nevertheless, this
seem like the least bad option. I don't think these fields widely used in
extensions; I don't think there are FDWs out there that uses the FDW
"direct update" API, other than postgres_fdw. If there is, you will get a
compilation error, so hopefully it is caught quickly.

Backpatch to 11, where support for both cross-partition UPDATEs, and unique
indexes on partitioned tables, were added.

Reviewed-by: Amit Langote
Security: CVE-2021-3393
2021-02-08 11:01:51 +02:00
Heikki Linnakangas c444472af5 Fix backslash-escaping multibyte chars in COPY FROM.
If a multi-byte character is escaped with a backslash in TEXT mode input,
and the encoding is one of the client-only encodings where the bytes after
the first one can have an ASCII byte "embedded" in the char, we didn't
skip the character correctly. After a backslash, we only skipped the first
byte of the next character, so if it was a multi-byte character, we would
try to process its second byte as if it was a separate character. If it
was one of the characters with special meaning, like '\n', '\r', or
another '\\', that would cause trouble.

One such exmple is the byte sequence '\x5ca45c2e666f6f' in Big5 encoding.
That's supposed to be [backslash][two-byte character][.][f][o][o], but
because the second byte of the two-byte character is 0x5c, we incorrectly
treat it as another backslash. And because the next character is a dot, we
parse it as end-of-copy marker, and throw an "end-of-copy marker corrupt"
error.

Backpatch to all supported versions.

Reviewed-by: John Naylor, Kyotaro Horiguchi
Discussion: https://www.postgresql.org/message-id/a897f84f-8dca-8798-3139-07da5bb38728%40iki.fi
2021-02-05 11:14:56 +02:00
Heikki Linnakangas 2f86ab305e Fix small error in COPY FROM progress reporting.
The # of bytes processed was accumulated slightly incorrectly. After
loading more data to the input buffer, we added the number of bytes in
the buffer to the sum. But in case of multi-byte characters or escapes,
there can be a few unprocessed bytes left over from previous load in the
buffer. Those bytes got counted twice.
2021-02-04 17:40:33 +02:00
Michael Paquier c5b286047c Add TABLESPACE option to REINDEX
This patch adds the possibility to move indexes to a new tablespace
while rebuilding them.  Both the concurrent and the non-concurrent cases
are supported, and the following set of restrictions apply:
- When using TABLESPACE with a REINDEX command that targets a
partitioned table or index, all the indexes of the leaf partitions are
moved to the new tablespace.  The tablespace references of the non-leaf,
partitioned tables in pg_class.reltablespace are not changed. This
requires an extra ALTER TABLE SET TABLESPACE.
- Any index on a toast table rebuilt as part of a parent table is kept
in its original tablespace.
- The operation is forbidden on system catalogs, including trying to
directly move a toast relation with REINDEX.  This results in an error
if doing REINDEX on a single object.  REINDEX SCHEMA, DATABASE and
SYSTEM skip system relations when TABLESPACE is used.

Author: Alexey Kondratov, Michael Paquier, Justin Pryzby
Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/8a8f5f73-00d3-55f8-7583-1375ca8f6a91@postgrespro.ru
2021-02-04 14:34:20 +09:00
Michael Paquier 24843297a9 Adjust comments of CheckRelationTableSpaceMove() and SetRelationTableSpace()
4c9c359, that introduced those two functions, has been overoptimistic on
the point that only ShareUpdateExclusiveLock would be required when
moving a relation to a new tablespace.  AccessExclusiveLock is a
requirement, but ShareUpdateExclusiveLock may be used under specific
conditions like REINDEX CONCURRENTLY where waits on past transactions
make the operation safe even with a lower-level lock.  The current code
does only the former, so update the existing comments to reflect that.

Once a REINDEX (TABLESPACE) is introduced, those comments would require
an extra refresh to mention their new use case.

While on it, fix an incorrect variable name.

Per discussion with Álvaro Herrera.

Discussion: https://postgr.es/m/20210127140741.GA14174@alvherre.pgsql
2021-01-29 13:59:18 +09:00
Alvaro Herrera 6f5c8a8ec2
Remove bogus restriction from BEFORE UPDATE triggers
In trying to protect the user from inconsistent behavior, commit
487e9861d0 "Enable BEFORE row-level triggers for partitioned tables"
tried to prevent BEFORE UPDATE FOR EACH ROW triggers from moving the row
from one partition to another.  However, it turns out that the
restriction is wrong in two ways: first, it fails spuriously, preventing
valid situations from working, as in bug #16794; and second, they don't
protect from any misbehavior, because tuple routing would cope anyway.

Fix by removing that restriction.

We keep the same restriction on BEFORE INSERT FOR EACH ROW triggers,
though.  It is valid and useful there.  In the future we could remove it
by having tuple reroute work for inserts as it does for updates.

Backpatch to 13.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Phillip Menke <pg@pmenke.de>
Discussion: https://postgr.es/m/16794-350a655580fbb9ae@postgresql.org
2021-01-28 16:56:07 -03:00
Heikki Linnakangas b80e10638e Add mbverifystr() functions specific to each encoding.
This makes pg_verify_mbstr() function faster, by allowing more efficient
encoding-specific implementations. All the implementations included in
this commit are pretty naive, they just call the same encoding-specific
verifychar functions that were used previously, but that already gives a
performance boost because the tight character-at-a-time loop is simpler.

Reviewed-by: John Naylor
Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01@iki.fi
2021-01-28 14:40:07 +02:00
Michael Paquier 4c9c359d38 Refactor code in tablecmds.c to check and process tablespace moves
Two code paths of tablecmds.c (for relations with storage and without
storage) use the same logic to check if the move of a relation to a
new tablespace is allowed or not and to update pg_class.reltablespace
and pg_class.relfilenode.

A potential TABLESPACE clause for REINDEX, CLUSTER and VACUUM FULL needs
similar checks to make sure that nothing is moved around in illegal ways
(no mapped relations, shared relations only in pg_global, no move of
temp tables owned by other backends).

This reorganizes the existing code of ALTER TABLE so as all this logic
is controlled by two new routines that can be reused for the other
commands able to move relations across tablespaces, limiting the number
of code paths in need of the same protections.  This also removes some
code that was duplicated for tables with and without storage for ALTER
TABLE.

Author: Alexey Kondratov, Michael Paquier
Discussion: https://postgr.es/m/YA+9mAMWYLXJMVPL@paquier.xyz
2021-01-27 11:54:16 +09:00
Tom Lane ee895a655c Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4.  That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context.  That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did.  While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.

This patch therefore makes the following API changes:

* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.

* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan.  This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).

* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot".  It's better to treat that as an option of
SPI_prepare_extended.

Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands.  (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.

Pavel Stehule, with some further hacking by me

Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-25 22:28:29 -05:00
Michael Paquier a3dc926009 Refactor option handling of CLUSTER, REINDEX and VACUUM
This continues the work done in b5913f6.  All the options of those
commands are changed to use hex values rather than enums to reduce the
risk of compatibility bugs when introducing new options.  Each option
set is moved into a new structure that can be extended with more
non-boolean options (this was already the case of VACUUM).  The code of
REINDEX is restructured so as manual REINDEX commands go through a
single routine from utility.c, like VACUUM, to ease the allocation
handling of option parameters when a command needs to go through
multiple transactions.

This can be used as a base infrastructure for future patches related to
those commands, including reindex filtering and tablespace support.

Per discussion with people mentioned below, as well as Alvaro Herrera
and Peter Eisentraut.

Author: Michael Paquier, Justin Pryzby
Reviewed-by: Alexey Kondratov, Justin Pryzby
Discussion: https://postgr.es/m/X8riynBLwxAD9uKk@paquier.xyz
2021-01-18 14:03:10 +09:00
Noah Misch 6db992833c Prevent excess SimpleLruTruncate() deletion.
Every core SLRU wraps around.  With the exception of pg_notify, the wrap
point can fall in the middle of a page.  Account for this in the
PagePrecedes callback specification and in SimpleLruTruncate()'s use of
said callback.  Update each callback implementation to fit the new
specification.  This changes SerialPagePrecedesLogically() from the
style of asyncQueuePagePrecedes() to the style of CLOGPagePrecedes().
(Whereas pg_clog and pg_serial share a key space, pg_serial is nothing
like pg_notify.)  The bug fixed here has the same symptoms and user
followup steps as 592a589a04.  Back-patch
to 9.5 (all supported versions).

Reviewed by Andrey Borodin and (in earlier versions) by Tom Lane.

Discussion: https://postgr.es/m/20190202083822.GC32531@gust.leadboat.com
2021-01-16 12:21:35 -08:00
Amit Kapila c95765f476 Remove unnecessary pstrdup in fetch_table_list.
The result of TextDatumGetCString is already palloc'ed so we don't need to
allocate memory for it again. We decide not to backpatch it as there
doesn't seem to be any case where it can create a meaningful leak.

Author: Zhijie Hou
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/229fed2eb8c54c71a96ccb99e516eb12@G08CNEXMBPEKD05.g08.fujitsu.local
2021-01-16 10:15:32 +05:30
Tomas Vondra c9a0dc3486 Disallow CREATE STATISTICS on system catalogs
Add a check that CREATE STATISTICS does not add extended statistics on
system catalogs, similarly to indexes etc.  It can be overriden using
the allow_system_table_mods GUC.

This bug exists since 7b504eb282, adding the extended statistics, so
backpatch all the way back to PostgreSQL 10.

Author: Tomas Vondra
Reported-by: Dean Rasheed
Backpatch-through: 10
Discussion: https://postgr.es/m/CAEZATCXAPrrOKwEsyZKQ4uzzJQWBCt6QAvOcgqRGdWwT1zb%2BrQ%40mail.gmail.com
2021-01-15 23:31:22 +01:00
Alvaro Herrera f9900df5f9
Avoid spurious wait in concurrent reindex
This is like commit c98763bf51, but for REINDEX CONCURRENTLY.  To wit:
this flags indicates that the current process is safe to ignore for the
purposes of waiting for other snapshots, when doing CREATE INDEX
CONCURRENTLY and REINDEX CONCURRENTLY.  This helps two processes doing
either of those things not deadlock, and also avoids spurious waits.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Hamid Akhtar <hamid.akhtar@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/20201130195439.GA24598@alvherre.pgsql
2021-01-15 10:31:42 -03:00
Alvaro Herrera ebfe2dbd6b
Prevent drop of tablespaces used by partitioned relations
When a tablespace is used in a partitioned relation (per commits
ca4103025d in pg12 for tables and 33e6c34c32 in pg11 for indexes),
it is possible to drop the tablespace, potentially causing various
problems.  One such was reported in bug #16577, where a rewriting ALTER
TABLE causes a server crash.

Protect against this by using pg_shdepend to keep track of tablespaces
when used for relations that don't keep physical files; we now abort a
tablespace if we see that the tablespace is referenced from any
partitioned relations.

Backpatch this to 11, where this problem has been latent all along.  We
don't try to create pg_shdepend entries for existing partitioned
indexes/tables, but any ones that are modified going forward will be
protected.

Note slight behavior change: when trying to drop a tablespace that
contains both regular tables as well as partitioned ones, you'd
previously get ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE and now you'll
get ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST.  Arguably, the latter is more
correct.

It is possible to add protecting pg_shdepend entries for existing
tables/indexes, by doing
  ALTER TABLE ONLY some_partitioned_table SET TABLESPACE pg_default;
  ALTER TABLE ONLY some_partitioned_table SET TABLESPACE original_tablespace;
for each partitioned table/index that is not in the database default
tablespace.  Because these partitioned objects do not have storage, no
file needs to be actually moved, so it shouldn't take more time than
what's required to acquire locks.

This query can be used to search for such relations:
SELECT ... FROM pg_class WHERE relkind IN ('p', 'I') AND reltablespace <> 0

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/16577-881633a9f9894fd5@postgresql.org
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
2021-01-14 15:32:14 -03:00
Peter Geoghegan 9dc718bdf2 Pass down "logically unchanged index" hint.
Add an executor aminsert() hint mechanism that informs index AMs that
the incoming index tuple (the tuple that accompanies the hint) is not
being inserted by execution of an SQL statement that logically modifies
any of the index's key columns.

The hint is received by indexes when an UPDATE takes place that does not
apply an optimization like heapam's HOT (though only for indexes where
all key columns are logically unchanged).  Any index tuple that receives
the hint on insert is expected to be a duplicate of at least one
existing older version that is needed for the same logical row.  Related
versions will typically be stored on the same index page, at least
within index AMs that apply the hint.

Recognizing the difference between MVCC version churn duplicates and
true logical row duplicates at the index AM level can help with cleanup
of garbage index tuples.  Cleanup can intelligently target tuples that
are likely to be garbage, without wasting too many cycles on less
promising tuples/pages (index pages with little or no version churn).

This is infrastructure for an upcoming commit that will teach nbtree to
perform bottom-up index deletion.  No index AM actually applies the hint
just yet.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Victor Yegorov <vyegorov@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=CEKFa74EScx_hFVshCOn6AA5T-ajFASTdzipdkLTNQQ@mail.gmail.com
2021-01-13 08:11:00 -08:00
Alvaro Herrera c6c4b37395
Invent struct ReindexIndexInfo
This struct is used by ReindexRelationConcurrently to keep track of the
relations to process.  This saves having to obtain some data repeatedly,
and has future uses as well.

Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Hamid Akhtar <hamid.akhtar@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/20201130195439.GA24598@alvherre.pgsql
2021-01-12 17:05:06 -03:00
Tomas Vondra 8a4f618e7a Report progress of COPY commands
This commit introduces a view pg_stat_progress_copy, reporting progress
of COPY commands.  This allows rough estimates how far a running COPY
progressed, with the caveat that the total number of bytes may not be
available in some cases (e.g. when the input comes from the client).

Author: Josef Šimánek
Reviewed-by: Fujii Masao, Bharath Rupireddy, Vignesh C, Matthias van de Meent
Discussion: https://postgr.es/m/CAFp7QwqMGEi4OyyaLEK9DR0+E+oK3UtA4bEjDVCa4bNkwUY2PQ@mail.gmail.com
Discussion: https://postgr.es/m/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com
2021-01-06 21:51:06 +01:00
Tom Lane c9d5298485 Re-implement pl/pgsql's expression and assignment parsing.
Invent new RawParseModes that allow the core grammar to handle
pl/pgsql expressions and assignments directly, and thereby get rid
of a lot of hackery in pl/pgsql's parser.  This moves a good deal
of knowledge about pl/pgsql into the core code: notably, we have to
invent a CoercionContext that matches pl/pgsql's (rather dubious)
historical behavior for assignment coercions.  That's getting away
from the original idea of pl/pgsql as an arm's-length extension of
the core, but really we crossed that bridge a long time ago.

The main advantage of doing this is that we can now use the core
parser to generate FieldStore and/or SubscriptingRef nodes to handle
assignments to pl/pgsql variables that are records or arrays.  That
fixes a number of cases that had never been implemented in pl/pgsql
assignment, such as nested records and array slicing, and it allows
pl/pgsql assignment to support the datatype-specific subscripting
behaviors introduced in commit c7aba7c14.

There are cosmetic benefits too: when a syntax error occurs in a
pl/pgsql expression, the error report no longer includes the confusing
"SELECT" keyword that used to get prefixed to the expression text.
Also, there seem to be some small speed gains.

Discussion: https://postgr.es/m/4165684.1607707277@sss.pgh.pa.us
2021-01-04 11:52:00 -05:00
Tom Lane 844fe9f159 Add the ability for the core grammar to have more than one parse target.
This patch essentially allows gram.y to implement a family of related
syntax trees, rather than necessarily always parsing a list of SQL
statements.  raw_parser() gains a new argument, enum RawParseMode,
to say what to do.  As proof of concept, add a mode that just parses
a TypeName without any other decoration, and use that to greatly
simplify typeStringToTypeName().

In addition, invent a new SPI entry point SPI_prepare_extended() to
allow SPI users (particularly plpgsql) to get at this new functionality.
In hopes of making this the last variant of SPI_prepare(), set up its
additional arguments as a struct rather than direct arguments, and
promise that future additions to the struct can default to zero.
SPI_prepare_cursor() and SPI_prepare_params() can perhaps go away at
some point.

Discussion: https://postgr.es/m/4165684.1607707277@sss.pgh.pa.us
2021-01-04 11:03:22 -05:00
Bruce Momjian ca3b37487b Update copyright for 2021
Backpatch-through: 9.5
2021-01-02 13:06:25 -05:00
Michael Paquier e665769e6d Sanitize IF NOT EXISTS in EXPLAIN for CTAS and matviews
IF NOT EXISTS was ignored when specified in an EXPLAIN query for CREATE
MATERIALIZED VIEW or CREATE TABLE AS.  Hence, if this clause was
specified, the caller would get a failure if the relation already
exists instead of a success with a NOTICE message.

This commit makes the behavior of IF NOT EXISTS in EXPLAIN consistent
with the non-EXPLAIN'd DDL queries, preventing a failure with IF NOT
EXISTS if the relation to-be-created already exists.  The skip is done
before the SELECT query used for the relation is planned or executed,
and a "dummy" plan is generated instead depending on the format used by
EXPLAIN.

Author: Bharath Rupireddy
Reviewed-by: Zhijie Hou, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACVa3oJ9O_wcGd+FtHWZds04dEKcakxphGz5POVgD4wC7Q@mail.gmail.com
2020-12-30 21:23:24 +09:00
Michael Paquier 93e8ff8701 Refactor logic to check for ASCII-only characters in string
The same logic was present for collation commands, SASLprep and
pgcrypto, so this removes some code.

Author: Michael Paquier
Reviewed-by: Stephen Frost, Heikki Linnakangas
Discussion: https://postgr.es/m/X9womIn6rne6Gud2@paquier.xyz
2020-12-21 09:37:11 +09:00
Alexander Korotkov 11072e8693 Fix compiler warning introduced in 6df7a9698b 2020-12-20 16:27:01 +03:00
Alexander Korotkov 6df7a9698b Multirange datatypes
Multiranges are basically sorted arrays of non-overlapping ranges with
set-theoretic operations defined over them.

Since v14, each range type automatically gets a corresponding multirange
datatype.  There are both manual and automatic mechanisms for naming multirange
types.  Once can specify multirange type name using multirange_type_name
attribute in CREATE TYPE.  Otherwise, a multirange type name is generated
automatically.  If the range type name contains "range" then we change that to
"multirange".  Otherwise, we add "_multirange" to the end.

Implementation of multiranges comes with a space-efficient internal
representation format, which evades extra paddings and duplicated storage of
oids.  Altogether this format allows fetching a particular range by its index
in O(n).

Statistic gathering and selectivity estimation are implemented for multiranges.
For this purpose, stored multirange is approximated as union range without gaps.
This field will likely need improvements in the future.

Catversion is bumped.

Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com
Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0
Author: Paul Jungwirth, revised by me
Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule
Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston
Reviewed-by: Zhihong Yu, Alexander Korotkov
2020-12-20 07:20:33 +03:00
Tom Lane b3817f5f77 Improve hash_create()'s API for some added robustness.
Invent a new flag bit HASH_STRINGS to specify C-string hashing, which
was formerly the default; and add assertions insisting that exactly
one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set.
This is in hopes of preventing recurrences of the type of oversight
fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS).

Also, when HASH_STRINGS is specified, insist that the keysize be
more than 8 bytes.  This is a heuristic, but it should catch
accidental use of HASH_STRINGS for integer or pointer keys.
(Nearly all existing use-cases set the keysize to NAMEDATALEN or
more, so there's little reason to think this restriction should
be problematic.)

Tweak hash_create() to insist that the HASH_ELEM flag be set, and
remove the defaults it had for keysize and entrysize.  Since those
defaults were undocumented and basically useless, no callers
omitted HASH_ELEM anyway.

Also, remove memset's zeroing the HASHCTL parameter struct from
those callers that had one.  This has never been really necessary,
and while it wasn't a bad coding convention it was confusing that
some callers did it and some did not.  We might as well save a few
cycles by standardizing on "not".

Also improve the documentation for hash_create().

In passing, improve reinit.c's usage of a hash table by storing
the key as a binary Oid rather than a string; and, since that's
a temporary hash table, allocate it in CurrentMemoryContext for
neatness.

Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us
2020-12-15 11:38:53 -05:00
Tom Lane 8c15a29745 Allow ALTER TYPE to update an existing type's typsubscript value.
This is essential if we'd like to allow existing extension data types
to support subscripting in future, since dropping and recreating the
type isn't a practical thing for an extension upgrade script, and
direct manipulation of pg_type isn't a great answer either.

There was some discussion about also allowing alteration of typelem,
but it's less clear whether that's a good idea or not, so for now
I forebore.

Discussion: https://postgr.es/m/3724341.1607551174@sss.pgh.pa.us
2020-12-11 18:58:21 -05:00
Tom Lane c7aba7c14e Support subscripting of arbitrary types, not only arrays.
This patch generalizes the subscripting infrastructure so that any
data type can be subscripted, if it provides a handler function to
define what that means.  Traditional variable-length (varlena) arrays
all use array_subscript_handler(), while the existing fixed-length
types that support subscripting use raw_array_subscript_handler().
It's expected that other types that want to use subscripting notation
will define their own handlers.  (This patch provides no such new
features, though; it only lays the foundation for them.)

To do this, move the parser's semantic processing of subscripts
(including coercion to whatever data type is required) into a
method callback supplied by the handler.  On the execution side,
replace the ExecEvalSubscriptingRef* layer of functions with direct
calls to callback-supplied execution routines.  (Thus, essentially
no new run-time overhead should be caused by this patch.  Indeed,
there is room to remove some overhead by supplying specialized
execution routines.  This patch does a little bit in that line,
but more could be done.)

Additional work is required here and there to remove formerly
hard-wired assumptions about the result type, collation, etc
of a SubscriptingRef expression node; and to remove assumptions
that the subscript values must be integers.

One useful side-effect of this is that we now have a less squishy
mechanism for identifying whether a data type is a "true" array:
instead of wiring in weird rules about typlen, we can look to see
if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER.  For this
to be bulletproof, we have to forbid user-defined types from using
that handler directly; but there seems no good reason for them to
do so.

This patch also removes assumptions that the number of subscripts
is limited to MAXDIM (6), or indeed has any hard-wired limit.
That limit still applies to types handled by array_subscript_handler
or raw_array_subscript_handler, but to discourage other dependencies
on this constant, I've moved it from c.h to utils/array.h.

Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov,
Peter Eisentraut, Pavel Stehule

Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com
Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09 12:40:37 -05:00
Peter Eisentraut 8b069ef5dc Change get_constraint_index() to use pg_constraint.conindid
It was still using a scan of pg_depend instead of using the conindid
column that has been added since.

Since it is now just a catalog lookup wrapper and not related to
pg_depend, move from pg_depend.c to lsyscache.c.

Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/4688d55c-9a2e-9a5a-d166-5f24fe0bf8db%40enterprisedb.com
2020-12-09 15:41:45 +01:00
Michael Paquier 947789f1f5 Avoid using tuple from syscache for update of pg_database.datfrozenxid
pg_database.datfrozenxid gets updated using an in-place update at the
end of vacuum or autovacuum.  Since 96cdeae, as pg_database has a toast
relation, it is possible for a pg_database tuple to have toast values
if there is a large set of ACLs in place.  In such a case, the in-place
update would fail because of the flattening of the toast values done for
the catcache entry fetched.  Instead of using a copy from the catcache,
this changes the logic to fetch the copy of the tuple by directly
scanning pg_database.

Per the lack of complaints on the matter, no backpatch is done.  Note
that before 96cdeae, attempting to insert such a tuple to pg_database
would cause a "row is too big" error, so the end-of-vacuum problem was
not reachable.

Author: Ashwin Agrawal, Junfeng Yang
Discussion: https://postgr.es/m/DM5PR0501MB38800D9E4605BCA72DD35557CCE10@DM5PR0501MB3880.namprd05.prod.outlook.com
2020-12-08 12:13:19 +09:00
Michael Paquier b5913f6120 Refactor CLUSTER and REINDEX grammar to use DefElem for option lists
This changes CLUSTER and REINDEX so as a parenthesized grammar becomes
possible for options, while unifying the grammar parsing rules for
option lists with the existing ones.

This is a follow-up of the work done in 873ea9e for VACUUM, ANALYZE and
EXPLAIN.  This benefits REINDEX for a potential backend-side filtering
for collatable-sensitive indexes and TABLESPACE, while CLUSTER would
benefit from the latter.

Author: Alexey Kondratov, Justin Pryzby
Discussion: https://postgr.es/m/8a8f5f73-00d3-55f8-7583-1375ca8f6a91@postgrespro.ru
2020-12-03 10:13:21 +09:00
Alvaro Herrera 677f74e5bb
Avoid memcpy() with a NULL source pointer and count == 0
When memcpy() is called on a pointer, the compiler is entitled to assume
that the pointer is not null, which can lead to optimizing nearby code
in potentially undesirable ways.  We still want such optimizations
(gcc's -fdelete-null-pointer-checks) in cases where they're valid.

Related: commit 13bba02271.

Backpatch to pg11, where this particular instance appeared.

Reported-by: Ranier Vilela <ranier.vf@gmail.com>
Reported-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/CAEudQApUndmQkr5fLrCKXQ7+ib44i7S+Kk93pyVThS85PnG3bQ@mail.gmail.com
Discussion: https://postgr.es/m/CALNJ-vSdhwSM5f4tnNn1cdLHvXMVe_S+V3nR5GwNrmCPNB2VtQ@mail.gmail.com
2020-12-01 11:46:56 -03:00
Heikki Linnakangas 2bc588798b Remove leftover comments, left behind by removal of WITH OIDS.
Author: Amit Langote
Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGaRoF3XrhPW-Y7P%2BG7bKo84Z_h%3DkQHvMh-80%3Dav3wmOw%40mail.gmail.com
2020-11-30 10:26:43 +02:00
Tom Lane 9c83b54a9c Fix a recently-introduced race condition in LISTEN/NOTIFY handling.
Commit 566372b3d fixed some race conditions involving concurrent
SimpleLruTruncate calls, but it introduced new ones in async.c.
A newly-listening backend could attempt to read Notify SLRU pages that
were in process of being truncated, possibly causing an error.  Also,
the QUEUE_TAIL pointer could become set to a value that's not equal to
the queue position of any backend.  While that's fairly harmless in
v13 and up (thanks to commit 51004c717), in older branches it resulted
in near-permanent disabling of the queue truncation logic, so that
continued use of NOTIFY led to queue-fill warnings and eventual
inability to send any more notifies.  (A server restart is enough to
make that go away, but it's still pretty unpleasant.)

The core of the problem is confusion about whether QUEUE_TAIL
represents the "logical" tail of the queue (i.e., the oldest
still-interesting data) or the "physical" tail (the oldest data we've
not yet truncated away).  To fix, split that into two variables.
QUEUE_TAIL regains its definition as the logical tail, and we
introduce a new variable to track the oldest un-truncated page.

Per report from Mikael Gustavsson.  Like the previous patch,
back-patch to all supported branches.

Discussion: https://postgr.es/m/1b8561412e8a4f038d7a491c8b922788@smhi.se
2020-11-28 14:03:40 -05:00
Alvaro Herrera dcfff74fb1
Restore lock level to update statusFlags
Reverts 27838981be (some comments are kept).  Per discussion, it does
not seem safe to relax the lock level used for this; in order for it to
be safe, there would have to be memory barriers between the point we set
the flag and the point we set the trasaction Xid, which perhaps would
not be so bad; but there would also have to be barriers at the readers'
side, which from a performance perspective might be bad.

Now maybe this analysis is wrong and it *is* safe for some reason, but
proof of that is not trivial.

Discussion: https://postgr.es/m/20201118190928.vnztes7c2sldu43a@alap3.anarazel.de
2020-11-26 12:30:48 -03:00
Alvaro Herrera c98763bf51
Avoid spurious waits in concurrent indexing
In the various waiting phases of CREATE INDEX CONCURRENTLY (CIC) and
REINDEX CONCURRENTLY (RC), we wait for other processes to release their
snapshots; this is necessary in general for correctness.  However,
processes doing CIC in other tables cannot possibly affect CIC or RC
done in "this" table, so we don't need to wait for those.  This commit
adds a flag in MyProc->statusFlags to indicate that the current process
is doing CIC, so that other processes doing CIC or RC can ignore it when
waiting.

Note that this logic is only valid if the index does not access other
tables.  For simplicity we avoid setting the flag if the index has a
column that's an expression, or has a WHERE predicate.  (It is possible
to have expressional or partial indexes that do not access other tables,
but figuring that out would require more work.)

This flag can potentially also be used by processes doing REINDEX
CONCURRENTLY to be skipped; and by VACUUM to ignore processes in CIC or
RC for the purposes of computing an Xmin.  That's left for future
commits.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Dimitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20200810233815.GA18970@alvherre.pgsql
2020-11-25 18:22:57 -03:00
Tom Lane ec05bafdbb Put "inline" marker on declarations of inline functions.
I'm having a hard time telling whether the letter of the C standard
requires this, but we do have a couple of buildfarm members that
throw warnings when this is not done.  Oversight in c532d15dd.
2020-11-24 15:43:01 -05:00
Heikki Linnakangas c532d15ddd Split copy.c into four files.
Copy.c has grown really large. Split it into more manageable parts:

- copy.c now contains only a few functions that are common to COPY FROM
  and COPY TO.

- copyto.c contains code for COPY TO.

- copyfrom.c contains code for initializing COPY FROM, and inserting the
  tuples to the correct table.

- copyfromparse.c contains code for reading from the client/file/program,
  and parsing the input text/CSV/binary format into tuples.

All of these parts are fairly complicated, and fairly independent of each
other. There is a patch being discussed to implement parallel COPY FROM,
which will add a lot of new code to the COPY FROM path, and another patch
which would allow INSERTs to use the same multi-insert machinery as COPY
FROM, both of which will require refactoring that code. With those two
patches, there's going to be a lot of code churn in copy.c anyway, so now
seems like a good time to do this refactoring.

The CopyStateData struct is also split. All the formatting options, like
FORMAT, QUOTE, ESCAPE, are put in a new CopyFormatOption struct, which
is used by both COPY FROM and TO. Other state data are kept in separate
CopyFromStateData and CopyToStateData structs.

Reviewed-by: Soumyadeep Chakraborty, Erik Rijkers, Vignesh C, Andres Freund
Discussion: https://www.postgresql.org/message-id/8e15b560-f387-7acc-ac90-763986617bfb%40iki.fi
2020-11-23 10:50:50 +02:00
Michael Paquier 878f3a19c6 Remove INSERT privilege check at table creation of CTAS and matview
As per discussion with Peter Eisentraunt, the SQL standard specifies
that any tuple insertion done as part of CREATE TABLE AS happens without
any extra ACL check, so it makes little sense to keep a check for INSERT
privileges when using WITH DATA.  Materialized views are not part of the
standard, but similarly, this check can be confusing as this refers to
an access check on a table created within the same command as the one
that would insert data into this table.

This commit removes the INSERT privilege check for WITH DATA, the
default, that 846005e removed partially, but only for WITH NO DATA.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/d049c272-9a47-d783-46b0-46665b011598@enterprisedb.com
2020-11-21 19:45:30 +09:00
Alvaro Herrera 27838981be
Relax lock level for setting PGPROC->statusFlags
We don't actually need a lock to set PGPROC->statusFlags itself; what we
do need is a shared lock on either XidGenLock or ProcArrayLock in order to
ensure MyProc->pgxactoff keeps still while we modify the mirror array in
ProcGlobal->statusFlags.  Some places were using an exclusive lock for
that, which is excessive.  Relax those to use shared lock only.

procarray.c has a couple of places with somewhat brittle assumptions
about PGPROC changes: ProcArrayEndTransaction uses only shared lock, so
it's permissible to change MyProc only.  On the other hand,
ProcArrayEndTransactionInternal also changes other procs, so it must
hold exclusive lock.  Add asserts to ensure those assumptions continue
to hold.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20201117155501.GA13805@alvherre.pgsql
2020-11-18 13:24:22 -03:00
Alvaro Herrera 7684b6fbed
indexcmds.c: reorder function prototypes
... out of an overabundance of neatnikism, perhaps.
2020-11-17 14:22:26 -03:00
Alvaro Herrera cd9c1b3e19
Rename PGPROC->vacuumFlags to statusFlags
With more flags associated to a PGPROC entry that are not related to
vacuum (currently existing or planned), the name "statusFlags" describes
its purpose better.

(The same is done to the mirroring PROC_HDR->vacuumFlags.)

No functional changes in this commit.

This was suggested first by Hari Babu Kommi in [1] and then by Michael
Paquier at [2].

[1] https://postgr.es/m/CAJrrPGcsDC-oy1AhqH0JkXYa0Z2AgbuXzHPpByLoBGMxfOZMEQ@mail.gmail.com
[2] https://postgr.es/m/20200820060929.GB3730@paquier.xyz

Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20201116182446.qcg3o6szo2zookyr@localhost
2020-11-16 19:42:55 -03:00
Tom Lane ad84ecc98d Use "true" not "TRUE" in one ICU function call.
This was evidently missed in commit 6337865f3, which generally did
s/TRUE/true/ everywhere.  It escaped notice up to now because ICU
versions before ICU 68 provided definitions of "TRUE" and "FALSE"
regardless.  With ICU 68, it fails to compile.

Per report from Condor.  Back-patch to v11 where 6337865f3 came in.
(I've not tested v10, where this call originated, but I imagine
it's fine since we defined TRUE in c.h back then.)

Discussion: https://postgr.es/m/7a6f3336165bfe3ca66abcda7966f9d0@stz-bg.com
2020-11-16 15:16:39 -05:00
Michael Paquier 846005e4f3 Relax INSERT privilege requirement for CTAS and matviews WITH NO DATA
When specified, WITH NO DATA does not insert any data into the relation
created, so skip checking for the insert permissions.  With WITH DATA or
WITH NO DATA, it is always required for the user to have CREATE
privileges on the schema targeted for the relation.

Note that plain CREATE TABLE AS or CREATE MATERIALIZED VIEW queries have
begun to work accidentally without INSERT privilege checks as of
874fe3ae, while using EXECUTE or EXPLAIN ANALYZE would fail with the ACL
check, so this makes the behavior for all the command flavors consistent
with each other.  This is arguably a bug fix, but there have been no
complaints about the current behavior either so stable branches are not
changed.

While on it, document properly the privileges requirements for each
commands with more tests for all the scenarios possible, and avoid a
useless bulk-insert allocation when using WITH NO DATA.

Author: Bharath Rupireddy
Reviewed-by: Anastasia Lubennikova, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACWc3N8j0_9nMPz9wcAUnVcdKHzFdDZJ3hVFNEbqtcyG9w@mail.gmail.com
2020-11-16 11:52:40 +09:00
Tom Lane 29d29d652f Fix fuzzy thinking about amcanmulticol versus amcaninclude.
These flags should be independent: in particular an index AM should
be able to say that it supports include columns without necessarily
supporting multiple key columns.  The included-columns patch got
this wrong, possibly aided by the fact that it didn't bother to
update the documentation.

While here, clarify some text about amcanreturn, which was a little
vague about what should happen when amcanreturn reports that only
some of the index columns are returnable.

Noted while reviewing the SP-GiST included-columns patch, which
quite incorrectly (and unsafely) changed SP-GiST to claim
amcanmulticol = true as a workaround for this bug.

Backpatch to v11 where included columns were introduced.
2020-11-15 16:10:58 -05:00
Tom Lane 92bf7e2d02 Provide the OR REPLACE option for CREATE TRIGGER.
This is mostly straightforward.  However, we disallow replacing
constraint triggers or changing the is-constraint property; perhaps
that can be added later, but the complexity versus benefit tradeoff
doesn't look very good.

Also, no special thought is taken here for whether replacing an
existing trigger should result in changes to queued-but-not-fired
trigger actions.  We just document that if you're surprised by the
results, too bad, don't do that.  (Note that any such pending trigger
activity would have to be within the current session.)

Takamichi Osumi, reviewed at various times by Surafel Temesgen,
Peter Smith, and myself

Discussion: https://postgr.es/m/0DDF369B45A1B44B8A687ED43F06557C010BC362@G01JPEXMBYT03
2020-11-14 17:05:34 -05:00
Fujii Masao b62e6056a0 pg_stat_statements: track number of rows processed by REFRESH MATERIALIZED VIEW.
Commit 6023b7ea71 allowed pg_stat_statements to track the number
of rows retrieved or affected by some utility commands including
CREATE MATERIALIZED VIEW. However it did not track the rowcount
of REFRESH MATERIALIZED VIEW. This commit allows pg_stat_statements
to track that.

To track that, this commit changes the query completion for
REFRESH MATERIALIZED VIEW so that it saves the rowcount. But note that
the rowcount is still not displayed in the command completion tag output.
That is, the display_rowcount flag of CMDTAG_REFRESH_MATERIALIZED_VIEW
command tag is left false in cmdtaglist.h. Otherwise, the change of
completion tag output might break applications using it.

Author: Katsuragi Yuta, Seino Yuki
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/71f6bc72f8bbaa06e701f8bd2562c347@oss.nttdata.com
Discussion: https://postgr.es/m/aadbfba9-e4bb-9531-6b3a-d13c31c8f4fe@oss.nttdata.com
2020-11-12 11:26:55 +09:00
Peter Eisentraut c77f6f50e4 Fix cases of discarding result from list API functions
Two cases violated list APIs by throwing away the return value.  While
the code was technically correct, it relied on internal knowledge of
the list implementation, and the code wasn't really gaining anything
that way.  It is planned to make this a compiler warning in the
future, so just fix these cases by assigning the return value
properly.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/e3753562-99cd-b65f-5aca-687dfd1ec2fc@2ndquadrant.com
2020-11-11 08:03:51 +01:00
Noah Misch 0c3185e963 In security-restricted operations, block enqueue of at-commit user code.
Specifically, this blocks DECLARE ... WITH HOLD and firing of deferred
triggers within index expressions and materialized view queries.  An
attacker having permission to create non-temp objects in at least one
schema could execute arbitrary SQL functions under the identity of the
bootstrap superuser.  One can work around the vulnerability by disabling
autovacuum and not manually running ANALYZE, CLUSTER, REINDEX, CREATE
INDEX, VACUUM FULL, or REFRESH MATERIALIZED VIEW.  (Don't restore from
pg_dump, since it runs some of those commands.)  Plain VACUUM (without
FULL) is safe, and all commands are fine when a trusted user owns the
target object.  Performance may degrade quickly under this workaround,
however.  Back-patch to 9.5 (all supported versions).

Reviewed by Robert Haas.  Reported by Etienne Stalmans.

Security: CVE-2020-25695
2020-11-09 07:32:09 -08:00
Alvaro Herrera 52eec1c53a
Message style improvements
* Avoid pointlessly highlighting that an index vacuum was executed by a
  parallel worker; user doesn't care.

* Don't give the impression that a non-concurrent reindex of an invalid
  index on a TOAST table would work, because it wouldn't.

* Add a "translator:" comment for a mysterious message.

Discussion: https://postgr.es/m/20201107034943.GA16596@alvherre.pgsql
Reviewed-by: Michael Paquier <michael@paquier.xyz>
2020-11-07 19:33:43 -03:00
Peter Eisentraut bdc4edbea6 Move catalog index declarations
Move the system catalog index declarations from catalog/indexing.h to
the respective parent tables' catalog/pg_*.h files.  The original
reason for having it split was that the old genbki system produced the
output in the order of the catalog files it read, so all the indexing
stuff needed to come separately.  But this is no longer the case, and
keeping it together makes more sense.

Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/c7cc82d6-f976-75d6-2e3e-b03d2cab26bb@2ndquadrant.com
2020-11-07 12:26:24 +01:00
Tom Lane eeda7f6338 Revert "Accept relations of any kind in LOCK TABLE".
Revert 59ab4ac32, as well as the followup fix 33862cb9c, in all
branches.  We need to think a bit harder about what the behavior
of LOCK TABLE on views should be, and there's no time for that
before next week's releases.  We'll take another crack at this
later.

Discussion: https://postgr.es/m/16703-e348f58aab3cf6cc@postgresql.org
2020-11-06 16:17:56 -05:00
Tom Lane 5b7bfc3972 Don't throw an error for LOCK TABLE on a self-referential view.
LOCK TABLE has complained about "infinite recursion" when applied
to a self-referential view, ever since we made it recurse into views
in v11.  However, that breaks pg_dump's new assumption that it's
okay to lock every relation.  There doesn't seem to be any good
reason to throw an error: if we just abandon the recursion, we've
still satisfied the requirement of locking every referenced relation.

Per bug #16703 from Andrew Bille (via Alexander Lakhin).

Discussion: https://postgr.es/m/16703-e348f58aab3cf6cc@postgresql.org
2020-11-05 11:44:32 -05:00
Tom Lane 17fb60387c Improve error messages around REPLICATION and BYPASSRLS properties.
Clarify wording as per suggestion from Wolfgang Walther.
No back-patch; this doesn't seem worth thrashing translatable
strings in the back branches.

Tom Lane and Stephen Frost

Discussion: https://postgr.es/m/a5548a9f-89ee-3167-129d-162b5985fcf8@technowledgy.de
2020-11-03 15:49:05 -05:00
Tom Lane d907bd0543 Allow users with BYPASSRLS to alter their own passwords.
The intention in commit 491c029db was to require superuserness to
change the BYPASSRLS property, but the actual effect of the coding
in AlterRole() was to require superuserness to change anything at all
about a BYPASSRLS role.  Other properties of a BYPASSRLS role should
be changeable under the same rules as for a normal role, though.

Fix that, and also take care of some documentation omissions related
to BYPASSRLS and REPLICATION role properties.

Tom Lane and Stephen Frost, per bug report from Wolfgang Walther.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/a5548a9f-89ee-3167-129d-162b5985fcf8@technowledgy.de
2020-11-03 15:41:32 -05:00
Peter Eisentraut bf797a8d97 Disallow ALTER TABLE ONLY / DROP EXPRESSION
The current implementation cannot handle this correctly, so just
forbid it for now.

GENERATED clauses must be attached to the column definition and cannot
be added later like DEFAULT, so if a child table has a generation
expression that the parent does not have, the child column will
necessarily be an attlocal column.  So to implement ALTER TABLE ONLY /
DROP EXPRESSION, we'd need extra code to update attislocal of the
direct child tables, somewhat similar to how DROP COLUMN does it, so
that the resulting state can be properly dumped and restored.

Discussion: https://www.postgresql.org/message-id/flat/15830.1575468847%40sss.pgh.pa.us
2020-11-03 15:28:23 +01:00
Thomas Munro 257836a755 Track collation versions for indexes.
Record the current version of dependent collations in pg_depend when
creating or rebuilding an index.  When accessing the index later, warn
that the index may be corrupted if the current version doesn't match.

Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe,
Michael Paquier, Robert Haas, Tom Lane and others for very helpful
discussion.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> (earlier versions)
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
2020-11-03 01:19:50 +13:00
Thomas Munro 7d1297df08 Remove pg_collation.collversion.
This model couldn't be extended to cover the default collation, and
didn't have any information about the affected database objects when the
version changed.  Remove, in preparation for a follow-up commit that
will add a new mechanism.

Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
2020-11-03 00:44:59 +13:00
Michael Paquier 8a15e735be Fix some grammar and typos in comments and docs
The documentation fixes are backpatched down to where they apply.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20201031020801.GD3080@telsasoft.com
Backpatch-through: 9.6
2020-11-02 15:14:41 +09:00
Tom Lane f90149e628 Don't use custom OID symbols in pg_type.dat, either.
On the same reasoning as in commit 36b931214, forbid using custom
oid_symbol macros in pg_type as well as pg_proc, so that we always
rely on the predictable macro names generated by genbki.pl.

We do continue to grant grandfather status to the names CASHOID and
LSNOID, although those are now considered deprecated aliases for the
preferred names MONEYOID and PG_LSNOID.  This is because there's
likely to be client-side code using the old names, and this bout of
neatnik-ism doesn't quite seem worth breaking client code.

There might be a case for grandfathering EVTTRIGGEROID, too, since
externally-maintained PLs may reference that symbol.  But renaming
such references to EVENT_TRIGGEROID doesn't seem like a particularly
heavy lift --- we make far more significant backend API changes in
every major release.  For now I didn't add that, but we could
reconsider if there's pushback.

The other names changed here seem pretty unlikely to have any outside
uses.  Again, we could add alias macros if there are complaints, but
for now I didn't.

As before, no need for a catversion bump.

John Naylor

Discussion: https://postgr.es/m/CAFBsxsHpCbjfoddNGpnnnY5pHwckWfiYkMYSF74PmP1su0+ZOw@mail.gmail.com
2020-10-29 13:33:38 -04:00
Andres Freund 94bc27b576 Centralize horizon determination for temp tables, fixing bug due to skew.
This fixes a bug in the edge case where, for a temp table, heap_page_prune()
can end up with a different horizon than heap_vacuum_rel(). Which can trigger
errors like "ERROR: cannot freeze committed xmax ...".

The bug was introduced due to interaction of a7212be8b9 "Set cutoff xmin more
aggressively when vacuuming a temporary table." with dc7420c2c9 "snapshot
scalability: Don't compute global horizons while building snapshots.".

The problem is caused by lazy_scan_heap() assuming that the only reason its
HeapTupleSatisfiesVacuum() call would return HEAPTUPLE_DEAD is if the tuple is
a HOT tuple, or if the tuple's inserting transaction has aborted since the
heap_page_prune() call. But after a7212be8b9 that was also possible in other
cases for temp tables, because heap_page_prune() uses a different visibility
test after dc7420c2c9.

The fix is fairly simple: Move the special case logic for temp tables from
vacuum_set_xid_limits() to the infrastructure introduced in dc7420c2c9. That
ensures that the horizon used for pruning is at least as aggressive as the one
used by lazy_scan_heap(). The concrete horizon used for temp tables is
slightly different than the logic in dc7420c2c9, but should always be as
aggressive as before (see comments).

A significant benefit to centralizing the logic procarray.c is that now the
more aggressive horizons for temp tables does not just apply to VACUUM but
also to e.g. HOT pruning and the nbtree killtuples logic.

Because isTopLevel is not needed by vacuum_set_xid_limits() anymore, I
undid the the related changes from a7212be8b9.

This commit also adds an isolation test ensuring that the more aggressive
vacuuming and pruning of temp tables keeps working.

Debugged-By: Amit Kapila <amit.kapila16@gmail.com>
Debugged-By: Tom Lane <tgl@sss.pgh.pa.us>
Debugged-By: Ashutosh Sharma <ashu.coek88@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20201014203103.72oke6hqywcyhx7s@alap3.anarazel.de
Discussion: https://postgr.es/m/20201015083735.derdzysdtqdvxshp@alap3.anarazel.de
2020-10-28 18:02:31 -07:00
Alvaro Herrera 59ab4ac324
Accept relations of any kind in LOCK TABLE
The restriction that only tables and views can be locked by LOCK TABLE
is quite arbitrary, since the underlying mechanism can lock any relation
type.  Drop the restriction so that programs such as pg_dump can lock
all relations they're interested in, preventing schema changes that
could cause a dump to fail after expending much effort.

Backpatch to 9.5.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Wells Oliver <wells.oliver@gmail.com>
Discussion: https://postgr.es/m/20201021200659.GA32358@alvherre.pgsql
2020-10-27 13:49:19 -03:00
Tom Lane 20d3fe9009 In INSERT/UPDATE, use the table's real tuple descriptor as target.
Previously, ExecInitModifyTable relied on ExecInitJunkFilter,
and thence ExecCleanTypeFromTL, to build the target descriptor from
the query tlist.  While we just checked (in ExecCheckPlanOutput)
that the tlist produces compatible output, this is not a great
substitute for the relation's actual tuple descriptor that's
available from the relcache.  For one thing, dropped columns will
not be correctly marked attisdropped; it's a bit surprising that
we've gotten away with that this long.  But the real reason for
being concerned with this is that using the table's descriptor means
that the slot will have correct attrmissing data, allowing us to
revert the klugy fix of commit ba9f18abd.  (This commit undoes
that one's changes in trigger.c, but keeps the new test case.)
Thus we can solve the bogus-trigger-tuple problem with fewer cycles
rather than more.

No back-patch, since this doesn't fix any additional bug, and it
seems somewhat more likely to have unforeseen side effects than
ba9f18abd's narrow fix.

Discussion: https://postgr.es/m/16644-5da7ef98a7ac4545@postgresql.org
2020-10-26 11:36:53 -04:00
Tom Lane ba9f18abd3 Fix corner case for a BEFORE ROW UPDATE trigger returning OLD.
If the old row has any "missing" attributes that are supposed to
be retrieved from an associated tuple descriptor, the wrong things
happened because the trigger result is shoved directly into an
executor slot that lacks the missing-attribute data.  Notably,
CHECK-constraint verification would incorrectly see those columns
as NULL, and so would RETURNING-list evaluation.

Band-aid around this by forcibly expanding the tuple before passing
it to the trigger function.  (IMO it was a fundamental misdesign to
put the missing-attribute data into tuple constraints, which so
much of the system considers to be optional.  But we're probably
stuck with that now, and will have to continue to apply band-aids
as we find other places with similar issues.)

Back-patch to v12.  v11 would also have the issue, except that
commit 920311ab1 already applied a similar band-aid.  That forced
expansion in more cases than seem really necessary, though, so
this isn't a directly equivalent fix.

Amit Langote, with some cosmetic changes by me

Discussion: https://postgr.es/m/16644-5da7ef98a7ac4545@postgresql.org
2020-10-25 13:57:46 -04:00
Tom Lane 87a174c0e7 Fix broken XML formatting in EXPLAIN output for incremental sorts.
The ExplainCloseGroup arguments for incremental sort usage data
didn't match the corresponding ExplainOpenGroup.  This only matters
for XML-format output, which is probably why we'd not noticed.

Daniel Gustafsson, per bug #16683 from Frits Jalvingh

Discussion: https://postgr.es/m/16683-8005033324ad34e9@postgresql.org
2020-10-23 11:32:33 -04:00
Alvaro Herrera bbb927b4db
Fix ALTER TABLE .. ENABLE/DISABLE TRIGGER recursion
More precisely, correctly handle the ONLY flag indicating not to
recurse.  This was implemented in 86f575948c by recursing in
trigger.c, but that's the wrong place; use ATSimpleRecursion instead,
which behaves properly.  However, because legacy inheritance has never
recursed in that situation, make sure to do that only for new-style
partitioning.

I noticed this problem while testing a fix for another bug in the
vicinity.

This has been wrong all along, so backpatch to 11.

Discussion: https://postgr.es/m/20201016235925.GA29829@alvherre.pgsql
2020-10-20 19:22:09 -03:00
Heikki Linnakangas fb5883da86 Remove PartitionRoutingInfo struct.
The extra indirection neeeded to access its members via its enclosing
ResultRelInfo seems pointless. Move all the fields from
PartitionRoutingInfo to ResultRelInfo.

Author: Amit Langote
Reviewed-by: Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/CA%2BHiwqFViT47Zbr_ASBejiK7iDG8%3DQ1swQ-tjM6caRPQ67pT%3Dw%40mail.gmail.com
2020-10-19 14:42:55 +03:00
Heikki Linnakangas 6973533650 Revise child-to-root tuple conversion map management.
Store the tuple conversion map to convert a tuple from a child table's
format to the root format in a new ri_ChildToRootMap field in
ResultRelInfo. It is initialized if transition tuple capture for FOR
STATEMENT triggers or INSERT tuple routing on a partitioned table is
needed. Previously, ModifyTable kept the maps in the per-subplan
ModifyTableState->mt_per_subplan_tupconv_maps array, or when tuple
routing was used, in
ResultRelInfo->ri_Partitioninfo->pi_PartitionToRootMap. The new field
replaces both of those.

Now that the child-to-root tuple conversion map is always available in
ResultRelInfo (when needed), remove the TransitionCaptureState.tcs_map
field. The callers of Exec*Trigger() functions no longer need to set or
save it, which is much less confusing and bug-prone. Also, as a future
optimization, this will allow us to delay creating the map for a given
result relation until the relation is actually processed during
execution.

Author: Amit Langote
Discussion: https://www.postgresql.org/message-id/CA%2BHiwqHtCWLdK-LO%3DNEsvOdHx%2B7yv4mE_zYK0i3BH7dXb-wxog%40mail.gmail.com
2020-10-19 14:42:55 +03:00
David Rowley 110d81728a Fixup some appendStringInfo and appendPQExpBuffer calls
A number of places were using appendStringInfo() when they could have been
using appendStringInfoString() instead.  While there's no functionality
change there, it's just more efficient to use appendStringInfoString()
when no formatting is required.  Likewise for some
appendStringInfoString() calls which were just appending a single char.
We can just use appendStringInfoChar() for that.

Additionally, many places were using appendPQExpBuffer() when they could
have used appendPQExpBufferStr(). Change those too.

Patch by Zhijie Hou, but further searching by me found significantly more
places that deserved the same treatment.

Author: Zhijie Hou, David Rowley
Discussion: https://postgr.es/m/cb172cf4361e4c7ba7167429070979d4@G08CNEXMBPEKD05.g08.fujitsu.local
2020-10-15 20:35:17 +13:00
Heikki Linnakangas a04daa97a4 Remove es_result_relation_info from EState.
Maintaining 'es_result_relation_info' correctly at all times has become
cumbersome, especially with partitioning where each partition gets its
own result relation info. Having to set and reset it across arbitrary
operations has caused bugs in the past.

This changes all the places that used 'es_result_relation_info', to
receive the currently active ResultRelInfo via function parameters
instead.

Author: Amit Langote
Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com
2020-10-14 11:41:40 +03:00
Heikki Linnakangas 1375422c78 Create ResultRelInfos later in InitPlan, index them by RT index.
Instead of allocating all the ResultRelInfos upfront in one big array,
allocate them in ExecInitModifyTable(). es_result_relations is now an
array of ResultRelInfo pointers, rather than an array of structs, and it
is indexed by the RT index.

This simplifies things: we get rid of the separate concept of a "result
rel index", and don't need to set it in setrefs.c anymore. This also
allows follow-up optimizations (not included in this commit yet) to skip
initializing ResultRelInfos for target relations that were not needed at
runtime, and removal of the es_result_relation_info pointer.

The EState arrays of regular result rels and root result rels are merged
into one array. Similarly, the resultRelations and rootResultRelations
lists in PlannedStmt are merged into one. It's not actually clear to me
why they were kept separate in the first place, but now that the
es_result_relations array is indexed by RT index, it certainly seems
pointless.

The PlannedStmt->resultRelations list is now only needed for
ExecRelationIsTargetRelation(). One visible effect of this change is that
ExecRelationIsTargetRelation() will now return 'true' also for the
partition root, if a partitioned table is updated. That seems like a good
thing, although the function isn't used in core code, and I don't see any
reason for an FDW to call it on a partition root.

Author: Amit Langote
Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com
2020-10-13 12:57:02 +03:00
Peter Eisentraut 2453ea1422 Support for OUT parameters in procedures
Unlike for functions, OUT parameters for procedures are part of the
signature.  Therefore, they have to be listed in pg_proc.proargtypes
as well as mentioned in ALTER PROCEDURE and DROP PROCEDURE.

Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/2b8490fe-51af-e671-c504-47359dc453c5@2ndquadrant.com
2020-10-05 09:21:43 +02:00
Michael Paquier 10c5291cc2 Fix handling of redundant options with COPY for "freeze" and "header"
The handling of those options was inconsistent, as the processing used
directly the value assigned to the option to check if it was redundant,
leading to patterns like this one to succeed (note that false is
specified first):
COPY hoge to '/path/to/file/' (header off, header on);

And the opposite would fail correctly (note that true is first here):
COPY hoge to '/path/to/file/' (header on, header off);

While on it, add some tests to check for all redundant patterns with the
options of COPY.  I have gone through the code and did not notice
similar mistakes for other commands.

"header" got it wrong since b63990c, and "freeze" was wrong from the
start as of 8de72b6.  No backpatch is done per the lack of complaints.

Reported-by: Rémi Lapeyre
Discussion: https://postgr.es/m/20200929072433.GA15570@paquier.xyz
Discussion: https://postgr.es/m/0B55BD07-83E4-439F-AACC-FA2D7CF50532@lenstra.fr
2020-10-05 09:43:17 +09:00
Alvaro Herrera 9fc2122712
Reword partitioning error message
The error message about columns in the primary key not including all of
the partition key was unclear; reword it.

Backpatch all the way to pg11, where it appeared.

Reported-by: Nagaraj Raj <nagaraj.sf@yahoo.com>
Discussion: https://postgr.es/m/64062533.78364.1601415362244@mail.yahoo.com
2020-09-30 18:25:23 -03:00
Michael Paquier e66bcfb4c6 Fix progress reporting of REINDEX CONCURRENTLY
This addresses a couple of issues with the so-said subject:
- Report the correct parent relation with the index actually being
rebuilt or validated.  Previously, the command status remained set to
the last index created for the progress of the index build and
validation, which would be incorrect when working on a table that has
more than one index.
- Use the correct phase when waiting before the drop of the old
indexes.  Previously, this was reported with the same status as when
waiting before the old indexes are marked as dead.

Author: Matthias van de Meent, Michael Paquier
Discussion: https://postgr.es/m/CAEze2WhqFgcwe1_tv=sFYhLWV2AdpfukumotJ6JNcAOQs3jufg@mail.gmail.com
Backpatch-through: 12
2020-09-29 14:15:57 +09:00
Tom Lane 56fe008996 Add for_each_from, to simplify loops starting from non-first list cells.
We have a dozen or so places that need to iterate over all but the
first cell of a List.  Prior to v13 this was typically written as
	for_each_cell(lc, lnext(list_head(list)))
Commit 1cff1b95a changed these to
	for_each_cell(lc, list, list_second_cell(list))
This patch introduces a new macro for_each_from() which expresses
the start point as a list index, allowing these to be written as
	for_each_from(lc, list, 1)
This is marginally more efficient, since ForEachState.i can be
initialized directly instead of backing into it from a ListCell
address.  It also seems clearer and less typo-prone.

Some of the remaining uses of for_each_cell() look like they could
profitably be changed to for_each_from(), but here I confined myself
to changing uses of list_second_cell().

Also, fix for_each_cell_setup() and for_both_cell_setup() to
const-ify their arguments; that's a simple oversight in 1cff1b95a.

Back-patch into v13, on the grounds that (1) the const-ification
is a minor bug fix, and (2) it's better for back-patching purposes
if we only have two ways to write these loops rather than three.

In HEAD, also remove list_third_cell() and list_fourth_cell(),
which were also introduced in 1cff1b95a, and are unused as of
cc99baa43.  It seems unlikely that any third-party code would
have started to use them already; anyone who has can be directed
to list_nth_cell instead.

Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
2020-09-28 20:33:13 -04:00
Tom Lane 9d299a4924 Minor mop-up for List improvements.
Fix a few places that were using written-out versions of the
pg_list.h macros that commit cc99baa43 just improved, making them
also use those macros so as to gain whatever performance improvement
is to be had.

Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
2020-09-27 22:30:52 -04:00
Tom Lane e55f718fc4 Revise RelationBuildRowSecurity() to avoid memory leaks.
This function leaked some memory while loading qual clauses for
an RLS policy.  While ordinarily negligible, that could build up
in some repeated-reload cases, as reported by Konstantin Knizhnik.
We can improve matters by borrowing the coding long used in
RelationBuildRuleLock: build stringToNode's result directly in
the target context, and remember to explicitly pfree the
input string.

This patch by no means completely guarantees zero leaks within
this function, since we have no real guarantee that the catalog-
reading subroutines it calls don't leak anything.  However,
practical tests suggest that this is enough to resolve the issue.
In any case, any remaining leaks are similar to those risked by
RelationBuildRuleLock and other relcache-loading subroutines.
If we need to fix them, we should adopt a more global approach
such as that used by the RECOVER_RELATION_BUILD_MEMORY hack.

While here, let's remove the need for an expensive PG_TRY block by
using MemoryContextSetParent to reparent an initially-short-lived
context for the RLS data.

Back-patch to all supported branches.

Discussion: https://postgr.es/m/21356c12-8917-8249-b35f-1c447231922b@postgrespro.ru
2020-09-26 16:04:06 -04:00
Thomas Munro dee663f784 Defer flushing of SLRU files.
Previously, we called fsync() after writing out individual pg_xact,
pg_multixact and pg_commit_ts pages due to cache pressure, leading to
regular I/O stalls in user backends and recovery.  Collapse requests for
the same file into a single system call as part of the next checkpoint,
as we already did for relation files, using the infrastructure developed
by commit 3eb77eba.  This can cause a significant improvement to
recovery performance, especially when it's otherwise CPU-bound.

Hoist ProcessSyncRequests() up into CheckPointGuts() to make it clearer
that it applies to all the SLRU mini-buffer-pools as well as the main
buffer pool.  Rearrange things so that data collected in CheckpointStats
includes SLRU activity.

Also remove the Shutdown{CLOG,CommitTS,SUBTRANS,MultiXact}() functions,
because they were redundant after the shutdown checkpoint that
immediately precedes them.  (I'm not sure if they were ever needed, but
they aren't now.)

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (parts)
Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com>
Discussion: https://postgr.es/m/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com
2020-09-25 19:00:15 +12:00
Tom Lane 6b2c4e59d0 Improve error cursor positions for problems with partition bounds.
We failed to pass down the query string to check_new_partition_bound,
so that its attempts to provide error cursor positions were for naught;
one must have the query string to get parser_errposition to do anything.
Adjust its API to require a ParseState to be passed down.

Also, improve the logic inside check_new_partition_bound so that the
cursor points at the partition bound for the specific column causing
the issue, when one can be identified.

That part is also for naught if we can't determine the query position of
the column with the problem.  Improve transformPartitionBoundValue so
that it makes sure that const-simplified partition expressions will be
properly labeled with positions.  In passing, skip calling evaluate_expr
if the value is already a Const, which is surely the most common case.

Alexandra Wang, Ashwin Agrawal, Amit Langote; reviewed by Ashutosh Bapat

Discussion: https://postgr.es/m/CACiyaSopZoqssfMzgHk6fAkp01cL6vnqBdmTw2C5_KJaFR_aMg@mail.gmail.com
Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
2020-09-23 18:04:53 -04:00
Tom Lane 1ed6b89563 Remove support for postfix (right-unary) operators.
This feature has been a thorn in our sides for a long time, causing
many grammatical ambiguity problems.  It doesn't seem worth the
pain to continue to support it, so remove it.

There are some follow-on improvements we can make in the grammar,
but this commit only removes the bare minimum number of productions,
plus assorted backend support code.

Note that pg_dump and psql continue to have full support, since
they may be used against older servers.  However, pg_dump warns
about postfix operators.  There is also a check in pg_upgrade.

Documentation-wise, I (tgl) largely removed the "left unary"
terminology in favor of saying "prefix operator", which is
a more standard and IMO less confusing term.

I included a catversion bump, although no initial catalog data
changes here, to mark the boundary at which oprkind = 'r'
stopped being valid in pg_operator.

Mark Dilger, based on work by myself and Robert Haas;
review by John Naylor

Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-17 19:38:05 -04:00
Tom Lane 2000b6c10a Don't fetch partition check expression during InitResultRelInfo.
Since there is only one place that actually needs the partition check
expression, namely ExecPartitionCheck, it's better to fetch it from
the relcache there.  In this way we will never fetch it at all if
the query never has use for it, and we still fetch it just once when
we do need it.

The reason for taking an interest in this is that if the relcache
doesn't already have the check expression cached, fetching it
requires obtaining AccessShareLock on the partition root.  That
means that operations that look like they should only touch the
partition itself will also take a lock on the root.  In particular
we observed that TRUNCATE on a partition may take a lock on the
partition's root, contributing to a deadlock situation in parallel
pg_restore.

As written, this patch does have a small cost, which is that we
are microscopically reducing efficiency for the case where a partition
has an empty check expression.  ExecPartitionCheck will be called,
and will go through the motions of setting up and checking an empty
qual, where before it would not have been called at all.  We could
avoid that by adding a separate boolean flag to track whether there
is a partition expression to test.  However, this case only arises
for a default partition with no siblings, which surely is not an
interesting case in practice.  Hence adding complexity for it
does not seem like a good trade-off.

Amit Langote, per a suggestion by me

Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
2020-09-16 14:28:18 -04:00
Tom Lane e5fac1cb19 Avoid unnecessary recursion to child tables in ALTER TABLE SET NOT NULL.
If a partitioned table's column is already marked NOT NULL, there is
no need to examine its partitions, because we can rely on previous
DDL to have enforced that the child columns are NOT NULL as well.
(Unfortunately, the same cannot be said for traditional inheritance,
so for now we have to restrict the optimization to partitioned tables.)
Hence, we may skip recursing to child tables in this situation.

The reason this case is worth worrying about is that when pg_dump dumps
a partitioned table having a primary key, it will include the requisite
NOT NULL markings in the CREATE TABLE commands, and then add the
primary key as a separate step.  The primary key addition generates a
SET NOT NULL as a subcommand, just to be sure.  So the situation where
a SET NOT NULL is redundant does arise in the real world.

Skipping the recursion does more than just save a few cycles: it means
that a command such as "ALTER TABLE ONLY partition_parent ADD PRIMARY
KEY" will take locks only on the partition parent table, not on the
partitions.  It turns out that parallel pg_restore is effectively
assuming that that's true, and has little choice but to do so because
the dependencies listed for such a TOC entry don't include the
partitions.  pg_restore could thus issue this ALTER while data restores
on the partitions are still in progress.  Taking unnecessary locks on
the partitions not only hurts concurrency, but can lead to actual
deadlock failures, as reported by Domagoj Smoljanovic.

(A contributing factor in the deadlock is that TRUNCATE on a child
partition wants a non-exclusive lock on the parent.  This seems
likewise unnecessary, but the fix for it is more invasive so we
won't consider back-patching it.  Fortunately, getting rid of one
of these two poor behaviors is enough to remove the deadlock.)

Although support for partitioned primary keys came in with v11,
this patch is dependent on the SET NOT NULL refactoring done by
commit f4a3fdfbd, so we can only patch back to v12.

Patch by me; thanks to Alvaro Herrera and Amit Langote for review.

Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
2020-09-16 13:38:26 -04:00
Alvaro Herrera ced138e8cb
Fix use-after-free bug with event triggers in an extension script
ALTER TABLE commands in an extension script are added to an event
trigger command list; but starting with commit b5810de3f4 they do so in
a memory context that's too short-lived, so when execution ends and time
comes to use the entries, they've already been freed.

(This would also be a problem with ALTER TABLE commands in a
multi-command query string, but these serendipitously end in
PortalContext -- which probably explains why it took so long for this to
be reported.)

Fix by using the memory context specifically set for that, instead.

Backpatch to 13, where the aforementioned commit appeared.

Reported-by: Philippe Beaudoin
Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://postgr.es/m/20200902193715.6e0269d4@firost
2020-09-15 21:03:14 -03:00
Peter Eisentraut 3e0242b24c Message fixes and style improvements 2020-09-14 06:42:30 +02:00
Michael Paquier a6642b3ae0 Add support for partitioned tables and indexes in REINDEX
Until now, REINDEX was not able to work with partitioned tables and
indexes, forcing users to reindex partitions one by one.  This extends
REINDEX INDEX and REINDEX TABLE so as they can accept a partitioned
index and table in input, respectively, to reindex all the partitions
assigned to them with physical storage (foreign tables, partitioned
tables and indexes are then discarded).

This shares some logic with schema and database REINDEX as each
partition gets processed in its own transaction after building a list of
relations to work on.  This choice has the advantage to minimize the
number of invalid indexes to one partition with REINDEX CONCURRENTLY in
the event a cancellation or failure in-flight, as the only indexes
handled at once in a single REINDEX CONCURRENTLY loop are the ones from
the partition being working on.

Isolation tests are added to emulate some cases I bumped into while
developing this feature, particularly with the concurrent drop of a
leaf partition reindexed.  However, this is rather limited as LOCK would
cause REINDEX to block in the first transaction building the list of
partitions.

Per its multi-transaction nature, this new flavor cannot run in a
transaction block, similarly to REINDEX SCHEMA, SYSTEM and DATABASE.

Author: Justin Pryzby, Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/db12e897-73ff-467e-94cb-4af03705435f.adger.lj@alibaba-inc.com
2020-09-08 10:09:22 +09:00
Tom Lane a5cc4dab6d Yet more elimination of dead stores and useless initializations.
I'm not sure what tool Ranier was using, but the ones I contributed
were found by using a newer version of scan-build than I tried before.

Ranier Vilela and Tom Lane

Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com
2020-09-05 13:17:32 -04:00
Michael Paquier 8febfd1855 Switch to multi-inserts when registering dependencies for many code paths
This commit improves the dependency registrations by taking advantage of
the preliminary work done in 63110c62, to group together the insertion
of dependencies of the same type to pg_depend.  With the current layer
of routines available, and as only dependencies of the same type can be
grouped, there are code paths still doing more than one multi-insert
when it is necessary to register dependencies of multiple types
(constraint and index creation are two cases doing that).

While on it, this refactors some of the code to use ObjectAddressSet()
when manipulating object addresses.

Author: Daniel Gustafsson, Michael Paquier
Reviewed-by: Andres Freund, Álvaro Herrera
Discussion: https://postgr.es/m/20200807061619.GA23955@paquier.xyz
2020-09-05 21:33:53 +09:00