Commit Graph

4244 Commits

Author SHA1 Message Date
Peter Eisentraut 37851a8b83 Database-level collation version tracking
This adds to database objects the same version tracking that collation
objects have.  There is a new pg_database column datcollversion that
stores the version, a new function
pg_database_collation_actual_version() to get the version from the
operating system, and a new subcommand ALTER DATABASE ... REFRESH
COLLATION VERSION.

This was not originally added together with pg_collation.collversion,
since originally version tracking was only supported for ICU, and ICU
on a database-level is not currently supported.  But we now have
version tracking for glibc (since PG13), FreeBSD (since PG14), and
Windows (since PG13), so this is useful to have now.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f0ff3190-29a3-5b39-a179-fa32eee57db6%40enterprisedb.com
2022-02-14 08:27:26 +01:00
Peter Geoghegan efa4a9462a Consolidate VACUUM xid cutoff logic.
Push the logic for determining whether or not a VACUUM operation will be
aggressive down into vacuum_set_xid_limits().  This makes the function's
signature significantly simpler, and seems clearer overall.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkymFbz6D_vL+jmqSn_5q1wsFvFrE+37yLgL_Rkfd6Gzg@mail.gmail.com
2022-02-11 18:26:15 -08:00
Peter Geoghegan 872770fd6c Add VACUUM instrumentation for scanned pages, relfrozenxid.
Report on scanned pages within VACUUM VERBOSE and autovacuum logging.
These are pages that were physically examined during the VACUUM
operation.  Note that this can include a small number of pages that were
marked all-visible in the visibility map by some earlier VACUUM
operation.  VACUUM won't skip all-visible pages that aren't part of a
range of all-visible pages that's at least 32 blocks in length (partly
to avoid missing out on opportunities to advance relfrozenxid during
non-aggressive VACUUMs).

Commit 44fa8488 simplified the definition of scanned pages.  It became
the complement of the pages (of those pages from rel_pages) that were
skipped using the visibility map.  And so scanned pages precisely
indicates how effective the visibility map was at saving work.  (Before
now we displayed the number of pages skipped via the visibility map when
happened to be frozen pages, but not when they were merely all-visible,
which was less useful to users.)

Rename the user-visible OldestXmin output field to "removal cutoff", and
show some supplementary information: how far behind the cutoff is
(number of XIDs behind) by the time the VACUUM operation finished.  This
will help users to figure out what's _not_ working in extreme cases
where VACUUM is fundamentally unable to remove dead tuples or freeze
older tuples (e.g., due to a leaked replication slot).  Also report when
relfrozenxid is advanced by VACUUM in output that immediately follows
"removal cutoff".  This structure is intended to highlight the
relationship between the new relfrozenxid value for the table, and the
VACUUM operation's removal cutoff.

Finally, add instrumentation of "missed dead tuples", and the number of
pages that had at least one such tuple.  These are fully DEAD (not just
RECENTLY_DEAD) tuples with storage that could not be pruned due to
failure to acquire a cleanup lock on a heap page.  This is a replacement
for the "skipped due to pin" instrumentation removed by commit 44fa8488.
It shows more details than before for pages where failing to get a
cleanup lock actually resulted in VACUUM missing out on useful work, but
usually shows nothing at all instead (the mere fact that we couldn't get
a cleanup lock is usually of no consequence whatsoever now).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-Wznp=c=Opj8Z7RMR3G=ec3_JfGYMN_YvmCEjoPCHzWbx0g@mail.gmail.com
2022-02-11 16:48:40 -08:00
Thomas Munro 4eb2176318 Fix DROP {DATABASE,TABLESPACE} on Windows.
Previously, it was possible for DROP DATABASE, DROP TABLESPACE and ALTER
DATABASE SET TABLESPACE to fail because other backends still had file
handles open for dropped tables.  Windows won't allow a directory
containing unlinked-but-still-open files to be unlinked.  Tackle this
problem by forcing all backends to close all smgr fds.  No change for
Unix systems, which don't suffer from the problem, but the new code path
can be tested by Unix-based developers by defining
USE_BARRIER_SMGRRELEASE explicitly.

It's possible that PROCSIGNAL_BARRIER_SMGRRELEASE will have more
bug-fixing applications soon (under discussion).  Note that this is the
first user of the ProcSignalBarrier mechanism from commit 16a4e4aec.  It
could in principle be back-patched as far as 14, but since field
complaints are rare and ProcSignalBarrier hasn't been battle-tested,
that seems like a bad idea.  Fix in master only, where these failures
have started to show up in automated testing due to new tests.

Suggested-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGLdemy2gBm80kz20GTe6hNVwoErE8KwcJk6-U56oStjtg@mail.gmail.com
2022-02-12 10:21:23 +13:00
Tomas Vondra 0da92dc530 Logical decoding of sequences
This extends the logical decoding to also decode sequence increments.
We differentiate between sequences created in the current (in-progress)
transaction, and sequences created earlier. This mixed behavior is
necessary because while sequences are not transactional (increments are
not subject to ROLLBACK), relfilenode changes are. So we do this:

* Changes for sequences created in the same top-level transaction are
  treated as transactional, i.e. just like any other change from that
  transaction, and discarded in case of a rollback.

* Changes for sequences created earlier are applied immediately, as if
  performed outside any transaction. This applies also after ALTER
  SEQUENCE, which may create a new relfilenode.

Moreover, if we ever get support for DDL replication, the sequence
won't exist until the transaction gets applied.

Sequences created in the current transaction are tracked in a simple
hash table, identified by a relfilenode. That means a sequence may
already exist, but if a transaction does ALTER SEQUENCE then the
increments for the new relfilenode will be treated as transactional.

For each relfilenode we track the XID of (sub)transaction that created
it, which is needed for cleanup at transaction end. We don't need to
check the XID to decide if an increment is transactional - if we find a
match in the hash table, it has to be the same transaction.

This requires two minor changes to WAL-logging. Firstly, we need to
ensure the sequence record has a valid XID - until now the the increment
might have XID 0 if it was the first change in a subxact. But the
sequence might have been created in the same top-level transaction. So
we ensure the XID is assigned when WAL-logging increments.

The other change is addition of "created" flag, marking increments for
newly created relfilenodes. This makes it easier to maintain the hash
table of sequences that need transactional handling.
Note: This is needed because of subxacts. A XID 0 might still have the
sequence created in a different subxact of the same top-level xact.

This does not include any changes to test_decoding and/or the built-in
replication - those will be committed in separate patches.

A patch adding decoding of sequences was originally submitted by Cary
Huang. This commit reworks various important aspects (e.g. the WAL
logging and transactional/non-transactional handling). However, the
original patch and reviews were very useful.

Author: Tomas Vondra, Cary Huang
Reviewed-by: Peter Eisentraut, Hannu Krosing, Andres Freund
Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com
Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca
2022-02-10 18:43:51 +01:00
Michael Paquier 4567596316 Reduce more the number of calls to GetMaxBackends()
Some of the code paths changed by aa64f23 can reduce the number of times
GetMaxBackends() is called.  The performance gain is marginal, but most
of the code changed by this commit already did that.  Hence, let's be
clean and apply the same rule everywhere, for consistency.

Some of the code paths, like in deadlock.c, involve only assertions.
These are left unchanged.

Reviewed-by: Nathan Bossart, Robert Haas
Discussion: https://postgr.es/m/YgMpGZhPOjNfS7er@paquier.xyz
2022-02-10 10:27:29 +09:00
Robert Haas aa64f23b02 Remove MaxBackends variable in favor of GetMaxBackends() function.
Previously, it was really easy to write code that accessed MaxBackends
before we'd actually initialized it, especially when coding up an
extension. To make this less error-prune, introduce a new function
GetMaxBackends() which should be used to obtain the correct value.
This will ERROR if called too early. Demote the global variable to
a file-level static, so that nobody can peak at it directly.

Nathan Bossart. Idea by Andres Freund. Review by Greg Sabino Mullane,
by Michael Paquier (who had doubts about the approach), and by me.

Discussion: http://postgr.es/m/20210802224204.bckcikl45uezv5e4@alap3.anarazel.de
2022-02-08 15:53:19 -05:00
Peter Eisentraut 94aa7cc5f7 Add UNIQUE null treatment option
The SQL standard has been ambiguous about whether null values in
unique constraints should be considered equal or not.  Different
implementations have different behaviors.  In the SQL:202x draft, this
has been formalized by making this implementation-defined and adding
an option on unique constraint definitions UNIQUE [ NULLS [NOT]
DISTINCT ] to choose a behavior explicitly.

This patch adds this option to PostgreSQL.  The default behavior
remains UNIQUE NULLS DISTINCT.  Making this happen in the btree code
is pretty easy; most of the patch is just to carry the flag around to
all the places that need it.

The CREATE UNIQUE INDEX syntax extension is not from the standard,
it's my own invention.

I named all the internal flags, catalog columns, etc. in the negative
("nulls not distinct") so that the default PostgreSQL behavior is the
default if the flag is false.

Reviewed-by: Maxim Orlov <orlovmg@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/84e5ee1b-387e-9a54-c326-9082674bde78@enterprisedb.com
2022-02-03 11:48:21 +01:00
Peter Eisentraut 87669de72c Some cleanup for change of collate and ctype fields to type text
Some cleanup for commit 54637508f87bd5f07fb9406bac6b08240283be3b:
Reformat pg_database.dat to reflect the new field order.  Also update
the corresponding example in bki.sgml.  Reorder the way the fields are
filled in dbcommands.c to correspond to the new order.
2022-02-02 11:58:55 +01:00
Alvaro Herrera b3d7d6e462
Remove xloginsert.h from xlog.h
xlog.h is directly and indirectly #included in a lot of places.  With
this change, xloginsert.h is no longer unnecessarily included in the
large number of them that don't need it.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CALj2ACVe-W+WM5P44N7eG9C2_FmaeM8Dq5aCnD3fHt0Ba=WR6w@mail.gmail.com
2022-01-30 12:25:24 -03:00
Michael Paquier dc084d7c73 Fix incorrect memory context switch in COPY TO execution
c532d15 has split the logic of COPY commands into multiple files, one
change being to move the internals of BeginCopy() to BeginCopyTo().
Originally the code was written so as we'd switch back-and-forth between
the current execution memory context and the dedicated memory context
for the COPY command, and this refactoring has introduced an extra
switch to the current memory context from the COPY context once
BeginCopyTo() is done with the past logic coming from BeginCopy().

The code was correctly doing the analyze, rewrite and planning phases in
the COPY context, but it was not assigning "copy_file" (FILE* used when
copying to a source file) and "filename" in the COPY context, making the
COPY status data inconsistent.

Author: Bharath Rupireddy
Reviewed-by: Japin Li
Discussion: https://postgr.es/m/CALj2ACWvVa69foi9jhHFY=2BuHxAoYboyE+vXQTARwxZcJnVrQ@mail.gmail.com
Backpatch-through: 14
2022-01-29 10:22:42 +09:00
Peter Eisentraut 43f33dc018 Add HEADER support to COPY text format
The COPY CSV format supports the HEADER option to output a header
line.  This patch adds the same option to the default text format.  On
input, the HEADER option causes the first line to be skipped, same as
with CSV.

Author: Rémi Lapeyre <remi.lapeyre@lenstra.fr>
Discussion: https://www.postgresql.org/message-id/flat/CAF1-J-0PtCWMeLtswwGV2M70U26n4g33gpe1rcKQqe6wVQDrFA@mail.gmail.com
2022-01-28 09:44:47 +01:00
Peter Eisentraut 5553cbd4fe Add some const decorations 2022-01-28 09:13:11 +01:00
Peter Eisentraut 54637508f8 Change collate and ctype fields to type text
This changes the data type of the catalog fields datcollate, datctype,
collcollate, and collctype from name to text.  There wasn't ever a
really good reason for them to be of type name; presumably this was
just carried over from when they were fixed-size fields in pg_control,
first into the corresponding pg_database fields, and then to
pg_collation.  The values are not identifiers or object names, and we
don't ever look them up that way.

Changing to type text saves space in the typical case, since locale
names are typically only a few bytes long.  But it is also possible
that an ICU locale name with several customization options appended
could be longer than 63 bytes, so this also enables that case, which
was previously probably broken.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a@2ndquadrant.com
2022-01-27 08:54:25 +01:00
Michael Paquier 410aa248e5 Fix various typos, grammar and code style in comments and docs
This fixes a set of issues that have accumulated over the past months
(or years) in various code areas.  Most fixes are related to some recent
additions, as of the development of v15.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220124030001.GQ23027@telsasoft.com
2022-01-25 09:40:04 +09:00
Robert Haas aa01051418 pg_upgrade: Preserve database OIDs.
Commit 9a974cbcba arranged to preserve
relfilenodes and tablespace OIDs. For similar reasons, also arrange
to preserve database OIDs.

One problem is that, up until now, the OIDs assigned to the template0
and postgres databases have not been fixed. This could be a problem
when upgrading, because pg_upgrade might try to migrate a database
from the old cluster to the new cluster while keeping the OID and find
a different database with that OID, resulting in a failure. If it finds
a database with the same name and the same OID that's OK: it will be
dropped and recreated. But the same OID and a different name is a
problem.

To prevent that, fix the OIDs for postgres and template0 to specific
values less than 16384. To avoid running afoul of this rule, these
values should not be changed in future releases. It's not a problem
that these OIDs aren't fixed in existing releases, because the OIDs
that we're assigning here weren't used for either of these databases
in any previous release. Thus, there's no chance that an upgrade of
a cluster from any previous release will collide with the OIDs we're
assigning here. And going forward, the OIDs will always be fixed, so
the only potential collision is with a system database having the
same name and the same OID, which is OK.

This patch lets users assign a specific OID to a database as well,
provided however that it can't be less than 16384. I (rhaas) thought
it might be better not to expose this capability to users, but the
consensus was otherwise, so the syntax is documented. Letting users
assign OIDs below 16384 would not be OK, though, because a
user-created database with a low-numbered OID might collide with a
system-created database in a future release. We therefore prohibit
that.

Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.

Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
Discussion: http://postgr.es/m/CAASxf_Mnwm1Dh2vd5FAhVX6S1nwNSZUB1z12VddYtM++H2+p7w@mail.gmail.com
2022-01-24 14:23:43 -05:00
Tomas Vondra 6d554e3fcd Check syscache result in AlterStatistics
The syscache lookup may return NULL even for valid OID, for example due
to a concurrent DROP STATISTICS, so a HeapTupleIsValid is necessary.
Without it, it may fail with a segfault.

Reported by Alexander Lakhin, patch by me. Backpatch to 13, where ALTER
STATISTICS ... SET STATISTICS was introduced.

Backpatch-through: 13
Discussion: https://postgr.es/m/17372-bf3b6e947e35ae77%40postgresql.org
2022-01-23 03:16:31 +01:00
Tom Lane 62e28097ce Remove useless inline marker.
Putting "inline" on a function that's not used anywhere in its
own file is useless unless the linker is doing global optimization,
a method we don't generally enable.  Moreover, it draws warnings
from some buildfarm members (curculio at least).

Looks like this was sloppiness in cc8b25712, which moved the
function from somewhere else where the inline marker was
more appropriate.
2022-01-22 17:11:33 -05:00
Robert Haas ab4fd4f868 Remove 'datlastsysoid'.
It hasn't been used for anything for a long time. Up until recently,
we still queried it when dumping very old servers, but since
commit 30e7c175b8, there's no longer any
code at all that cares about it.

Discussion: http://postgr.es/m/CA+Tgmoa14=BRq0WEd0eevjEMn9EkghDB1FZEkBw7+UAb7tF49A@mail.gmail.com
2022-01-20 09:01:12 -05:00
Tom Lane 89f059bdf5 Remove redundant memory context switches in BeginCopyFrom().
This is probably a leftover from code refactoring.

Japin Li

Discussion: https://postgr.es/m/MEYP282MB16693DDABDFEC7949AC31857B6599@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2022-01-19 12:31:15 -05:00
Robert Haas 9a974cbcba pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.

This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.

Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.

Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.

Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 13:40:27 -05:00
Peter Eisentraut 941460fcf7 Add Boolean node
Before, SQL-level boolean constants were represented by a string with
a cast, and internal Boolean values in DDL commands were usually
represented by Integer nodes.  This takes the place of both of these
uses, making the intent clearer and having some amount of type safety.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8c1a2e37-c68d-703c-5a83-7a6077f4f997@enterprisedb.com
2022-01-17 10:38:23 +01:00
Tomas Vondra 269b532aef Add stxdinherit flag to pg_statistic_ext_data
Add pg_statistic_ext_data.stxdinherit flag, so that for each extended
statistics definition we can store two versions of data - one for the
relation alone, one for the whole inheritance tree. This is analogous to
pg_statistic.stainherit, but we failed to include such flag in catalogs
for extended statistics, and we had to work around it (see commits
859b3003de, 36c4bc6e72 and 20b9fa308e).

This changes the relationship between the two catalogs storing extended
statistics objects (pg_statistic_ext and pg_statistic_ext_data). Until
now, there was a simple 1:1 mapping - for each definition there was one
pg_statistic_ext_data row, and this row was inserted while creating the
statistics (and then updated during ANALYZE). With the stxdinherit flag,
we don't know how many rows there will be (child relations may be added
after the statistics object is defined), so there may be up to two rows.

We could make CREATE STATISTICS to always create both rows, but that
seems wasteful - without partitioning we only need stxdinherit=false
rows, and declaratively partitioned tables need only stxdinherit=true.
So we no longer initialize pg_statistic_ext_data in CREATE STATISTICS,
and instead make that a responsibility of ANALYZE. Which is what we do
for regular statistics too.

Patch by me, with extensive improvements and fixes by Justin Pryzby.

Author: Tomas Vondra, Justin Pryzby
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
2022-01-16 13:38:01 +01:00
Tomas Vondra 20b9fa308e Build inherited extended stats on partitioned tables
Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. While that
resolved the issue, it also means there are no extended stats for
declaratively partitioned tables, because there are no data in the
non-leaf relations.

That also means declaratively partitioned tables were not affected by
the issue 859b3003de addressed, which means this is a regression
affecting queries that calculate estimates for the whole inheritance
tree as a whole (which includes e.g. GROUP BY queries).

But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.

It may be necessary to run ANALYZE on partitioned tables, to collect
proper statistics. For declarative partitioning there should no prior
statistics, and it might take time before autoanalyze is triggered. For
tables partitioned by inheritance the statistics may include data from
child relations (if built 859b3003de), contradicting the current code.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
2022-01-15 19:06:48 +01:00
Peter Geoghegan 49c9d9fcfa Unify VACUUM VERBOSE and autovacuum logging.
The log_autovacuum_min_duration instrumentation used its own dedicated
code for logging, which was not reused by VACUUM VERBOSE.  This was
highly duplicative, and sometimes led to each code path using slightly
different accounting for essentially the same information.

Clean things up by making VACUUM VERBOSE reuse the same instrumentation
code.  This code restructuring changes the structure of the VACUUM
VERBOSE output itself, but that seems like an overall improvement.  The
most noticeable change in VACUUM VERBOSE output is that it no longer
outputs a distinct message per index per round of index vacuuming.  Most
of the same information (about each index) is now shown in its new
per-operation summary message.  This is far more legible.

A few details are no longer displayed by VACUUM VERBOSE, but that's no
real loss in practice, especially in the common case where we don't need
multiple index scans/rounds of vacuuming.  This super fine-grained
information is still available via DEBUG2 messages, which might still be
useful in debugging scenarios.

VACUUM VERBOSE now shows new instrumentation, which is typically very
useful: all of the log_autovacuum_min_duration instrumentation that it
missed out on before now.  This includes information about WAL overhead,
buffers hit/missed/dirtied information, and I/O timing information.

VACUUM VERBOSE still retains a few INFO messages of its own.  This is
limited to output concerning the progress of heap rel truncation, as
well as some basic information about parallel workers.  These details
are still potentially quite useful.  They aren't a good fit for the log
output, which must summarize the whole operation.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-WzmW4Me7_qR4X4ka7pxP-jGmn7=Npma_-Z-9Y1eD0MQRLw@mail.gmail.com
2022-01-14 16:50:34 -08:00
Thomas Munro 7170f2159f Allow "in place" tablespaces.
Provide a developer-only GUC allow_in_place_tablespaces, disabled by
default.  When enabled, tablespaces can be created with an empty
LOCATION string, meaning that they should be created as a directory
directly beneath pg_tblspc.  This can be used for new testing scenarios,
in a follow-up patch.  Not intended for end-user usage, since it might
confuse backup tools that expect symlinks.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com
2022-01-15 00:09:24 +13:00
Peter Eisentraut c4cc2850f4 Rename value node fields
For the formerly-Value node types, rename the "val" field to a name
specific to the node type, namely "ival", "fval", "sval", and "bsval".
This makes some code clearer and catches mixups better.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8c1a2e37-c68d-703c-5a83-7a6077f4f997@enterprisedb.com
2022-01-14 11:26:08 +01:00
Peter Eisentraut 93415a3b5a Refactor AlterRole()
Get rid of the three-valued logic for the Boolean variables to track
whether the value was been specified and what the new value should be.
Instead, we can use the "dfoo" variables to determine whether the
value was specified and should be applied.  This was already done in
some cases, so this makes this more uniform and removes one layer of
indirection.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8c1a2e37-c68d-703c-5a83-7a6077f4f997@enterprisedb.com
2022-01-14 10:53:21 +01:00
Tom Lane 6f6943fc94 Improve error message for missing extension.
If we get ENOENT while trying to read an extension control file,
report that as a missing extension (with a HINT to install it)
rather than as a filesystem access problem.  The message wording
was extensively bikeshedded in hopes of pointing people to the
idea that they need to do a software installation before they
can install the extension into the current database.

Nathan Bossart, with review/wording suggestions from Daniel
Gustafsson, Chapman Flack, and myself

Discussion: https://postgr.es/m/3950D56A-4E47-48E7-BF9B-F5F22E268BE7@amazon.com
2022-01-11 14:22:00 -05:00
Michael Paquier b69aba7457 Improve error handling of cryptohash computations
The existing cryptohash facility was causing problems in some code paths
related to MD5 (frontend and backend) that relied on the fact that the
only type of error that could happen would be an OOM, as the MD5
implementation used in PostgreSQL ~13 (the in-core implementation is
used when compiling with or without OpenSSL in those older versions),
could fail only under this circumstance.

The new cryptohash facilities can fail for reasons other than OOMs, like
attempting MD5 when FIPS is enabled (upstream OpenSSL allows that up to
1.0.2, Fedora and Photon patch OpenSSL 1.1.1 to allow that), so this
would cause incorrect reports to show up.

This commit extends the cryptohash APIs so as callers of those routines
can fetch more context when an error happens, by using a new routine
called pg_cryptohash_error().  The error states are stored within each
implementation's internal context data, so as it is possible to extend
the logic depending on what's suited for an implementation.  The default
implementation requires few error states, but OpenSSL could report
various issues depending on its internal state so more is needed in
cryptohash_openssl.c, and the code is shaped so as we are always able to
grab the necessary information.

The core code is changed to adapt to the new error routine, painting
more "const" across the call stack where the static errors are stored,
particularly in authentication code paths on variables that provide
log details.  This way, any future changes would warn if attempting to
free these strings.  The MD5 authentication code was also a bit blurry
about the handling of "logdetail" (LOG sent to the postmaster), so
improve the comments related that, while on it.

The origin of the problem is 87ae969, that introduced the centralized
cryptohash facility.  Extra changes are done for pgcrypto in v14 for the
non-OpenSSL code path to cope with the improvements done by this
commit.

Reported-by: Michael Mühlbeyer
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/89B7F072-5BBE-4C92-903E-D83E865D9367@trivadis.com
Backpatch-through: 14
2022-01-11 09:55:16 +09:00
Jeff Davis a2ab9c06ea Respect permissions within logical replication.
Prevent logical replication workers from performing insert, update,
delete, truncate, or copy commands on tables unless the subscription
owner has permission to do so.

Prevent subscription owners from circumventing row-level security by
forbidding replication into tables with row-level security policies
which the subscription owner is subject to, without regard to whether
the policy would ordinarily allow the INSERT, UPDATE, DELETE or
TRUNCATE which is being replicated.  This seems sufficient for now, as
superusers, roles with bypassrls, and target table owners should still
be able to replicate despite RLS policies.  We can revisit the
question of applying row-level security policies on a per-row basis if
this restriction proves too severe in practice.

Author: Mark Dilger
Reviewed-by: Jeff Davis, Andrew Dunstan, Ronan Dunklau
Discussion: https://postgr.es/m/9DFC88D3-1300-4DE8-ACBC-4CEF84399A53%40enterprisedb.com
2022-01-07 17:40:56 -08:00
Bruce Momjian 27b77ecf9f Update copyright for 2022
Backpatch-through: 10
2022-01-07 19:04:57 -05:00
Tom Lane 7ead9925ff Prevent altering partitioned table's rowtype, if it's used elsewhere.
We disallow altering a column datatype within a regular table,
if the table's rowtype is used as a column type elsewhere,
because we lack code to go around and rewrite the other tables.
This restriction should apply to partitioned tables as well, but it
was not checked because ATRewriteTables and ATPrepAlterColumnType
were not on the same page about who should do it for which relkinds.

Per bug #17351 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17351-6db1870f3f4f612a@postgresql.org
2022-01-06 16:46:46 -05:00
Alvaro Herrera f4566345cf
Create foreign key triggers in partitioned tables too
While user-defined triggers defined on a partitioned table have
a catalog definition for both it and its partitions, internal
triggers used by foreign keys defined on partitioned tables only
have a catalog definition for its partitions.  This commit fixes
that so that partitioned tables get the foreign key triggers too,
just like user-defined triggers.  Moreover, like user-defined
triggers, partitions' internal triggers will now also have their
tgparentid set appropriately.  This is to allow subsequent commit(s)
to make the foreign key related events to be fired in some cases
using the parent table triggers instead of those of partitions'.

This also changes what tgisinternal means in some cases.  Currently,
it means either that the trigger is an internal implementation object
of a foreign key constraint, or a "child" trigger on a partition
cloned from the trigger on the parent.  This commit changes it to
only mean the former to avoid confusion.  As for the latter, it can
be told by tgparentid being nonzero, which is now true both for user-
defined and foreign key's internal triggers.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Arne Roland <A.Roland@index.de>
Discussion: https://postgr.es/m/CA+HiwqG7LQSK+n8Bki8tWv7piHD=PnZro2y6ysU2-28JS6cfgQ@mail.gmail.com
2022-01-05 19:00:13 -03:00
Tom Lane 9a3ddeb519 Fix index-only scan plans, take 2.
Commit 4ace45677 failed to fix the problem fully, because the
same issue of attempting to fetch a non-returnable index column
can occur when rechecking the indexqual after using a lossy index
operator.  Moreover, it broke EXPLAIN for such indexquals (which
indicates a gap in our test cases :-().

Revert the code changes of 4ace45677 in favor of adding a new field
to struct IndexOnlyScan, containing a version of the indexqual that
can be executed against the index-returned tuple without using any
non-returnable columns.  (The restrictions imposed by check_index_only
guarantee this is possible, although we may have to recompute indexed
expressions.)  Support construction of that during setrefs.c
processing by marking IndexOnlyScan.indextlist entries as resjunk
if they can't be returned, rather than removing them entirely.
(We could alternatively require setrefs.c to look up the IndexOptInfo
again, but abusing resjunk this way seems like a reasonably safe way
to avoid needing to do that.)

This solution isn't great from an API-stability standpoint: if there
are any extensions out there that build IndexOnlyScan structs directly,
they'll be broken in the next minor releases.  However, only a very
invasive extension would be likely to do such a thing.  There's no
change in the Path representation, so typical planner extensions
shouldn't have a problem.

As before, back-patch to all supported branches.

Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us
Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org
2022-01-03 15:42:27 -05:00
Alvaro Herrera 9623d89996
Avoid using DefElemAction in AlterPublicationStmt
Create a new enum type for it.  This allows to add new values for future
functionality without disrupting unrelated uses of DefElem.

Discussion: https://postgr.es/m/202112302021.ca7ihogysgh3@alvherre.pgsql
2022-01-03 10:48:48 -03:00
Alvaro Herrera c9105dd366
Small cleanups related to PUBLICATION framework code
Discussion: https://postgr.es/m/202112302021.ca7ihogysgh3@alvherre.pgsql
2021-12-30 19:24:26 -03:00
Daniel Gustafsson e68570e388 Revert b2a459edf "Fix GRANTED BY support in REVOKE ROLE statements"
The reverted commit attempted to fix SQL specification compliance for
the cases which 6aaaa76bb left.  This however broke existing behavior
which takes precedence over spec compliance so revert. The introduced
tests are left after the revert since the codepath isn't well covered.
Per bug report 17346. Backpatch down to 14 where it was introduced.

Reported-by: Andrew Bille <andrewbille@gmail.com>
Discussion: https://postgr.es/m/17346-f72b28bd1a341060@postgresql.org
2021-12-30 13:23:47 +01:00
Amit Kapila 94226d4506 Fix compilation error introduced by commit 8e1fae1938.
Author: Masahiko Sawada
Discussion: https://postgr.es/m/E1n0HSK-00048l-RE@gemulon.postgresql.org
2021-12-23 12:46:27 +05:30
Amit Kapila 8e1fae1938 Move parallel vacuum code to vacuumparallel.c.
This commit moves parallel vacuum related code to a new file
commands/vacuumparallel.c so that any table AM supporting indexes can
utilize parallel vacuum in order to call index AM callbacks (ambulkdelete
and amvacuumcleanup) with parallel workers.

Another reason for this refactoring is that the parallel vacuum isn't
specific to heap so it doesn't make sense to keep this code in
heap/vacuumlazy.c.

Author: Masahiko Sawada, based on suggestion from Andres Freund
Reviewed-by: Hou Zhijie, Amit Kapila, Haiying Tang
Discussion: https://www.postgresql.org/message-id/20211030212101.ae3qcouatwmy7tbr%40alap3.anarazel.de
2021-12-23 11:42:52 +05:30
Michael Paquier 2e577c9446 Remove assertion for ALTER TABLE .. DETACH PARTITION CONCURRENTLY
One code path related to this flavor of ALTER TABLE was checking that
the relation to detach has to be a normal table or a partitioned table,
which would fail if using the command with a different relation kind.

Views, sequences and materialized views cannot be part of a partition
tree, so these would cause the command to fail anyway, but the assertion
was triggered.  Foreign tables can be part of a partition tree, and
again the assertion would have failed.  The simplest solution is just to
remove this assertion, so as we get the same failure as the
non-concurrent code path.

While on it, add a regression test in postgres_fdw for the concurrent
partition detach of a foreign table, as per a suggestion from Alexander
Lakhin.

Issue introduced in 71f4c8c.

Reported-by: Alexander Lakhin
Author: Michael Paquier, Alexander Lakhin
Reviewed-by: Peter Eisentraut, Kyotaro Horiguchi
Discussion: https://postgr.es/m/17339-a9e09aaf38a3457a@postgresql.org
Backpatch-through: 14
2021-12-22 15:38:00 +09:00
Amit Kapila cc8b25712b Move index vacuum routines to vacuum.c.
An upcoming patch moves parallel vacuum code out of vacuumlazy.c. This
code restructuring will allow both lazy vacuum and parallel vacuum to use
index vacuum functions.

Author: Masahiko Sawada
Reviewed-by: Hou Zhijie, Amit Kapila
Discussion: https://www.postgresql.org/message-id/20211030212101.ae3qcouatwmy7tbr%40alap3.anarazel.de
2021-12-22 07:55:14 +05:30
Tom Lane bbc227e951 Always use ReleaseTupleDesc after lookup_rowtype_tupdesc et al.
The API spec for lookup_rowtype_tupdesc previously said you could use
either ReleaseTupleDesc or DecrTupleDescRefCount.  However, the latter
choice means the caller must be certain that the returned tupdesc is
refcounted.  I don't recall right now whether that was always true
when this spec was written, but it's certainly not always true since
we introduced shared record typcaches for parallel workers.  That means
that callers using DecrTupleDescRefCount are dependent on typcache
behavior details that they probably shouldn't be.  Hence, change the API
spec to say that you must call ReleaseTupleDesc, and fix the half-dozen
callers that weren't.

AFAICT this is just future-proofing, there's no live bug here.
So no back-patch.

Per gripe from Chapman Flack.

Discussion: https://postgr.es/m/61B901A4.1050808@anastigmatix.net
2021-12-15 18:58:20 -05:00
Peter Eisentraut d6f96ed94e Allow specifying column list for foreign key ON DELETE SET actions
Extend the foreign key ON DELETE actions SET NULL and SET DEFAULT by
allowing the specification of a column list, like

    CREATE TABLE posts (
        ...
        FOREIGN KEY (tenant_id, author_id) REFERENCES users ON DELETE SET NULL (author_id)
    );

If a column list is specified, only those columns are set to
null/default, instead of all the columns in the foreign-key
constraint.

This is useful for multitenant or sharded schemas, where the tenant or
shard ID is included in the primary key of all tables but shouldn't be
set to null.

Author: Paul Martinez <paulmtz@google.com>
Discussion: https://www.postgresql.org/message-id/flat/CACqFVBZQyMYJV=njbSMxf+rbDHpx=W=B7AEaMKn8dWn9OZJY7w@mail.gmail.com
2021-12-08 11:13:57 +01:00
Amit Kapila 1a2aaeb0db Fix changing the ownership of ALL TABLES IN SCHEMA publication.
Ensure that the new owner of ALL TABLES IN SCHEMA publication must be a
superuser. The same is already ensured during CREATE PUBLICATION.

Author: Vignesh C
Reviewed-by: Nathan Bossart, Greg Nancarrow, Michael Paquier, Haiying Tang
Discussion: https://postgr.es/m/CALDaNm0E5U-RqxFuFrkZrQeG7ae5trGa=xs=iRtPPHULtT4zOw@mail.gmail.com
2021-12-08 11:31:16 +05:30
Michael Paquier 00029deaf6 Improve parsing of options of CREATE/ALTER SUBSCRIPTION
This simplifies the code so as it is not necessary anymore for the
caller of parse_subscription_options() to zero SubOpts, holding a
bitmaps of the provided options as well as the default/parsed option
values.  This also simplifies some checks related to the options
supported by a command when checking for incompatibilities.

While on it, the errors generated for unsupported combinations with
"slot_name = NONE" are reordered.  This may generate a different errors
compared to the previous major versions, but users have to go through
all those errors to get a correct command in this case when using
incorrect values for options "enabled" and "create\slot", so at the end
the resulting command would remain the same.

Author: Peter Smith
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/CAHut+PtXHfLgLHDDJ8ZN5f5Be_37mJoxpEsRg8LNmm4XCr06Rw@mail.gmail.com
2021-12-08 12:36:31 +09:00
Peter Eisentraut 37b2764593 Some RELKIND macro refactoring
Add more macros to group some RELKIND_* macros:

- RELKIND_HAS_PARTITIONS()
- RELKIND_HAS_TABLESPACE()
- RELKIND_HAS_TABLE_AM()

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/a574c8f1-9c84-93ad-a9e5-65233d6fc00f%40enterprisedb.com
2021-12-03 14:08:19 +01:00
Amit Kapila 8d74fc96db Add a view to show the stats of subscription workers.
This commit adds a new system view pg_stat_subscription_workers, that
shows information about any errors which occur during the application of
logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when the
corresponding subscription is removed.

It also adds an SQL function pg_stat_reset_subscription_worker() to reset
single subscription errors.

The contents of this view can be used by an upcoming patch that skips the
particular transaction that conflicts with the existing data on the
subscriber.

This view can be extended in the future to track other xact related
statistics like the number of xacts committed/aborted for subscription
workers.

Author: Masahiko Sawada
Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip Kumar, Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
2021-11-30 08:54:30 +05:30
Tom Lane 3804539e48 Replace random(), pg_erand48(), etc with a better PRNG API and algorithm.
Standardize on xoroshiro128** as our basic PRNG algorithm, eliminating
a bunch of platform dependencies as well as fundamentally-obsolete PRNG
code.  In addition, this API replacement will ease replacing the
algorithm again in future, should that become necessary.

xoroshiro128** is a few percent slower than the drand48 family,
but it can produce full-width 64-bit random values not only 48-bit,
and it should be much more trustworthy.  It's likely to be noticeably
faster than the platform's random(), depending on which platform you
are thinking about; and we can have non-global state vectors easily,
unlike with random().  It is not cryptographically strong, but neither
are the functions it replaces.

Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself

Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
2021-11-28 21:33:07 -05:00
Daniel Gustafsson b2a459edfe Fix GRANTED BY support in REVOKE ROLE statements
Commit 6aaaa76bb added support for the GRANTED BY clause in GRANT and
REVOKE statements, but missed adding support for checking the role in
the REVOKE ROLE case. Fix by checking that the parsed role matches the
CURRENT_ROLE/CURRENT_USER requirement, and also add some tests for it.
Backpatch to v14 where GRANTED BY support was introduced.

Discussion: https://postgr.es/m/B7F6699A-A984-4943-B9BF-CEB84C003527@yesql.se
Backpatch-through: 14
2021-11-26 14:02:01 +01:00