Commit Graph

1414 Commits

Author SHA1 Message Date
Peter Eisentraut 26f7802beb Message style improvements 2022-09-24 18:41:25 -04:00
Amit Kapila 13a185f54b Allow publications with schema and table of the same schema.
We previously thought that allowing such cases can confuse users when they
specify DROP TABLES IN SCHEMA but that doesn't seem to be the case based
on discussion. This helps to uplift the restriction during
ALTER TABLE ... SET SCHEMA which used to ensure that we couldn't end up
with a publication having both a schema and the same schema's table.

To allow this, we need to forbid having any schema on a publication if
column lists on a table are specified (and vice versa). This is because
otherwise we still need a restriction during ALTER TABLE ... SET SCHEMA to
forbid cases where it could lead to a publication having both a schema and
the same schema's table with column list.

Based on suggestions by Peter Eisentraut.

Author: Hou Zhijie and Vignesh C
Reviewed-By: Peter Smith, Amit Kapila
Backpatch-through: 15, where it was introduced
Discussion: https://postgr.es/m/2729c9e2-9aac-8cda-f2f4-34f2bcc18f4e@enterprisedb.com
2022-09-23 08:21:26 +05:30
Alvaro Herrera 790bf615dd
Remove ALL keyword from TABLES IN SCHEMA for publication
This may be a bit too subtle, but removing that word from there makes
this clause no longer a perfect parallel of the GRANT variant "ALL
TABLES IN SCHEMA": indeed, for publications what we record is the schema
itself, not the tables therein, which means that any tables added to the
schema in the future are also published.  This is completely different
to what GRANT does, which is affect only the tables that exist when the
command is executed.

There isn't resounding support for this change, but there are a few
positive votes and no opposition.  Because the time to 15 RC1 is very
short, let's get this out now.

Backpatch to 15.

Discussion: https://postgr.es/m/2729c9e2-9aac-8cda-f2f4-34f2bcc18f4e
2022-09-22 19:02:25 +02:00
Andres Freund e6927270cd meson: Add initial version of meson based build system
Autoconf is showing its age, fewer and fewer contributors know how to wrangle
it. Recursive make has a lot of hard to resolve dependency issues and slow
incremental rebuilds. Our home-grown MSVC build system is hard to maintain for
developers not using Windows and runs tests serially. While these and other
issues could individually be addressed with incremental improvements, together
they seem best addressed by moving to a more modern build system.

After evaluating different build system choices, we chose to use meson, to a
good degree based on the adoption by other open source projects.

We decided that it's more realistic to commit a relatively early version of
the new build system and mature it in tree.

This commit adds an initial version of a meson based build system. It supports
building postgres on at least AIX, FreeBSD, Linux, macOS, NetBSD, OpenBSD,
Solaris and Windows (however only gcc is supported on aix, solaris). For
Windows/MSVC postgres can now be built with ninja (faster, particularly for
incremental builds) and msbuild (supporting the visual studio GUI, but
building slower).

Several aspects (e.g. Windows rc file generation, PGXS compatibility, LLVM
bitcode generation, documentation adjustments) are done in subsequent commits
requiring further review. Other aspects (e.g. not installing test-only
extensions) are not yet addressed.

When building on Windows with msbuild, builds are slower when using a visual
studio version older than 2019, because those versions do not support
MultiToolTask, required by meson for intra-target parallelism.

The plan is to remove the MSVC specific build system in src/tools/msvc soon
after reaching feature parity. However, we're not planning to remove the
autoconf/make build system in the near future. Likely we're going to keep at
least the parts required for PGXS to keep working around until all supported
versions build with meson.

Some initial help for postgres developers is at
https://wiki.postgresql.org/wiki/Meson

With contributions from Thomas Munro, John Naylor, Stone Tickle and others.

Author: Andres Freund <andres@anarazel.de>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Author: Peter Eisentraut <peter@eisentraut.org>
Reviewed-By: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/20211012083721.hvixq4pnh2pixr3j@alap3.anarazel.de
2022-09-21 22:37:17 -07:00
Amit Kapila a932824dfe Pass Size as a 2nd argument for snprintf() in tablesync.c.
Previously the following snprintf() wrappers:

* ReplicationSlotNameForTablesync()
* ReplicationOriginNameForTablesync()

... used int as a second argument of snprintf() while the actual type of it
is size_t. Although it doesn't fail at present better replace it with Size
for consistency with the rest of the system.

Author: Aleksander Alekseev
Reviewed-By: Peter Smith
Discussion: https://postgr.es/m/CAHut%2BPsa8hhfSE6ozUK-ih7GkQziAVAf4f3bqiXEj2nQiu-43g%40mail.gmail.com
2022-09-21 10:20:37 +05:30
Amit Kapila 6971a839cc Improve some error messages.
It is not our usual style to use "we" in the error messages.

Author: Kyotaro Horiguchi
Reviewed-By: Amit Kapila
Discussion: https://postgr.es/m/20220914.111507.13049297635620898.horikyota.ntt@gmail.com
2022-09-21 09:43:59 +05:30
Michael Paquier e9123197c8 Fix incorrect variable types for origin IDs in decode.c
These variables used XLogRecPtr instead of RepOriginId.

Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoBm-vNyBSXGp4bmJGvhr=S-EGc5q1dtV70cFTcJvLhC=Q@mail.gmail.com
Backpatch-through: 14
2022-09-20 18:13:00 +09:00
Peter Geoghegan bfcf1b3480 Harmonize parameter names in storage and AM code.
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in storage, catalog,
access method, executor, and logical replication code, as well as in
miscellaneous utility/library code.

Like other recent commits that cleaned up function parameter names, this
commit was written with help from clang-tidy.  Later commits will do the
same for other parts of the codebase.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
2022-09-19 19:18:36 -07:00
Amit Kapila a234177906 Fix typos.
Author: Hou Zhijie and Zhang Mingli
Discussion: https://postgr.es/m/OS0PR01MB57162559C01FE2848C12E8F7944D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2022-09-19 14:21:39 +05:30
Peter Geoghegan 035ce1feb2 Harmonize reorderbuffer parameter names.
Make reorderbuffer.h function declarations consistently use named
parameters.  Also make sure that the declarations use names that match
corresponding names from function definitions in reorderbuffer.c.  This
makes the definitions easier to follow, especially in the case of
functions that happen to have adjoining arguments of the same type.

This patch was written with help from clang-tidy.  Specifically, its
"readability-inconsistent-declaration-parameter-name" check and its
"readability-named-parameter" check were used.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/3955318.1663377656@sss.pgh.pa.us
2022-09-17 17:20:17 -07:00
Tom Lane 31dcfae83c Use the terminology "WAL file" not "log file" more consistently.
Referring to the WAL as just "log" invites confusion with the
postmaster log, so avoid doing that in docs and error messages.
Also shorten "WAL segment file" to just "WAL file" in various
places.

Bharath Rupireddy, reviewed by Nathan Bossart and Kyotaro Horiguchi

Discussion: https://postgr.es/m/CALj2ACUeXa8tDPaiTLexBDMZ7hgvaN+RTb957-cn5qwv9zf-MQ@mail.gmail.com
2022-09-14 18:40:58 -04:00
Tom Lane 0a20ff54f5 Split up guc.c for better build speed and ease of maintenance.
guc.c has grown to be one of our largest .c files, making it
a bottleneck for compilation.  It's also acquired a bunch of
knowledge that'd be better kept elsewhere, because of our not
very good habit of putting variable-specific check hooks here.
Hence, split it up along these lines:

* guc.c itself retains just the core GUC housekeeping mechanisms.
* New file guc_funcs.c contains the SET/SHOW interfaces and some
  SQL-accessible functions for GUC manipulation.
* New file guc_tables.c contains the data arrays that define the
  built-in GUC variables, along with some already-exported constant
  tables.
* GUC check/assign/show hook functions are moved to the variable's
  home module, whenever that's clearly identifiable.  A few hard-
  to-classify hooks ended up in commands/variable.c, which was
  already a home for miscellaneous GUC hook functions.

To avoid cluttering a lot more header files with #include "guc.h",
I also invented a new header file utils/guc_hooks.h and put all
the GUC hook functions' declarations there, regardless of their
originating module.  That allowed removal of #include "guc.h"
from some existing headers.  The fallout from that (hopefully
all caught here) demonstrates clearly why such inclusions are
best minimized: there are a lot of files that, for example,
were getting array.h at two or more levels of remove, despite
not having any connection at all to GUCs in themselves.

There is some very minor code beautification here, such as
renaming a couple of inconsistently-named hook functions
and improving some comments.  But mostly this just moves
code from point A to point B and deals with the ensuing
needs for #include adjustments and exporting a few functions
that previously weren't exported.

Patch by me, per a suggestion from Andres Freund; thanks also
to Michael Paquier for the idea to invent guc_funcs.c.

Discussion: https://postgr.es/m/587607.1662836699@sss.pgh.pa.us
2022-09-13 11:11:45 -04:00
Amit Kapila 88f488319b Make the tablesync worker's replication origin drop logic robust.
In commit f6c5edb8ab, we started to drop the replication origin slots
before tablesync worker exits to avoid consuming more slots than required.
We were dropping the replication origin in the same transaction where we
were marking the tablesync state as SYNCDONE. Now, if there is any error
after we have dropped the origin but before we commit the containing
transaction, the in-memory state of replication progress won't be rolled
back. Due to this, after the restart, tablesync worker can start streaming
from the wrong location and can apply the already processed transaction.

To fix this, we need to opportunistically drop the origin after marking
the tablesync state as SYNCDONE. Even, if the tablesync worker fails to
remove the replication origin before exit, the apply worker ensures to
clean it up afterward.

Reported by Tom Lane as per buildfarm.
Diagnosed-by: Masahiko Sawada
Author: Hou Zhijie
Reviewed-By: Masahiko Sawada, Amit Kapila
Discussion: https://postgr.es/m/20220714115155.GA5439@depesz.com
Discussion: https://postgr.es/m/CAD21AoAw0Oofi4kiDpJBOwpYyBBBkJj=sLUOn4Gd2GjUAKG-fw@mail.gmail.com
2022-09-12 12:40:57 +05:30
John Naylor b086a47a27 Bump minimum version of Bison to 2.3
Since the retirement of some older buildfarm members, the oldest Bison
that gets regular testing is 2.3. MacOS ships that version, and will
continue doing so for the forseeable future because of Apple's policy
regarding GPLv3. While Mac users could use a package manager to install
a newer version, there is no compelling reason to force them do so at
this time.

Reviewed by Andres Freund
Discussion: https://www.postgresql.org/message-id/1097762.1662145681@sss.pgh.pa.us
2022-09-09 12:31:41 +07:00
Alvaro Herrera 4b4663fb4a
Message style fixes 2022-09-07 17:33:49 +02:00
David Rowley 8b26769bc4 Fix an assortment of improper usages of string functions
In a similar effort to f736e188c and 110d81728, fixup various usages of
string functions where a more appropriate function is available and more
fit for purpose.

These changes include:

1. Use cstring_to_text_with_len() instead of cstring_to_text() when
   working with a StringInfoData and the length can easily be obtained.
2. Use appendStringInfoString() instead of appendStringInfo() when no
   formatting is required.
3. Use pstrdup(...) instead of psprintf("%s", ...)
4. Use pstrdup(...) instead of psprintf(...) (with no formatting)
5. Use appendPQExpBufferChar() instead of appendPQExpBufferStr() when the
   length of the string being appended is 1.
6. appendStringInfoChar() instead of appendStringInfo() when no formatting
   is required and string is 1 char long.
7. Use appendPQExpBufferStr(b, .) instead of appendPQExpBuffer(b, "%s", .)
8. Don't use pstrdup when it's fine to just point to the string constant.

I (David) did find other cases of #8 but opted to use #4 instead as I
wasn't certain enough that applying #8 was ok (e.g in hba.c)

Author: Ranier Vilela, David Rowley
Discussion: https://postgr.es/m/CAApHDvo2j2+RJBGhNtUz6BxabWWh2Jx16wMUMWKUjv70Ver1vg@mail.gmail.com
2022-09-06 13:19:44 +12:00
John Naylor dac048f71e Build all Flex files standalone
The proposed Meson build system will need a way to ignore certain
generated files in order to coexist with the autoconf build system,
and C files generated by Flex which are #include'd into .y files make
this more difficult. In similar vein to 72b1e3a21, arrange for all Flex
C files to compile to their own .o targets.

Reviewed by Andres Freund

Discussion: https://www.postgresql.org/message-id/20220810171935.7k5zgnjwqzalzmtm%40awork3.anarazel.de
Discussion: https://www.postgresql.org/message-id/CAFBsxsF8Gc2StS3haXofshHCzqNMRXiSxvQEYGwnFsTmsdwNeg@mail.gmail.com
2022-09-04 12:09:01 +07:00
Michael Paquier bfb9dfd937 Expand the use of get_dirent_type(), shaving a few calls to stat()/lstat()
Several backend-side loops scanning one or more directories with
ReadDir() (WAL segment recycle/removal in xlog.c, backend-side directory
copy, temporary file removal, configuration file parsing, some logical
decoding logic and some pgtz stuff) already know the type of the entry
being scanned thanks to the dirent structure associated to the entry, on
platforms where we know about DT_REG, DT_DIR and DT_LNK to make the
difference between a regular file, a directory and a symbolic link.

Relying on the direct structure of an entry saves a few system calls to
stat() and lstat() in the loops updated here, shaving some code while on
it.  The logic of the code remains the same, calling stat() or lstat()
depending on if it is necessary to look through symlinks.

Authors: Nathan Bossart, Bharath Rupireddy
Reviewed-by: Andres Freund, Thomas Munro, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACV8n-J-f=yiLUOx2=HrQGPSOZM3nWzyQQvLPcccPXxEdg@mail.gmail.com
2022-09-02 16:58:06 +09:00
Amit Kapila f6c5edb8ab Drop replication origin slots before tablesync worker exits.
Currently, the replication origin tracking of the tablesync worker is
dropped by the apply worker. So, there will be a small lag between the
tablesync worker exit and its origin tracking got removed. In the
meantime, new tablesync workers can be launched and will try to set up
a new origin tracking. This can lead the system to reach max configured
limit (max_replication_slots) even if the user has configured the max
limit considering the number of tablesync workers required in the system.

We decided not to back-patch as this can occur in very narrow
circumstances and users have to option to increase the configured limit by
increasing max_replication_slots.

Reported-by: Hubert Depesz Lubaczewski
Author: Ajin Cherian
Reviwed-by: Masahiko Sawada, Peter Smith, Hou Zhijie, Amit Kapila
Discussion: https://postgr.es/m/20220714115155.GA5439@depesz.com
2022-08-30 08:51:41 +05:30
Amit Kapila d2169c9985 Fix the incorrect assertion introduced in commit 7f13ac8123.
It has been incorrectly assumed in commit 7f13ac8123 that we can either
purge all or none in the catalog modifying xids list retrieved from a
serialized snapshot. It is quite possible that some of the xids in that
array are old enough to be pruned but not others.

As per buildfarm

Author: Amit Kapila and Masahiko Sawada
Reviwed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAA4eK1LBtv6ayE+TvCcPmC-xse=DVg=SmbyQD1nv_AaqcpUJEg@mail.gmail.com
2022-08-29 08:10:10 +05:30
Peter Eisentraut e890ce7a4f Remove unneeded null pointer checks before PQfreemem()
PQfreemem() just calls free(), and the latter already checks for null
pointers.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/cf26e970-8e92-59f1-247a-aa265235075b%40enterprisedb.com
2022-08-26 19:16:28 +02:00
Amit Kapila f972ec5c28 Add CHECK_FOR_INTERRUPTS while decoding changes.
While decoding changes in a loop, if we skip all the changes there is no
CFI making the loop uninterruptible.

Reported-by: Whale Song and Andrey Borodin
Bug: 17580
Author: Masahiko Sawada
Reviwed-by: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/17580-849c1d5b6d7eb422@postgresql.org
Discussion: https://postgr.es/m/B319ECD6-9A28-4CDF-A8F4-3591E0BF2369@yandex-team.ru
2022-08-23 10:20:02 +05:30
John Naylor ba8321349b Switch format specifier for replication origins to %d
Using %u with uint16 causes warnings with -Wformat-signedness. There are many
other warnings, but for now change only these since c920fe4818 already changed
the message string for most of them.

Per report from Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/31e63649-0355-7088-831e-b07d5f908a8c%40enterprisedb.com
2022-08-23 09:55:05 +07:00
David Rowley f01592f915 Remove shadowed local variables that are new in v15
Compiling with -Wshadow=compatible-local yields quite a few warnings about
local variables being shadowed by compatible local variables in an inner
scope.  Of course, this is perfectly valid in C, but we have had bugs in
the past as a result of developers failing to notice this.  af7d270dd is a
recent example.

Here we do a cleanup of warnings we receive from -Wshadow=compatible-local
for code which is new to PostgreSQL 15.  We've yet to have the discussion
about if we actually ever want to run that as a standard compilation flag.
We'll need to at least get the number of warnings down to something easier
to manage before we can realistically consider if we want this or not.
This commit is the first step towards reducing the warnings.

The changes being made here are all fairly trivial.  Because of that, and
the fact that v15 is still in beta, this is being back-patched into 15.
It seems more risky not to do this as the risk of future bugs is increased
by the additional conflicts that this commit could cause for any future
bug fixes touching the same areas as this commit.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220817145434.GC26426%40telsasoft.com
Backpatch-through: 15
2022-08-20 11:40:44 +12:00
John Naylor c920fe4818 Refer to replication origin roident as "ID" in user facing messages and docs
The table column that stores this is of type oid, but is actually limited
to uint16 and has a different path for creating new values. Some of
the documentation already referred to it as an ID, so let's standardize
on that.

While at it, most format strings already use %u, so for consintency
change the remaining stragglers using %d.

Per suggestions from Tom Lane and Justin Pryzby
Discussion: https://www.postgresql.org/message-id/3437166.1659620465%40sss.pgh.pa.us
Backpatch to v15
2022-08-18 08:57:13 +07:00
Tom Lane efd0c16bec Avoid using list_length() to test for empty list.
The standard way to check for list emptiness is to compare the
List pointer to NIL; our list code goes out of its way to ensure
that that is the only representation of an empty list.  (An
acceptable alternative is a plain boolean test for non-null
pointer, but explicit mention of NIL is usually preferable.)

Various places didn't get that memo and expressed the condition
with list_length(), which might not be so bad except that there
were such a variety of ways to check it exactly: equal to zero,
less than or equal to zero, less than one, yadda yadda.  In the
name of code readability, let's standardize all those spellings
as "list == NIL" or "list != NIL".  (There's probably some
microscopic efficiency gain too, though few of these look to be
at all performance-critical.)

A very small number of cases were left as-is because they seemed
more consistent with other adjacent list_length tests that way.

Peter Smith, with bikeshedding from a number of us

Discussion: https://postgr.es/m/CAHut+PtQYe+ENX5KrONMfugf0q6NHg4hR5dAhqEXEc2eefFeig@mail.gmail.com
2022-08-17 11:12:35 -04:00
Alvaro Herrera 2c86077765
struct PQWalReceiverFunctions: use designated initializers
We now require that compilers support this, and it makes the code easier
to trace, so change it.  I'm fixated on this particular struct because
I've had to navigate around it a number of times, but there are others
elsewhere that could use the same treatment.

Discussion: https://postgr.es/m/20220810140300.ixhbmm4svo5yypv6@alvherre.pgsql
2022-08-11 12:07:05 +02:00
Amit Kapila 7f13ac8123 Fix catalog lookup with the wrong snapshot during logical decoding.
Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION
records to know if the transaction has modified the catalog, and that
information is not serialized to snapshot. Therefore, after the restart,
if the logical decoding decodes only the commit record of the transaction
that has actually modified a catalog, we will miss adding its XID to the
snapshot. Thus, we will end up looking at catalogs with the wrong
snapshot.

To fix this problem, this change adds the list of transaction IDs and
sub-transaction IDs, that have modified catalogs and are running during
snapshot serialization, to the serialized snapshot. After restart or
otherwise, when we restore from such a serialized snapshot, the
corresponding list is restored in memory. Now, when decoding a COMMIT
record, we check both the list and the ReorderBuffer to see if the
transaction has modified catalogs.

Since this adds additional information to the serialized snapshot, we
cannot backpatch it. For back branches, we took another approach.
We remember the last-running-xacts list of the decoded RUNNING_XACTS
record after restoring the previously serialized snapshot. Then, we mark
the transaction as containing catalog changes if it's in the list of
initial running transactions and its commit record has
XACT_XINFO_HAS_INVALS. This doesn't require any file format changes but
the transaction will end up being added to the snapshot even if it has
only relcache invalidations. But that won't be a problem since we use
snapshot built during decoding only to read system catalogs.

This commit bumps SNAPBUILD_VERSION because of a change in SnapBuild.

Reported-by: Mike Oh
Author: Masahiko Sawada
Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi
Backpatch-through: 10
Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com
2022-08-11 10:09:24 +05:30
Robert Haas a8c0128697 Move basebackup code to new directory src/backend/backup
Reviewed by David Steele and Justin Pryzby

Discussion: http://postgr.es/m/CA+TgmoafqboATDSoXHz8VLrSwK_MDhjthK4hEpYjqf9_1Fmczw%40mail.gmail.com
2022-08-10 14:03:23 -04:00
Thomas Munro 5fc88c5d53 Replace pgwin32_is_junction() with lstat().
Now that lstat() reports junction points with S_IFLNK/S_ISLINK(), and
unlink() can unlink them, there is no need for conditional code for
Windows in a few places.  That was expressed by testing for WIN32 or
S_ISLNK, which we can now constant-fold.

The coding around pgwin32_is_junction() was a bit suspect anyway, as we
never checked for errors, and we also know that errors can be spuriously
reported because of transient sharing violations on this OS.  The
lstat()-based code has handling for that.

This also reverts 4fc6b6ee on master only.  That was done because
lstat() didn't previously work for symlinks (junction points), but now
it does.

Tested-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/CA%2BhUKGLfOOeyZpm5ByVcAt7x5Pn-%3DxGRNCvgiUPVVzjFLtnY0w%40mail.gmail.com
2022-08-06 12:50:59 +12:00
Thomas Munro cf112c1220 Remove dead pread and pwrite replacement code.
pread() and pwrite() are in SUSv2, and all targeted Unix systems have
them.

Previously, we defined pg_pread and pg_pwrite to emulate these function
with lseek() on old Unixen.  The names with a pg_ prefix were a reminder
of a portability hazard: they might change the current file position.
That hazard is gone, so we can drop the prefixes.

Since the remaining replacement code is Windows-only, move it into
src/port/win32p{read,write}.c, and move the declarations into
src/include/port/win32_port.h.

No need for vestigial HAVE_PREAD, HAVE_PWRITE macros as they were only
used for declarations in port.h which have now moved into win32_port.h.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Greg Stark <stark@mit.edu>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
2022-08-05 09:49:21 +12:00
Thomas Munro 2b1f580ee2 Remove configure probes for symlink/readlink, and dead code.
symlink() and readlink() are in SUSv2 and all targeted Unix systems have
them.  We have partial emulation on Windows.  Code that raised runtime
errors on systems without it has been dead for years, so we can remove
that and also references to such systems in the documentation.

Define HAVE_READLINK and HAVE_SYMLINK macros on Unix.  Our Windows
replacement functions based on junction points can't be used for
relative paths or for non-directories, so the macros can be used to
check for full symlink support.  The places that deal with tablespaces
can just use symlink functions without checking the macros.  (If they
did check the macros, they'd need to provide an #else branch with a
runtime or compile time error, and it'd be dead code.)

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com
2022-08-05 09:22:56 +12:00
Robert Haas 851f4cc75c Clean up some residual confusion between OIDs and RelFileNumbers.
Commit b0a55e4329 missed a few places
where we are referring to the number used as a part of the relation
filename as an "OID". We now want to call that a "RelFileNumber".

Some of these places actually made it sound like the OID in question
is pg_class.oid rather than pg_class.relfilenode, which is especially
good to clean up.

Dilip Kumar with some editing by me.
2022-07-28 10:20:29 -04:00
Michael Paquier ce3049b021 Refactor code in charge of grabbing the relations of a subscription
GetSubscriptionRelations() and GetSubscriptionNotReadyRelations() share
mostly the same code, which scans pg_subscription_rel and fetches all
the relations of a given subscription.  The only difference is that the
second routine looks for all the relations not in a ready state.  This
commit refactors the code to use a single routine, shaving a bit of
code.

Author: Vignesh C
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Michael Paquier, Peter
Smith
Discussion: https://postgr.es/m/CALDaNm0eW-9g4G_EzHebnFT5zZoasWCS_EzZQ5BgnLZny9S=pg@mail.gmail.com
2022-07-27 19:50:06 +09:00
Amit Kapila 366283961a Allow users to skip logical replication of data having origin.
This patch adds a new SUBSCRIPTION parameter "origin". It specifies
whether the subscription will request the publisher to only send changes
that don't have an origin or send changes regardless of origin. Setting it
to "none" means that the subscription will request the publisher to only
send changes that have no origin associated. Setting it to "any" means
that the publisher sends changes regardless of their origin. The default
is "any".
Usage:
CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres port=9999'
PUBLICATION pub1 WITH (origin = none);

This can be used to avoid loops (infinite replication of the same data)
among replication nodes.

This feature allows filtering only the replication data originating from
WAL but for initial sync (initial copy of table data) we don't have such a
facility as we can only distinguish the data based on origin from WAL. As
a follow-up patch, we are planning to forbid the initial sync if the
origin is specified as none and we notice that the publication tables were
also replicated from other publishers to avoid duplicate data or loops.

We forbid to allow creating origin with names 'none' and 'any' to avoid
confusion with the same name options.

Author: Vignesh C, Amit Kapila
Reviewed-By: Peter Smith, Amit Kapila, Dilip Kumar, Shi yu, Ashutosh Bapat, Hayato Kuroda
Discussion: https://postgr.es/m/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubhjcQ@mail.gmail.com
2022-07-21 08:47:38 +05:30
Fujii Masao ee79647769 Prevent BASE_BACKUP in the middle of another backup in the same session.
Multiple non-exclusive backups are able to be run conrrently in different
sessions. But, in the same session, only one non-exclusive backup can be
run at the same moment. If pg_backup_start (pg_start_backup in v14 or before)
is called in the middle of another non-exclusive backup in the same session,
an error is thrown.

However, previously, in logical replication walsender mode, even if that
walsender session had already called pg_backup_start and started
a non-exclusive backup, it could execute BASE_BACKUP command and
start another non-exclusive backup. Which caused subsequent pg_backup_stop
to throw an error because BASE_BACKUP unexpectedly reset the session state
marked by pg_backup_start.

This commit prevents BASE_BACKUP command in the middle of another
non-exclusive backup in the same session.

Back-patch to all supported branches.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
2022-07-20 09:56:42 +09:00
Andres Freund fd4bad1655 Remove now superfluous declarations of dlsym()ed symbols.
The prior commit declared them centrally.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20211101020311.av6hphdl6xbjbuif@alap3.anarazel.de
2022-07-17 17:29:32 -07:00
Peter Eisentraut 9fd45870c1 Replace many MemSet calls with struct initialization
This replaces all MemSet() calls with struct initialization where that
is easily and obviously possible.  (For example, some cases have to
worry about padding bits, so I left those.)

(The same could be done with appropriate memset() calls, but this
patch is part of an effort to phase out MemSet(), so it doesn't touch
memset() calls.)

Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/9847b13c-b785-f4e2-75c3-12ec77a3b05c@enterprisedb.com
2022-07-16 08:50:49 +02:00
Robert Haas b0a55e4329 Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.

Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".

Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.

On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.

Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.

Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.

Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 11:39:09 -04:00
Peter Eisentraut 16d52fc89d Refactor sending of DataRow messages in replication protocol
Some routines open-coded the construction of DataRow messages.  Use
TupOutputState struct and associated functions instead, which was
already done in some places.

SendTimeLineHistory() is a bit more complicated and isn't converted by
this.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/7e4fdbdc-699c-4cd0-115d-fb78a957fc22@enterprisedb.com
2022-07-06 08:42:56 +02:00
Andres Freund 3f8148c256 Revert 019_replslot_limit.pl related debugging aids.
This reverts most of 91c0570a79, f28bf667f6, fe0972ee5e, afdeff1052. The
only thing left is the retry loop in 019_replslot_limit.pl that avoids
spurious failures by retrying a couple times.

We haven't seen any hard evidence that this is caused by anything but slow
process shutdown. We did not find any cases where walsenders did not vanish
after waiting for longer. Therefore there's no reason for this debugging code
to remain.

Discussion: https://postgr.es/m/20220530190155.47wr3x2prdwyciah@alap3.anarazel.de
Backpatch: 15-
2022-07-05 11:01:10 -07:00
Peter Eisentraut 2ce648f750 Refactor sending of RowDescription messages in replication protocol
Some routines open-coded the construction of RowDescription messages.
Instead, we have support for doing this using tuple descriptors and
DestRemoteSimple, so use that instead.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/7e4fdbdc-699c-4cd0-115d-fb78a957fc22@enterprisedb.com
2022-07-04 19:43:58 +02:00
Alvaro Herrera f10a025cfe
Implement List support for TransactionId
Use it for RelationSyncEntry->streamed_txns, which is currently using an
integer list.

The API support is not complete, not because it is hard to write but
because it's unclear that it's worth the code space, there being so
little use of XID lists.

Discussion: https://postgr.es/m/202205130830.g5ntonhztspb@alvherre.pgsql
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
2022-07-04 14:52:12 +02:00
Peter Eisentraut 8ba3cb2f18 Fix for change timeline field of IDENTIFY_SYSTEM to int8
Amendment to ec40f3422412cfdc140b5d3f67db7fd2dac0f1e2: We also need to
change the way the datum is supplied to int8.  Otherwise, the value is
still cut off as an int4, and it will crash on 32-bit platforms.
2022-07-04 08:06:05 +02:00
Peter Eisentraut ec40f34224 Change timeline field of IDENTIFY_SYSTEM to int8
It was int4, but in the other replication commands, timelines are
returned as int8.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/7e4fdbdc-699c-4cd0-115d-fb78a957fc22@enterprisedb.com
2022-07-04 07:32:48 +02:00
Peter Eisentraut 4e85b97304 Fix attlen in RowDescription of BASE_BACKUP response
Should be 8 for int8, not -1.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/7e4fdbdc-699c-4cd0-115d-fb78a957fc22@enterprisedb.com
2022-07-04 07:31:54 +02:00
Peter Eisentraut d746021de1 Add construct_array_builtin, deconstruct_array_builtin
There were many calls to construct_array() and deconstruct_array() for
built-in types, for example, when dealing with system catalog columns.
These all hardcoded the type attributes necessary to pass to these
functions.

To simplify this a bit, add construct_array_builtin(),
deconstruct_array_builtin() as wrappers that centralize this hardcoded
knowledge.  This simplifies many call sites and reduces the amount of
hardcoded stuff that is spread around.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/2914356f-9e5f-8c59-2995-5997fc48bcba%40enterprisedb.com
2022-07-01 11:23:15 +02:00
Michael Paquier 33bd4698c1 Fix code comments still referring to pg_start/stop_backup()
pg_start_backup() and pg_stop_backup() have been respectively renamed to
pg_backup_start() and pg_backup_stop() as of 39969e2, but a few comments
did not get the call.

Reviewed-by: Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/YrqGlj1+4DF3dbZ/@paquier.xyz
2022-07-01 09:37:17 +09:00
Peter Eisentraut 258f48f858 Change some unnecessary MemSet calls
MemSet() with a value other than 0 just falls back to memset(), so the
indirection is unnecessary if the value is constant and not 0.  Since
there is some interest in getting rid of MemSet(), this gets some easy
cases out of the way.  (There are a few MemSet() calls that I didn't
change to maintain the consistency with their surrounding code.)

Discussion: https://www.postgresql.org/message-id/flat/CAEudQApCeq4JjW1BdnwU=m=-DvG5WyUik0Yfn3p6UNphiHjj+w@mail.gmail.com
2022-07-01 00:16:38 +02:00
Amit Kapila ac0e2d387a Fix memory leak due to LogicalRepRelMapEntry.attrmap.
When rebuilding the relation mapping on subscribers, we were not releasing
the attribute mapping's memory which was no longer required.

The attribute mapping used in logical tuple conversion was refactored in
PG13 (by commit e1551f96e6) but we forgot to update the related code that
frees the attribute map.

Author: Hou Zhijie
Reviewed-by: Amit Langote, Amit Kapila, Shi yu
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
2022-06-23 09:23:46 +05:30
Amit Kapila 75bfe7434d Fix stale values in partition map entries on subscribers.
We build the partition map entries on subscribers while applying the
changes for update/delete on partitions. The component relation in each
entry is closed after its use so we need to update it on successive use of
cache entries.

This problem was there since the original commit f1ac27bfda that
introduced this code but we didn't notice it till the recent commit
26b3455afa started to use the component relation of partition map cache
entry.

Reported-by: Tom Lane, as per buildfarm
Author: Amit Langote, Hou Zhijie
Reviewed-by: Amit Kapila, Shi Yu
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
2022-06-21 15:39:35 +05:30
Amit Kapila 26b3455afa Fix partition table's REPLICA IDENTITY checking on the subscriber.
In logical replication, we will check if the target table on the
subscriber is updatable by comparing the replica identity of the table on
the publisher with the table on the subscriber. When the target table is a
partitioned table, we only check its replica identity but not for the
partition tables. This leads to assertion failure while applying changes
for update/delete as we expect those to succeed only when the
corresponding partition table has a primary key or has a replica
identity defined.

Fix it by checking the replica identity of the partition table while
applying changes.

Reported-by: Shi Yu
Author: Shi Yu, Hou Zhijie
Reviewed-by: Amit Langote, Amit Kapila
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
2022-06-21 08:07:43 +05:30
Amit Kapila b7658c24c7 Fix data inconsistency between publisher and subscriber.
We were not updating the partition map cache in the subscriber even when
the corresponding remote rel is changed. Due to this data was getting
incorrectly replicated for partition tables after the publisher has
changed the table schema.

Fix it by resetting the required entries in the partition map cache after
receiving a new relation mapping from the publisher.

Reported-by: Shi Yu
Author: Shi Yu, Hou Zhijie
Reviewed-by: Amit Langote, Amit Kapila
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
2022-06-16 08:45:07 +05:30
Amit Kapila 5a97b13254 Fix cache look-up failures while applying changes in logical replication.
While building a new attrmap which maps partition attribute numbers to
remoterel's, we incorrectly update the map for dropped column attributes.
Later, it caused cache look-up failure when we tried to use the map to
fetch the information about attributes.

This also fixes the partition map cache invalidation which was using the
wrong type cast to fetch the entry. We were using stale partition map
entry after invalidation which leads to the assertion or cache look-up
failure.

Reported-by: Shi Yu
Author: Hou Zhijie, Shi Yu
Reviewed-by: Amit Langote, Amit Kapila
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
2022-06-15 09:52:12 +05:30
Tom Lane bf4717b091 Fix off-by-one loop termination condition in pg_stat_get_subscription().
pg_stat_get_subscription scanned one more LogicalRepWorker array entry
than is really allocated.  In the worst case this could lead to SIGSEGV,
if the LogicalRepCtx data structure is near the end of shared memory.
That seems quite unlikely though (thanks to the ordering of calls in
CreateSharedMemoryAndSemaphores) and we've heard no field reports of it.
A more likely misbehavior is one row of garbage data in the function's
result, but even that is not real likely because of the check that the
pid field matches some live backend.

Report and fix by Kuntal Ghosh.  This bug is old, so back-patch
to all supported branches.

Discussion: https://postgr.es/m/CAGz5QCJykEDzW6jQK6Yz7Qh_PMtD=95de_7QoocbVR2Qy8hWZA@mail.gmail.com
2022-06-07 15:34:30 -04:00
Amit Kapila fd0b9dcebd Prohibit combining publications with different column lists.
Currently, we simply combine the column lists when publishing tables on
multiple publications and that can sometimes lead to unexpected behavior.
Say, if a column is published in any row-filtered publication, then the
values for that column are sent to the subscriber even for rows that don't
match the row filter, as long as the row matches the row filter for any
other publication, even if that other publication doesn't include the
column.

The main purpose of introducing a column list is to have statically
different shapes on publisher and subscriber or hide sensitive column
data. In both cases, it doesn't seem to make sense to combine column
lists.

So, we disallow the cases where the column list is different for the same
table when combining publications. It can be later extended to combine the
column lists for selective cases where required.

Reported-by: Alvaro Herrera
Author: Hou Zhijie
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/202204251548.mudq7jbqnh7r@alvherre.pgsql
2022-06-02 08:31:50 +05:30
Amit Kapila 0ff20288e1 Extend pg_publication_tables to display column list and row filter.
Commit 923def9a53 and 52e4f0cd47 allowed to specify column lists and row
filters for publication tables. This commit extends the
pg_publication_tables view and pg_get_publication_tables function to
display that information.

This information will be useful to users and we also need this for the
later commit that prohibits combining multiple publications with different
column lists for the same table.

Author: Hou Zhijie
Reviewed By: Amit Kapila, Alvaro Herrera, Shi Yu, Takamichi Osumi
Discussion: https://postgr.es/m/202204251548.mudq7jbqnh7r@alvherre.pgsql
2022-05-19 08:20:55 +05:30
Michael Paquier bbf7c2d9e9 Fix typo in walreceiver.c
s/primary_slotname/primary_slot_name/.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACX3=pHkCpoGG-z+O=7Gp5YZv70jmfTyGnNV7YF3SkK73g@mail.gmail.com
2022-05-18 09:06:22 +09:00
Alvaro Herrera c4f113e8fe
Clean up newlines following left parentheses
Like commit c9d2977519.
2022-05-13 23:52:35 +02:00
Peter Eisentraut 30ed71e423 Indent C code in flex and bison files
In the style of pgindent, done semi-manually.

Discussion: https://www.postgresql.org/message-id/flat/7d062ecc-7444-23ec-a159-acd8adf9b586%40enterprisedb.com
2022-05-13 07:17:29 +02:00
Andres Freund 0cf16cb8ca Don't report stats in LogicalRepApplyLoop() when in xact.
pgstat_report_stat() is only supposed to be called outside of transactions. In
5891c7a8ed I added a pgstat_report_stat() call into LogicalRepApplyLoop()'s
timeout branch. While not commonly reached inside a transaction, it is
reachable (e.g. due to network bottlenecks or the sender being stalled / slow
for some reason).

To fix, add a !IsTransactionState() check.

No test added because there's no easy way to reproduce this case without
patching the code.

Reported-By: Erik Rijkers <er@xs4all.nl>
Discussion: https://postgr.es/m/b3463b8c-2328-dcac-0136-af95715493c1@xs4all.nl
2022-05-12 18:54:26 -07:00
Andres Freund 07d683b54a Remove function declaration for function in pg_proc.
The declaration is automatically generated. Noticed when experimenting with
adding PGDLLIMPORT markers for functions.

Discussion: https://postgr.es/m/20220512164513.vaheofqp2q24l65r@alap3.anarazel.de
2022-05-12 12:39:33 -07:00
Andres Freund 09cd33f47b Add 'static' to file-local variables missing it.
Noticed when comparing the set of exported symbols without / with
-fvisibility=hidden after adding PGDLLIMPORT to intentionally exported
symbols.

Discussion: https://postgr.es/m/20220512164513.vaheofqp2q24l65r@alap3.anarazel.de
2022-05-12 12:39:33 -07:00
Tom Lane 23e7b38bfe Pre-beta mechanical code beautification.
Run pgindent, pgperltidy, and reformat-dat-files.
I manually fixed a couple of comments that pgindent uglified.
2022-05-12 15:17:30 -04:00
Andres Freund b5f44225b8 Mark a few 'bbsink' related functions / variables static.
Discussion: https://postgr.es/m/20220506234924.6mxxotl3xl63db3l@alap3.anarazel.de
2022-05-12 09:11:31 -07:00
Michael Paquier 45edde037e Fix typos and grammar in code and test comments
This fixes the grammar of some comments in a couple of tests (SQL and
TAP), and in some C files.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220511020334.GH19626@telsasoft.com
2022-05-11 15:38:55 +09:00
Amit Kapila f95d53eded Fix the logical replication timeout during large transactions.
The problem is that we don't send keep-alive messages for a long time
while processing large transactions during logical replication where we
don't send any data of such transactions. This can happen when the table
modified in the transaction is not published or because all the changes
got filtered. We do try to send the keep_alive if necessary at the end of
the transaction (via WalSndWriteData()) but by that time the
subscriber-side can timeout and exit.

To fix this we try to send the keepalive message if required after
processing certain threshold of changes.

Reported-by: Fabrice Chapuis
Author: Wang wei and Amit Kapila
Reviewed By: Masahiko Sawada, Euler Taveira, Hou Zhijie, Hayato Kuroda
Backpatch-through: 10
Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com
2022-05-11 11:11:44 +05:30
Michael Paquier 59a32f0093 Fix typo in origin.c
Introduced in 5aa2350.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+PsuWz6_7aCmivNU5FahgQxDUTQtc3+__XnWkBzQcfn43w@mail.gmail.com
2022-05-06 20:01:15 +09:00
John Naylor e84f82ab5c Fix SQL syntax in comment in logical/worker.c
Euler Taveira

Disussion: https://www.postgresql.org/message-id/25f95189-eef8-43c4-9d7b-419b651963c8%40www.fastmail.com
2022-04-28 09:27:32 +07:00
Michael Paquier 83cca409ed Remove duplicated word in comment of basebackup.c
Oversight in 39969e2.

Author: Martín Marqués
Discussion: https://postgr.es/m/CABeG9LviA01oHC5h=ksLUuhMyXxmZR_tftRq6q3341CMT=j=4g@mail.gmail.com
2022-04-20 11:05:34 +09:00
Amit Kapila dd4ab6fd65 Fix the check to limit sync workers.
We don't allow to invoke more sync workers once we have reached the sync
worker limit per subscription. But the check to enforce this also doesn't
allow to launch an apply worker if it gets restarted.

This code was introduced by commit de43897122 but we caught the problem
only with the test added by recent commit c91f71b9dc which started failing
occasionally in the buildfarm.

As per buildfarm.
Diagnosed-by: Amit Kapila, Masahiko Sawada, Tomas Vondra
Author: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
	    https://postgr.es/m/f90d2b03-4462-ce95-a524-d91464e797c8@enterprisedb.com
2022-04-19 08:49:49 +05:30
Tom Lane 6fea65508a Tighten ComputeXidHorizons' handling of walsenders.
ComputeXidHorizons (nee GetOldestXmin) thought that it could identify
walsenders by checking for proc->databaseId == 0.  Perhaps that was
safe when the code was written, but it's been wrong at least since
autovacuum was invented.  Background processes that aren't connected
to any particular database, such as the autovacuum launcher and
logical replication launcher, look like that too.

This imprecision is harmful because when such a process advertises an
xmin, the result is to hold back dead-tuple cleanup in all databases,
though it'd be sufficient to hold it back in shared catalogs (which
are the only relations such a process can access).  Aside from being
generally inefficient, this has recently been seen to cause regression
test failures in the buildfarm, as a consequence of the logical
replication launcher's startup transaction preventing VACUUM from
marking pages of a user table as all-visible.

We only want that global hold-back effect for the case where a
walsender is advertising a hot standby feedback xmin.  Therefore,
invent a new PGPROC flag that says that a process' xmin should be
considered globally, and check that instead of using the incorrect
databaseId == 0 test.  Currently only a walsender sets that flag,
and only if it is not connected to any particular database.  (This is
for bug-compatibility with the undocumented behavior of the existing
code, namely that feedback sent by a client who has connected to a
particular database would not be applied globally.  I'm not sure this
is a great definition; however, such a client is capable of issuing
plain SQL commands, and I don't think we want xmins advertised for
such commands to be applied globally.  Perhaps this could do with
refinement later.)

While at it, I rewrote the comment in ComputeXidHorizons, and
re-ordered the commented-upon if-tests, to make them match up
for intelligibility's sake.

This is arguably a back-patchable bug fix, but given the lack of
complaints I think it prudent to let it age awhile in HEAD first.

Discussion: https://postgr.es/m/1346227.1649887693@sss.pgh.pa.us
2022-04-15 17:50:05 -04:00
Alvaro Herrera 24d2b2680a
Remove extraneous blank lines before block-closing braces
These are useless and distracting.  We wouldn't have written the code
with them to begin with, so there's no reason to keep them.

Author: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
Discussion: https://postgr.es/m/attachment/133167/0016-Extraneous-blank-lines.patch
2022-04-13 19:16:02 +02:00
Michael Paquier a4b57543ac Rename backup_compression.{c,h} to compression.{c,h}
Compression option handling (level, algorithm or even workers) can be
used across several parts of the system and not only base backups.
Structures, objects and routines are renamed in consequence, to remove
the concept of base backups from this part of the code making this
change straight-forward.

pg_receivewal, that has gained support for LZ4 since babbbb5, will make
use of this infrastructure for its set of compression options, bringing
more consistency with pg_basebackup.  This cleanup needs to be done
before releasing a beta of 15.  pg_dump is a potential future target, as
well, and adding more compression options to it may happen in 16~.

Author: Michael Paquier
Reviewed-by: Robert Haas, Georgios Kokolatos
Discussion: https://postgr.es/m/YlPQGNAAa04raObK@paquier.xyz
2022-04-12 13:38:54 +09:00
David Rowley b0e5f02ddc Fix various typos and spelling mistakes in code comments
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
2022-04-11 20:49:41 +12:00
Michael Paquier 7597cc3083 Fix the dates of some copyright notices
0ad8032 and 4e34747 are at the origin of that.  Julien has found the one
in parse_jsontable.c, while I have spotted the rest.

Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/20220411060838.ftnzyvflpwu6f74w@jrouhaud
2022-04-11 16:36:25 +09:00
Tomas Vondra 2c7ea57e56 Revert "Logical decoding of sequences"
This reverts a sequence of commits, implementing features related to
logical decoding and replication of sequences:

 - 0da92dc530
 - 80901b3291
 - b779d7d8fd
 - d5ed9da41d
 - a180c2b34d
 - 75b1521dae
 - 2d2232933b
 - 002c9dd97a
 - 05843b1aa4

The implementation has issues, mostly due to combining transactional and
non-transactional behavior of sequences. It's not clear how this could
be fixed, but it'll require reworking significant part of the patch.

Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com
2022-04-07 20:06:36 +02:00
Jeff Davis 5c279a6d35 Custom WAL Resource Managers.
Allow extensions to specify a new custom resource manager (rmgr),
which allows specialized WAL. This is meant to be used by a Table
Access Method or Index Access Method.

Prior to this commit, only Generic WAL was available, which offers
support for recovery and physical replication but not logical
replication.

Reviewed-by: Julien Rouhaud, Bharath Rupireddy, Andres Freund
Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com
2022-04-06 23:06:46 -07:00
Andres Freund 6f0cf87872 pgstat: remove stats_temp_directory.
With stats now being stored in shared memory, the GUC isn't needed
anymore. However, the pg_stat_tmp directory and PG_STAT_TMP_DIR define are
kept, as pg_stat_statements (and some out-of-core extensions) store data in
it.

Docs will be updated in a subsequent commit, together with the other pending
docs updates due to shared memory stats.

Author: Andres Freund <andres@anarazel.de>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220330233550.eiwsbearu6xhuqwe@alap3.anarazel.de
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund 5891c7a8ed pgstat: store statistics in shared memory.
Previously the statistics collector received statistics updates via UDP and
shared statistics data by writing them out to temporary files regularly. These
files can reach tens of megabytes and are written out up to twice a
second. This has repeatedly prevented us from adding additional useful
statistics.

Now statistics are stored in shared memory. Statistics for variable-numbered
objects are stored in a dshash hashtable (backed by dynamic shared
memory). Fixed-numbered stats are stored in plain shared memory.

The header for pgstat.c contains an overview of the architecture.

The stats collector is not needed anymore, remove it.

By utilizing the transactional statistics drop infrastructure introduced in a
prior commit statistics entries cannot "leak" anymore. Previously leaked
statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On
systems with many small relations pgstat_vacuum_stat() could be quite
expensive.

Now that replicas drop statistics entries for dropped objects, it is not
necessary anymore to reset stats when starting from a cleanly shut down
replica.

Subsequent commits will perform some further code cleanup, adapt docs and add
tests.

Bumps PGSTAT_FILE_FORMAT_ID.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@2ndquadrant.com> (in a much earlier version)
Reviewed-By: Arthur Zakirov <a.zakirov@postgrespro.ru> (in a much earlier version)
Reviewed-By: Antonin Houska <ah@cybertec.at> (in a much earlier version)
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
Discussion: https://postgr.es/m/20210319235115.y3wz7hpnnrshdyv6@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund e41aed674f pgstat: revise replication slot API in preparation for shared memory stats.
Previously the pgstat <-> replication slots API was done with on the basis of
names. However, the upcoming move to storing stats in shared memory makes it
more convenient to use a integer as key.

Change the replication slot functions to take the slot rather than the slot
name, and expose ReplicationSlotIndex() to compute the index of an replication
slot. Special handling will be required for restarts, as the index is not
stable across restarts. For now pgstat internally still uses names.

Rename pgstat_report_replslot_{create,drop}() to
pgstat_{create,drop}_replslot() to match the functions for other kinds of
stats.

Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
2022-04-06 18:38:24 -07:00
Andres Freund bdbd3d9064 pgstat: stats collector references in comments.
Soon the stats collector will be no more, with statistics instead getting
stored in shared memory. There are a lot of references to the stats collector
in comments. This commit replaces most of these references with "cumulative
statistics system", with the remaining ones getting replaced as part of
subsequent commits.

This is done separately from the - quite large - shared memory statistics
patch to make review easier.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
2022-04-06 13:56:06 -07:00
Stephen Frost 39969e2a1e Remove exclusive backup mode
Exclusive-mode backups have been deprecated since 9.6 (when
non-exclusive backups were introduced) due to the issues
they can cause should the system crash while one is running and
generally because non-exclusive provides a much better interface.
Further, exclusive backup mode wasn't really being tested (nor was most
of the related code- like being able to log in just to stop an exclusive
backup and the bits of the state machine related to that) and having to
possibly deal with an exclusive backup and the backup_label file
existing during pg_basebackup, pg_rewind, etc, added other complexities
that we are better off without.

This patch removes the exclusive backup mode, the various special cases
for dealing with it, and greatly simplifies the online backup code and
documentation.

Authors: David Steele, Nathan Bossart
Reviewed-by: Chapman Flack
Discussion: https://postgr.es/m/ac7339ca-3718-3c93-929f-99e725d1172c@pgmasters.net
https://postgr.es/m/CAHg+QDfiM+WU61tF6=nPZocMZvHDzCK47Kneyb0ZRULYzV5sKQ@mail.gmail.com
2022-04-06 14:41:03 -04:00
Amit Kapila 2d09e44d30 Improve comments for row filtering and toast interaction in logical replication.
Reported-by: Antonin Houska
Author: Amit Kapila
Reviewed-by: Antonin Houska, Ajin Cherian
Discussion: https://postgr.es/m/84638.1649152255@antos
2022-04-06 08:20:40 +05:30
David Rowley 1b0d9aa4f7 Improve the generation memory allocator
Here we make a series of improvements to the generation memory
allocator, namely:

1. Allow generation contexts to have a minimum, initial and maximum block
sizes. The standard allocator allows this already but when the generation
context was added, it only allowed fixed-sized blocks.  The problem with
fixed-sized blocks is that it's difficult to choose how large to make the
blocks.  If the chosen size is too small then we'd end up with a large
number of blocks and a large number of malloc calls. If the block size is
made too large, then memory is wasted.

2. Add support for "keeper" blocks.  This is a special block that is
allocated along with the context itself but is never freed.  Instead,
when the last chunk in the keeper block is freed, we simply mark the block
as empty to allow new allocations to make use of it.

3. Add facility to "recycle" newly empty blocks instead of freeing them
and having to later malloc an entire new block again.  We do this by
recording a single GenerationBlock which has become empty of any chunks.
When we run out of space in the current block, we check to see if there is
a "freeblock" and use that if it contains enough space for the allocation.

Author: David Rowley, Tomas Vondra
Reviewed-by: Andy Fan
Discussion: https://postgr.es/m/d987fd54-01f8-0f73-af6c-519f799a0ab8@enterprisedb.com
2022-04-04 20:53:13 +12:00
Joe Conway 9752436f04 Use has_privs_for_roles for predefined role checks: round 2
Similar to commit 6198420ad, replace is_member_of_role with
has_privs_for_role for predefined role access checks in recently
committed basebackup code. In passing fix a double-word error
in a nearby comment.

Discussion: https://postgr.es/m/flat/CAGB+Vh4Zv_TvKt2tv3QNS6tUM_F_9icmuj0zjywwcgVi4PAhFA@mail.gmail.com
2022-04-02 13:24:38 -04:00
Robert Haas 51c0d186d9 Allow parallel zstd compression when taking a base backup.
libzstd allows transparent parallel compression just by setting
an option when creating the compression context, so permit that
for both client and server-side backup compression. To use this,
use something like pg_basebackup --compress WHERE-zstd:workers=N
where WHERE is "client" or "server" and N is an integer.

When compression is performed on the server side, this will spawn
threads inside the PostgreSQL backend. While there is almost no
PostgreSQL server code which is thread-safe, the threads here are used
internally by libzstd and touch only data structures controlled by
libzstd.

Patch by me, based in part on earlier work by Dipesh Pandit
and Jeevan Ladhe. Reviewed by Justin Pryzby.

Discussion: http://postgr.es/m/CA+Tgmobj6u-nWF-j=FemygUhobhryLxf9h-wJN7W-2rSsseHNA@mail.gmail.com
2022-03-30 09:41:26 -04:00
Amit Kapila d5a9d86d8f Skip empty transactions for logical replication.
The current logical replication behavior is to send every transaction to
subscriber even if the transaction is empty. This can happen because
transaction doesn't contain changes from the selected publications or all
the changes got filtered. It is a waste of CPU cycles and network
bandwidth to build/transmit these empty transactions.

This patch addresses the above problem by postponing the BEGIN message
until the first change is sent. While processing a COMMIT message, if
there was no other change for that transaction, do not send the COMMIT
message. This allows us to skip sending BEGIN/COMMIT messages for empty
transactions.

When skipping empty transactions in synchronous replication mode, we send
a keepalive message to avoid delaying such transactions.

Author: Ajin Cherian, Hou Zhijie, Euler Taveira
Reviewed-by: Peter Smith, Takamichi Osumi, Shi Yu, Masahiko Sawada, Greg Nancarrow, Vignesh C, Amit Kapila
Discussion: https://postgr.es/m/CAMkU=1yohp9-dv48FLoSPrMqYEyyS5ZWkaZGD41RJr10xiNo_Q@mail.gmail.com
2022-03-30 07:41:05 +05:30
Joe Conway 6198420ad8 Use has_privs_for_roles for predefined role checks
Generally if a role is granted membership to another role with NOINHERIT
they must use SET ROLE to access the privileges of that role, however
with predefined roles the membership and privilege is conflated. Fix that
by replacing is_member_of_role with has_privs_for_role for predefined
roles. Patch does not remove is_member_of_role from acl.h, but it does
add a warning not to use that function for privilege checking. Not
backpatched based on hackers list discussion.

Author: Joshua Brindle
Reviewed-by: Stephen Frost, Nathan Bossart, Joe Conway
Discussion: https://postgr.es/m/flat/CAGB+Vh4Zv_TvKt2tv3QNS6tUM_F_9icmuj0zjywwcgVi4PAhFA@mail.gmail.com
2022-03-28 15:10:04 -04:00
Robert Haas 61762426e6 Fix a few goofs in new backup compression code.
When we try to set the zstd compression level either on the client
or on the server, check for errors.

For any algorithm, on the client side, don't try to set the compression
level unless the user specified one. This was visibly broken for
zstd, which managed to set -1 rather than 0 in this case, but tidy
up the code for the other methods, too.

On the client side, if we fail to create a ZSTD_CCtx, exit after
reporting the error. Otherwise we'll dereference a null pointer.

Patch by me, reviewed by Dipesh Pandit.

Discussion: http://postgr.es/m/CA+TgmoZK3zLQUCGi1h4XZw4jHiAWtcACc+GsdJR1_Mc19jUjXA@mail.gmail.com
2022-03-28 12:19:05 -04:00
Andres Freund 91c0570a79 Don't fail for > 1 walsenders in 019_replslot_limit, add debug messages.
So far the first of the retries introduced in f28bf667f6 resolves the
issue. But I (Andres) am still suspicious that the start of the failures might
indicate a problem.

To reduce noise, stop reporting a failure if a retry resolves the problem. To
allow figuring out what causes the slow slot drop, add a few more debug
messages to ReplicationSlotDropPtr.

See also commit afdeff1052, fe0972ee5e and f28bf667f6.

Discussion: https://postgr.es/m/20220327213219.smdvfkq2fl74flow@alap3.anarazel.de
2022-03-27 22:35:42 -07:00
Tomas Vondra 923def9a53 Allow specifying column lists for logical replication
This allows specifying an optional column list when adding a table to
logical replication. The column list may be specified after the table
name, enclosed in parentheses. Columns not included in this list are not
sent to the subscriber, allowing the schema on the subscriber to be a
subset of the publisher schema.

For UPDATE/DELETE publications, the column list needs to cover all
REPLICA IDENTITY columns. For INSERT publications, the column list is
arbitrary and may omit some REPLICA IDENTITY columns. Furthermore, if
the table uses REPLICA IDENTITY FULL, column list is not allowed.

The column list can contain only simple column references. Complex
expressions, function calls etc. are not allowed. This restriction could
be relaxed in the future.

During the initial table synchronization, only columns included in the
column list are copied to the subscriber. If the subscription has
several publications, containing the same table with different column
lists, columns specified in any of the lists will be copied.

This means all columns are replicated if the table has no column list
at all (which is treated as column list with all columns), or when of
the publications is defined as FOR ALL TABLES (possibly IN SCHEMA that
matches the schema of the table).

For partitioned tables, publish_via_partition_root determines whether
the column list for the root or the leaf relation will be used. If the
parameter is 'false' (the default), the list defined for the leaf
relation is used. Otherwise, the column list for the root partition
will be used.

Psql commands \dRp+ and \d <table-name> now display any column lists.

Author: Tomas Vondra, Alvaro Herrera, Rahila Syed
Reviewed-by: Peter Eisentraut, Alvaro Herrera, Vignesh C, Ibrar Ahmed,
Amit Kapila, Hou zj, Peter Smith, Wang wei, Tang, Shi yu
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
2022-03-26 01:01:27 +01:00
Tomas Vondra 05843b1aa4 Minor improvements in sequence decoding code and docs
A couple minor comment improvements and code cleanups, based on
post-commit feedback to the sequence decoding patch.

Author: Amit Kapila, vignesh C
Discussion: https://postgr.es/m/aeb2ba8d-e6f4-5486-cc4c-0d4982c291cb@enterprisedb.com
2022-03-25 21:07:17 +01:00
Tomas Vondra 75b1521dae Add decoding of sequences to built-in replication
This commit adds support for decoding of sequences to the built-in
replication (the infrastructure was added by commit 0da92dc530).

The syntax and behavior mostly mimics handling of tables, i.e. a
publication may be defined as FOR ALL SEQUENCES (replicating all
sequences in a database), FOR ALL SEQUENCES IN SCHEMA (replicating
all sequences in a particular schema) or individual sequences.

To publish sequence modifications, the publication has to include
'sequence' action. The protocol is extended with a new message,
describing sequence increments.

A new system view pg_publication_sequences lists all the sequences
added to a publication, both directly and indirectly. Various psql
commands (\d and \dRp) are improved to also display publications
including a given sequence, or sequences included in a publication.

Author: Tomas Vondra, Cary Huang
Reviewed-by: Peter Eisentraut, Amit Kapila, Hannu Krosing, Andres
             Freund, Petr Jelinek
Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com
Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca
2022-03-24 18:49:27 +01:00
Robert Haas 591767150f pg_basebackup: Try to fix some failures on Windows.
Commit ffd53659c4 messed up the
mechanism that was being used to pass parameters to LogStreamerMain()
on Windows. It worked on Linux because only Windows was using threads.
Repair by moving the additional parameters added by that commit into
the 'logstreamer_param' struct.

Along the way, fix a compiler warning on builds without HAVE_LIBZ.

Discussion: http://postgr.es/m/CA+TgmoY5=AmWOtMj3v+cySP2rR=Bt6EGyF_joAq4CfczMddKtw@mail.gmail.com
2022-03-23 13:25:26 -04:00
Robert Haas 607e75e8f0 Unbreak the build.
Commit ffd53659c4 broke the build for
anyone not compiling with LZ4 and ZSTD enabled. Woops.
2022-03-23 10:22:54 -04:00
Robert Haas ffd53659c4 Replace BASE_BACKUP COMPRESSION_LEVEL option with COMPRESSION_DETAIL.
There are more compression parameters that can be specified than just
an integer compression level, so rename the new COMPRESSION_LEVEL
option to COMPRESSION_DETAIL before it gets released. Introduce a
flexible syntax for that option to allow arbitrary options to be
specified without needing to adjust the main replication grammar,
and common code to parse it that is shared between the client and
the server.

This commit doesn't actually add any new compression parameters,
so the only user-visible change is that you can now type something
like pg_basebackup --compress gzip:level=5 instead of writing just
pg_basebackup --compress gzip:5. However, it should make it easy to
add new options. If for example gzip starts offering fries, we can
support pg_basebackup --compress gzip:level=5,fries=true for the
benefit of users who want fries with that.

Along the way, this fixes a few things in pg_basebackup so that the
pg_basebackup can be used with a server-side compression algorithm
that pg_basebackup itself does not understand. For example,
pg_basebackup --compress server-lz4 could still succeed even if
only the server and not the client has LZ4 support, provided that
the other options to pg_basebackup don't require the client to
decompress the archive.

Patch by me. Reviewed by Justin Pryzby and Dagfinn Ilmari Mannsåker.

Discussion: http://postgr.es/m/CA+TgmoYvpetyRAbbg1M8b3-iHsaN4nsgmWPjOENu5-doHuJ7fA@mail.gmail.com
2022-03-23 09:19:14 -04:00
Amit Kapila 208c5d65bb Add ALTER SUBSCRIPTION ... SKIP.
This feature allows skipping the transaction on subscriber nodes.

If incoming change violates any constraint, logical replication stops
until it's resolved. Currently, users need to either manually resolve the
conflict by updating a subscriber-side database or by using function
pg_replication_origin_advance() to skip the conflicting transaction. This
commit introduces a simpler way to skip the conflicting transactions.

The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX),
which allows the apply worker to skip the transaction finished at
specified LSN. The apply worker skips all data modification changes within
the transaction.

Author: Masahiko Sawada
Reviewed-by: Takamichi Osumi, Hou Zhijie, Peter Eisentraut, Amit Kapila, Shi Yu, Vignesh C, Greg Nancarrow, Haiying Tang, Euler Taveira
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
2022-03-22 07:11:19 +05:30
Magnus Hagander c540d37157 Fix typo in file identification
Clearly a simple copy/paste mistake when the file was created.
2022-03-21 12:35:48 +02:00
Thomas Munro 3f1ce97346 Add circular WAL decoding buffer, take II.
Teach xlogreader.c to decode the WAL into a circular buffer.  This will
support optimizations based on looking ahead, to follow in a later
commit.

 * XLogReadRecord() works as before, decoding records one by one, and
   allowing them to be examined via the traditional XLogRecGetXXX()
   macros and certain traditional members like xlogreader->ReadRecPtr.

 * An alternative new interface XLogReadAhead()/XLogNextRecord() is
   added that returns pointers to DecodedXLogRecord objects so that it's
   now possible to look ahead in the WAL stream while replaying.

 * In order to be able to use the new interface effectively while
   streaming data, support is added for the page_read() callback to
   respond to a new nonblocking mode with XLREAD_WOULDBLOCK instead of
   waiting for more data to arrive.

No direct user of the new interface is included in this commit, though
XLogReadRecord() uses it internally.  Existing code doesn't need to
change, except in a few places where it was accessing reader internals
directly and now needs to go through accessor macros.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions)
Discussion: https://postgr.es/m/CA+hUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq=AovOddfHpA@mail.gmail.com
2022-03-18 18:45:47 +13:00
Tomas Vondra 5a07966225 Fix row filters with multiple publications
When publishing changes through a artition root, we should use the row
filter for the top-most ancestor. The relation may be added to multiple
publications, using different ancestors, and 52e4f0cd47 handled this
incorrectly. With c91f71b9dc we find the correct top-most ancestor, but
the code tried to fetch the row filter from all publications, including
those using a different ancestor etc. No row filter can be found for
such publications, which was treated as replicating all rows.

Similarly to c91f71b9dc, this seems to be a rare issue in practice. It
requires multiple publications including the same partitioned relation,
through different ancestors.

Fixed by only passing publications containing the top-most ancestor to
pgoutput_row_filter_init(), so that treating a missing row filter as
replicating all rows is correct.

Report and fix by me, test case by Hou zj. Reviews and improvements by
Amit Kapila.

Author: Tomas Vondra, Hou zj, Amit Kapila
Reviewed-by: Amit Kapila, Hou zj
Discussion: https://postgr.es/m/d26d24dd-2fab-3c48-0162-2b7f84a9c893%40enterprisedb.com
2022-03-17 17:03:48 +01:00
Tomas Vondra c91f71b9dc Fix publish_as_relid with multiple publications
Commit 83fd4532a7 allowed publishing of changes via ancestors, for
publications defined with publish_via_partition_root. But the way
the ancestor was determined in get_rel_sync_entry() was incorrect,
simply updating the same variable. So with multiple publications,
replicating different ancestors, the outcome depended on the order
of publications in the list - the value from the last loop was used,
even if it wasn't the top-most ancestor.

This is a probably rare situation, as in most cases publications do
not overlap, so each partition has exactly one candidate ancestor
to replicate as and there's no ambiguity.

Fixed by tracking the "ancestor level" for each publication, and
picking the top-most ancestor. Adds a test case, verifying the
correct ancestor is used for publishing the changes and that this
does not depend on order of publications in the list.

Older releases have another bug in this loop - once all actions are
replicated, the loop is terminated, on the assumption that inspecting
additional publications is unecessary. But that misses the fact that
those additional applications may replicate different ancestors.

Fixed by removal of this break condition. We might still terminate the
loop in some cases (e.g. when replicating all actions and the ancestor
is the partition root).

Backpatch to 13, where publish_via_partition_root was introduced.

Initial report and fix by me, test added by Hou zj. Reviews and
improvements by Amit Kapila.

Author: Tomas Vondra, Hou zj, Amit Kapila
Reviewed-by: Amit Kapila, Hou zj
Discussion: https://postgr.es/m/d26d24dd-2fab-3c48-0162-2b7f84a9c893%40enterprisedb.com
2022-03-16 18:05:58 +01:00
Robert Haas d0083c1d2a Suppress compiler warnings.
Michael Paquier

Discussion: http://postgr.es/m/YjGvq4zPDT6j15go@paquier.xyz
2022-03-16 09:26:48 -04:00
Robert Haas 8ef1fa3ee0 Remove accidentally-committed file. 2022-03-15 13:41:36 -04:00
Robert Haas e4ba69f3f4 Allow extensions to add new backup targets.
Commit 3500ccc39b allowed for base backup
targets, meaning that we could do something with the backup other than
send it to the client, but all of those targets had to be baked in to
the core code. This commit makes it possible for extensions to define
additional backup targets.

Patch by me, reviewed by Abhijit Menon-Sen.

Discussion: http://postgr.es/m/CA+TgmoaqvdT-u3nt+_kkZ7bgDAyqDB0i-+XOMmr5JN2Rd37hxw@mail.gmail.com
2022-03-15 13:22:04 -04:00
Robert Haas 75eae09087 Change HAVE_LIBLZ4 and HAVE_LIBZSTD tests to USE_LZ4 and USE_ZSTD.
These tests were added recently, but older code tests USE_LZ4 rathr
than HAVE_LIBLZ4, so let's follow the established precedent. It
also seems more consistent with the intent of the configure tests,
since I think that the USE_* symbols are intended to correspond to
what the user requested, and the HAVE_* symbols to what configure
found while probing.

Discussion: http://postgr.es/m/CA+Tgmoap+hTD2-QNPJLH4tffeFE8MX5+xkbFKMU3FKBy=ZSNKA@mail.gmail.com
2022-03-15 13:06:25 -04:00
Amit Kapila 695f459f17 Fix compiler warning introduced in commit 705e20f855.
Reported-by: Nathan Bossart
Author: Nathan Bossart
Reviewed-by: Osumi Takamichi
Discussion : https://postgr.es/m/20220314230424.GA1085716@nathanxps13
2022-03-15 08:11:17 +05:30
Michael Paquier 6bdf1a1400 Fix collection of typos in the code and the documentation
Some words were duplicated while other places were grammatically
incorrect, including one variable name in the code.

Author: Otto Kekalainen, Justin Pryzby
Discussion: https://postgr.es/m/7DDBEFC5-09B6-4325-B942-B563D1A24BDC@amazon.com
2022-03-15 11:29:35 +09:00
Amit Kapila 705e20f855 Optionally disable subscriptions on error.
Logical replication apply workers for a subscription can easily get stuck
in an infinite loop of attempting to apply a change, triggering an error
(such as a constraint violation), exiting with the error written to the
subscription server log, and restarting.

To partially remedy the situation, this patch adds a new subscription
option named 'disable_on_error'. To be consistent with old behavior, this
option defaults to false. When true, both the tablesync worker and apply
worker catch any errors thrown and disable the subscription in order to
break the loop. The error is still also written in the logs.

Once the subscription is disabled, users can either manually resolve the
conflict/error or skip the conflicting transaction by using
pg_replication_origin_advance() function. After resolving the conflict,
users need to enable the subscription to allow apply process to proceed.

Author: Osumi Takamichi and Mark Dilger
Reviewed-by: Greg Nancarrow, Vignesh C, Amit Kapila, Wang wei, Tang Haiying, Peter Smith, Masahiko Sawada, Shi Yu
Discussion : https://postgr.es/m/DB35438F-9356-4841-89A0-412709EBD3AB%40enterprisedb.com
2022-03-14 09:32:40 +05:30
Robert Haas 1d4be6be65 Fix LZ4 tests for remaining buffer space.
We should flush the buffer when the remaining space is less than
the maximum amount that we might need, not when it is less than or
equal to the maximum amount we might need.

Jeevan Ladhe, per an observation from me.

Discussion: http://postgr.es/m/CANm22CgVMa85O1akgs+DOPE8NSrT1zbz5_vYfS83_r+6nCivLQ@mail.gmail.com
2022-03-08 10:05:55 -05:00
Robert Haas 7cf085f077 Add support for zstd base backup compression.
Both client-side compression and server-side compression are now
supported for zstd. In addition, a backup compressed by the server
using zstd can now be decompressed by the client in order to
accommodate the use of -Fp.

Jeevan Ladhe, with some edits by me.

Discussion: http://postgr.es/m/CA+Tgmobyzfbz=gyze2_LL1ZumZunmaEKbHQxjrFkOR7APZGu-g@mail.gmail.com
2022-03-08 09:52:43 -05:00
Amit Kapila d3e8368c4b Add the additional information to the logical replication worker errcontext.
This commits adds both the finish LSN (commit_lsn in case transaction got
committed, prepare_lsn in case of a prepared transaction, etc.) and
replication origin name to the existing error context message.

This will help users in specifying the origin name and transaction finish
LSN to pg_replication_origin_advance() SQL function to skip a particular
transaction.

Author: Masahiko Sawada
Reviewed-by: Takamichi Osumi, Euler Taveira, and Amit Kapila
Discussion: https://postgr.es/m/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com
2022-03-08 08:08:32 +05:30
Tomas Vondra d5ed9da41d Call ReorderBufferProcessXid from sequence_decode
Commit 0da92dc530 added sequence_decode() implementing logical decoding
of sequences, but it failed to call ReorderBufferProcessXid() as it
should. So add the missing call.

Reported-by: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1KGn6cQqJEsubOOENwQOANsExiV2sKL52r4U10J8NJEMQ%40mail.gmail.com
2022-03-07 20:53:16 +01:00
Amit Kapila 5e0e99a80b Make the errcontext message in logical replication worker translation friendly.
Previously, the message for logical replication worker errcontext is
incrementally built, which was not translation friendly.  Instead, we use
complete sentences with if-else branches.

We also remove the commit timestamp from the context message since it's
not important information and made the message long.

Author: Masahiko Sawada
Reviewed-by: Takamichi Osumi, and Amit Kapila
Discussion: https://postgr.es/m/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com
2022-03-07 08:33:58 +05:30
Michael Paquier 9e98583898 Create routine able to set single-call SRFs for Materialize mode
Set-returning functions that use the Materialize mode, creating a
tuplestore to include all the tuples returned in a set rather than doing
so in multiple calls, use roughly the same set of steps to prepare
ReturnSetInfo for this job:
- Check if ReturnSetInfo supports returning a tuplestore and if the
materialize mode is enabled.
- Create a tuplestore for all the tuples part of the returned set in the
per-query memory context, stored in ReturnSetInfo->setResult.
- Build a tuple descriptor mostly from get_call_result_type(), then
stored in ReturnSetInfo->setDesc.  Note that there are some cases where
the SRF's tuple descriptor has to be the one specified by the function
caller.

This refactoring is done so as there are (well, should be) no behavior
changes in any of the in-core functions refactored, and the centralized
function that checks and sets up the function's ReturnSetInfo can be
controlled with a set of bits32 options.  Two of them prove to be
necessary now:
- SRF_SINGLE_USE_EXPECTED to use expectedDesc as tuple descriptor, as
expected by the function's caller.
- SRF_SINGLE_BLESS to validate the tuple descriptor for the SRF.

The same initialization pattern is simplified in 28 places per my
count as of src/backend/, shaving up to ~900 lines of code.  These
mostly come from the removal of the per-query initializations and the
sanity checks now grouped in a single location.  There are more
locations that could be simplified in contrib/, that are left for a
follow-up cleanup.

fcc2817, 07daca5 and d61a361 have prepared the areas of the code related
to this change, to ease this refactoring.

Author: Melanie Plageman, Michael Paquier
Reviewed-by: Álvaro Herrera, Justin Pryzby
Discussion: https://postgr.es/m/CAAKRu_azyd1Z3W_r7Ou4sorTjRCs+PxeHw1CWJeXKofkE6TuZg@mail.gmail.com
2022-03-07 10:26:29 +09:00
Amit Kapila 7a85073290 Reconsider pg_stat_subscription_workers view.
It was decided (refer to the Discussion link below) that the stats
collector is not an appropriate place to store the error information of
subscription workers.

This patch changes the pg_stat_subscription_workers view (introduced by
commit 8d74fc96db) so that it stores only statistics counters:
apply_error_count and sync_error_count, and has one entry for
each subscription. The removed error information such as error-XID and
the error message would be stored in another way in the future which is
more reliable and persistent.

After removing these error details, there is no longer any relation
information, so the subscription statistics are now a cluster-wide
statistics.

The patch also changes the view name to pg_stat_subscription_stats since
the word "worker" is an implementation detail that we use one worker for
one tablesync and one apply.

Author: Masahiko Sawada, based on suggestions by Andres Freund
Reviewed-by: Peter Smith, Haiying Tang, Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/20220125063131.4cmvsxbz2tdg6g65@alap3.anarazel.de
2022-03-01 06:17:52 +05:30
Andres Freund d33aeefd9b Fix warning on mingw due to pid_t width, introduced in fe0972ee5e. 2022-02-26 16:07:07 -08:00
Amit Kapila a89850a57e Fix typo in logicalfuncs.c.
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACX1mVtw8LWEnZgnpPdk2bPFR1xX2ZN+8GfXCffyip_9=Q@mail.gmail.com
2022-02-26 10:38:37 +05:30
Andres Freund fe0972ee5e Add further debug info to help debug 019_replslot_limit.pl failures.
See also afdeff1052. Failures after that commit provided a few more hints,
but not yet enough to understand what's going on.

In 019_replslot_limit.pl shut down nodes with fast instead of immediate mode
if we observe the failure mode. That should tell us whether the failures we're
observing are just a timing issue under high load. PGCTLTIMEOUT should prevent
buildfarm animals from hanging endlessly.

Also adds a bit more logging to replication slot drop and ShutdownPostgres().

Discussion: https://postgr.es/m/20220225192941.hqnvefgdzaro6gzg@alap3.anarazel.de
2022-02-25 17:04:39 -08:00
Andres Freund afdeff1052 Add temporary debug info to help debug 019_replslot_limit.pl failures.
I have not been able to reproduce the occasional failures of
019_replslot_limit.pl we are seeing in the buildfarm and not for lack of
trying. The additional logging and increased log level will hopefully help.

Will be reverted once the cause is identified.

Discussion: https://postgr.es/m/20220218231415.c4plkp4i3reqcwip@alap3.anarazel.de
2022-02-22 18:02:34 -08:00
Amit Kapila 52e4f0cd47 Allow specifying row filters for logical replication of tables.
This feature adds row filtering for publication tables. When a publication
is defined or modified, an optional WHERE clause can be specified. Rows
that don't satisfy this WHERE clause will be filtered out. This allows a
set of tables to be partially replicated. The row filter is per table. A
new row filter can be added simply by specifying a WHERE clause after the
table name. The WHERE clause must be enclosed by parentheses.

The row filter WHERE clause for a table added to a publication that
publishes UPDATE and/or DELETE operations must contain only columns that
are covered by REPLICA IDENTITY. The row filter WHERE clause for a table
added to a publication that publishes INSERT can use any column. If the
row filter evaluates to NULL, it is regarded as "false". The WHERE clause
only allows simple expressions that don't have user-defined functions,
user-defined operators, user-defined types, user-defined collations,
non-immutable built-in functions, or references to system columns. These
restrictions could be addressed in the future.

If you choose to do the initial table synchronization, only data that
satisfies the row filters is copied to the subscriber. If the subscription
has several publications in which a table has been published with
different WHERE clauses, rows that satisfy ANY of the expressions will be
copied. If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher.

The row filters are applied before publishing the changes. If the
subscription has several publications in which the same table has been
published with different filters (for the same publish operation), those
expressions get OR'ed together so that rows satisfying any of the
expressions will be replicated.

This means all the other filters become redundant if (a) one of the
publications have no filter at all, (b) one of the publications was
created using FOR ALL TABLES, (c) one of the publications was created
using FOR ALL TABLES IN SCHEMA and the table belongs to that same schema.

If your publication contains a partitioned table, the publication
parameter publish_via_partition_root determines if it uses the partition's
row filter (if the parameter is false, the default) or the root
partitioned table's row filter.

Psql commands \dRp+ and \d <table-name> will display any row filters.

Author: Hou Zhijie, Euler Taveira, Peter Smith, Ajin Cherian
Reviewed-by: Greg Nancarrow, Haiying Tang, Amit Kapila, Tomas Vondra, Dilip Kumar, Vignesh C, Alvaro Herrera, Andres Freund, Wei Wang
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
2022-02-22 08:11:50 +05:30
Amit Kapila c476f380e2 Fix a comment in worker.c.
The comment incorrectly states that worker gets killed during
ALTER SUBSCRIPTION ... DISABLE. Remove that part of the comment.

Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoCbEN==oH7BhP3U6WPHg3zgH6sDOeKhJjy4W2dx-qoVCw@mail.gmail.com
2022-02-18 07:46:51 +05:30
Michael Paquier d61a361d1a Remove all traces of tuplestore_donestoring() in the C code
This routine is a no-op since dd04e95 from 2003, with a macro kept
around for compatibility purposes.  This has led to the same code
patterns being copy-pasted around for no effect, sometimes in confusing
ways like in pg_logical_slot_get_changes_guts() from logical.c where the
code was actually incorrect.

This issue has been discussed on two different threads recently, so
rather than living with this legacy, remove any uses of this routine in
the C code to simplify things.  The compatibility macro is kept to avoid
breaking any out-of-core modules that depend on it.

Reported-by: Tatsuhito Kasahara, Justin Pryzby
Author: Tatsuhito Kasahara
Discussion: https://postgr.es/m/20211217200419.GQ17618@telsasoft.com
Discussion: https://postgr.es/m/CAP0=ZVJeeYfAeRfmzqAF2Lumdiv4S4FewyBnZd4DPTrsSQKJKw@mail.gmail.com
2022-02-17 09:52:02 +09:00
Heikki Linnakangas 70e81861fa Split xlog.c into xlog.c and xlogrecovery.c.
This moves the functions related to performing WAL recovery into the new
xlogrecovery.c source file, leaving xlog.c responsible for maintaining
the WAL buffers, coordinating the startup and switch from recovery to
normal operations, and other miscellaneous stuff that have always been in
xlog.c.

Reviewed-by: Andres Freund, Kyotaro Horiguchi, Robert Haas
Discussion: https://www.postgresql.org/message-id/a31f27b4-a31d-f976-6217-2b03be646ffa%40iki.fi
2022-02-16 09:30:38 +02:00
Andres Freund 2f6501fa3c Move replication slot release to before_shmem_exit().
Previously, replication slots were released in ProcKill() on error, resulting
in reporting replication slot drop of ephemeral slots after the stats
subsystem was already shut down.

To fix this problem, move replication slot release to a before_shmem_exit()
hook that is called before the stats collector shuts down. There wasn't really
a good reason for the slot handling to be in ProcKill() anyway.

Patch by Masahiko Sawada, with very minor polishing by me.

I, Andres, wrote a test for dropping slots during process exit, but there may
be some OS dependent issues around the number of times FATAL error messages
are displayed due to a still debated libpq issue. So that test will be
committed separately / later.

Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAD21AoDAeEpAbZEyYJsPZJUmSPaRicVSBObaL7sPaofnKz+9zg@mail.gmail.com
2022-02-14 17:08:17 -08:00
Peter Eisentraut cfc7191dfe Move scanint8() to numutils.c
Move scanint8() to numutils.c and rename to pg_strtoint64().  We
already have a "16" and "32" version of that, and the code inside the
functions was aligned, so this move makes all three versions
consistent.  The API is also changed to no longer provide the errorOK
case.  Users that need the error checking can use strtoi64().

Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
2022-02-14 21:57:26 +01:00
Tom Lane 302612a6c7 Silence minor compiler warnings.
Depending on compiler version and optimization level, we might
get a complaint that lazy_scan_heap's "freespace" is used
uninitialized.

Compilers not aware that ereport(ERROR) doesn't return complained
about bbsink_lz4_new().

Assigning "-1" to a uint64 value has unportable results; fortunately,
the value of xlogreadsegno is unimportant when xlogreadfd is -1.
(It looks to me like there is no need for xlogreadsegno to be static
in the first place, but I didn't venture to change that.)
2022-02-13 13:06:55 -05:00
Robert Haas dab298471f Add suport for server-side LZ4 base backup compression.
LZ4 compression can be a lot faster than gzip compression, so users
may prefer it even if the compression ratio is not as good. We will
want pg_basebackup to support LZ4 compression and decompression on the
client side as well, and there is a pending patch for that, but it's
by a different author, so I am committing this part separately for
that reason.

Jeevan Ladhe, reviewed by Tushar Ahuja and by me.

Discussion: http://postgr.es/m/CANm22Cg9cArXEaYgHVZhCnzPLfqXCZLAzjwTq7Fc0quXRPfbxA@mail.gmail.com
2022-02-11 08:29:38 -05:00
Tomas Vondra 0da92dc530 Logical decoding of sequences
This extends the logical decoding to also decode sequence increments.
We differentiate between sequences created in the current (in-progress)
transaction, and sequences created earlier. This mixed behavior is
necessary because while sequences are not transactional (increments are
not subject to ROLLBACK), relfilenode changes are. So we do this:

* Changes for sequences created in the same top-level transaction are
  treated as transactional, i.e. just like any other change from that
  transaction, and discarded in case of a rollback.

* Changes for sequences created earlier are applied immediately, as if
  performed outside any transaction. This applies also after ALTER
  SEQUENCE, which may create a new relfilenode.

Moreover, if we ever get support for DDL replication, the sequence
won't exist until the transaction gets applied.

Sequences created in the current transaction are tracked in a simple
hash table, identified by a relfilenode. That means a sequence may
already exist, but if a transaction does ALTER SEQUENCE then the
increments for the new relfilenode will be treated as transactional.

For each relfilenode we track the XID of (sub)transaction that created
it, which is needed for cleanup at transaction end. We don't need to
check the XID to decide if an increment is transactional - if we find a
match in the hash table, it has to be the same transaction.

This requires two minor changes to WAL-logging. Firstly, we need to
ensure the sequence record has a valid XID - until now the the increment
might have XID 0 if it was the first change in a subxact. But the
sequence might have been created in the same top-level transaction. So
we ensure the XID is assigned when WAL-logging increments.

The other change is addition of "created" flag, marking increments for
newly created relfilenodes. This makes it easier to maintain the hash
table of sequences that need transactional handling.
Note: This is needed because of subxacts. A XID 0 might still have the
sequence created in a different subxact of the same top-level xact.

This does not include any changes to test_decoding and/or the built-in
replication - those will be committed in separate patches.

A patch adding decoding of sequences was originally submitted by Cary
Huang. This commit reworks various important aspects (e.g. the WAL
logging and transactional/non-transactional handling). However, the
original patch and reviews were very useful.

Author: Tomas Vondra, Cary Huang
Reviewed-by: Peter Eisentraut, Hannu Krosing, Andres Freund
Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com
Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca
2022-02-10 18:43:51 +01:00
Robert Haas 0d4513b613 Remove server support for the previous base backup protocol.
Commit cc333f3233 added a new COPY
sub-protocol for taking base backups, but retained support for the
previous protocol. For the same reasons articulated in the message
for commit 9cd28c2e5f, remove support
for the previous protocol from the server.

Discussion: http://postgr.es/m/CA+TgmoazKcKUWtqVa0xZqSzbKgTH+X-aw4V7GyLD68EpDLMh8A@mail.gmail.com
2022-02-10 12:12:43 -05:00
Robert Haas 9cd28c2e5f Remove server support for old BASE_BACKUP command syntax.
Commit 0ba281cb4b added a new syntax
for the BASE_BACKUP command, with extensible options, but maintained
support for the legacy syntax. This isn't important for PostgreSQL,
where pg_basebackup works with older server versions but not newer
ones, but it could in theory matter for out-of-core users of the
replication protocol.

Discussion on pgsql-hackers, however, suggests that no one is aware
of any out-of-core use of the BASE_BACKUP command, and the consensus
is in favor of removing support for the old syntax to simplify the
code, so do that.

Discussion: http://postgr.es/m/CA+TgmoazKcKUWtqVa0xZqSzbKgTH+X-aw4V7GyLD68EpDLMh8A@mail.gmail.com
2022-02-10 10:48:33 -05:00
Amit Kapila 7f481b8d38 Improve invalidation handling in pgoutput.c.
Fix the following issues in pgoutput.c:

* rel_sync_cache_relation_cb does the wrong thing when called for a cache
flush (i.e., relid == 0). Instead of invalidating all RelationSyncCache
entries as it should, it does nothing.

* When rel_sync_cache_relation_cb does invalidate an entry, it immediately
zaps the entry->map structure, even though that might still be in use. We
instead just mark the entry as invalid and rebuild it at a later safe
point.

* Similarly, rel_sync_cache_publication_cb is way too eager to reset the
pubactions flags, which would likely lead to failing to transmit changes
that we should transmit. In this case also, we just mark the entry as
invalid and rebuild it at a later safe point.

Author: Tom Lane
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/885288.1641420714@sss.pgh.pa.us
2022-02-04 07:30:40 +05:30
Robert Haas 8e2b6d45a0 Fix server crash bug in 'server' backup target.
When this code executed as superuser it appeared to work because no
system catalog lookups happened, but otherwise it crashes because there
is no transaction environment. Fix that.

Report and code change by me. Test case by Dagfinn Ilmari Mannsåker.

Discussion: http://postgr.es/m/CA+TgmobiKLXne-2AVzYyWRiO8=rChBQ=7ywoxp=2SmcFw=oDDw@mail.gmail.com
2022-02-02 13:50:33 -05:00
Alvaro Herrera b3d7d6e462
Remove xloginsert.h from xlog.h
xlog.h is directly and indirectly #included in a lot of places.  With
this change, xloginsert.h is no longer unnecessarily included in the
large number of them that don't need it.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CALj2ACVe-W+WM5P44N7eG9C2_FmaeM8Dq5aCnD3fHt0Ba=WR6w@mail.gmail.com
2022-01-30 12:25:24 -03:00
Robert Haas 7f6772317b Adjust server-side backup to depend on pg_write_server_files.
I had made it depend on superuser, but that seems clearly inferior.
Also document the permissions requirement in the straming replication
protocol section of the documentation, rather than only in the
section having to do with pg_basebackup.

Idea and patch from Dagfinn Ilmari Mannsåker.

Discussion: http://postgr.es/m/87bkzw160u.fsf@wibble.ilmari.org
2022-01-28 12:31:40 -05:00
Robert Haas 71cbbbbe80 pg_basebackup: Add a dummy return to bbsink_gzip_new().
Apparently, this is needed to avoid warnings on MVCC.

David Rowley

Discussion: http://postgr.es/m/CAApHDvosHkgyo_PZs7CSB4Kgs2ey4FdmFpcK0N_QOci9DJ=wnw@mail.gmail.com
2022-01-27 14:20:18 -05:00
Michael Paquier 410aa248e5 Fix various typos, grammar and code style in comments and docs
This fixes a set of issues that have accumulated over the past months
(or years) in various code areas.  Most fixes are related to some recent
additions, as of the development of v15.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220124030001.GQ23027@telsasoft.com
2022-01-25 09:40:04 +09:00
Tom Lane 6aa5186146 Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands.  Poor design of the replication-command
parser caused it to fail in various cases, notably:

* semicolons embedded in a command, or multiple SQL commands
sent in a single message;

* dollar-quoted literals containing odd numbers of single
or double quote marks;

* commands starting with a comment.

The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command.  Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.

We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command.  That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token.  If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing.  Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.

(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)

In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type.  (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)

I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it.  The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.

Per bug #17379 from Greg Rychlewski.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:38 -05:00
Robert Haas 0ad8032910 Server-side gzip compression.
pg_basebackup's --compression option now lets you write either
"client-gzip" or "server-gzip" instead of just "gzip" to specify
where the compression should be performed. If you write simply
"gzip" it's taken to mean "client-gzip" unless you also use
--target, in which case it is interpreted to mean "server-gzip",
because that's the only thing that makes any sense in that case.

To make this work, the BASE_BACKUP command now takes new
COMPRESSION and COMPRESSION_LEVEL options.

At present, pg_basebackup cannot decompress .gz files, so
server-side compression will cause a failure if (1) -Ft is not
used or (2) -R is used or (3) -D- is used without --no-manifest.

Along the way, I removed the information message added by commit
5c649fe153 which occurred if you
specified no compression level and told you that the default level
had been used instead. That seemed like more output than most
people would want.

Also along the way, this adds a check to the server for
unrecognized base backup options. This repairs a bug introduced
by commit 0ba281cb4b.

This commit also adds some new test cases for pg_verifybackup.
They take a server-side backup with and without compression, and
then extract the backup if we have the OS facilities available
to do so, and then run pg_verifybackup on the extracted
directory. That is a good test of the functionality added by
this commit and also improves test coverage for the backup target
patch (commit 3500ccc39b) and for
pg_verifybackup itself.

Patch by me, with a bug fix by Jeevan Ladhe.  The patch set of which
this is a part has also had review and/or testing from Tushar Ahuja,
Suraj Kharage, Dipesh Pandit, and Mark Dilger.

Discussion: http://postgr.es/m/CA+Tgmoa-ST7fMLsVJduOB7Eub=2WjfpHS+QxHVEpUoinf4bOSg@mail.gmail.com
2022-01-24 15:13:18 -05:00
Tom Lane 3c06ec6d14 Remember to reset yy_start state when firing up repl_scanner.l.
Without this, we get odd behavior when the previous cycle of
lexing exited in a non-default exclusive state.  Every other
copy of this code is aware that it has to do BEGIN(INITIAL),
but repl_scanner.l did not get that memo.

The real-world impact of this is probably limited, since most
replication clients would abandon their connection after getting
a syntax error.  Still, it's a bug.

This mistake is old, so back-patch to all supported branches.

Discussion: https://postgr.es/m/1874781.1643035952@sss.pgh.pa.us
2022-01-24 12:09:46 -05:00
Tom Lane 353708e1fb Clean up recent Coverity complaints.
Commit 5c649fe15 introduced a memory leak into pg_basebackup's
parse_compress_options.  (I simplified nearby code while at it.)

Commit 9a974cbcb introduced a memory leak into pg_dump's
binary_upgrade_set_pg_class_oids.

Coverity also complained about a call of SnapBuildProcessChange that
ignored the result, unlike every other call of that function.  This
is evidently intentional, so add a (void) cast to indicate that.
(It's also old, dating to b89e15105; I suppose the reason it showed
up now is 7a5f6b474's recent rearrangement of nearby code.)
2022-01-23 12:51:38 -05:00
Robert Haas 3500ccc39b Support base backup targets.
pg_basebackup now has a --target=TARGET[:DETAIL] option. If specfied,
it is sent to the server as the value of the TARGET option to the
BASE_BACKUP command. If DETAIL is included, it is sent as the value of
the new TARGET_DETAIL option to the BASE_BACKUP command.  If the
target is anything other than 'client', pg_basebackup assumes that it
will now be the server's job to write the backup in a location somehow
defined by the target, and that it therefore needs to write nothing
locally. However, the server will still send messages to the client
for progress reporting purposes.

On the server side, we now support two additional types of backup
targets.  There is a 'blackhole' target, which just throws away the
backup data without doing anything at all with it. Naturally, this
should only be used for testing and debugging purposes, since you will
not actually have a backup when it finishes running. More usefully,
there is also a 'server' target, so you can now use something like
'pg_basebackup -Xnone -t server:/SOME/PATH' to write a backup to some
location on the server. We can extend this to more types of targets
in the future, and might even want to create an extensibility
mechanism for adding new target types.

Since WAL fetching is handled with separate client-side logic, it's
not part of this mechanism; thus, backups with non-default targets
must use -Xnone or -Xfetch.

Patch by me, with a bug fix by Jeevan Ladhe.  The patch set of which
this is a part has also had review and/or testing from Tushar Ahuja,
Suraj Kharage, Dipesh Pandit, and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
2022-01-20 10:46:33 -05:00
Jeff Davis 7a5f6b4748 Make logical decoding a part of the rmgr.
Add a new rmgr method, rm_decode, and use that rather than a switch
statement.

In preparation for rmgr extensibility.

Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com
Discussion: https://postgr.es/m/20220118095332.6xtlcjoyxobv6cbk@jrouhaud
2022-01-19 14:58:49 -08:00
Robert Haas 0f47e833bf Fix alignment problem with bbsink_copystream buffer.
bbsink_copystream wants to store a type byte just before the buffer,
but basebackup.c wants the buffer to be aligned so that it can call
PageIsNew() and PageGetLSN() on it. Therefore, instead of inserting
1 extra byte before the buffer, insert MAXIMUM_ALIGNOF extra bytes
and only use the last one.

On most machines this doesn't cause any problem (except perhaps for
performance) but some buildfarm machines with -fsanitize=alignment
dump core.

Discussion: http://postgr.es/m/CA+TgmoYx5=1A2K9JYV-9zdhyokU4KKTyNQ9q7CiXrX=YBBMWVw@mail.gmail.com
2022-01-19 08:12:08 -05:00
Robert Haas cc333f3233 Modify pg_basebackup to use a new COPY subprotocol for base backups.
In the new approach, all files across all tablespaces are sent in a
single COPY OUT operation. The CopyData messages are no longer raw
archive content; rather, each message is prefixed with a type byte
that describes its purpose, e.g. 'n' signifies the start of a new
archive and 'd' signifies archive or manifest data. This protocol
is significantly more extensible than the old approach, since we can
later create more message types, though not without concern for
backward compatibility.

The new protocol sends a few things to the client that the old one
did not. First, it sends the name of each archive explicitly, instead
of letting the client compute it. This is intended to make it easier
to write future patches that might send archives in a format other
that tar (e.g. cpio, pax, tar.gz). Second, it sends explicit progress
messages rather than allowing the client to assume that progress is
defined by the number of bytes received. This will help with future
features where the server compresses the data, or sends it someplace
directly rather than transmitting it to the client.

The old protocol is still supported for compatibility with previous
releases. The new protocol is selected by means of a new
TARGET option to the BASE_BACKUP command. Currently, the
only supported target is 'client'. Support for additional
targets will be added in a later commit.

Patch by me. The patch set of which this is a part has had review
and/or testing from Jeevan Ladhe, Tushar Ahuja, Suraj Kharage,
Dipesh Pandit, and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
2022-01-18 13:47:49 -05:00
Peter Eisentraut 941460fcf7 Add Boolean node
Before, SQL-level boolean constants were represented by a string with
a cast, and internal Boolean values in DDL commands were usually
represented by Integer nodes.  This takes the place of both of these
uses, making the intent clearer and having some amount of type safety.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8c1a2e37-c68d-703c-5a83-7a6077f4f997@enterprisedb.com
2022-01-17 10:38:23 +01:00
Michael Paquier b69aba7457 Improve error handling of cryptohash computations
The existing cryptohash facility was causing problems in some code paths
related to MD5 (frontend and backend) that relied on the fact that the
only type of error that could happen would be an OOM, as the MD5
implementation used in PostgreSQL ~13 (the in-core implementation is
used when compiling with or without OpenSSL in those older versions),
could fail only under this circumstance.

The new cryptohash facilities can fail for reasons other than OOMs, like
attempting MD5 when FIPS is enabled (upstream OpenSSL allows that up to
1.0.2, Fedora and Photon patch OpenSSL 1.1.1 to allow that), so this
would cause incorrect reports to show up.

This commit extends the cryptohash APIs so as callers of those routines
can fetch more context when an error happens, by using a new routine
called pg_cryptohash_error().  The error states are stored within each
implementation's internal context data, so as it is possible to extend
the logic depending on what's suited for an implementation.  The default
implementation requires few error states, but OpenSSL could report
various issues depending on its internal state so more is needed in
cryptohash_openssl.c, and the code is shaped so as we are always able to
grab the necessary information.

The core code is changed to adapt to the new error routine, painting
more "const" across the call stack where the static errors are stored,
particularly in authentication code paths on variables that provide
log details.  This way, any future changes would warn if attempting to
free these strings.  The MD5 authentication code was also a bit blurry
about the handling of "logdetail" (LOG sent to the postmaster), so
improve the comments related that, while on it.

The origin of the problem is 87ae969, that introduced the centralized
cryptohash facility.  Extra changes are done for pgcrypto in v14 for the
non-OpenSSL code path to cope with the improvements done by this
commit.

Reported-by: Michael Mühlbeyer
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/89B7F072-5BBE-4C92-903E-D83E865D9367@trivadis.com
Backpatch-through: 14
2022-01-11 09:55:16 +09:00
Jeff Davis 96a6f11c06 More cleanup of a2ab9c06ea.
Require SELECT privileges when performing UPDATE or DELETE, to be
consistent with the way a normal UPDATE or DELETE command works.

Simplify subscription test it so that it runs faster. Also, wait for
initial table sync to complete to avoid intermittent failures.

Minor doc fixup.

Discussion: https://postgr.es/m/CAA4eK1L3-qAtLO4sNGaNhzcyRi_Ufmh2YPPnUjkROBK0tN%3Dx%3Dg%40mail.gmail.com
Discussion: https://postgr.es/m/1514479.1641664638%40sss.pgh.pa.us
Discussion: https://postgr.es/m/Ydkfj5IsZg7mQR0g@paquier.xyz
2022-01-08 20:07:16 -08:00
Jeff Davis a2ab9c06ea Respect permissions within logical replication.
Prevent logical replication workers from performing insert, update,
delete, truncate, or copy commands on tables unless the subscription
owner has permission to do so.

Prevent subscription owners from circumventing row-level security by
forbidding replication into tables with row-level security policies
which the subscription owner is subject to, without regard to whether
the policy would ordinarily allow the INSERT, UPDATE, DELETE or
TRUNCATE which is being replicated.  This seems sufficient for now, as
superusers, roles with bypassrls, and target table owners should still
be able to replicate despite RLS policies.  We can revisit the
question of applying row-level security policies on a per-row basis if
this restriction proves too severe in practice.

Author: Mark Dilger
Reviewed-by: Jeff Davis, Andrew Dunstan, Ronan Dunklau
Discussion: https://postgr.es/m/9DFC88D3-1300-4DE8-ACBC-4CEF84399A53%40enterprisedb.com
2022-01-07 17:40:56 -08:00
Bruce Momjian 27b77ecf9f Update copyright for 2022
Backpatch-through: 10
2022-01-07 19:04:57 -05:00