Commit Graph

709 Commits

Author SHA1 Message Date
Amit Kapila edca342434 Allow the use of a hash index on the subscriber during replication.
Commit 89e46da5e5 allowed using BTREE indexes that are neither
PRIMARY KEY nor REPLICA IDENTITY on the subscriber during apply of
update/delete. This patch extends that functionality to also allow HASH
indexes.

We explored supporting other index access methods as well but they don't
have a fixed strategy for equality operation which is required by the
current infrastructure in logical replication to scan the indexes.

Author: Kuroda Hayato
Reviewed-by: Peter Smith, Onder Kalaci, Amit Kapila
Discussion: https://postgr.es/m/TYAPR01MB58669D7414E59664E17A5827F522A@TYAPR01MB5866.jpnprd01.prod.outlook.com
2023-07-14 08:21:54 +05:30
Peter Eisentraut e1c83e7b96 Fix untranslatable log message assembly
We can't inject the name of the logical replication worker into a log
message like that.  But for these messages we don't really need the
precision of knowing what kind of worker it was, so just write
"logical replication worker" and keep the message in one piece.

Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPt1xwATviPGjjtJy5L631SGf3qjV9XUCmxLu16cHamfgg%40mail.gmail.com
2023-07-13 13:21:43 +02:00
Masahiko Sawada fd48a86c62 Doc: clarify the conditions of usable indexes for REPLICA IDENTITY FULL tables.
Commit 89e46da5e allowed REPLICA IDENTITY FULL tables to use an index
on the subscriber during apply of update/delete. This commit clarifies
in the documentation that the leftmost field of candidate indexes must
be a column (not an expression) that references the published relation
column.

The source code comments are also updated accordingly.

Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDJjffEvUFKXT27Q5U8-UU9JHv4rrJ9Ke8Zkc5UPWHLvA@mail.gmail.com
Backpatch-through: 16
2023-07-13 15:03:17 +09:00
Daniel Gustafsson 48efb2302b Fix assertion failure in snapshot building
Clear any potential stale next_phase_at value from the snapshot
builder which otherwise may trip an assertion check ensuring
that there is no next_phase_at value.

This can be reproduced by running 80 concurrent sessions like
the below where $c is a loop counter (assumes there has been
1..$c databases created) :

  echo "
    CREATE TABLE replication_example(id SERIAL PRIMARY KEY,
                                     somedata int,
                                     text varchar(120));
    SELECT 'init' FROM
      pg_create_logical_replication_slot('regression_slot_$c',
                                         'test_decoding');
    SELECT data FROM
      pg_logical_slot_get_changes('regression_slot_$c', NULL,
                                  NULL, 'include-xids', '0',
                                  'skip-empty-xacts', '1');
  " | psql -d regress_$c >>psql.log &

Backpatch down to v16.

Bug: #17695
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reported-by: bowenshi <zxwsbg@qq.com>
Discussion: https://postgr.es/m/17695-6be9277c9295985f@postgresql.org
Backpatch-through: v16
2023-07-04 17:36:13 +02:00
Nathan Bossart 957845789b Increase size of bgw_library_name.
This commit increases the size of the bgw_library_name member of
the BackgroundWorker struct from BGW_MAXLEN (96) bytes to MAXPGPATH
(default of 1024) bytes so that it can store longer file names
(e.g., absolute paths).

Author: Yurii Rashkovskii
Reviewed-by: Daniel Gustafsson, Aleksander Alekseev
Discussion: https://postgr.es/m/CA%2BRLCQyjFV5Y8tG5QgUb6gjteL4S3p%2B1gcyqWTqigyM93WZ9Pg%40mail.gmail.com
2023-07-03 15:02:16 -07:00
Heikki Linnakangas e251e780bf Remove redundant check for fast_forward.
We already checked for it earlier in the function.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/1ba2899e-77f8-7866-79e5-f3b7d1251a3e@iki.fi
2023-06-30 18:31:10 +03:00
Heikki Linnakangas a0dd4c95b9 Improve comment on why we need ctid->(cmin,cmax) mapping.
Combocids are only part of the problem. Explain the problem in more detail.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/1ba2899e-77f8-7866-79e5-f3b7d1251a3e@iki.fi
2023-06-30 18:30:32 +03:00
Peter Eisentraut 046c8c5c8f Reword error messages for consistency 2023-06-28 19:30:26 +02:00
Peter Eisentraut 3ad5f07c0f Error message refactoring
Take some untranslatable things out of the message and replace by
format placeholders, to reduce translatable strings and reduce
translation mistakes.
2023-06-23 16:36:17 +02:00
Tom Lane b334612b8a Pre-beta2 mechanical code beautification.
Run pgindent and pgperltidy.  It seems we're still some ways
away from all committers doing this automatically.  Now that
we have a buildfarm animal that will whine about poorly-indented
code, we'll try to keep the tree more tidy.

Discussion: https://postgr.es/m/3156045.1687208823@sss.pgh.pa.us
2023-06-20 09:50:43 -04:00
Amit Kapila b5c517379a Fix possible crash in tablesync worker.
Commit c3afe8cf5a added a new password_required option but forgot that you
need database access to check whether an arbitrary role ID is a superuser.

Commit e7e7da2f8d fixed a similar bug in apply worker, and this patch
fixes a similar bug in tablesync worker.

Author: Hou Zhijie
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB571607F5A9D723755268D36294759@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-06-15 08:37:48 +05:30
Masahiko Sawada a83edeaf68 Honor run_as_owner option in tablesync worker.
Commit 482675987 introduced "run_as_owner" subscription option so that
subscription runs with either the permissions of the subscription
owner or the permission of the table owner. However, tablesync workers
did not use this option for the initial data copy.

With this change, tablesync workers run with appropriate permissions
based on "run_as_owner" option.

Ajin Cherian, with changes and regression tests added by me.

Reported-By: Amit Kapila
Author: Ajin Cherian, Masahiko Sawada
Reviewed-by: Ajin Cherian, Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1L=qzRHPEn+qeMoKQGFBzqGoLBzt_ov0A89iFFiut+ppA@mail.gmail.com
2023-06-09 10:43:03 +09:00
Amit Kapila d64e6468f4 Reload configuration more frequently in apply worker.
The apply worker was not reloading the configuration while processing
messages if there is a continuous flow of messages from upstream. It was
also not reloading the configuration if there is a change in the
configuration after it has waited for the message and before receiving the
new replication message. This can lead to failure in tests because we
expect that after reload, the behavior of apply worker to respect the
changed GUCs.

We found this while analyzing a rare buildfarm failure.

Author: Hou Zhijie
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB5716AF9079CC0755CD015322947E9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-06-07 09:19:17 +05:30
Tom Lane 0245f8db36 Pre-beta mechanical code beautification.
Run pgindent, pgperltidy, and reformat-dat-files.

This set of diffs is a bit larger than typical.  We've updated to
pg_bsd_indent 2.1.2, which properly indents variable declarations that
have multi-line initialization expressions (the continuation lines are
now indented one tab stop).  We've also updated to perltidy version
20230309 and changed some of its settings, which reduces its desire to
add whitespace to lines to make assignments etc. line up.  Going
forward, that should make for fewer random-seeming changes to existing
code.

Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
2023-05-19 17:24:48 -04:00
Tom Lane 70b42f2790 Fix misbehavior of EvalPlanQual checks with multiple result relations.
The idea of EvalPlanQual is that we replace the query's scan of the
result relation with a single injected tuple, and see if we get a
tuple out, thereby implying that the injected tuple still passes the
query quals.  (In join cases, other relations in the query are still
scanned normally.)  This logic was not updated when commit 86dc90056
made it possible for a single DML query plan to have multiple result
relations, when the query target relation has inheritance or partition
children.  We replaced the output for the current result relation
successfully, but other result relations were still scanned normally;
thus, if any other result relation contained a tuple satisfying the
quals, we'd think the EPQ check passed, even if it did not pass for
the injected tuple itself.  This would lead to update or delete
actions getting performed when they should have been skipped due to
a conflicting concurrent update in READ COMMITTED isolation mode.

Fix by blocking all sibling result relations from emitting tuples
during an EvalPlanQual recheck.  In the back branches, the fix is
complicated a bit by the need to not change the size of struct
EPQState (else we'd have ABI-breaking changes in offsets in
struct ModifyTableState).  Like the back-patches of 3f7836ff6
and 4b3e37993, add a separately palloc'd struct to avoid that.
The logic is the same as in HEAD otherwise.

This is only a live bug back to v14 where 86dc90056 came in.
However, I chose to back-patch the test cases further, on the
grounds that this whole area is none too well tested.  I skipped
doing so in v11 though because none of the test applied cleanly,
and it didn't quite seem worth extra work for a branch with only
six months to live.

Per report from Ante Krešić (via Aleksander Alekseev)

Discussion: https://postgr.es/m/CAJ7c6TMBTN3rcz4=AjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com
2023-05-19 14:26:40 -04:00
Amit Kapila 3d144c6c86 Fix invalid memory access during the shutdown of the parallel apply worker.
The callback function pa_shutdown() accesses MyLogicalRepWorker which may
not be initialized if there is an error during the initialization of the
parallel apply worker. The other problem is that by the time it is invoked
even after the initialization of the worker, the MyLogicalRepWorker will
be reset by another callback logicalrep_worker_onexit. So, it won't have
the required information.

To fix this, register the shutdown callback after we are attached to the
worker slot.

After this fix, we observed another issue which is that sometimes the
leader apply worker tries to receive the message from the error queue that
might already be detached by the parallel apply worker leading to an
error. To prevent such an error, we ensure that the leader apply worker
detaches from the parallel apply worker's error queue before stopping it.

Reported-by: Sawada Masahiko
Author: Hou Zhijie
Reviewed-by: Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDo+yUwNq6nTrvE2h9bB2vZfcag=jxWc7QxuWCmkDAqcA@mail.gmail.com
2023-05-09 09:28:06 +05:30
Amit Kapila de63f8dade Fix assertion failure in apply worker.
During exit, the logical replication apply worker tries to release session
level locks, if any. However, if the apply worker exits due to an error
before its connection is initialized, trying to release locks can lead to
assertion failure. The locks will be acquired once the worker is
initialized, so we don't need to release them till the worker
initialization is complete.

Reported-by: Alexander Lakhin
Author: Hou Zhijie based on inputs from Sawada Masahiko and Amit Kapila
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/2185d65f-5aae-3efa-c48f-fb42b173ef5c@gmail.com
2023-05-03 10:17:49 +05:30
Michael Paquier 8961cb9a03 Fix typos in comments
The changes done in this commit impact comments with no direct
user-visible changes, with fixes for incorrect function, variable or
structure names.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/e8c38840-596a-83d6-bd8d-cebc51111572@gmail.com
2023-05-02 12:23:08 +09:00
Masahiko Sawada 781ac42d43 Use elog to report unexpected action in handle_streamed_transaction().
An oversight in commit 216a784829.

Author: Masahiko Sawada
Reviewed-by: Kyotaro Horiguchi, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDDbM8_HJt-nMCvcjTK8K9hPzXWqJj7pyaUvR4mm_NrSg@mail.gmail.com
2023-04-24 15:37:14 +09:00
Amit Kapila c1cc4e688b Restart the apply worker if the 'password_required' option is changed.
The apply worker is restarted if any subscription option that affects the
remote connection was changed. In commit c3afe8cf5a, we added the option
'password_required' which can affect the remote connection, so we should
restart the worker if it was changed.

Author: Amit Kapila
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CAA4eK1+z9UDFEynXLsWeMMuUZc1iQkRwj2HNDtxUHTPo-u1F4A@mail.gmail.com
Discussion: https://postgr.es/m/9DFC88D3-1300-4DE8-ACBC-4CEF84399A53@enterprisedb.com
2023-04-20 08:56:18 +05:30
David Rowley 3f58a4e296 Fix various typos and incorrect/outdated name references
Author: Alexander Lakhin
Discussion: https://postgr.es/m/699beab4-a6ca-92c9-f152-f559caf6dc25@gmail.com
2023-04-19 13:50:33 +12:00
Andres Freund 5ec69b71f1 Improve error messages introduced in be87200efd and 0fdab27ad6
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20230411.120301.93333867350615278.horikyota.ntt@gmail.com
Discussion: https://postgr.es/m/20230412174244.6njadz4uoiez3l74@awork3.anarazel.de
2023-04-12 11:00:37 -07:00
Andres Freund 0fdab27ad6 Allow logical decoding on standbys
Unsurprisingly, this requires wal_level = logical to be set on the primary and
standby. The infrastructure added in 26669757b6 ensures that slots are
invalidated if the primary's wal_level is lowered.

Creating a slot on a standby waits for a xl_running_xact record to be
processed. If the primary is idle (and thus not emitting xl_running_xact
records), that can take a while.  To make that faster, this commit also
introduces the pg_log_standby_snapshot() function. By executing it on the
primary, completion of slot creation on the standby can be accelerated.

Note that logical decoding on a standby does not itself enforce that required
catalog rows are not removed. The user has to use physical replication slots +
hot_standby_feedback or other measures to prevent that. If catalog rows
required for a slot are removed, the slot is invalidated.

See 6af1793954 for an overall design of logical decoding on a standby.

Bumps catversion, for the addition of the pg_log_standby_snapshot() function.

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de> (in an older version)
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: FabrÌzio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
2023-04-08 02:20:05 -07:00
Andres Freund be87200efd Support invalidating replication slots due to horizon and wal_level
Needed for logical decoding on a standby. Slots need to be invalidated because
of the horizon if rows required for logical decoding are removed. If the
primary's wal_level is lowered from 'logical', logical slots on the standby
need to be invalidated.

The new invalidation methods will be used in a subsequent commit.

Logical slots that have been invalidated can be identified via the new
pg_replication_slots.conflicting column.

See 6af1793954 for an overall design of logical decoding on a standby.

Bumps catversion for the addition of the new pg_replication_slots column.

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
2023-04-07 22:40:27 -07:00
Andres Freund 4397abd0a2 Prevent use of invalidated logical slot in CreateDecodingContext()
Previously we had checks for this in multiple places. Support for logical
decoding on standbys will add other forms of invalidation, making it worth
while to centralize the checks.

This slightly changes the error message for both the walsender and SQL
interface. Particularly the SQL interface error was inaccurate, as the "This
slot has never previously reserved WAL" portion was unreachable.

Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
2023-04-07 22:19:05 -07:00
Robert Haas 482675987b Add a run_as_owner option to subscriptions.
This option is normally false, but can be set to true to obtain
the legacy behavior where the subscription runs with the permissions
of the subscription owner rather than the permissions of the
table owner. The advantages of this mode are (1) it doesn't require
that the subscription owner have permission to SET ROLE to each
table owner and (2) since no role switching occurs, the
SECURITY_RESTRICTED_OPERATION restrictions do not apply.

On the downside, it allows any table owner to easily usurp
the privileges of the subscription owner - basically, to take
over their account. Because that's generally quite undesirable,
we don't make this mode the default, but we do make it available,
just in case the new behavior causes too many problems for someone.

Discussion: http://postgr.es/m/CA+TgmoZ-WEeG6Z14AfH7KhmpX2eFh+tZ0z+vf0=eMDdbda269g@mail.gmail.com
2023-04-04 12:03:03 -04:00
Robert Haas 1e10d49b65 Perform logical replication actions as the table owner.
Up until now, logical replication actions have been performed as the
subscription owner, who will generally be a superuser.  Commit
cec57b1a0f documented hazards
associated with that situation, namely, that any user who owns a
table on the subscriber side could assume the privileges of the
subscription owner by attaching a trigger, expression index, or
some other kind of executable code to it. As a remedy, it suggested
not creating configurations where users who are not fully trusted
own tables on the subscriber.

Although that will work, it basically precludes using logical
replication in the way that people typically want to use it,
namely, to replicate a database from one node to another
without necessarily having any restrictions on which database
users can own tables. So, instead, change logical replication to
execute INSERT, UPDATE, DELETE, and TRUNCATE operations as the
table owner when they are replicated.

Since this involves switching the active user frequently within
a session that is authenticated as the subscription user, also
impose SECURITY_RESTRICTED_OPERATION restrictions on logical
replication code. As an exception, if the table owner can SET
ROLE to the subscription owner, these restrictions have no
security value, so don't impose them in that case.

Subscription owners are now required to have the ability to
SET ROLE to every role that owns a table that the subscription
is replicating. If they don't, replication will fail. Superusers,
who normally own subscriptions, satisfy this property by default.
Non-superusers users who own subscriptions will need to be
granted the roles that own relevant tables.

Patch by me, reviewed (but not necessarily in its entirety) by
Jelte Fennema, Jeff Davis, and Noah Misch.

Discussion: http://postgr.es/m/CA+TgmoaSCkg9ww9oppPqqs+9RVqCexYCE6Aq=UsYPfnOoDeFkw@mail.gmail.com
2023-04-04 11:25:23 -04:00
Robert Haas e7e7da2f8d Fix possible logical replication crash.
Commit c3afe8cf5a added a new
password_required option but forgot that you need database access
to check whether an arbitrary role ID is a superuser.

Report and patch by Hou Zhijie. I added a comment. Thanks to
Alexander Lakhin for devising a way to reproduce the crash.

Discussion: http://postgr.es/m/OS0PR01MB5716BFD7EC44284C89F40808948F9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-04-03 13:54:21 -04:00
Robert Haas c3afe8cf5a Add new predefined role pg_create_subscription.
This role can be granted to non-superusers to allow them to issue
CREATE SUBSCRIPTION. The non-superuser must additionally have CREATE
permissions on the database in which the subscription is to be
created.

Most forms of ALTER SUBSCRIPTION, including ALTER SUBSCRIPTION .. SKIP,
now require only that the role performing the operation own the
subscription, or inherit the privileges of the owner. However, to
use ALTER SUBSCRIPTION ... RENAME or ALTER SUBSCRIPTION ... OWNER TO,
you also need CREATE permission on the database. This is similar to
what we do for schemas. To change the owner of a schema, you must also
have permission to SET ROLE to the new owner, similar to what we do
for other object types.

Non-superusers are required to specify a password for authentication
and the remote side must use the password, similar to what is required
for postgres_fdw and dblink.  A superuser who wants a non-superuser to
own a subscription that does not rely on password authentication may
set the new password_required=false property on that subscription. A
non-superuser may not set password_required=false and may not modify a
subscription that already has password_required=false.

This new password_required subscription property works much like the
eponymous postgres_fdw property.  In both cases, the actual semantics
are that a password is not required if either (1) the property is set
to false or (2) the relevant user is the superuser.

Patch by me, reviewed by Andres Freund, Jeff Davis, Mark Dilger,
and Stephen Frost (but some of those people did not fully endorse
all of the decisions that the patch makes).

Discussion: http://postgr.es/m/CA+TgmoaDH=0Xj7OBiQnsHTKcF2c4L+=gzPBUKSJLh8zed2_+Dg@mail.gmail.com
2023-03-30 11:37:19 -04:00
Amit Kapila ecb696527c Allow logical replication to copy tables in binary format.
This patch allows copying tables in the binary format during table
synchronization when the binary option for a subscription is enabled.
Previously, tables are copied in text format even if the subscription is
created with the binary option enabled. Copying tables in binary format
may reduce the time spent depending on column types.

A binary copy for initial table synchronization is supported only when
both publisher and subscriber are v16 or later.

Author: Melih Mutlu
Reviewed-by: Peter Smith, Shi yu, Euler Taveira, Vignesh C,  Kuroda Hayato, Osumi Takamichi, Bharath Rupireddy, Hou Zhijie
Discussion: https://postgr.es/m/CAGPVpCQvAziCLknEnygY0v1-KBtg%2BOm-9JHJYZOnNPKFJPompw%40mail.gmail.com
2023-03-23 08:45:51 +05:30
Amit Kapila e709596b25 Add macros for ReorderBufferTXN toptxn.
Currently, there are quite a few places in reorderbuffer.c that tries to
access top-transaction for a subtransaction. This makes the code to access
top-transaction consistent and easier to follow.

Author: Peter Smith
Reviewed-by: Vignesh C, Sawada Masahiko
Discussion: https://postgr.es/m/CAHut+PuCznOyTqBQwjRUu-ibG-=KHyCv-0FTcWQtZUdR88umfg@mail.gmail.com
2023-03-17 08:29:41 +05:30
Amit Kapila 89e46da5e5 Allow the use of indexes other than PK and REPLICA IDENTITY on the subscriber.
Using REPLICA IDENTITY FULL on the publisher can lead to a full table scan
per tuple change on the subscription when REPLICA IDENTITY or PK index is
not available. This makes REPLICA IDENTITY FULL impractical to use apart
from some small number of use cases.

This patch allows using indexes other than PRIMARY KEY or REPLICA
IDENTITY on the subscriber during apply of update/delete. The index that
can be used must be a btree index, not a partial index, and it must have
at least one column reference (i.e. cannot consist of only expressions).
We can uplift these restrictions in the future. There is no smart
mechanism to pick the index. If there is more than one index that
satisfies these requirements, we just pick the first one. We discussed
using some of the optimizer's low-level APIs for this but ruled it out
as that can be a maintenance burden in the long run.

This patch improves the performance in the vast majority of cases and the
improvement is proportional to the amount of data in the table. However,
there could be some regression in a small number of cases where the indexes
have a lot of duplicate and dead rows. It was discussed that those are
mostly impractical cases but we can provide a table or subscription level
option to disable this feature if required.

Author: Onder Kalaci, Amit Kapila
Reviewed-by: Peter Smith, Shi yu, Hou Zhijie, Vignesh C, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/CACawEhVLqmAAyPXdHEPv1ssU2c=dqOniiGz7G73HfyS7+nGV4w@mail.gmail.com
2023-03-15 08:49:04 +05:30
Tom Lane b803b7d132 Fill EState.es_rteperminfos more systematically.
While testing a fix for bug #17823, I discovered that EvalPlanQualStart
failed to copy es_rteperminfos from the parent EState, resulting in
failure if anything in EPQ execution wanted to consult that information.

This led me to conclude that commit a61b1f748 had been too haphazard
about where to fill es_rteperminfos, and that we need to be sure that
that happens exactly where es_range_table gets filled.  So I changed the
signature of ExecInitRangeTable to help ensure that this new requirement
doesn't get missed.  (Indeed, pgoutput.c was also failing to fill it.
Maybe we don't ever need it there, but I wouldn't bet on that.)

No test case yet; one will arrive with the fix for #17823.
But that needs to be back-patched, while this fix is HEAD-only.

Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
2023-03-06 13:10:57 -05:00
Amit Kapila 9effa55236 Deduplicate handling of binary and text modes in logicalrep_read_tuple().
Author: Bharath Rupireddy
Reviewed-by: Peter Smith
Discussion: https://postgr.es/m/CALj2ACXdbq7kW_+bRrSGMsR6nefCvwbHBJ5J51mr3gFf7QysTA@mail.gmail.com
2023-03-06 09:54:57 +05:30
Tom Lane 462bb7f128 Remove bms_first_member().
This function has been semi-deprecated ever since we invented
bms_next_member().  Its habit of scribbling on the input bitmapset
isn't great, plus for sufficiently large bitmapsets it would take
O(N^2) time to complete a loop.  Now we have the additional problem
that reducing the input to empty while leaving it still accessible
would violate a planned invariant.  So let's just get rid of it,
after updating the few extant callers to use bms_next_member().

Patch by me; thanks to Nathan Bossart and Richard Guo for review.

Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
2023-03-02 11:34:29 -05:00
Tomas Vondra 7fe1aa991b Fix snapshot handling in logicalmsg_decode
Whe decoding a transactional logical message, logicalmsg_decode called
SnapBuildGetOrBuildSnapshot. But we may not have a consistent snapshot
yet at that point. We don't actually need the snapshot in this case
(during replay we'll have the snapshot from the transaction), so in
practice this is harmless. But in assert-enabled build this crashes.

Fixed by requesting the snapshot only in non-transactional case, where
we are guaranteed to have SNAPBUILD_CONSISTENT.

Backpatch to 11. The issue exists since 9.6.

Backpatch-through: 11
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/84d60912-6eab-9b84-5de3-41765a5449e8@enterprisedb.com
2023-02-22 15:24:18 +01:00
Amit Kapila fce003cfde Add a new wait state and use it when sending data in the apply worker.
d9d7fe68d3 made use of an existing wait event when sending data from the
apply worker, but we should have invented a new wait event since this is a
new place to wait.

This patch corrects the mistake by using a new wait event
"LogicalApplySendData".

Author: Hou Zhijie
Reviewed-by: Peter Smith
Discussion: https://postgr.es/m/CA+TgmobWzbr9H3yN3dLVckviEZKemPwd+XyCFKEgyZQZhgP66Q@mail.gmail.com
2023-02-16 07:46:31 +05:30
Michael Paquier ef7002dbe0 Fix various typos in code and tests
Most of these are recent, and the documentation portions are new as of
v16 so there is no need for a backpatch.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20230208155644.GM1653@telsasoft.com
2023-02-09 14:43:53 +09:00
Amit Kapila 8c58624df4 Fix the logical replication timeout during large DDLs.
The DDLs like Refresh Materialized views that generate lots of temporary
data due to rewrite rules may not be processed by output plugins (for
example pgoutput). So, we won't send keep-alive messages for a long time
while processing such commands and that can lead the subscriber side to
timeout. We have previously fixed a similar case for large transactions in
commit f95d53eded where the output plugin filters all or most of the
changes but missed to handle the DDLs.

We decided not to backpatch this as this adds a new callback in the
existing exposed structure and moreover, users can increase the
wal_sender_timeout and wal_receiver_timeout to avoid this problem.

Author: Wang wei, Hou Zhijie
Reviewed-by: Peter Smith, Ashutosh Bapat, Shi yu, Amit Kapila
Discussion: https://postgr.es/m/OS3PR01MB6275478E5D29E4A563302D3D9E2B9@OS3PR01MB6275.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com
2023-02-08 07:58:25 +05:30
Amit Kapila d9d7fe68d3 Use appropriate wait event when sending data in the apply worker.
Currently, we reuse WAIT_EVENT_LOGICAL_PARALLEL_APPLY_STATE_CHANGE in the
apply worker while sending data to the parallel apply worker via a shared
memory queue. This is not appropriate as one won't be able to distinguish
whether the worker is waiting for sending data or for the state change.

To patch instead uses the wait event WAIT_EVENT_MQ_SEND which has been
already used in blocking mode while sending data via a shared memory
queue.

Author: Hou Zhijie
Reviewed-by: Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB57161C680B22E4C591628EE994DA9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-02-07 09:58:19 +05:30
Peter Eisentraut 54a177a948 Remove useless casts to (void *) in hash_search() calls
Some of these appear to be leftovers from when hash_search() took a
char * argument (changed in 5999e78fc4).

Since after this there is some more horizontal space available, do
some light reformatting where suitable.

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/fd9adf5d-b1aa-e82f-e4c7-263c30145807%40enterprisedb.com
2023-02-06 09:41:01 +01:00
Amit Kapila 3e577ff602 Optimize the origin drop functionality.
To interlock against concurrent drops, we use to hold ExclusiveLock on
pg_replication_origin till xact commit. This blocks even concurrent drops
of different origins by tablesync workers. So, instead, lock the specific
origin to interlock against concurrent drops.

This reduces the test time variability in src/test/subscription where
multiple tables are being synced.

Author: Vignesh C
Reviewed-by: Hou Zhijie, Amit Kapila
Discussion: https://postgr.es/m/1412708.1674417574@sss.pgh.pa.us
2023-02-03 08:29:08 +05:30
Amit Kapila 9f2213a7c5 Allow the logical_replication_mode to be used on the subscriber.
Extend the existing developer option 'logical_replication_mode' to help
test the parallel apply of large transactions on the subscriber.

When set to 'buffered', the leader sends changes to parallel apply workers
via a shared memory queue. When set to 'immediate', the leader serializes
all changes to files and notifies the parallel apply workers to read and
apply them at the end of the transaction.

This helps in adding tests to cover the serialization code path in
parallel streaming mode.

Author: Hou Zhijie
Reviewed-by: Peter Smith, Kuroda Hayato, Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-02-02 08:15:18 +05:30
Amit Kapila 1e8b61735c Rename GUC logical_decoding_mode to logical_replication_mode.
Rename the developer option 'logical_decoding_mode' to the more flexible
name 'logical_replication_mode' because doing so will make it easier to
extend this option in the future to help test other areas of logical
replication.

Currently, it is used on the publisher side to allow streaming or
serializing each change in logical decoding. In the upcoming patch, we are
planning to use it on the subscriber. On the subscriber, it will allow
serializing the changes to file and notifies the parallel apply workers to
read and apply them at the end of the transaction.

We discussed exposing this parameter as a subscription option but
it did not seem advisable since it is primarily used for testing/debugging
and there is no other such parameter. We also discussed having separate
GUCs for publisher and subscriber but for current testing/debugging
requirements, one GUC is sufficient.

Author: Hou Zhijie
Reviewed-by: Peter Smith, Kuroda Hayato, Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoAy2c=Mx=FTCs+EwUsf2kQL5MmU3N18X84k0EmCXntK4g@mail.gmail.com
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-30 08:02:08 +05:30
Tom Lane 3b4ac33254 Avoid type cheats for invalid dsa_handles and dshash_table_handles.
Invent separate macros for "invalid" values of these types, so that
we needn't embed knowledge of their representations into calling code.
These are all zeroes anyway ATM, so this is not fixing any live bug,
but it makes the code cleaner and more future-proof.

I (tgl) also chose to move DSM_HANDLE_INVALID into dsm_impl.h,
since it seems like it should live beside the typedef for dsm_handle.

Hou Zhijie, Nathan Bossart, Kyotaro Horiguchi, Tom Lane

Discussion: https://postgr.es/m/OS0PR01MB5716860B1454C34E5B179B6694C99@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-01-25 11:48:38 -05:00
Amit Kapila 6c6d6ba3ee Fix the Drop Database hang.
The drop database command waits for the logical replication sync worker to
accept ProcSignalBarrier and the worker's slot creation waits for the drop
database to finish which leads to a deadlock. This happens because the
tablesync worker holds interrupts while creating a slot.

We prevent cancel/die interrupts while creating a slot in the table sync
worker because it is possible that before the server finishes this
command, a concurrent drop subscription happens which would complete
without removing this slot and that leads to the slot existing until the
end of walsender. However, the slot will eventually get dropped at the
walsender exit time, so there is no danger of the dangling slot.

This patch reallows cancel/die interrupts while creating a slot and
modifies the test to wait for slots to become zero to prevent finding an
ephemeral slot.

The reported hang doesn't happen in PG14 as the drop database starts to
wait for ProcSignalBarrier with PG15 (commits 4eb2176318 and e2f65f4255)
but it is good to backpatch this till PG14 as it is not a good idea to
prevent interrupts during a network call that could block indefinitely.

Reported-by: Lakshmi Narayanan Sreethar
Diagnosed-by: Andres Freund
Author: Hou Zhijie
Reviewed-by: Vignesh C, Amit Kapila
Backpatch-through: 14, where it was introduced in commit 6b67d72b60
Discussion: https://postgr.es/m/CA+kvmZELXQ4ZD3U=XCXuG3KvFgkuPoN1QrEj8c-rMRodrLOnsg@mail.gmail.com
2023-01-24 09:25:36 +05:30
Tom Lane 5a3a95385b Track logrep apply workers' last start times to avoid useless waits.
Enforce wal_retrieve_retry_interval on a per-subscription basis,
rather than globally, and arrange to skip that delay in case of
an intentional worker exit.  This probably makes little difference
in the field, where apply workers wouldn't be restarted often;
but it has a significant impact on the runtime of our logical
replication regression tests (even though those tests use
artificially-small wal_retrieve_retry_interval settings already).

Nathan Bossart, with mostly-cosmetic editorialization by me

Discussion: https://postgr.es/m/20221122004119.GA132961@nathanxps13
2023-01-22 14:08:46 -05:00
Amit Kapila d540a02a72 Display the leader apply worker's PID for parallel apply workers.
Add leader_pid to pg_stat_subscription. leader_pid is the process ID of
the leader apply worker if this process is a parallel apply worker. If
this field is NULL, it indicates that the process is a leader apply
worker or a synchronization worker. The new column makes it easier to
distinguish parallel apply workers from other kinds of workers and helps
to identify the leader for the parallel workers corresponding to a
particular subscription.

Additionally, update the leader_pid column in pg_stat_activity as well to
display the PID of the leader apply worker for parallel apply workers.

Author: Hou Zhijie
Reviewed-by: Peter Smith, Sawada Masahiko, Amit Kapila, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-18 09:03:12 +05:30
Amit Kapila c981d9145d Improve the code to decide and process the apply action.
The code that decides the apply action missed to handle non-transactional
messages and we didn't catch it in our testing as currently such messages
are simply ignored by the apply worker. This was introduced by changes in
commit 216a784829.

While testing this, I noticed that we forgot to reset stream_xid after
processing the stream stop message which could also result in the wrong
apply action after the fix for non-transactional messages.

In passing, change assert to elog for unexpected apply action in some of
the routines so as to catch the problems in the production environment, if
any.

Reported-by: Tomas Vondra
Author: Amit Kapila
Reviewed-by: Tomas Vondra, Sawada Masahiko, Hou Zhijie
Discussion: https://postgr.es/m/984ff689-adde-9977-affe-cd6029e850be@enterprisedb.com
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-17 11:28:22 +05:30
Peter Eisentraut 20428d344a Add BufFileRead variants with short read and EOF detection
Most callers of BufFileRead() want to check whether they read the full
specified length.  Checking this at every call site is very tedious.
This patch provides additional variants BufFileReadExact() and
BufFileReadMaybeEOF() that include the length checks.

I considered changing BufFileRead() itself, but this function is also
used in extensions, and so changing the behavior like this would
create a lot of problems there.  The new names are analogous to the
existing LogicalTapeReadExact().

Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f3501945-c591-8cc3-5ef0-b72a2e0eaa9c@enterprisedb.com
2023-01-16 11:01:31 +01:00