postgresql

Commit Graph

Author	SHA1	Message	Date
Amit Kapila	4b82ed6eca	Fix dangling pointer reference in stream_cleanup_files. We can't access the entry after it is removed from dynahash. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Ps-pL++f6CJwPx2+vUqXuew=Xt-9Bi-6kCyxn+Fwi2M7w@mail.gmail.com	2021-03-23 09:43:33 +05:30
Peter Eisentraut	1657b37d7c	Small debug message tweak This makes the wording of the delete case match the update case.	2021-03-10 08:16:38 +01:00
Peter Eisentraut	6f6f284c7e	Simplify printing of LSNs Add a macro LSN_FORMAT_ARGS for use in printf-style printing of LSNs. Convert all applicable code to use it. Reviewed-by: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/CAExHW5ub5NaTELZ3hJUCE6amuvqAtsSxc7O+uK7y4t9Rrk23cw@mail.gmail.com	2021-02-23 10:27:02 +01:00
Amit Kapila	ce0fdbfe97	Allow multiple xacts during table sync in logical replication. For the initial table data synchronization in logical replication, we use a single transaction to copy the entire table and then synchronize the position in the stream with the main apply worker. There are multiple downsides of this approach: (a) We have to perform the entire copy operation again if there is any error (network breakdown, error in the database operation, etc.) while we synchronize the WAL position between tablesync worker and apply worker; this will be onerous especially for large copies, (b) Using a single transaction in the synchronization-phase (where we can receive WAL from multiple transactions) will have the risk of exceeding the CID limit, (c) The slot will hold the WAL till the entire sync is complete because we never commit till the end. This patch solves all the above downsides by allowing multiple transactions during the tablesync phase. The initial copy is done in a single transaction and after that, we commit each transaction as we receive. To allow recovery after any error or crash, we use a permanent slot and origin to track the progress. The slot and origin will be removed once we finish the synchronization of the table. We also remove slot and origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not finished. The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true cannot be executed inside a transaction block because they can now drop the slots for which we have no provision to rollback. This will also open up the path for logical replication of 2PC transactions on the subscriber side. Previously, we can't do that because of the requirement of maintaining a single transaction in tablesync workers. Bump catalog version due to change of state in the catalog (pg_subscription_rel). Author: Peter Smith, Amit Kapila, and Takamichi Osumi Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com	2021-02-12 07:41:51 +05:30
Amit Kapila	ed43677e20	pgindent worker.c. This is a leftover from commit `0926e96c49`. Changing this separately because this file is being modified for upcoming patch logical replication of 2PC. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Ps+EgG8KzcmAyAgBUi_vuTps6o9ZA8DG6SdnO0-YuOhPQ@mail.gmail.com	2021-01-19 08:10:13 +05:30
Peter Geoghegan	9dc718bdf2	Pass down "logically unchanged index" hint. Add an executor aminsert() hint mechanism that informs index AMs that the incoming index tuple (the tuple that accompanies the hint) is not being inserted by execution of an SQL statement that logically modifies any of the index's key columns. The hint is received by indexes when an UPDATE takes place that does not apply an optimization like heapam's HOT (though only for indexes where all key columns are logically unchanged). Any index tuple that receives the hint on insert is expected to be a duplicate of at least one existing older version that is needed for the same logical row. Related versions will typically be stored on the same index page, at least within index AMs that apply the hint. Recognizing the difference between MVCC version churn duplicates and true logical row duplicates at the index AM level can help with cleanup of garbage index tuples. Cleanup can intelligently target tuples that are likely to be garbage, without wasting too many cycles on less promising tuples/pages (index pages with little or no version churn). This is infrastructure for an upcoming commit that will teach nbtree to perform bottom-up index deletion. No index AM actually applies the hint just yet. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=CEKFa74EScx_hFVshCOn6AA5T-ajFASTdzipdkLTNQQ@mail.gmail.com	2021-01-13 08:11:00 -08:00
Bruce Momjian	ca3b37487b	Update copyright for 2021 Backpatch-through: 9.5	2021-01-02 13:06:25 -05:00
Noah Misch	a1b8aa1e4e	Use HASH_BLOBS for xidhash. This caused BufFile errors on buildfarm member sungazer, and SIGSEGV was possible. Conditions for reaching those symptoms were more frequent on big-endian systems. Discussion: https://postgr.es/m/20201129214441.GA691200@rfd.leadboat.com	2020-12-12 21:38:36 -08:00
Noah Misch	73aae4522b	Correct behavior descriptions in comments, and correct a test name.	2020-12-12 20:12:25 -08:00
Amit Kapila	0926e96c49	Fix replication of in-progress transactions in tablesync worker. Tablesync worker runs under a single transaction but in streaming mode, we were committing the transaction on stream_stop, stream_abort, and stream_commit. We need to avoid committing the transaction in a streaming mode in tablesync worker. In passing move the call to process_syncing_tables in apply_handle_stream_commit after clean up of stream files. This will allow clean up of files to happen before the exit of tablesync worker which would otherwise be handled by one of the proc exit routines. Author: Dilip Kumar Reviewed-by: Amit Kapila and Peter Smith Tested-by: Peter Smith Discussion: https://postgr.es/m/CAHut+Pt4PyKQCwqzQ=EFF=bpKKJD7XKt_S23F6L20ayQNxg77A@mail.gmail.com	2020-11-27 07:43:34 +05:30
Amit Kapila	f3a8f73ec2	Use Enums for logical replication message types at more places. Commit `644f0d7cc9` added logical replication message type enums to use instead of character literals but some char substitutions were overlooked. Author: Peter Smith Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAHut+PsTG=Vrv8hgrvOnAvCNR21jhqMdPk2n0a1uJPoW0p+UfQ@mail.gmail.com	2020-11-26 09:21:14 +05:30
Amit Kapila	644f0d7cc9	Use Enum for top level logical replication message types. Logical replication protocol uses a single byte character to identify a message type in logical replication protocol. The code uses string literals for the same. Use Enum so that 1. All the string literals used can be found at a single place. This makes it easy to add more types without the risk of conflicts. 2. It's easy to locate the code handling a given message type. 3. When used with switch statements, it is easy to identify the missing cases using -Wswitch. Author: Ashutosh Bapat Reviewed-by: Kyotaro Horiguchi, Andres Freund, Peter Smith and Amit Kapila Discussion: https://postgr.es/m/CAExHW5uPzQ7L0oAd_ENyvaiYMOPgkrAoJpE+ZY5-obdcVT6NPg@mail.gmail.com	2020-11-02 08:18:18 +05:30
Tom Lane	ad77039fad	Calculate extraUpdatedCols in query rewriter, not parser. It's unsafe to do this at parse time because addition of generated columns to a table would not invalidate stored rules containing UPDATEs on the table ... but there might now be dependent generated columns that were not there when the rule was made. This also fixes an oversight that rewriteTargetView failed to update extraUpdatedCols when transforming an UPDATE on an updatable view. (Since the new calculation is downstream of that, rewriteTargetView doesn't actually need to do anything; but before, there was a demonstrable bug there.) In v13 and HEAD, this leads to easily-visible bugs because (since commit `c6679e4fc`) we won't recalculate generated columns that aren't listed in extraUpdatedCols. In v12 this bitmap is mostly just used for trigger-firing decisions, so you'd only notice a problem if a trigger cared whether a generated column had been updated. I'd complained about this back in May, but then forgot about it until bug #16671 from Michael Paul Killian revived the issue. Back-patch to v12 where this field was introduced. If existing stored rules contain any extraUpdatedCols values, they'll be ignored because the rewriter will overwrite them, so the bug will be fixed even for existing rules. (But note that if someone were to update to 13.1 or 12.5, store some rules with UPDATEs on tables having generated columns, and then downgrade to a prior minor version, they might observe issues similar to what this patch fixes. That seems unlikely enough to not be worth going to a lot of effort to fix.) Discussion: https://postgr.es/m/10206.1588964727@sss.pgh.pa.us Discussion: https://postgr.es/m/16671-2fa55851859fb166@postgresql.org	2020-10-28 13:47:02 -04:00
Heikki Linnakangas	fb5883da86	Remove PartitionRoutingInfo struct. The extra indirection neeeded to access its members via its enclosing ResultRelInfo seems pointless. Move all the fields from PartitionRoutingInfo to ResultRelInfo. Author: Amit Langote Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/CA%2BHiwqFViT47Zbr_ASBejiK7iDG8%3DQ1swQ-tjM6caRPQ67pT%3Dw%40mail.gmail.com	2020-10-19 14:42:55 +03:00
Alvaro Herrera	b05fe7b442	Review logical replication tablesync code Most importantly, remove optimization in LogicalRepSyncTableStart that skips the normal walrcv_startstreaming/endstreaming dance. The optimization is not critically important for production uses anyway, since it only fires in cases with no activity, and saves an uninteresting amount of work even then. Critically, it obscures bugs by hiding the interesting code path from test cases. Also: in GetSubscriptionRelState, remove pointless relation open; access pg_subscription_rel->srsubstate with GETSTRUCT as is typical rather than SysCacheGetAttr; remove unused 'missing_ok' argument. In wait_for_relation_state_change, use explicit catalog snapshot invalidation rather than obscurely (and expensively) through GetLatestSnapshot. In various places: sprinkle comments more liberally and rewrite a number of them. Other cosmetic code improvements. No backpatch, since no bug is being fixed here. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Petr Jelínek <petr.jelinek@2ndquadrant.com> Discussion: https://postgr.es/m/20201010190637.GA5774@alvherre.pgsql	2020-10-15 11:35:51 -03:00
Alvaro Herrera	4e9821b6fa	Restore replication protocol's duplicate command tags I removed the duplicate command tags for START_REPLICATION inadvertently in commit `07082b08cc`, but the replication protocol requires them. The fact that the replication protocol was broken was not noticed because all our test cases use an optimized code path that exits early, failing to verify that the behavior is correct for non-optimized cases. Put them back. Also document this protocol quirk. Add a test case that shows the failure. It might still succeed even without the patch when run on a fast enough server, but it suffices to show the bug in enough cases that it would be noticed in buildfarm. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Henry Hinze <henry.hinze@gmail.com> Reviewed-by: Petr Jelínek <petr.jelinek@2ndquadrant.com> Discussion: https://postgr.es/m/16643-eaadeb2a1a58d28c@postgresql.org	2020-10-14 20:12:26 -03:00
Heikki Linnakangas	a04daa97a4	Remove es_result_relation_info from EState. Maintaining 'es_result_relation_info' correctly at all times has become cumbersome, especially with partitioning where each partition gets its own result relation info. Having to set and reset it across arbitrary operations has caused bugs in the past. This changes all the places that used 'es_result_relation_info', to receive the currently active ResultRelInfo via function parameters instead. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com	2020-10-14 11:41:40 +03:00
Heikki Linnakangas	1375422c78	Create ResultRelInfos later in InitPlan, index them by RT index. Instead of allocating all the ResultRelInfos upfront in one big array, allocate them in ExecInitModifyTable(). es_result_relations is now an array of ResultRelInfo pointers, rather than an array of structs, and it is indexed by the RT index. This simplifies things: we get rid of the separate concept of a "result rel index", and don't need to set it in setrefs.c anymore. This also allows follow-up optimizations (not included in this commit yet) to skip initializing ResultRelInfos for target relations that were not needed at runtime, and removal of the es_result_relation_info pointer. The EState arrays of regular result rels and root result rels are merged into one array. Similarly, the resultRelations and rootResultRelations lists in PlannedStmt are merged into one. It's not actually clear to me why they were kept separate in the first place, but now that the es_result_relations array is indexed by RT index, it certainly seems pointless. The PlannedStmt->resultRelations list is now only needed for ExecRelationIsTargetRelation(). One visible effect of this change is that ExecRelationIsTargetRelation() will now return 'true' also for the partition root, if a partitioned table is updated. That seems like a good thing, although the function isn't used in core code, and I don't see any reason for an FDW to call it on a partition root. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com	2020-10-13 12:57:02 +03:00
Amit Kapila	079d0cacf4	Fix the logical replication from HEAD to lower versions. Commit `464824323e` changed the logical replication protocol to allow the streaming of in-progress transactions and used the new version of protocol irrespective of the server version. Use the appropriate version of the protocol based on the server version. Reported-by: Ashutosh Sharma Author: Dilip Kumar Reviewed-by: Ashutosh Sharma and Amit Kapila Discussion: https://postgr.es/m/CAE9k0P=9OpXcNrcU5Gsvd5MZ8GFpiN833vNHzX6Uc=8+h1ft1Q@mail.gmail.com	2020-09-26 10:13:51 +05:30
Tom Lane	2000b6c10a	Don't fetch partition check expression during InitResultRelInfo. Since there is only one place that actually needs the partition check expression, namely ExecPartitionCheck, it's better to fetch it from the relcache there. In this way we will never fetch it at all if the query never has use for it, and we still fetch it just once when we do need it. The reason for taking an interest in this is that if the relcache doesn't already have the check expression cached, fetching it requires obtaining AccessShareLock on the partition root. That means that operations that look like they should only touch the partition itself will also take a lock on the root. In particular we observed that TRUNCATE on a partition may take a lock on the partition's root, contributing to a deadlock situation in parallel pg_restore. As written, this patch does have a small cost, which is that we are microscopically reducing efficiency for the case where a partition has an empty check expression. ExecPartitionCheck will be called, and will go through the motions of setting up and checking an empty qual, where before it would not have been called at all. We could avoid that by adding a separate boolean flag to track whether there is a partition expression to test. However, this case only arises for a default partition with no siblings, which surely is not an interesting case in practice. Hence adding complexity for it does not seem like a good trade-off. Amit Langote, per a suggestion by me Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com	2020-09-16 14:28:18 -04:00
Tom Lane	c8746f999e	Fix over-eager ping'ing in logical replication receiver. Commit `3f60f690f` only partially fixed the broken-status-tracking issue in LogicalRepApplyLoop: we need ping_sent to have the same lifetime as last_recv_timestamp. The effects are much less serious than what that commit fixed, though. AFAICS this would just lead to extra ping requests being sent, once per second until the sender responds. Still, it's a bug, so backpatch to v10 as before. Discussion: https://postgr.es/m/959627.1599248476@sss.pgh.pa.us	2020-09-04 20:33:36 -04:00
Amit Kapila	ac15b499f7	Fix inline marking introduced in commit `464824323e`. Forgot to add inline marking in changes_filename() declaration. In the passing, add inline marking for a similar function subxact_filename(). Reported-By: Nathan Bossart Discussion: https://postgr.es/m/E98FBE8F-B878-480D-A728-A60C6EED3047@amazon.com	2020-09-04 11:25:16 +05:30
Amit Kapila	464824323e	Add support for streaming to built-in logical replication. To add support for streaming of in-progress transactions into the built-in logical replication, we need to do three things: * Extend the logical replication protocol, so identify in-progress transactions, and allow adding additional bits of information (e.g. XID of subtransactions). * Modify the output plugin (pgoutput) to implement the new stream API callbacks, by leveraging the extended replication protocol. * Modify the replication apply worker, to properly handle streamed in-progress transaction by spilling the data to disk and then replaying them on commit. We however must explicitly disable streaming replication during replication slot creation, even if the plugin supports it. We don't need to replicate the changes accumulated during this phase, and moreover we don't have a replication connection open so we don't have where to send the data anyway. Author: Tomas Vondra, Dilip Kumar and Amit Kapila Reviewed-by: Amit Kapila, Kuntal Ghosh and Ajin Cherian Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com	2020-09-03 07:54:07 +05:30
Noah Misch	11da97024a	Empty search_path in logical replication apply worker and walsender. This is like CVE-2018-1058 commit `582edc369c`. Today, a malicious user of a publisher or subscriber database can invoke arbitrary SQL functions under an identity running replication, often a superuser. This fix may cause "does not exist" or "no schema has been selected to create in" errors in a replication process. After upgrading, consider watching server logs for these errors. Objects accruing schema qualification in the wake of the earlier commit are unlikely to need further correction. Back-patch to v10, which introduced logical replication. Security: CVE-2020-14349	2020-08-10 09:22:54 -07:00
Tom Lane	d5daae47db	Fix construction of updated-columns bitmap in logical replication. Commit `b9c130a1f` failed to apply the publisher-to-subscriber column mapping while checking which columns were updated. Perhaps less significantly, it didn't exclude dropped columns either. This could result in an incorrect updated-columns bitmap and thus wrong decisions about whether to fire column-specific triggers on the subscriber while applying updates. In HEAD (since commit `9de77b545`), it could also result in accesses off the end of the colstatus array, as detected by buildfarm member skink. Fix the logic, and adjust 003_constraints.pl so that the problem is exposed in unpatched code. In HEAD, also add some assertions to check that we don't access off the ends of these newly variable-sized arrays. Back-patch to v10, as `b9c130a1f` was. Discussion: https://postgr.es/m/CAH2-Wz=79hKQ4++c5A060RYbjTHgiYTHz=fw6mptCtgghH2gJA@mail.gmail.com	2020-07-20 13:40:16 -04:00
Tom Lane	9de77b5453	Allow logical replication to transfer data in binary format. This patch adds a "binary" option to CREATE/ALTER SUBSCRIPTION. When that's set, the publisher will send data using the data type's typsend function if any, rather than typoutput. This is generally faster, if slightly less robust. As committed, we won't try to transfer user-defined array or composite types in binary, for fear that type OIDs won't match at the subscriber. This might be changed later, but it seems like fit material for a follow-on patch. Dave Cramer, reviewed by Daniel Gustafsson, Petr Jelinek, and others; adjusted some by me Discussion: https://postgr.es/m/CADK3HH+R3xMn=8t3Ct+uD+qJ1KD=Hbif5NFMJ+d5DkoCzp6Vgw@mail.gmail.com	2020-07-18 12:44:51 -04:00
Andres Freund	5e7bbb5286	code: replace 'master' with 'primary' where appropriate. Also changed "in the primary" to "on the primary", and added a few "the" before "primary". Author: Andres Freund Reviewed-By: David Steele Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de	2020-07-08 12:57:23 -07:00
Peter Eisentraut	f1ac27bfda	Add logical replication support to replicate into partitioned tables Mainly, this adds support code in logical/worker.c for applying replicated operations whose target is a partitioned table to its relevant partitions. Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com	2020-04-06 15:15:52 +02:00
Peter Eisentraut	d8653f4687	Refactor code to look up local replication tuple This unifies some duplicate code. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://www.postgresql.org/message-id/CA+HiwqFjYE5anArxvkjr37AQMd52L-LZtz9Ld2QrLQ3YfcYhTw@mail.gmail.com	2020-04-01 15:34:41 +02:00
Peter Eisentraut	cef27ae01a	Fix compiler warning A variable was unused in non-assert builds. Simplify the code to avoid the issue. Reported-by: Erik Rijkers <er@xs4all.nl>	2020-03-24 16:02:01 +01:00
Peter Eisentraut	97ee604d9b	Some refactoring of logical/worker.c This moves the main operations of apply_handle_{insert\|update\|delete}, that of inserting, updating, deleting a tuple into/from a given relation, into corresponding apply_handle_{insert\|update\|delete}_internal functions. This allows performing those operations on relations that are not directly the targets of replication, which is something a later patch will use for targeting partitioned tables. Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com	2020-03-24 15:00:54 +01:00
Peter Eisentraut	ad3ae64770	Fill in extraUpdatedCols in logical replication The extraUpdatedCols field of the target RTE records which generated columns are affected by an update. This is used in a variety of places, including per-column triggers and foreign data wrappers. When an update was initiated by a logical replication subscription, this field was not filled in, so such an update would not affect generated columns in a way that is consistent with normal updates. To fix, factor out some code from analyze.c to fill in extraUpdatedCols in the logical replication worker as well. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b05e781a-fa16-6b52-6738-761181204567@2ndquadrant.com	2020-02-17 15:20:57 +01:00
Peter Eisentraut	b9c130a1fd	Have logical replication subscriber fire column triggers The logical replication apply worker did not fire per-column update triggers because the updatedCols bitmap in the RTE was not populated. This fixes that. Reviewed-by: Euler Taveira <euler@timbira.com.br> Discussion: https://www.postgresql.org/message-id/flat/21673e2d-597c-6afe-637e-e8b10425b240%402ndquadrant.com	2020-01-06 08:40:00 +01:00
Bruce Momjian	7559d8ebfa	Update copyrights for 2020 Backpatch-through: update all files in master, backpatch legal files through 9.4	2020-01-01 12:21:45 -05:00
Michael Paquier	e1551f96e6	Refactor attribute mappings used in logical tuple conversion Tuple conversion support in tupconvert.c is able to convert rowtypes between two relations, inner and outer, which are logically equivalent but have a different ordering or even dropped columns (used mainly for inheritance tree and partitions). This makes use of attribute mappings, which are simple arrays made of AttrNumber elements with a length matching the number of attributes of the outer relation. The length of the attribute mapping has been treated as completely independent of the mapping itself until now, making it easy to pass down an incorrect mapping length. This commit refactors the code related to attribute mappings and moves it into an independent facility called attmap.c, extracted from tupconvert.c. This merges the attribute mapping with its length, avoiding to try to guess what is the length of a mapping to use as this is computed once, when the map is built. This will avoid mistakes like what has been fixed in `dc816e58`, which has used an incorrect mapping length by matching it with the number of attributes of an inner relation (a child partition) instead of an outer relation (a partitioned table). Author: Michael Paquier Reviewed-by: Amit Langote Discussion: https://postgr.es/m/20191121042556.GD153437@paquier.xyz	2019-12-18 16:23:02 +09:00
Robert Haas	7dbfea3c45	Partially deduplicate interrupt handling for background processes. Where possible, share signal handler code and main loop interrupt checking. This saves quite a bit of code and should simplify maintenance, too. This commit intends not to change the way anything works, even though that might allow more code to be unified. It does unify a bunch of individual variables into a ShutdownRequestPending flag that has is now used by a bunch of different process types, though. Patch by me, reviewed by Andres Freund and Daniel Gustafsson. Discussion: http://postgr.es/m/CA+TgmoZwDk=BguVDVa+qdA6SBKef=PKbaKDQALTC_9qoz1mJqg@mail.gmail.com	2019-12-17 13:14:28 -05:00
Robert Haas	1e53fe0e70	Use PostgresSigHupHandler in more places. There seems to be no reason for every background process to have its own flag indicating that a config-file reload is needed. Instead, let's just use ConfigFilePending for that purpose everywhere. Patch by me, reviewed by Andres Freund and Daniel Gustafsson. Discussion: http://postgr.es/m/CA+TgmoZwDk=BguVDVa+qdA6SBKef=PKbaKDQALTC_9qoz1mJqg@mail.gmail.com	2019-12-17 13:03:57 -05:00
Tom Lane	4d9ceb0018	Fix bogus tuple-slot management in logical replication UPDATE handling. slot_modify_cstrings seriously abused the TupleTableSlot API by relying on a slot's underlying data to stay valid across ExecClearTuple. Since this abuse was also quite undocumented, it's little surprise that the case got broken during the v12 slot rewrites. As reported in bug #16129 from Ondřej Jirman, this could lead to crashes or data corruption when a logical replication subscriber processes a row update. Problems would only arise if the subscriber's table contained columns of pass-by-ref types that were not being copied from the publisher. Fix by explicitly copying the datum/isnull arrays from the source slot that the old row was in already. This ends up being about the same thing that happened pre-v12, but hopefully in a less opaque and fragile way. We might've caught the problem sooner if there were any test cases dealing with updates involving non-replicated or dropped columns. Now there are. Back-patch to v10 where this code came in. Even though the failure does not manifest before v12, IMO this code is too fragile to leave as-is. In any case we certainly want the additional test coverage. Patch by me; thanks to Tomas Vondra for initial investigation. Discussion: https://postgr.es/m/16129-a0c0f48e71741e5f@postgresql.org	2019-11-22 11:31:19 -05:00
Michael Paquier	3f60f690fa	Fix timeout handling in logical replication worker The timestamp tracking the last moment a message is received in a logical replication worker was initialized in each loop checking if a message was received or not, causing wal_receiver_timeout to be ignored in basically any logical replication deployments. This also broke the ping sent to the server when reaching half of wal_receiver_timeout. This simply moves the initialization of the timestamp out of the apply loop to the beginning of LogicalRepApplyLoop(). Reported-by: Jehan-Guillaume De Rorthais Author: Julien Rouhaud Discussion: https://postgr.es/m/CAOBaU_ZHESFcWva8jLjtZdCLspMj7vqaB2k++rjHLY897ZxbYw@mail.gmail.com Backpatch-through: 10	2019-10-18 14:26:29 +09:00
Michael Paquier	66bde49d96	Fix inconsistencies and typos in the tree, take 10 This addresses some issues with unnecessary code comments, fixes various typos in docs and comments, and removes some orphaned structures and definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com	2019-08-13 13:53:41 +09:00
Peter Eisentraut	fc22b6623b	Generated columns This is an SQL-standard feature that allows creating columns that are computed from expressions rather than assigned, similar to a view or materialized view but on a column basis. This implements one kind of generated column: stored (computed on write). Another kind, virtual (computed on read), is planned for the future, and some room is left for it. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b151f851-4019-bdb1-699e-ebab07d2f40a@2ndquadrant.com	2019-03-30 08:15:57 +01:00
Peter Eisentraut	69039fda83	Add walreceiver API to get remote server version Add a separate walreceiver API function walrcv_server_version() to get the version of the remote server, instead of doing it as part of walrcv_identify_system(). This allows the server version to be available even for uses that don't call IDENTIFY_SYSTEM, and it seems cleaner anyway. This is for an upcoming patch, not currently used. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/20190115071359.GF1433@paquier.xyz	2019-03-15 10:16:26 +01:00
Andres Freund	c2fe139c20	tableam: Add and use scan APIs. Too allow table accesses to be not directly dependent on heap, several new abstractions are needed. Specifically: 1) Heap scans need to be generalized into table scans. Do this by introducing TableScanDesc, which will be the "base class" for individual AMs. This contains the AM independent fields from HeapScanDesc. The previous heap_{beginscan,rescan,endscan} et al. have been replaced with a table_ version. There's no direct replacement for heap_getnext(), as that returned a HeapTuple, which is undesirable for a other AMs. Instead there's table_scan_getnextslot(). But note that heap_getnext() lives on, it's still used widely to access catalog tables. This is achieved by new scan_begin, scan_end, scan_rescan, scan_getnextslot callbacks. 2) The portion of parallel scans that's shared between backends need to be able to do so without the user doing per-AM work. To achieve that new parallelscan_{estimate, initialize, reinitialize} callbacks are introduced, which operate on a new ParallelTableScanDesc, which again can be subclassed by AMs. As it is likely that several AMs are going to be block oriented, block oriented callbacks that can be shared between such AMs are provided and used by heap. table_block_parallelscan_{estimate, intiialize, reinitialize} as callbacks, and table_block_parallelscan_{nextpage, init} for use in AMs. These operate on a ParallelBlockTableScanDesc. 3) Index scans need to be able to access tables to return a tuple, and there needs to be state across individual accesses to the heap to store state like buffers. That's now handled by introducing a sort-of-scan IndexFetchTable, which again is intended to be subclassed by individual AMs (for heap IndexFetchHeap). The relevant callbacks for an AM are index_fetch_{end, begin, reset} to create the necessary state, and index_fetch_tuple to retrieve an indexed tuple. Note that index_fetch_tuple implementations need to be smarter than just blindly fetching the tuples for AMs that have optimizations similar to heap's HOT - the currently alive tuple in the update chain needs to be fetched if appropriate. Similar to table_scan_getnextslot(), it's undesirable to continue to return HeapTuples. Thus index_fetch_heap (might want to rename that later) now accepts a slot as an argument. Core code doesn't have a lot of call sites performing index scans without going through the systable_* API (in contrast to loads of heap_getnext calls and working directly with HeapTuples). Index scans now store the result of a search in IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the target is not generally a HeapTuple anymore that seems cleaner. To be able to sensible adapt code to use the above, two further callbacks have been introduced: a) slot_callbacks returns a TupleTableSlotOps* suitable for creating slots capable of holding a tuple of the AMs type. table_slot_callbacks() and table_slot_create() are based upon that, but have additional logic to deal with views, foreign tables, etc. While this change could have been done separately, nearly all the call sites that needed to be adapted for the rest of this commit also would have been needed to be adapted for table_slot_callbacks(), making separation not worthwhile. b) tuple_satisfies_snapshot checks whether the tuple in a slot is currently visible according to a snapshot. That's required as a few places now don't have a buffer + HeapTuple around, but a slot (which in heap's case internally has that information). Additionally a few infrastructure changes were needed: I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now internally uses a slot to keep track of tuples. While systable_getnext() still returns HeapTuples, and will so for the foreseeable future, the index API (see 1) above) now only deals with slots. The remainder, and largest part, of this commit is then adjusting all scans in postgres to use the new APIs. Author: Andres Freund, Haribabu Kommi, Alvaro Herrera Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql	2019-03-11 12:46:41 -07:00
Andres Freund	70b9bda65f	Use a virtual rather than a heap slot in two places where that suffices. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-01 17:26:43 -08:00
Andres Freund	ff11e7f4b9	Use slots in trigger infrastructure, except for the actual invocation. In preparation for abstracting table storage, convert trigger.c to track tuples in slots. Which also happens to make code calling triggers simpler. As the calling interface for triggers themselves is not changed in this patch, HeapTuples still are extracted from the slot at that time. But that's handled solely inside trigger.c, not visible to callers. It's quite likely that we'll want to revise the external trigger interface, but that's a separate large project. As part of this work the slots used for old/new/return tuples are moved from EState into ResultRelInfo, as different updated tables might need different slots. The slots are now also now created on-demand, which is good both from an efficiency POV, but also makes the modifying code simpler. Author: Andres Freund, Amit Khandekar and Ashutosh Bapat Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-02-26 20:31:38 -08:00
Tom Lane	f09346a9c6	Refactor planner's header files. Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us	2019-01-29 15:48:51 -05:00
Andres Freund	e7cc78ad43	Remove superfluous tqual.h includes. Most of these had been obsoleted by `568d4138c` / the SnapshotNow removal. This is is preparation for moving most of tqual.[ch] into either snapmgr.h or heapam.h, which in turn is in preparation for pluggable table AMs. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-01-21 12:15:02 -08:00
Andres Freund	111944c5ee	Replace heapam.h includes with {table, relation}.h where applicable. A lot of files only included heapam.h for relation_open, heap_open etc - replace the heapam.h include in those files with the narrower header. Author: Andres Freund Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de	2019-01-21 10:51:37 -08:00
Andres Freund	4c850ecec6	Don't include heapam.h from others headers. heapam.h previously was included in a number of widely used headers (e.g. execnodes.h, indirectly in executor.h, ...). That's problematic on its own, as heapam.h contains a lot of low-level details that don't need to be exposed that widely, but becomes more problematic with the upcoming introduction of pluggable table storage - it seems inappropriate for heapam.h to be included that widely afterwards. heapam.h was largely only included in other headers to get the HeapScanDesc typedef (which was defined in heapam.h, even though HeapScanDescData is defined in relscan.h). The better solution here seems to be to just use the underlying struct (forward declared where necessary). Similar for BulkInsertState. Another problem was that LockTupleMode was used in executor.h - parts of the file tried to cope without heapam.h, but due to the fact that it indirectly included it, several subsequent violations of that goal were not not noticed. We could just reuse the approach of declaring parameters as int, but it seems nicer to move LockTupleMode to lockoptions.h - that's not a perfect location, but also doesn't seem bad. As a number of files relied on implicitly included heapam.h, a significant number of files grew an explicit include. It's quite probably that a few external projects will need to do the same. Author: Andres Freund Reviewed-By: Alvaro Herrera Discussion: https://postgr.es/m/20190114000701.y4ttcb74jpskkcfb@alap3.anarazel.de	2019-01-14 16:24:41 -08:00
Bruce Momjian	97c39498e5	Update copyright for 2019 Backpatch-through: certain files through 9.4	2019-01-02 12:44:25 -05:00
Thomas Munro	cfdf4dc4fc	Add WL_EXIT_ON_PM_DEATH pseudo-event. Users of the WaitEventSet and WaitLatch() APIs can now choose between asking for WL_POSTMASTER_DEATH and then handling it explicitly, or asking for WL_EXIT_ON_PM_DEATH to trigger immediate exit on postmaster death. This reduces code duplication, since almost all callers want the latter. Repair all code that was previously ignoring postmaster death completely, or requesting the event but ignoring it, or requesting the event but then doing an unconditional PostmasterIsAlive() call every time through its event loop (which is an expensive syscall on platforms for which we don't have USE_POSTMASTER_DEATH_SIGNAL support). Assert that callers of WaitLatchXXX() under the postmaster remember to ask for either WL_POSTMASTER_DEATH or WL_EXIT_ON_PM_DEATH, to prevent future bugs. The only process that doesn't handle postmaster death is syslogger. It waits until all backends holding the write end of the syslog pipe (including the postmaster) have closed it by exiting, to be sure to capture any parting messages. By using the WaitEventSet API directly it avoids the new assertion, and as a by-product it may be slightly more efficient on platforms that have epoll(). Author: Thomas Munro Reviewed-by: Kyotaro Horiguchi, Heikki Linnakangas, Tom Lane Discussion: https://postgr.es/m/CAEepm%3D1TCviRykkUb69ppWLr_V697rzd1j3eZsRMmbXvETfqbQ%40mail.gmail.com, https://postgr.es/m/CAEepm=2LqHzizbe7muD7-2yHUbTOoF7Q+qkSD5Q41kuhttRTwA@mail.gmail.com	2018-11-23 20:46:34 +13:00
Andres Freund	1a0586de36	Introduce notion of different types of slots (without implementing them). Upcoming work intends to allow pluggable ways to introduce new ways of storing table data. Accessing those table access methods from the executor requires TupleTableSlots to be carry tuples in the native format of such storage methods; otherwise there'll be a significant conversion overhead. Different access methods will require different data to store tuples efficiently (just like virtual, minimal, heap already require fields in TupleTableSlot). To allow that without requiring additional pointer indirections, we want to have different structs (embedding TupleTableSlot) for different types of slots. Thus different types of slots are needed, which requires adapting creators of slots. The slot that most efficiently can represent a type of tuple in an executor node will often depend on the type of slot a child node uses. Therefore we need to track the type of slot is returned by nodes, so parent slots can create slots based on that. Relatedly, JIT compilation of tuple deforming needs to know which type of slot a certain expression refers to, so it can create an appropriate deforming function for the type of tuple in the slot. But not all nodes will only return one type of slot, e.g. an append node will potentially return different types of slots for each of its subplans. Therefore add function that allows to query the type of a node's result slot, and whether it'll always be the same type (whether it's fixed). This can be queried using ExecGetResultSlotOps(). The scan, result, inner, outer type of slots are automatically inferred from ExecInitScanTupleSlot(), ExecInitResultSlot(), left/right subtrees respectively. If that's not correct for a node, that can be overwritten using new fields in PlanState. This commit does not introduce the actually abstracted implementation of different kind of TupleTableSlots, that will be left for a followup commit. The different types of slots introduced will, for now, still use the same backing implementation. While this already partially invalidates the big comment in tuptable.h, it seems to make more sense to update it later, when the different TupleTableSlot implementations actually exist. Author: Ashutosh Bapat and Andres Freund, with changes by Amit Khandekar Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de	2018-11-15 22:00:30 -08:00
Tom Lane	d73f4c74dd	In the executor, use an array of pointers to access the rangetable. Instead of doing a lot of list_nth() accesses to es_range_table, create a flattened pointer array during executor startup and index into that to get at individual RangeTblEntrys. This eliminates one source of O(N^2) behavior with lots of partitions. (I'm not exactly convinced that it's the most important source, but it's an easy one to fix.) Amit Langote and David Rowley Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp	2018-10-04 15:48:17 -04:00
Tom Lane	9ddef36278	Centralize executor's opening/closing of Relations for rangetable entries. Create an array estate->es_relations[] paralleling the es_range_table, and store references to Relations (relcache entries) there, so that any given RT entry is opened and closed just once per executor run. Scan nodes typically still call ExecOpenScanRelation, but ExecCloseScanRelation is no more; relation closing is now done centrally in ExecEndPlan. This is slightly more complex than one would expect because of the interactions with relcache references held in ResultRelInfo nodes. The general convention is now that ResultRelInfo->ri_RelationDesc does not represent a separate relcache reference and so does not need to be explicitly closed; but there is an exception for ResultRelInfos in the es_trig_target_relations list, which are manufactured by ExecGetTriggerResultRel and have to be cleaned up by ExecCleanUpTriggerState. (That much was true all along, but these ResultRelInfos are now more different from others than they used to be.) To allow the partition pruning logic to make use of es_relations[] rather than having its own relcache references, adjust PartitionedRelPruneInfo to store an RT index rather than a relation OID. Amit Langote, reviewed by David Rowley and Jesper Pedersen, some mods by me Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp	2018-10-04 14:03:42 -04:00
Tom Lane	fdba460a26	Create an RTE field to record the query's lock mode for each relation. Add RangeTblEntry.rellockmode, which records the appropriate lock mode for each RTE_RELATION rangetable entry (either AccessShareLock, RowShareLock, or RowExclusiveLock depending on the RTE's role in the query). This patch creates the field and makes all creators of RTE nodes fill it in reasonably, but for the moment nothing much is done with it. The plan is to replace assorted post-parser logic that re-determines the right lockmode to use with simple uses of rte->rellockmode. For now, just add Asserts in each of those places that the rellockmode matches what they are computing today. (In some cases the match isn't perfect, so the Asserts are weaker than you might expect; but this seems OK, as per discussion.) This passes check-world for me, but it seems worth pushing in this state to see if the buildfarm finds any problems in cases I failed to test. catversion bump due to change of stored rules. Amit Langote, reviewed by David Rowley and Jesper Pedersen, and whacked around a bit more by me Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp	2018-09-30 13:55:51 -04:00
Andres Freund	29c94e03c7	Split ExecStoreTuple into ExecStoreHeapTuple and ExecStoreBufferHeapTuple. Upcoming changes introduce further types of tuple table slots, in preparation of making table storage pluggable. New storage methods will have different representation of tuples, therefore the slot accessor should refer explicitly to heap tuples. Instead of just renaming the functions, split it into one function that accepts heap tuples not residing in buffers, and one accepting ones in buffers. Previously one function was used for both, but that was a bit awkward already, and splitting will allow us to represent slot types for tuples in buffers and normal memory separately. This is split out from the patch introducing abstract slots, as this largely consists out of mechanical changes. Author: Ashutosh Bapat Reviewed-By: Andres Freund Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de	2018-09-25 16:27:48 -07:00
Alvaro Herrera	4f10e7ea7b	Set ActiveSnapshot when logically replaying inserts Input functions for the inserted tuples may require a snapshot, when they are replayed by native logical replication. An example is a domain with a constraint using a SQL-language function, which prior to this commit failed to apply on the subscriber side. Reported-by: Mai Peng <maily.peng@webedia-group.com> Co-authored-by: Minh-Quan TRAN <qtran@itscaro.me> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/4EB4BD78-BFC3-4D04-B8DA-D53DF7160354@webedia-group.com Discussion: https://postgr.es/m/153211336163.1404.11721804383024050689@wrigleys.postgresql.org	2018-07-30 16:30:07 -04:00
Tom Lane	3cb646264e	Use a ResourceOwner to track buffer pins in all cases. Historically, we've allowed auxiliary processes to take buffer pins without tracking them in a ResourceOwner. However, that creates problems for error recovery. In particular, we've seen multiple reports of assertion crashes in the startup process when it gets an error while holding a buffer pin, as for example if it gets ENOSPC during a write. In a non-assert build, the process would simply exit without releasing the pin at all. We've gotten away with that so far just because a failure exit of the startup process translates to a database crash anyhow; but any similar behavior in other aux processes could result in stuck pins and subsequent problems in vacuum. To improve this, institute a policy that we must always have a resowner backing any attempt to pin a buffer, which we can enforce just by removing the previous special-case code in resowner.c. Add infrastructure to make it easy to create a process-lifespan AuxProcessResourceOwner and clear out its contents at appropriate times. Replace existing ad-hoc resowner management in bgwriter.c and other aux processes with that. (Thus, while the startup process gains a resowner where it had none at all before, some other aux process types are replacing an ad-hoc resowner with this code.) Also use the AuxProcessResourceOwner to manage buffer pins taken during StartupXLOG and ShutdownXLOG, even when those are being run in a bootstrap process or a standalone backend rather than a true auxiliary process. In passing, remove some other ad-hoc resource owner creations that had gotten cargo-culted into various other places. As far as I can tell that was all unnecessary, and if it had been necessary it was incomplete, due to lacking any provision for clearing those resowners later. (Also worth noting in this connection is that a process that hasn't called InitBufferPoolBackend has no business accessing buffers; so there's more to do than just add the resowner if we want to touch buffers in processes not covered by this patch.) Although this fixes a very old bug, no back-patch, because there's no evidence of any significant problem in non-assert builds. Patch by me, pursuant to a report from Justin Pryzby. Thanks to Robert Haas and Kyotaro Horiguchi for reviews. Discussion: https://postgr.es/m/20180627233939.GA10276@telsasoft.com	2018-07-18 12:15:16 -04:00
Peter Eisentraut	8a07ebb3c1	Convert debug message from ereport to elog	2018-06-12 11:33:39 -04:00
Tom Lane	bdf46af748	Post-feature-freeze pgindent run. Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us	2018-04-26 14:47:16 -04:00
Peter Eisentraut	df044026fc	Fix typo in logical truncate replication This could result in some misbehavior in a cascading replication setup.	2018-04-23 13:38:22 -04:00
Peter Eisentraut	039eb6e92f	Logical replication support for TRUNCATE Update the built-in logical replication system to make use of the previously added logical decoding for TRUNCATE support. Add the required truncate callback to pgoutput and a new logical replication protocol message. Publications get a new attribute to determine whether to replicate truncate actions. When updating a publication via pg_dump from an older version, this is not set, thus preserving the previous behavior. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-04-07 11:34:11 -04:00
Peter Eisentraut	c25304a945	Improve messaging during logical replication worker startup In case the subscription is removed before the worker is fully started, give a specific error message instead of the generic "cache lookup" error. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>	2018-04-06 09:07:09 -04:00
Magnus Hagander	eed1ce72e1	Allow background workers to bypass datallowconn THis adds a "flags" field to the BackgroundWorkerInitializeConnection() and BackgroundWorkerInitializeConnectionByOid(). For now only one flag, BGWORKER_BYPASS_ALLOWCONN, is defined, which allows the worker to ignore datallowconn.	2018-04-05 19:02:45 +02:00
Alvaro Herrera	24c0a6c649	logical replication: fix OID type mapping mechanism The logical replication type map seems to have been misused by its only caller -- it would try to use the remote OID as input for local type routines, which unsurprisingly could result in bogus "cache lookup failed for type XYZ" errors, or random other type names being picked up if they happened to use the right OID. Fix that, changing Oid logicalrep_typmap_getid(Oid remoteid) to char *logicalrep_typmap_gettypname(Oid remoteid) which is more useful. If the remote type is not part of the typmap, this simply prints "unrecognized type" instead of choking trying to figure out -- a pointless exercise (because the only input for that comes from replication messages, which are not under the local node's control) and dangerous to boot, when called from within an error context callback. Once that is done, it comes to light that the local OID in the typmap entry was not being used for anything; the type/schema names are what we need, so remove local type OID from that struct. Once you do that, it becomes pointless to attach a callback to regular syscache invalidation. So remove that also. Reported-by: Dang Minh Huong Author: Masahiko Sawada Reviewed-by: Álvaro Herrera, Petr Jelínek, Dang Minh Huong, Atsushi Torikoshi Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A6BE964@BPXM05GP.gisp.nec.co.jp Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A6C4B0A@BPXM05GP.gisp.nec.co.jp	2018-03-14 21:34:26 -03:00
Andres Freund	ad7dbee368	Allow tupleslots to have a fixed tupledesc, use in executor nodes. The reason for doing so is that it will allow expression evaluation to optimize based on the underlying tupledesc. In particular it will allow to JIT tuple deforming together with the expression itself. For that expression initialization needs to be moved after the relevant slots are initialized - mostly unproblematic, except in the case of nodeWorktablescan.c. After doing so there's no need for ExecAssignResultType() and ExecAssignResultTypeFromTL() anymore, as all former callers have been converted to create a slot with a fixed descriptor. When creating a slot with a fixed descriptor, tts_values/isnull can be allocated together with the main slot, reducing allocation overhead and increasing cache density a bit. Author: Andres Freund Discussion: https://postgr.es/m/20171206093717.vqdxe5icqttpxs3p@alap3.anarazel.de	2018-02-16 21:17:38 -08:00
Bruce Momjian	9d4649ca49	Update copyright for 2018 Backpatch-through: certain files through 9.3	2018-01-02 23:30:12 -05:00
Simon Riggs	7e17a6889a	Set es_output_cid in replication worker Allows triggers to operate correctly Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Konstantin Knizhnik <k.knizhnik@postgrespro.ru>	2017-11-22 16:28:14 +11:00
Peter Eisentraut	a9fce66729	Don't reset additional columns on subscriber to NULL on UPDATE When a publisher table has fewer columns than a subscriber, the update of a row on the publisher should result in updating of only the columns in common. The previous coding mistakenly reset the values of additional columns on the subscriber to NULL because it failed to skip updates of columns not found in the attribute map. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-11-03 12:27:59 -04:00
Peter Eisentraut	821fb8cdbf	Message style fixes	2017-09-11 11:21:27 -04:00
Andres Freund	2cd7084524	Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n). This is a mechanical change in preparation for a later commit that will change the layout of TupleDesc. Introducing a macro to abstract the details of where attributes are stored will allow us to change that in separate step and revise it in future. Author: Thomas Munro, editorialized by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com	2017-08-20 11:19:07 -07:00
Peter Eisentraut	fca17a933b	Fix local/remote attribute mix-up in logical replication This would lead to failures if local and remote tables have a different column order. The tests previously didn't catch that because they only tested the initial data copy. So add another test that exercises the apply worker. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-08-07 10:49:08 -04:00
Tom Lane	f32678c016	Reduce delay for last logicalrep feedback message when master goes idle. The regression tests contain numerous cases where we do some activity on a master server and then wait till the slave has ack'd flushing its copy of that transaction. Because WAL flush on the slave is asynchronous to the logicalrep worker process, the worker cannot send such a feedback message during the LogicalRepApplyLoop iteration where it processes the last data from the master. In the previous coding, the feedback message would come out only when the loop's WaitLatchOrSocket call returned WL_TIMEOUT. That requires one full second of delay (NAPTIME_PER_CYCLE); and to add insult to injury, it could take more than that if the WaitLatchOrSocket was interrupted a few times by latch-setting events. In reality we can expect the slave's walwriter process to have flushed the WAL data after, more or less, WalWriterDelay (typically 200ms). Hence, if there are unacked transactions pending, make the wait delay only that long rather than the full NAPTIME_PER_CYCLE. Also, move one of the send_feedback() calls into the loop main line, so that we'll check for the need to send feedback even if we were woken by a latch event and not either socket data or timeout. It's not clear how much this matters for production purposes, but it's definitely helpful for testing. Discussion: https://postgr.es/m/30864.1498861103@sss.pgh.pa.us	2017-07-01 12:15:51 -04:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00
Tom Lane	c7b8998ebb	Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit `e3860ffa4d` wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:19:25 -04:00
Peter Eisentraut	033370179a	Set statement timestamp in apply worker This ensures that triggers can see an up-to-date timestamp. Reported-by: Konstantin Evteev <konst583@gmail.com>	2017-06-17 08:54:21 -04:00
Tom Lane	651902deb1	Re-run pgindent. This is just to have a clean base state for testing of Piotr Stefaniak's latest version of FreeBSD indent. I fixed up a couple of places where pgindent would have changed format not-nicely. perltidy not included. Discussion: https://postgr.es/m/VI1PR03MB119959F4B65F000CA7CD9F6BF2CC0@VI1PR03MB1199.eurprd03.prod.outlook.com	2017-06-13 13:05:59 -04:00
Peter Eisentraut	d4bfc06e29	Consistently use subscription name as application name The logical replication apply worker uses the subscription name as application name, except for table sync. This was incorrectly set to use the replication slot name, which might be different, in one case. Also add a comment why the other case is different.	2017-06-06 22:11:22 -04:00
Andres Freund	9206ced1dc	Clean up latch related code. The larger part of this patch replaces usages of MyProc->procLatch with MyLatch. The latter works even early during backend startup, where MyProc->procLatch doesn't yet. While the affected code shouldn't run in cases where it's not initialized, it might get copied into places where it might. Using MyLatch is simpler and a bit faster to boot, so there's little point to stick with the previous coding. While doing so I noticed some weaknesses around newly introduced uses of latches that could lead to missed events, and an omitted CHECK_FOR_INTERRUPTS() call in worker_spi. As all the actual bugs are in v10 code, there doesn't seem to be sufficient reason to backpatch this. Author: Andres Freund Discussion: https://postgr.es/m/20170606195321.sjmenrfgl2nu6j63@alap3.anarazel.de https://postgr.es/m/20170606210405.sim3yl6vpudhmufo@alap3.anarazel.de Backpatch: -	2017-06-06 16:13:00 -07:00
Peter Eisentraut	41a21bf9b4	Don't set application_name in logical replication workers This was bothering some people because it's not the intended use of application_name and it makes the default view of pg_stat_activity bulky.	2017-06-05 22:16:02 -04:00
Peter Eisentraut	66b84fa82f	Receive invalidation messages correctly in tablesync worker We didn't accept any invalidation messages until the whole sync process had finished (because it flattens all the remote transactions in the single one). So the sync worker didn't learn about subscription changes/drop until it has finished. This could lead to "orphaned" sync workers. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>	2017-06-03 11:40:05 -04:00
Peter Eisentraut	9fcf670c2e	Fix signal handling in logical replication workers The logical replication worker processes now use the normal die() handler for SIGTERM and CHECK_FOR_INTERRUPTS() instead of custom code. One problem before was that the apply worker would not exit promptly when a subscription was dropped, which could lead to deadlocks. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>	2017-06-02 14:49:23 -04:00
Peter Eisentraut	6812330f1c	Reorganize logical replication worker disconnect code Move the walrcv_disconnect() calls into the before_shmem_exit handler. This makes sure the call is always made even during exit by signal, it saves some duplicate code, and it makes the logic more similar to walreceiver.c. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-06-01 23:16:20 -04:00
Peter Eisentraut	92ecb148e5	Improve logical replication worker log messages Reduce some redundant messages to DEBUG1. Be clearer about the distinction between apply workers and table synchronization workers. Add subscription and table name where possible. Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>	2017-05-24 18:57:56 -04:00
Peter Eisentraut	6234569851	Improve CREATE SUBSCRIPTION option parsing When creating a subscription with slot_name = NONE, we failed to check that also create_slot = false and enabled = false were set. This created an invalid subscription and could later lead to a crash if a NULL slot name was accessed. Add more checks around that for robustness. Reported-by: tushar <tushar.ahuja@enterprisedb.com>	2017-05-17 20:47:37 -04:00
Bruce Momjian	a6fd7b7a5f	Post-PG 10 beta1 pgindent run perltidy run not included.	2017-05-17 16:31:56 -04:00
Peter Eisentraut	489b96e80b	Improve memory use in logical replication apply Previously, the memory used by the logical replication apply worker for processing messages would never be freed, so that could end up using a lot of memory. To improve that, change the existing ApplyContext memory context to ApplyMessageContext and reset that after every message (similar to MessageContext used elsewhere). For consistency of naming, rename the ApplyCacheContext to ApplyContext. Author: Stas Kelvich <s.kelvich@postgrespro.ru>	2017-05-09 14:51:49 -04:00
Peter Eisentraut	013c1178fd	Remove the NODROP SLOT option from DROP SUBSCRIPTION It turned out this approach had problems, because a DROP command should not have any options other than CASCADE and RESTRICT. Instead, always attempt to drop the slot if there is one configured, but also add an ALTER SUBSCRIPTION action to set the slot to NONE. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/29431.1493730652@sss.pgh.pa.us	2017-05-09 10:20:42 -04:00
Peter Eisentraut	9a591c1bcc	Fix statistics reporting in logical replication workers This new arrangement ensures that statistics are reported right after commit of transactions. The previous arrangement didn't get this quite right and could lead to assertion failures. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Erik Rijkers <er@xs4all.nl>	2017-05-08 12:10:22 -04:00
Peter Eisentraut	6c9bd27aec	Fix typo in comment Author: Masahiko Sawada <sawada.mshk@gmail.com>	2017-04-26 21:13:01 -04:00
Fujii Masao	39a6772d04	Use DatumGetInt32() to extract 32-bit integer value from a datum. Previously DatumGetObjectId() was wrongly used for that. Author: Masahiko Sawada Reviewed-by: Kyotaro Horiguchi Reported-by: Fujii Masao Discussion: http://postgr.es/m/CAHGQGwFDWh_Qr-q_GEMpD+qH=vYPMdVqw=ZOSY3kX_Pna9R9SA@mail.gmail.com	2017-04-19 00:12:27 +09:00
Peter Eisentraut	139eb9673c	Report statistics in logical replication workers Author: Stas Kelvich <s.kelvich@postgrespro.ru> Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>	2017-04-14 14:37:06 -04:00
Peter Eisentraut	887227a1cc	Add option to modify sync commit per subscription This also changes default behaviour of subscription workers to synchronous_commit = off. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-04-14 13:58:46 -04:00
Peter Eisentraut	26ad194cb0	Support configuration reload in logical replication workers Author: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>	2017-04-10 13:42:21 -04:00
Peter Eisentraut	d1f103c739	Fix typo Author: Masahiko Sawada <sawada.mshk@gmail.com>	2017-04-04 09:03:24 -04:00
Peter Eisentraut	fe7bbc4ddb	Fix remote position tracking in logical replication We need to set the origin remote position to end_lsn, not commit_lsn, as commit_lsn is the start of commit record, and we use the origin remote position as start position when restarting replication stream. If we'd use commit_lsn, we could request data that we already received from the remote server after a crash of a downstream server. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-04-04 08:24:32 -04:00
Peter Eisentraut	1116108c92	Handle change of slot name in logical replication apply Since change of slot name is a supported operation, handle it more gracefully, instead of in the this-should-not-happen way. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-04-03 11:10:28 -04:00
Peter Eisentraut	7c4f52409a	Logical replication support for initial data copy Add functionality for a new subscription to copy the initial data in the tables and then sync with the ongoing apply process. For the copying, add a new internal COPY option to have the COPY source data provided by a callback function. The initial data copy works on the subscriber by receiving COPY data from the publisher and then providing it locally into a COPY that writes to the destination table. A WAL receiver can now execute full SQL commands. This is used here to obtain information about tables and publications. Several new options were added to CREATE and ALTER SUBSCRIPTION to control whether and when initial table syncing happens. Change pg_dump option --no-create-subscription-slots to --no-subscription-connect and use the new CREATE SUBSCRIPTION ... NOCONNECT option for that. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Tested-by: Erik Rijkers <er@xs4all.nl>	2017-03-23 08:55:37 -04:00
Peter Eisentraut	a47b38c9ee	Spelling fixes From: Josh Soref <jsoref@gmail.com>	2017-03-14 12:58:39 -04:00
Peter Eisentraut	6da9759a03	Add RENAME support for PUBLICATIONs and SUBSCRIPTIONs From: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-03-03 10:47:04 -05:00

1 2 3 4

157 Commits