Logical Replication Logical replication is a method of replicating data objects and their changes, based upon their replication identity (usually a primary key). We use the term logical in contrast to physical replication, which uses exact block addresses and byte-by-byte replication. PostgreSQL supports both mechanisms concurrently, see . Logical replication allows fine-grained control over both data replication and security. Logical replication uses a publish and subscribe model with one or more subscribers subscribing to one or more publications on a publisher node. Subscribers pull data from the publications they subscribe to and may subsequently re-publish data to allow cascading replication or more complex configurations. Logical replication sends changes on the publisher to the subscriber as they occur in real-time. The subscriber applies the data in the same order as the publisher so that transactional consistency is guaranteed for publications within a single subscription. This method of data replication is sometimes referred to as transactional replication. The typical use-cases for logical replication are: Sending incremental changes in a single database or a subset of a database to subscribers as they occur. Firing triggers for individual changes as they arrive on the subscriber. Consolidating multiple databases into a single one (for example for analytical purposes). Replicating between different major versions of PostgreSQL. Giving access to replicated data to different groups of users. Sharing a subset of the database between multiple databases. The subscriber database behaves in the same way as any other PostgreSQL instance and can be used as a publisher for other databases by defining its own publications. When the subscriber is treated as read-only by application, there will be no conflicts from a single subscription. On the other hand, if there are other writes done either by an application or by other subscribers to the same set of tables, conflicts can arise. Publication A publication can be defined on any physical replication master. The node where a publication is defined is referred to as publisher. A publication is a set of changes generated from a table or a group of tables, and might also be described as a change set or replication set. Each publication exists in only one database. Publications are different from schemas and do not affect how the table is accessed. Each table can be added to multiple publications if needed. Publications may currently only contain tables. Objects must be added explicitly, except when a publication is created for ALL TABLES. Publications can choose to limit the changes they produce to any combination of INSERT, UPDATE, and DELETE, similar to how triggers are fired by particular event types. If a table without a REPLICA IDENTITY is added to a publication that replicates UPDATE or DELETE operations then subsequent UPDATE or DELETE operations will fail on the publisher. Every publication can have multiple subscribers. A publication is created using the command and may later be altered or dropped using corresponding commands. The individual tables can be added and removed dynamically using . Both the ADD TABLE and DROP TABLE operations are transactional; so the table will start or stop replicating at the correct snapshot once the transaction has committed. Subscription A subscription is the downstream side of logical replication. The node where a subscription is defined is referred to as the subscriber. A subscription defines the connection to another database and set of publications (one or more) to which it wants to subscribe. The subscriber database behaves in the same way as any other PostgreSQL instance and can be used as a publisher for other databases by defining its own publications. A subscriber node may have multiple subscriptions if desired. It is possible to define multiple subscriptions between a single publisher-subscriber pair, in which case care must be taken to ensure that the subscribed publication objects don't overlap. Each subscription will receive changes via one replication slot (see ). Subscriptions are not dumped by pg_dump by default, but this can be requested using the command-line option . The subscription is added using and can be stopped/resumed at any time using the command and removed using . When a subscription is dropped and recreated, the synchronization information is lost. This means that the data has to be resynchronized afterwards. The schema definitions are not replicated, and the published tables must exist on the subscriber. Only regular tables may be the target of replication. For example, you can't replicate to a view. The tables are matched between the publisher and the subscriber using the fully qualified table name. Replication to differently-named tables on the subscriber is not supported. Columns of a table are also matched by name. A different order of columns in the target table is allowed, but the column types have to match. Conflicts Logical replication behaves similarly to normal DML operations in that the data will be updated even if it was changed locally on the subscriber node. If incoming data violates any constraints the replication will stop. This is referred to as a conflict. When replicating UPDATE or DELETE operations, missing data will not produce a conflict and such operations will simply be skipped. A conflict will produce an error and will stop the replication; it must be resolved manually by the user. Details about the conflict can be found in the subscriber's server log. The resolution can be done either by changing data on the subscriber so that it does not conflict with the incoming change or by skipping the transaction that conflicts with the existing data. The transaction can be skipped by calling the pg_replication_origin_advance() function with a node_name corresponding to the subscription name, and a position. The current position of origins can be seen in the pg_replication_origin_status system view. Architecture Logical replication starts by copying a snapshot of the data on the publisher database. Once that is done, changes on the publisher are sent to the subscriber as they occur in real time. The subscriber applies data in the order in which commits were made on the publisher so that transactional consistency is guaranteed for the publications within any single subscription. Logical replication is built with an architecture similar to physical streaming replication (see ). It is implemented by walsender and apply processes. The walsender process starts logical decoding (described in ) of the WAL and loads the standard logical decoding plugin (pgoutput). The plugin transforms the changes read from WAL to the logical replication protocol (see ) and filters the data according to the publication specification. The data is then continuously transferred using the streaming replication protocol to the apply worker, which maps the data to local tables and applies the individual changes as they are received, in correct transactional order. The apply process on the subscriber database always runs with session_replication_role set to replica, which produces the usual effects on triggers and constraints. Monitoring Because logical replication is based on a similar architecture as physical streaming replication, the monitoring on a publication node is similar to monitoring of a physical replication master (see ). The monitoring information about subscription is visible in pg_stat_subscription. This view contains one row for every subscription worker. A subscription can have zero or more active subscription workers depending on its state. Normally, there is a single apply process running for an enabled subscription. A disabled subscription or a crashed subscription will have zero rows in this view. Security The role used for the replication connection must have the REPLICATION attribute. Access for the role must be configured in pg_hba.conf. To create a publication, the user must have the CREATE privilege in the database. To add tables to a publication, the user must have ownership rights on the table. To create a publication that publishes all tables automatically, the user must be a superuser. To create a subscription, the user must be a superuser. The subscription apply process will run in the local database with the privileges of a superuser. Privileges are only checked once at the start of a replication connection. They are not re-checked as each change record is read from the publisher, nor are they re-checked for each change when applied. Configuration Settings Logical replication requires several configuration options to be set. On the publisher side, wal_level must be set to logical, and max_replication_slots must be set to at least the number of subscriptions expected to connect. And max_wal_senders should be set to at least the same as max_replication_slots plus the number of physical replicas that are connected at the same time. The subscriber also requires the max_replication_slots to be set. In this case it should be set to at least the number of subscriptions that will be added to the subscriber. max_logical_replication_workers must be set to at least the number of subscriptions. Additionally the max_worker_processes may need to be adjusted to accommodate for replication workers, at least (max_logical_replication_workers + 1). Note that some extensions and parallel queries also take worker slots from max_worker_processes. Quick Setup First set the configuration options in postgresql.conf: wal_level = logical The other required settings have default values that are sufficient for a basic setup. pg_hba.conf needs to be adjusted to allow replication (the values here depend on your actual network configuration and user you want to use for connecting): host replication repuser 0.0.0.0/0 md5 Then on the publisher database: CREATE PUBLICATION mypub FOR TABLE users, departments; And on the subscriber database: CREATE SUBSCRIPTION mysub CONNECTION 'dbname=foo host=bar user=repuser' PUBLICATION mypub; The above will start the replication process of changes to users and departments tables.