mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-09-30 19:01:19 +02:00
e93ca1618b
Michael Paquier
448 lines
16 KiB
Plaintext
448 lines
16 KiB
Plaintext
<!-- doc/src/sgml/postgres-fdw.sgml -->
|
|
|
|
<sect1 id="postgres-fdw" xreflabel="postgres_fdw">
|
|
<title>postgres_fdw</title>
|
|
|
|
<indexterm zone="postgres-fdw">
|
|
<primary>postgres_fdw</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <filename>postgres_fdw</> module provides the foreign-data wrapper
|
|
<literal>postgres_fdw</literal>, which can be used to access data
|
|
stored in external <productname>PostgreSQL</productname> servers.
|
|
</para>
|
|
|
|
<para>
|
|
The functionality provided by this module overlaps substantially
|
|
with the functionality of the older <xref linkend="dblink"> module.
|
|
But <filename>postgres_fdw</> provides more transparent and
|
|
standards-compliant syntax for accessing remote tables, and can give
|
|
better performance in many cases.
|
|
</para>
|
|
|
|
<para>
|
|
To prepare for remote access using <filename>postgres_fdw</>:
|
|
<orderedlist spacing="compact">
|
|
<listitem>
|
|
<para>
|
|
Install the <filename>postgres_fdw</> extension using <xref
|
|
linkend="sql-createextension">.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Create a foreign server object, using <xref linkend="sql-createserver">,
|
|
to represent each remote database you want to connect to.
|
|
Specify connection information, except <literal>user</literal> and
|
|
<literal>password</literal>, as options of the server object.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Create a user mapping, using <xref linkend="sql-createusermapping">, for
|
|
each database user you want to allow to access each foreign server.
|
|
Specify the remote user name and password to use as
|
|
<literal>user</literal> and <literal>password</literal> options of the
|
|
user mapping.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Create a foreign table, using <xref linkend="sql-createforeigntable">,
|
|
for each remote table you want to access. The columns of the foreign
|
|
table must match the referenced remote table. You can, however, use
|
|
table and/or column names different from the remote table's, if you
|
|
specify the correct remote names as options of the foreign table object.
|
|
</para>
|
|
</listitem>
|
|
</orderedlist>
|
|
</para>
|
|
|
|
<para>
|
|
Now you need only <command>SELECT</> from a foreign table to access
|
|
the data stored in its underlying remote table. You can also modify
|
|
the remote table using <command>INSERT</>, <command>UPDATE</>, or
|
|
<command>DELETE</>. (Of course, the remote user you have specified
|
|
in your user mapping must have privileges to do these things.)
|
|
</para>
|
|
|
|
<para>
|
|
It is generally recommended that the columns of a foreign table be declared
|
|
with exactly the same data types, and collations if applicable, as the
|
|
referenced columns of the remote table. Although <filename>postgres_fdw</>
|
|
is currently rather forgiving about performing data type conversions at
|
|
need, surprising semantic anomalies may arise when types or collations do
|
|
not match, due to the remote server interpreting <literal>WHERE</> clauses
|
|
slightly differently from the local server.
|
|
</para>
|
|
|
|
<para>
|
|
Note that a foreign table can be declared with fewer columns, or with a
|
|
different column order, than its underlying remote table has. Matching
|
|
of columns to the remote table is by name, not position.
|
|
</para>
|
|
|
|
<sect2>
|
|
<title>FDW Options of postgres_fdw</title>
|
|
|
|
<sect3>
|
|
<title>Connection Options</title>
|
|
|
|
<para>
|
|
A foreign server using the <filename>postgres_fdw</> foreign data wrapper
|
|
can have the same options that <application>libpq</> accepts in
|
|
connection strings, as described in <xref linkend="libpq-paramkeywords">,
|
|
except that these options are not allowed:
|
|
|
|
<itemizedlist spacing="compact">
|
|
<listitem>
|
|
<para>
|
|
<literal>user</literal> and <literal>password</literal> (specify these
|
|
for a user mapping, instead)
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>client_encoding</> (this is automatically set from the local
|
|
server encoding)
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>fallback_application_name</> (always set to
|
|
<literal>postgres_fdw</>)
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
Only superusers may connect to foreign servers without password
|
|
authentication, so always specify the <literal>password</literal> option
|
|
for user mappings belonging to non-superusers.
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Object Name Options</title>
|
|
|
|
<para>
|
|
These options can be used to control the names used in SQL statements
|
|
sent to the remote <productname>PostgreSQL</productname> server. These
|
|
options are needed when a foreign table is created with names different
|
|
from the underlying remote table's names.
|
|
</para>
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
<term><literal>schema_name</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign table, gives the
|
|
schema name to use for the foreign table on the remote server. If this
|
|
option is omitted, the name of the foreign table's schema is used.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>table_name</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign table, gives the
|
|
table name to use for the foreign table on the remote server. If this
|
|
option is omitted, the foreign table's name is used.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>column_name</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a column of a foreign table,
|
|
gives the column name to use for the column on the remote server.
|
|
If this option is omitted, the column's name is used.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Cost Estimation Options</title>
|
|
|
|
<para>
|
|
<filename>postgres_fdw</> retrieves remote data by executing queries
|
|
against remote servers, so ideally the estimated cost of scanning a
|
|
foreign table should be whatever it costs to be done on the remote
|
|
server, plus some overhead for communication. The most reliable way to
|
|
get such an estimate is to ask the remote server and then add something
|
|
for overhead — but for simple queries, it may not be worth the cost
|
|
of an additional remote query to get a cost estimate.
|
|
So <filename>postgres_fdw</> provides the following options to control
|
|
how cost estimation is done:
|
|
</para>
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
<term><literal>use_remote_estimate</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign table or a foreign
|
|
server, controls whether <filename>postgres_fdw</> issues remote
|
|
<command>EXPLAIN</command> commands to obtain cost estimates.
|
|
A setting for a foreign table overrides any setting for its server,
|
|
but only for that table.
|
|
The default is <literal>false</literal>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>fdw_startup_cost</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign server, is a numeric
|
|
value that is added to the estimated startup cost of any foreign-table
|
|
scan on that server. This represents the additional overhead of
|
|
establishing a connection, parsing and planning the query on the
|
|
remote side, etc.
|
|
The default value is <literal>100</literal>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>fdw_tuple_cost</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign server, is a numeric
|
|
value that is used as extra cost per-tuple for foreign-table
|
|
scans on that server. This represents the additional overhead of
|
|
data transfer between servers. You might increase or decrease this
|
|
number to reflect higher or lower network delay to the remote server.
|
|
The default value is <literal>0.01</literal>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
<para>
|
|
When <literal>use_remote_estimate</literal> is true,
|
|
<filename>postgres_fdw</> obtains row count and cost estimates from the
|
|
remote server and then adds <literal>fdw_startup_cost</literal> and
|
|
<literal>fdw_tuple_cost</literal> to the cost estimates. When
|
|
<literal>use_remote_estimate</literal> is false,
|
|
<filename>postgres_fdw</> performs local row count and cost estimation
|
|
and then adds <literal>fdw_startup_cost</literal> and
|
|
<literal>fdw_tuple_cost</literal> to the cost estimates. This local
|
|
estimation is unlikely to be very accurate unless local copies of the
|
|
remote table's statistics are available. Running
|
|
<xref linkend="sql-analyze"> on the foreign table is the way to update
|
|
the local statistics; this will perform a scan of the remote table and
|
|
then calculate and store statistics just as though the table were local.
|
|
Keeping local statistics can be a useful way to reduce per-query planning
|
|
overhead for a remote table — but if the remote table is
|
|
frequently updated, the local statistics will soon be obsolete.
|
|
</para>
|
|
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Updatability Options</title>
|
|
|
|
<para>
|
|
By default all foreign tables using <filename>postgres_fdw</> are assumed
|
|
to be updatable. This may be overridden using the following option:
|
|
</para>
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
<term><literal>updatable</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option controls whether <filename>postgres_fdw</> allows foreign
|
|
tables to be modified using <command>INSERT</>, <command>UPDATE</> and
|
|
<command>DELETE</> commands. It can be specified for a foreign table
|
|
or a foreign server. A table-level option overrides a server-level
|
|
option.
|
|
The default is <literal>true</>.
|
|
</para>
|
|
|
|
<para>
|
|
Of course, if the remote table is not in fact updatable, an error
|
|
would occur anyway. Use of this option primarily allows the error to
|
|
be thrown locally without querying the remote server. Note however
|
|
that the <literal>information_schema</> views will report a
|
|
<filename>postgres_fdw</> foreign table to be updatable (or not)
|
|
according to the setting of this option, without any check of the
|
|
remote server.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
</sect3>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Connection Management</title>
|
|
|
|
<para>
|
|
<filename>postgres_fdw</filename> establishes a connection to a
|
|
foreign server during the first query that uses a foreign table
|
|
associated with the foreign server. This connection is kept and
|
|
re-used for subsequent queries in the same session. However, if
|
|
multiple user identities (user mappings) are used to access the foreign
|
|
server, a connection is established for each user mapping.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Transaction Management</title>
|
|
|
|
<para>
|
|
During a query that references any remote tables on a foreign server,
|
|
<filename>postgres_fdw</filename> opens a transaction on the
|
|
remote server if one is not already open corresponding to the current
|
|
local transaction. The remote transaction is committed or aborted when
|
|
the local transaction commits or aborts. Savepoints are similarly
|
|
managed by creating corresponding remote savepoints.
|
|
</para>
|
|
|
|
<para>
|
|
The remote transaction uses <literal>SERIALIZABLE</>
|
|
isolation level when the local transaction has <literal>SERIALIZABLE</>
|
|
isolation level; otherwise it uses <literal>REPEATABLE READ</>
|
|
isolation level. This choice ensures that if a query performs multiple
|
|
table scans on the remote server, it will get snapshot-consistent results
|
|
for all the scans. A consequence is that successive queries within a
|
|
single transaction will see the same data from the remote server, even if
|
|
concurrent updates are occurring on the remote server due to other
|
|
activities. That behavior would be expected anyway if the local
|
|
transaction uses <literal>SERIALIZABLE</> or <literal>REPEATABLE READ</>
|
|
isolation level, but it might be surprising for a <literal>READ
|
|
COMMITTED</> local transaction. A future
|
|
<productname>PostgreSQL</productname> release might modify these rules.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Remote Query Optimization</title>
|
|
|
|
<para>
|
|
<filename>postgres_fdw</> attempts to optimize remote queries to reduce
|
|
the amount of data transferred from foreign servers. This is done by
|
|
sending query <literal>WHERE</> clauses to the remote server for
|
|
execution, and by not retrieving table columns that are not needed for
|
|
the current query. To reduce the risk of misexecution of queries,
|
|
<literal>WHERE</> clauses are not sent to the remote server unless they use
|
|
only built-in data types, operators, and functions. Operators and
|
|
functions in the clauses must be <literal>IMMUTABLE</> as well.
|
|
</para>
|
|
|
|
<para>
|
|
The query that is actually sent to the remote server for execution can
|
|
be examined using <command>EXPLAIN VERBOSE</>.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Cross-Version Compatibility</title>
|
|
|
|
<para>
|
|
<filename>postgres_fdw</> can be used with remote servers dating back
|
|
to <productname>PostgreSQL</> 8.3. Read-only capability is available
|
|
back to 8.1. A limitation however is that <filename>postgres_fdw</>
|
|
generally assumes that immutable built-in functions and operators are
|
|
safe to send to the remote server for execution, if they appear in a
|
|
<literal>WHERE</> clause for a foreign table. Thus, a built-in
|
|
function that was added since the remote server's release might be sent
|
|
to it for execution, resulting in <quote>function does not exist</> or
|
|
a similar error. This type of failure can be worked around by
|
|
rewriting the query, for example by embedding the foreign table
|
|
reference in a sub-<literal>SELECT</> with <literal>OFFSET 0</> as an
|
|
optimization fence, and placing the problematic function or operator
|
|
outside the sub-<literal>SELECT</>.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Examples</title>
|
|
|
|
<para>
|
|
Here is an example of creating a foreign table with
|
|
<literal>postgres_fdw</>. First install the extension:
|
|
</para>
|
|
|
|
<programlisting>
|
|
CREATE EXTENSION postgres_fdw;
|
|
</programlisting>
|
|
|
|
<para>
|
|
Then create a foreign server using <xref linkend="sql-createserver">.
|
|
In this example we wish to connect to a <productname>PostgreSQL</> server
|
|
on host <literal>192.83.123.89</literal> listening on
|
|
port <literal>5432</literal>. The database to which the connection is made
|
|
is named <literal>foreign_db</literal> on the remote server:
|
|
|
|
<programlisting>
|
|
CREATE SERVER foreign_server
|
|
FOREIGN DATA WRAPPER postgres_fdw
|
|
OPTIONS (host '192.83.123.89', port '5432', dbname 'foreign_db');
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
A user mapping, defined with <xref linkend="sql-createusermapping">, is
|
|
needed as well to identify the role that will be used on the remote
|
|
server:
|
|
|
|
<programlisting>
|
|
CREATE USER MAPPING FOR local_user
|
|
SERVER foreign_server
|
|
OPTIONS (user 'foreign_user', password 'password');
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Now it is possible to create a foreign table with
|
|
<xref linkend="sql-createforeigntable">. In this example we
|
|
wish to access the table named <structname>some_schema.some_table</>
|
|
on the remote server. The local name for it will
|
|
be <structname>foreign_table</>:
|
|
|
|
<programlisting>
|
|
CREATE FOREIGN TABLE foreign_table (
|
|
id serial NOT NULL,
|
|
data text
|
|
)
|
|
SERVER foreign_server
|
|
OPTIONS (schema_name 'some_schema', table_name 'some_table');
|
|
</programlisting>
|
|
|
|
It's essential that the data types and other properties of the columns
|
|
declared in <command>CREATE FOREIGN TABLE</> match the actual remote table.
|
|
Column names must match as well, unless you attach <literal>column_name</>
|
|
options to the individual columns to show how they are named in the remote
|
|
table.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Author</title>
|
|
<para>
|
|
Shigeru Hanada <email>shigeru.hanada@gmail.com</email>
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|