956 lines
39 KiB
Plaintext
956 lines
39 KiB
Plaintext
<!-- doc/src/sgml/fdwhandler.sgml -->
|
|
|
|
<chapter id="fdwhandler">
|
|
<title>Writing A Foreign Data Wrapper</title>
|
|
|
|
<indexterm zone="fdwhandler">
|
|
<primary>foreign data wrapper</primary>
|
|
<secondary>handler for</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
All operations on a foreign table are handled through its foreign data
|
|
wrapper, which consists of a set of functions that the core server
|
|
calls. The foreign data wrapper is responsible for fetching
|
|
data from the remote data source and returning it to the
|
|
<productname>PostgreSQL</productname> executor. If updating foreign
|
|
tables is to be supported, the wrapper must handle that, too.
|
|
This chapter outlines how to write a new foreign data wrapper.
|
|
</para>
|
|
|
|
<para>
|
|
The foreign data wrappers included in the standard distribution are good
|
|
references when trying to write your own. Look into the
|
|
<filename>contrib</> subdirectory of the source tree.
|
|
The <xref linkend="sql-createforeigndatawrapper"> reference page also has
|
|
some useful details.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
The SQL standard specifies an interface for writing foreign data wrappers.
|
|
However, PostgreSQL does not implement that API, because the effort to
|
|
accommodate it into PostgreSQL would be large, and the standard API hasn't
|
|
gained wide adoption anyway.
|
|
</para>
|
|
</note>
|
|
|
|
<sect1 id="fdw-functions">
|
|
<title>Foreign Data Wrapper Functions</title>
|
|
|
|
<para>
|
|
The FDW author needs to implement a handler function, and optionally
|
|
a validator function. Both functions must be written in a compiled
|
|
language such as C, using the version-1 interface.
|
|
For details on C language calling conventions and dynamic loading,
|
|
see <xref linkend="xfunc-c">.
|
|
</para>
|
|
|
|
<para>
|
|
The handler function simply returns a struct of function pointers to
|
|
callback functions that will be called by the planner, executor, and
|
|
various maintenance commands.
|
|
Most of the effort in writing an FDW is in implementing these callback
|
|
functions.
|
|
The handler function must be registered with
|
|
<productname>PostgreSQL</productname> as taking no arguments and
|
|
returning the special pseudo-type <type>fdw_handler</type>. The
|
|
callback functions are plain C functions and are not visible or
|
|
callable at the SQL level. The callback functions are described in
|
|
<xref linkend="fdw-callbacks">.
|
|
</para>
|
|
|
|
<para>
|
|
The validator function is responsible for validating options given in
|
|
<command>CREATE</command> and <command>ALTER</command> commands for its
|
|
foreign data wrapper, as well as foreign servers, user mappings, and
|
|
foreign tables using the wrapper.
|
|
The validator function must be registered as taking two arguments, a
|
|
text array containing the options to be validated, and an OID
|
|
representing the type of object the options are associated with (in
|
|
the form of the OID of the system catalog the object would be stored
|
|
in, either
|
|
<literal>ForeignDataWrapperRelationId</>,
|
|
<literal>ForeignServerRelationId</>,
|
|
<literal>UserMappingRelationId</>,
|
|
or <literal>ForeignTableRelationId</>).
|
|
If no validator function is supplied, options are not checked at object
|
|
creation time or object alteration time.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="fdw-callbacks">
|
|
<title>Foreign Data Wrapper Callback Routines</title>
|
|
|
|
<para>
|
|
The FDW handler function returns a palloc'd <structname>FdwRoutine</>
|
|
struct containing pointers to the callback functions described below.
|
|
The scan-related functions are required, the rest are optional.
|
|
</para>
|
|
|
|
<para>
|
|
The <structname>FdwRoutine</> struct type is declared in
|
|
<filename>src/include/foreign/fdwapi.h</>, which see for additional
|
|
details.
|
|
</para>
|
|
|
|
<sect2 id="fdw-callbacks-scan">
|
|
<title>FDW Routines For Scanning Foreign Tables</title>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
GetForeignRelSize (PlannerInfo *root,
|
|
RelOptInfo *baserel,
|
|
Oid foreigntableid);
|
|
</programlisting>
|
|
|
|
Obtain relation size estimates for a foreign table. This is called
|
|
at the beginning of planning for a query that scans a foreign table.
|
|
<literal>root</> is the planner's global information about the query;
|
|
<literal>baserel</> is the planner's information about this table; and
|
|
<literal>foreigntableid</> is the <structname>pg_class</> OID of the
|
|
foreign table. (<literal>foreigntableid</> could be obtained from the
|
|
planner data structures, but it's passed explicitly to save effort.)
|
|
</para>
|
|
|
|
<para>
|
|
This function should update <literal>baserel->rows</> to be the
|
|
expected number of rows returned by the table scan, after accounting for
|
|
the filtering done by the restriction quals. The initial value of
|
|
<literal>baserel->rows</> is just a constant default estimate, which
|
|
should be replaced if at all possible. The function may also choose to
|
|
update <literal>baserel->width</> if it can compute a better estimate
|
|
of the average result row width.
|
|
</para>
|
|
|
|
<para>
|
|
See <xref linkend="fdw-planning"> for additional information.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
GetForeignPaths (PlannerInfo *root,
|
|
RelOptInfo *baserel,
|
|
Oid foreigntableid);
|
|
</programlisting>
|
|
|
|
Create possible access paths for a scan on a foreign table.
|
|
This is called during query planning.
|
|
The parameters are the same as for <function>GetForeignRelSize</>,
|
|
which has already been called.
|
|
</para>
|
|
|
|
<para>
|
|
This function must generate at least one access path
|
|
(<structname>ForeignPath</> node) for a scan on the foreign table and
|
|
must call <function>add_path</> to add each such path to
|
|
<literal>baserel->pathlist</>. It's recommended to use
|
|
<function>create_foreignscan_path</> to build the
|
|
<structname>ForeignPath</> nodes. The function can generate multiple
|
|
access paths, e.g., a path which has valid <literal>pathkeys</> to
|
|
represent a pre-sorted result. Each access path must contain cost
|
|
estimates, and can contain any FDW-private information that is needed to
|
|
identify the specific scan method intended.
|
|
</para>
|
|
|
|
<para>
|
|
See <xref linkend="fdw-planning"> for additional information.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
ForeignScan *
|
|
GetForeignPlan (PlannerInfo *root,
|
|
RelOptInfo *baserel,
|
|
Oid foreigntableid,
|
|
ForeignPath *best_path,
|
|
List *tlist,
|
|
List *scan_clauses);
|
|
</programlisting>
|
|
|
|
Create a <structname>ForeignScan</> plan node from the selected foreign
|
|
access path. This is called at the end of query planning.
|
|
The parameters are as for <function>GetForeignRelSize</>, plus
|
|
the selected <structname>ForeignPath</> (previously produced by
|
|
<function>GetForeignPaths</>), the target list to be emitted by the
|
|
plan node, and the restriction clauses to be enforced by the plan node.
|
|
</para>
|
|
|
|
<para>
|
|
This function must create and return a <structname>ForeignScan</> plan
|
|
node; it's recommended to use <function>make_foreignscan</> to build the
|
|
<structname>ForeignScan</> node.
|
|
</para>
|
|
|
|
<para>
|
|
See <xref linkend="fdw-planning"> for additional information.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
BeginForeignScan (ForeignScanState *node,
|
|
int eflags);
|
|
</programlisting>
|
|
|
|
Begin executing a foreign scan. This is called during executor startup.
|
|
It should perform any initialization needed before the scan can start,
|
|
but not start executing the actual scan (that should be done upon the
|
|
first call to <function>IterateForeignScan</>).
|
|
The <structname>ForeignScanState</> node has already been created, but
|
|
its <structfield>fdw_state</> field is still NULL. Information about
|
|
the table to scan is accessible through the
|
|
<structname>ForeignScanState</> node (in particular, from the underlying
|
|
<structname>ForeignScan</> plan node, which contains any FDW-private
|
|
information provided by <function>GetForeignPlan</>).
|
|
<literal>eflags</> contains flag bits describing the executor's
|
|
operating mode for this plan node.
|
|
</para>
|
|
|
|
<para>
|
|
Note that when <literal>(eflags & EXEC_FLAG_EXPLAIN_ONLY)</> is
|
|
true, this function should not perform any externally-visible actions;
|
|
it should only do the minimum required to make the node state valid
|
|
for <function>ExplainForeignScan</> and <function>EndForeignScan</>.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
TupleTableSlot *
|
|
IterateForeignScan (ForeignScanState *node);
|
|
</programlisting>
|
|
|
|
Fetch one row from the foreign source, returning it in a tuple table slot
|
|
(the node's <structfield>ScanTupleSlot</> should be used for this
|
|
purpose). Return NULL if no more rows are available. The tuple table
|
|
slot infrastructure allows either a physical or virtual tuple to be
|
|
returned; in most cases the latter choice is preferable from a
|
|
performance standpoint. Note that this is called in a short-lived memory
|
|
context that will be reset between invocations. Create a memory context
|
|
in <function>BeginForeignScan</> if you need longer-lived storage, or use
|
|
the <structfield>es_query_cxt</> of the node's <structname>EState</>.
|
|
</para>
|
|
|
|
<para>
|
|
The rows returned must match the column signature of the foreign table
|
|
being scanned. If you choose to optimize away fetching columns that
|
|
are not needed, you should insert nulls in those column positions.
|
|
</para>
|
|
|
|
<para>
|
|
Note that <productname>PostgreSQL</productname>'s executor doesn't care
|
|
whether the rows returned violate any <literal>NOT NULL</literal>
|
|
constraints that were defined on the foreign table columns — but
|
|
the planner does care, and may optimize queries incorrectly if
|
|
<literal>NULL</> values are present in a column declared not to contain
|
|
them. If a <literal>NULL</> value is encountered when the user has
|
|
declared that none should be present, it may be appropriate to raise an
|
|
error (just as you would need to do in the case of a data type mismatch).
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
ReScanForeignScan (ForeignScanState *node);
|
|
</programlisting>
|
|
|
|
Restart the scan from the beginning. Note that any parameters the
|
|
scan depends on may have changed value, so the new scan does not
|
|
necessarily return exactly the same rows.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
EndForeignScan (ForeignScanState *node);
|
|
</programlisting>
|
|
|
|
End the scan and release resources. It is normally not important
|
|
to release palloc'd memory, but for example open files and connections
|
|
to remote servers should be cleaned up.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="fdw-callbacks-update">
|
|
<title>FDW Routines For Updating Foreign Tables</title>
|
|
|
|
<para>
|
|
If an FDW supports writable foreign tables, it should provide
|
|
some or all of the following callback functions depending on
|
|
the needs and capabilities of the FDW:
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
AddForeignUpdateTargets (Query *parsetree,
|
|
RangeTblEntry *target_rte,
|
|
Relation target_relation);
|
|
</programlisting>
|
|
|
|
<command>UPDATE</> and <command>DELETE</> operations are performed
|
|
against rows previously fetched by the table-scanning functions. The
|
|
FDW may need extra information, such as a row ID or the values of
|
|
primary-key columns, to ensure that it can identify the exact row to
|
|
update or delete. To support that, this function can add extra hidden,
|
|
or <quote>junk</>, target columns to the list of columns that are to be
|
|
retrieved from the foreign table during an <command>UPDATE</> or
|
|
<command>DELETE</>.
|
|
</para>
|
|
|
|
<para>
|
|
To do that, add <structname>TargetEntry</> items to
|
|
<literal>parsetree->targetList</>, containing expressions for the
|
|
extra values to be fetched. Each such entry must be marked
|
|
<structfield>resjunk</> = <literal>true</>, and must have a distinct
|
|
<structfield>resname</> that will identify it at execution time.
|
|
Avoid using names matching <literal>ctid<replaceable>N</></literal> or
|
|
<literal>wholerow<replaceable>N</></literal>, as the core system can
|
|
generate junk columns of these names.
|
|
</para>
|
|
|
|
<para>
|
|
This function is called in the rewriter, not the planner, so the
|
|
information available is a bit different from that available to the
|
|
planning routines.
|
|
<literal>parsetree</> is the parse tree for the <command>UPDATE</> or
|
|
<command>DELETE</> command, while <literal>target_rte</> and
|
|
<literal>target_relation</> describe the target foreign table.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>AddForeignUpdateTargets</> pointer is set to
|
|
<literal>NULL</>, no extra target expressions are added.
|
|
(This will make it impossible to implement <command>DELETE</>
|
|
operations, though <command>UPDATE</> may still be feasible if the FDW
|
|
relies on an unchanging primary key to identify rows.)
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
List *
|
|
PlanForeignModify (PlannerInfo *root,
|
|
ModifyTable *plan,
|
|
Index resultRelation,
|
|
int subplan_index);
|
|
</programlisting>
|
|
|
|
Perform any additional planning actions needed for an insert, update, or
|
|
delete on a foreign table. This function generates the FDW-private
|
|
information that will be attached to the <structname>ModifyTable</> plan
|
|
node that performs the update action. This private information must
|
|
have the form of a <literal>List</>, and will be delivered to
|
|
<function>BeginForeignModify</> during the execution stage.
|
|
</para>
|
|
|
|
<para>
|
|
<literal>root</> is the planner's global information about the query.
|
|
<literal>plan</> is the <structname>ModifyTable</> plan node, which is
|
|
complete except for the <structfield>fdwPrivLists</> field.
|
|
<literal>resultRelation</> identifies the target foreign table by its
|
|
rangetable index. <literal>subplan_index</> identifies which target of
|
|
the <structname>ModifyTable</> plan node this is, counting from zero;
|
|
use this if you want to index into <literal>plan->plans</> or other
|
|
substructure of the <literal>plan</> node.
|
|
</para>
|
|
|
|
<para>
|
|
See <xref linkend="fdw-planning"> for additional information.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>PlanForeignModify</> pointer is set to
|
|
<literal>NULL</>, no additional plan-time actions are taken, and the
|
|
<literal>fdw_private</> list delivered to
|
|
<function>BeginForeignModify</> will be NIL.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
BeginForeignModify (ModifyTableState *mtstate,
|
|
ResultRelInfo *rinfo,
|
|
List *fdw_private,
|
|
int subplan_index,
|
|
int eflags);
|
|
</programlisting>
|
|
|
|
Begin executing a foreign table modification operation. This routine is
|
|
called during executor startup. It should perform any initialization
|
|
needed prior to the actual table modifications. Subsequently,
|
|
<function>ExecForeignInsert</>, <function>ExecForeignUpdate</> or
|
|
<function>ExecForeignDelete</> will be called for each tuple to be
|
|
inserted, updated, or deleted.
|
|
</para>
|
|
|
|
<para>
|
|
<literal>mtstate</> is the overall state of the
|
|
<structname>ModifyTable</> plan node being executed; global data about
|
|
the plan and execution state is available via this structure.
|
|
<literal>rinfo</> is the <structname>ResultRelInfo</> struct describing
|
|
the target foreign table. (The <structfield>ri_FdwState</> field of
|
|
<structname>ResultRelInfo</> is available for the FDW to store any
|
|
private state it needs for this operation.)
|
|
<literal>fdw_private</> contains the private data generated by
|
|
<function>PlanForeignModify</>, if any.
|
|
<literal>subplan_index</> identifies which target of
|
|
the <structname>ModifyTable</> plan node this is.
|
|
<literal>eflags</> contains flag bits describing the executor's
|
|
operating mode for this plan node.
|
|
</para>
|
|
|
|
<para>
|
|
Note that when <literal>(eflags & EXEC_FLAG_EXPLAIN_ONLY)</> is
|
|
true, this function should not perform any externally-visible actions;
|
|
it should only do the minimum required to make the node state valid
|
|
for <function>ExplainForeignModify</> and <function>EndForeignModify</>.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>BeginForeignModify</> pointer is set to
|
|
<literal>NULL</>, no action is taken during executor startup.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
TupleTableSlot *
|
|
ExecForeignInsert (EState *estate,
|
|
ResultRelInfo *rinfo,
|
|
TupleTableSlot *slot,
|
|
TupleTableSlot *planSlot);
|
|
</programlisting>
|
|
|
|
Insert one tuple into the foreign table.
|
|
<literal>estate</> is global execution state for the query.
|
|
<literal>rinfo</> is the <structname>ResultRelInfo</> struct describing
|
|
the target foreign table.
|
|
<literal>slot</> contains the tuple to be inserted; it will match the
|
|
rowtype definition of the foreign table.
|
|
<literal>planSlot</> contains the tuple that was generated by the
|
|
<structname>ModifyTable</> plan node's subplan; it differs from
|
|
<literal>slot</> in possibly containing additional <quote>junk</>
|
|
columns. (The <literal>planSlot</> is typically of little interest
|
|
for <command>INSERT</> cases, but is provided for completeness.)
|
|
</para>
|
|
|
|
<para>
|
|
The return value is either a slot containing the data that was actually
|
|
inserted (this might differ from the data supplied, for example as a
|
|
result of trigger actions), or NULL if no row was actually inserted
|
|
(again, typically as a result of triggers). The passed-in
|
|
<literal>slot</> can be re-used for this purpose.
|
|
</para>
|
|
|
|
<para>
|
|
The data in the returned slot is used only if the <command>INSERT</>
|
|
query has a <literal>RETURNING</> clause. Hence, the FDW could choose
|
|
to optimize away returning some or all columns depending on the contents
|
|
of the <literal>RETURNING</> clause. However, some slot must be
|
|
returned to indicate success, or the query's reported row count will be
|
|
wrong.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>ExecForeignInsert</> pointer is set to
|
|
<literal>NULL</>, attempts to insert into the foreign table will fail
|
|
with an error message.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
TupleTableSlot *
|
|
ExecForeignUpdate (EState *estate,
|
|
ResultRelInfo *rinfo,
|
|
TupleTableSlot *slot,
|
|
TupleTableSlot *planSlot);
|
|
</programlisting>
|
|
|
|
Update one tuple in the foreign table.
|
|
<literal>estate</> is global execution state for the query.
|
|
<literal>rinfo</> is the <structname>ResultRelInfo</> struct describing
|
|
the target foreign table.
|
|
<literal>slot</> contains the new data for the tuple; it will match the
|
|
rowtype definition of the foreign table.
|
|
<literal>planSlot</> contains the tuple that was generated by the
|
|
<structname>ModifyTable</> plan node's subplan; it differs from
|
|
<literal>slot</> in possibly containing additional <quote>junk</>
|
|
columns. In particular, any junk columns that were requested by
|
|
<function>AddForeignUpdateTargets</> will be available from this slot.
|
|
</para>
|
|
|
|
<para>
|
|
The return value is either a slot containing the row as it was actually
|
|
updated (this might differ from the data supplied, for example as a
|
|
result of trigger actions), or NULL if no row was actually updated
|
|
(again, typically as a result of triggers). The passed-in
|
|
<literal>slot</> can be re-used for this purpose.
|
|
</para>
|
|
|
|
<para>
|
|
The data in the returned slot is used only if the <command>UPDATE</>
|
|
query has a <literal>RETURNING</> clause. Hence, the FDW could choose
|
|
to optimize away returning some or all columns depending on the contents
|
|
of the <literal>RETURNING</> clause. However, some slot must be
|
|
returned to indicate success, or the query's reported row count will be
|
|
wrong.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>ExecForeignUpdate</> pointer is set to
|
|
<literal>NULL</>, attempts to update the foreign table will fail
|
|
with an error message.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
TupleTableSlot *
|
|
ExecForeignDelete (EState *estate,
|
|
ResultRelInfo *rinfo,
|
|
TupleTableSlot *slot,
|
|
TupleTableSlot *planSlot);
|
|
</programlisting>
|
|
|
|
Delete one tuple from the foreign table.
|
|
<literal>estate</> is global execution state for the query.
|
|
<literal>rinfo</> is the <structname>ResultRelInfo</> struct describing
|
|
the target foreign table.
|
|
<literal>slot</> contains nothing useful upon call, but can be used to
|
|
hold the returned tuple.
|
|
<literal>planSlot</> contains the tuple that was generated by the
|
|
<structname>ModifyTable</> plan node's subplan; in particular, it will
|
|
carry any junk columns that were requested by
|
|
<function>AddForeignUpdateTargets</>. The junk column(s) must be used
|
|
to identify the tuple to be deleted.
|
|
</para>
|
|
|
|
<para>
|
|
The return value is either a slot containing the row that was deleted,
|
|
or NULL if no row was deleted (typically as a result of triggers). The
|
|
passed-in <literal>slot</> can be used to hold the tuple to be returned.
|
|
</para>
|
|
|
|
<para>
|
|
The data in the returned slot is used only if the <command>DELETE</>
|
|
query has a <literal>RETURNING</> clause. Hence, the FDW could choose
|
|
to optimize away returning some or all columns depending on the contents
|
|
of the <literal>RETURNING</> clause. However, some slot must be
|
|
returned to indicate success, or the query's reported row count will be
|
|
wrong.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>ExecForeignDelete</> pointer is set to
|
|
<literal>NULL</>, attempts to delete from the foreign table will fail
|
|
with an error message.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
EndForeignModify (EState *estate,
|
|
ResultRelInfo *rinfo);
|
|
</programlisting>
|
|
|
|
End the table update and release resources. It is normally not important
|
|
to release palloc'd memory, but for example open files and connections
|
|
to remote servers should be cleaned up.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>EndForeignModify</> pointer is set to
|
|
<literal>NULL</>, no action is taken during executor shutdown.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
int
|
|
IsForeignRelUpdatable (Relation rel);
|
|
</programlisting>
|
|
|
|
Report which update operations the specified foreign table supports.
|
|
The return value should be a bitmask of rule event numbers indicating
|
|
which operations are supported by the foreign table, using the
|
|
<literal>CmdType</> enumeration; that is,
|
|
<literal>(1 << CMD_UPDATE) = 4</> for <command>UPDATE</>,
|
|
<literal>(1 << CMD_INSERT) = 8</> for <command>INSERT</>, and
|
|
<literal>(1 << CMD_DELETE) = 16</> for <command>DELETE</>.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>IsForeignRelUpdatable</> pointer is set to
|
|
<literal>NULL</>, foreign tables are assumed to be insertable, updatable,
|
|
or deletable if the FDW provides <function>ExecForeignInsert</>,
|
|
<function>ExecForeignUpdate</>, or <function>ExecForeignDelete</>
|
|
respectively. This function is only needed if the FDW supports some
|
|
tables that are updatable and some that are not. (Even then, it's
|
|
permissible to throw an error in the execution routine instead of
|
|
checking in this function. However, this function is used to determine
|
|
updatability for display in the <literal>information_schema</> views.)
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="fdw-callbacks-explain">
|
|
<title>FDW Routines for <command>EXPLAIN</></title>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
ExplainForeignScan (ForeignScanState *node,
|
|
ExplainState *es);
|
|
</programlisting>
|
|
|
|
Print additional <command>EXPLAIN</> output for a foreign table scan.
|
|
This function can call <function>ExplainPropertyText</> and
|
|
related functions to add fields to the <command>EXPLAIN</> output.
|
|
The flag fields in <literal>es</> can be used to determine what to
|
|
print, and the state of the <structname>ForeignScanState</> node
|
|
can be inspected to provide run-time statistics in the <command>EXPLAIN
|
|
ANALYZE</> case.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>ExplainForeignScan</> pointer is set to
|
|
<literal>NULL</>, no additional information is printed during
|
|
<command>EXPLAIN</>.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
void
|
|
ExplainForeignModify (ModifyTableState *mtstate,
|
|
ResultRelInfo *rinfo,
|
|
List *fdw_private,
|
|
int subplan_index,
|
|
struct ExplainState *es);
|
|
</programlisting>
|
|
|
|
Print additional <command>EXPLAIN</> output for a foreign table update.
|
|
This function can call <function>ExplainPropertyText</> and
|
|
related functions to add fields to the <command>EXPLAIN</> output.
|
|
The flag fields in <literal>es</> can be used to determine what to
|
|
print, and the state of the <structname>ModifyTableState</> node
|
|
can be inspected to provide run-time statistics in the <command>EXPLAIN
|
|
ANALYZE</> case. The first four arguments are the same as for
|
|
<function>BeginForeignModify</>.
|
|
</para>
|
|
|
|
<para>
|
|
If the <function>ExplainForeignModify</> pointer is set to
|
|
<literal>NULL</>, no additional information is printed during
|
|
<command>EXPLAIN</>.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="fdw-callbacks-analyze">
|
|
<title>FDW Routines for <command>ANALYZE</></title>
|
|
|
|
<para>
|
|
<programlisting>
|
|
bool
|
|
AnalyzeForeignTable (Relation relation,
|
|
AcquireSampleRowsFunc *func,
|
|
BlockNumber *totalpages);
|
|
</programlisting>
|
|
|
|
This function is called when <xref linkend="sql-analyze"> is executed on
|
|
a foreign table. If the FDW can collect statistics for this
|
|
foreign table, it should return <literal>true</>, and provide a pointer
|
|
to a function that will collect sample rows from the table in
|
|
<parameter>func</>, plus the estimated size of the table in pages in
|
|
<parameter>totalpages</>. Otherwise, return <literal>false</>.
|
|
</para>
|
|
|
|
<para>
|
|
If the FDW does not support collecting statistics for any tables, the
|
|
<function>AnalyzeForeignTable</> pointer can be set to <literal>NULL</>.
|
|
</para>
|
|
|
|
<para>
|
|
If provided, the sample collection function must have the signature
|
|
<programlisting>
|
|
int
|
|
AcquireSampleRowsFunc (Relation relation, int elevel,
|
|
HeapTuple *rows, int targrows,
|
|
double *totalrows,
|
|
double *totaldeadrows);
|
|
</programlisting>
|
|
|
|
A random sample of up to <parameter>targrows</> rows should be collected
|
|
from the table and stored into the caller-provided <parameter>rows</>
|
|
array. The actual number of rows collected must be returned. In
|
|
addition, store estimates of the total numbers of live and dead rows in
|
|
the table into the output parameters <parameter>totalrows</> and
|
|
<parameter>totaldeadrows</>. (Set <parameter>totaldeadrows</> to zero
|
|
if the FDW does not have any concept of dead rows.)
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="fdw-helpers">
|
|
<title>Foreign Data Wrapper Helper Functions</title>
|
|
|
|
<para>
|
|
Several helper functions are exported from the core server so that
|
|
authors of foreign data wrappers can get easy access to attributes of
|
|
FDW-related objects, such as FDW options.
|
|
To use any of these functions, you need to include the header file
|
|
<filename>foreign/foreign.h</filename> in your source file.
|
|
That header also defines the struct types that are returned by
|
|
these functions.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
ForeignDataWrapper *
|
|
GetForeignDataWrapper(Oid fdwid);
|
|
</programlisting>
|
|
|
|
This function returns a <structname>ForeignDataWrapper</structname>
|
|
object for the foreign-data wrapper with the given OID. A
|
|
<structname>ForeignDataWrapper</structname> object contains properties
|
|
of the FDW (see <filename>foreign/foreign.h</filename> for details).
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
ForeignServer *
|
|
GetForeignServer(Oid serverid);
|
|
</programlisting>
|
|
|
|
This function returns a <structname>ForeignServer</structname> object
|
|
for the foreign server with the given OID. A
|
|
<structname>ForeignServer</structname> object contains properties
|
|
of the server (see <filename>foreign/foreign.h</filename> for details).
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
UserMapping *
|
|
GetUserMapping(Oid userid, Oid serverid);
|
|
</programlisting>
|
|
|
|
This function returns a <structname>UserMapping</structname> object for
|
|
the user mapping of the given role on the given server. (If there is no
|
|
mapping for the specific user, it will return the mapping for
|
|
<literal>PUBLIC</>, or throw error if there is none.) A
|
|
<structname>UserMapping</structname> object contains properties of the
|
|
user mapping (see <filename>foreign/foreign.h</filename> for details).
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
ForeignTable *
|
|
GetForeignTable(Oid relid);
|
|
</programlisting>
|
|
|
|
This function returns a <structname>ForeignTable</structname> object for
|
|
the foreign table with the given OID. A
|
|
<structname>ForeignTable</structname> object contains properties of the
|
|
foreign table (see <filename>foreign/foreign.h</filename> for details).
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
List *
|
|
GetForeignColumnOptions(Oid relid, AttrNumber attnum);
|
|
</programlisting>
|
|
|
|
This function returns the per-column FDW options for the column with the
|
|
given foreign table OID and attribute number, in the form of a list of
|
|
<structname>DefElem</structname>. NIL is returned if the column has no
|
|
options.
|
|
</para>
|
|
|
|
<para>
|
|
Some object types have name-based lookup functions in addition to the
|
|
OID-based ones:
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
ForeignDataWrapper *
|
|
GetForeignDataWrapperByName(const char *name, bool missing_ok);
|
|
</programlisting>
|
|
|
|
This function returns a <structname>ForeignDataWrapper</structname>
|
|
object for the foreign-data wrapper with the given name. If the wrapper
|
|
is not found, return NULL if missing_ok is true, otherwise raise an
|
|
error.
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
ForeignServer *
|
|
GetForeignServerByName(const char *name, bool missing_ok);
|
|
</programlisting>
|
|
|
|
This function returns a <structname>ForeignServer</structname> object
|
|
for the foreign server with the given name. If the server is not found,
|
|
return NULL if missing_ok is true, otherwise raise an error.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="fdw-planning">
|
|
<title>Foreign Data Wrapper Query Planning</title>
|
|
|
|
<para>
|
|
The FDW callback functions <function>GetForeignRelSize</>,
|
|
<function>GetForeignPaths</>, <function>GetForeignPlan</>, and
|
|
<function>PlanForeignModify</> must fit into the workings of the
|
|
<productname>PostgreSQL</> planner. Here are some notes about what
|
|
they must do.
|
|
</para>
|
|
|
|
<para>
|
|
The information in <literal>root</> and <literal>baserel</> can be used
|
|
to reduce the amount of information that has to be fetched from the
|
|
foreign table (and therefore reduce the cost).
|
|
<literal>baserel->baserestrictinfo</> is particularly interesting, as
|
|
it contains restriction quals (<literal>WHERE</> clauses) that should be
|
|
used to filter the rows to be fetched. (The FDW itself is not required
|
|
to enforce these quals, as the core executor can check them instead.)
|
|
<literal>baserel->reltargetlist</> can be used to determine which
|
|
columns need to be fetched; but note that it only lists columns that
|
|
have to be emitted by the <structname>ForeignScan</> plan node, not
|
|
columns that are used in qual evaluation but not output by the query.
|
|
</para>
|
|
|
|
<para>
|
|
Various private fields are available for the FDW planning functions to
|
|
keep information in. Generally, whatever you store in FDW private fields
|
|
should be palloc'd, so that it will be reclaimed at the end of planning.
|
|
</para>
|
|
|
|
<para>
|
|
<literal>baserel->fdw_private</> is a <type>void</> pointer that is
|
|
available for FDW planning functions to store information relevant to
|
|
the particular foreign table. The core planner does not touch it except
|
|
to initialize it to NULL when the <literal>baserel</> node is created.
|
|
It is useful for passing information forward from
|
|
<function>GetForeignRelSize</> to <function>GetForeignPaths</> and/or
|
|
<function>GetForeignPaths</> to <function>GetForeignPlan</>, thereby
|
|
avoiding recalculation.
|
|
</para>
|
|
|
|
<para>
|
|
<function>GetForeignPaths</> can identify the meaning of different
|
|
access paths by storing private information in the
|
|
<structfield>fdw_private</> field of <structname>ForeignPath</> nodes.
|
|
<structfield>fdw_private</> is declared as a <type>List</> pointer, but
|
|
could actually contain anything since the core planner does not touch
|
|
it. However, best practice is to use a representation that's dumpable
|
|
by <function>nodeToString</>, for use with debugging support available
|
|
in the backend.
|
|
</para>
|
|
|
|
<para>
|
|
<function>GetForeignPlan</> can examine the <structfield>fdw_private</>
|
|
field of the selected <structname>ForeignPath</> node, and can generate
|
|
<structfield>fdw_exprs</> and <structfield>fdw_private</> lists to be
|
|
placed in the <structname>ForeignScan</> plan node, where they will be
|
|
available at execution time. Both of these lists must be
|
|
represented in a form that <function>copyObject</> knows how to copy.
|
|
The <structfield>fdw_private</> list has no other restrictions and is
|
|
not interpreted by the core backend in any way. The
|
|
<structfield>fdw_exprs</> list, if not NIL, is expected to contain
|
|
expression trees that are intended to be executed at run time. These
|
|
trees will undergo post-processing by the planner to make them fully
|
|
executable.
|
|
</para>
|
|
|
|
<para>
|
|
In <function>GetForeignPlan</>, generally the passed-in target list can
|
|
be copied into the plan node as-is. The passed scan_clauses list
|
|
contains the same clauses as <literal>baserel->baserestrictinfo</>,
|
|
but may be re-ordered for better execution efficiency. In simple cases
|
|
the FDW can just strip <structname>RestrictInfo</> nodes from the
|
|
scan_clauses list (using <function>extract_actual_clauses</>) and put
|
|
all the clauses into the plan node's qual list, which means that all the
|
|
clauses will be checked by the executor at run time. More complex FDWs
|
|
may be able to check some of the clauses internally, in which case those
|
|
clauses can be removed from the plan node's qual list so that the
|
|
executor doesn't waste time rechecking them.
|
|
</para>
|
|
|
|
<para>
|
|
As an example, the FDW might identify some restriction clauses of the
|
|
form <replaceable>foreign_variable</> <literal>=</>
|
|
<replaceable>sub_expression</>, which it determines can be executed on
|
|
the remote server given the locally-evaluated value of the
|
|
<replaceable>sub_expression</>. The actual identification of such a
|
|
clause should happen during <function>GetForeignPaths</>, since it would
|
|
affect the cost estimate for the path. The path's
|
|
<structfield>fdw_private</> field would probably include a pointer to
|
|
the identified clause's <structname>RestrictInfo</> node. Then
|
|
<function>GetForeignPlan</> would remove that clause from scan_clauses,
|
|
but add the <replaceable>sub_expression</> to <structfield>fdw_exprs</>
|
|
to ensure that it gets massaged into executable form. It would probably
|
|
also put control information into the plan node's
|
|
<structfield>fdw_private</> field to tell the execution functions what
|
|
to do at run time. The query transmitted to the remote server would
|
|
involve something like <literal>WHERE <replaceable>foreign_variable</> =
|
|
$1</literal>, with the parameter value obtained at run time from
|
|
evaluation of the <structfield>fdw_exprs</> expression tree.
|
|
</para>
|
|
|
|
<para>
|
|
The FDW should always construct at least one path that depends only on
|
|
the table's restriction clauses. In join queries, it might also choose
|
|
to construct path(s) that depend on join clauses, for example
|
|
<replaceable>foreign_variable</> <literal>=</>
|
|
<replaceable>local_variable</>. Such clauses will not be found in
|
|
<literal>baserel->baserestrictinfo</> but must be sought in the
|
|
relation's join lists. A path using such a clause is called a
|
|
<quote>parameterized path</>. It must identify the other relations
|
|
used in the selected join clause(s) with a suitable value of
|
|
<literal>param_info</>; use <function>get_baserel_parampathinfo</>
|
|
to compute that value. In <function>GetForeignPlan</>, the
|
|
<replaceable>local_variable</> portion of the join clause would be added
|
|
to <structfield>fdw_exprs</>, and then at run time the case works the
|
|
same as for an ordinary restriction clause.
|
|
</para>
|
|
|
|
<para>
|
|
When planning an <command>UPDATE</> or <command>DELETE</>,
|
|
<function>PlanForeignModify</> can look up the <structname>RelOptInfo</>
|
|
struct for the foreign table and make use of the
|
|
<literal>baserel->fdw_private</> data previously created by the
|
|
scan-planning functions. However, in <command>INSERT</> the target
|
|
table is not scanned so there is no <structname>RelOptInfo</> for it.
|
|
The <structname>List</> returned by <function>PlanForeignModify</> has
|
|
the same restrictions as the <structfield>fdw_private</> list of a
|
|
<structname>ForeignScan</> plan node, that is it must contain only
|
|
structures that <function>copyObject</> knows how to copy.
|
|
</para>
|
|
|
|
<para>
|
|
For an <command>UPDATE</> or <command>DELETE</> against an external data
|
|
source that supports concurrent updates, it is recommended that the
|
|
<literal>ForeignScan</> operation lock the rows that it fetches, perhaps
|
|
via the equivalent of <command>SELECT FOR UPDATE</>. The FDW may also
|
|
choose to lock rows at fetch time when the foreign table is referenced
|
|
in a <command>SELECT FOR UPDATE/SHARE</>; if it does not, the
|
|
<literal>FOR UPDATE</> or <literal>FOR SHARE</> option is essentially a
|
|
no-op so far as the foreign table is concerned. This behavior may yield
|
|
semantics slightly different from operations on local tables, where row
|
|
locking is customarily delayed as long as possible: remote rows may get
|
|
locked even though they subsequently fail locally-applied restriction or
|
|
join conditions. However, matching the local semantics exactly would
|
|
require an additional remote access for every row, and might be
|
|
impossible anyway depending on what locking semantics the external data
|
|
source provides.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
</chapter>
|