Commit Graph

213 Commits

Author SHA1 Message Date
Peter Eisentraut 32b507378f postgres_fdw: Remove useless DO block in test
Discussion: https://www.postgresql.org/message-id/flat/b1f9f399-3a1a-b554-283f-4ae7f34608e2@enterprisedb.com
2022-09-16 15:57:34 +02:00
Peter Eisentraut 5ac51c8c9e Adjust assorted hint messages that list all valid options.
Instead of listing all valid options, we now try to provide one
that looks similar.  Since this may be useful elsewhere, this
change introduces a new set of functions that can be reused for
similar purposes.

Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b1f9f399-3a1a-b554-283f-4ae7f34608e2@enterprisedb.com
2022-09-16 14:53:12 +02:00
Etsuro Fujita 9320cfdd06 postgres_fdw: Avoid 'variable not found in subplan target list' error.
The tlist of the EvalPlanQual outer plan for a ForeignScan node is
adjusted to produce a tuple whose descriptor matches the scan tuple slot
for the ForeignScan node.  But in the case where the outer plan contains
an extra Sort node, if the new tlist contained columns required only for
evaluating PlaceHolderVars or columns required only for evaluating local
conditions, this would cause setrefs.c to fail with the error.

The cause of this is that when creating the outer plan by injecting the
Sort node into an alternative local join plan that could emit such extra
columns as well, we fail to arrange for the outer plan to propagate them
up through the Sort node, causing setrefs.c to fail to match up them in
the new tlist to what is available from the outer plan.  Repair.

Per report from Alexander Pyhalov.

Richard Guo and Etsuro Fujita, reviewed by Alexander Pyhalov and Tom Lane.
Backpatch to all supported versions.

Discussion: http://postgr.es/m/cfb17bf6dfdf876467bd5ef533852d18%40postgrespro.ru
2022-09-14 18:45:00 +09:00
Etsuro Fujita 82593b9a3d postgres_fdw: Disable batch insertion when there are WCO constraints.
When inserting a view referencing a foreign table that has WITH CHECK
OPTION constraints, in single-insert mode postgres_fdw retrieves the
data that was actually inserted on the remote side so that the WITH
CHECK OPTION constraints are enforced with the data locally, but in
batch-insert mode it cannot currently retrieve the data (except for the
row first inserted through the view), resulting in enforcing the WITH
CHECK OPTION constraints with the data passed from the core (except for
the first-inserted row), which led to incorrect results when inserting
into a view referencing a foreign table in which a remote BEFORE ROW
INSERT trigger changes the rows inserted through the view so that they
violate the view's WITH CHECK OPTION constraint.  Also, the query
inserting into the view caused an assertion failure in assert-enabled
builds.

Fix these by disabling batch insertion when inserting into such a view.

Back-patch to v14 where batch insertion was added.

Discussion: https://postgr.es/m/CAPmGK17LpbTZs4m4a_6THP54UBeK9fHvX8aVVA%2BC6yEZDZwQcg%40mail.gmail.com
2022-08-05 17:15:00 +09:00
David Rowley 1349d2790b Improve performance of ORDER BY / DISTINCT aggregates
ORDER BY / DISTINCT aggreagtes have, since implemented in Postgres, been
executed by always performing a sort in nodeAgg.c to sort the tuples in
the current group into the correct order before calling the transition
function on the sorted tuples.  This was not great as often there might be
an index that could have provided pre-sorted input and allowed the
transition functions to be called as the rows come in, rather than having
to store them in a tuplestore in order to sort them once all the tuples
for the group have arrived.

Here we change the planner so it requests a path with a sort order which
supports the most amount of ORDER BY / DISTINCT aggregate functions and
add new code to the executor to allow it to support the processing of
ORDER BY / DISTINCT aggregates where the tuples are already sorted in the
correct order.

Since there can be many ORDER BY / DISTINCT aggregates in any given query
level, it's very possible that we can't find an order that suits all of
these aggregates.  The sort order that the planner chooses is simply the
one that suits the most aggregate functions.  We take the most strictly
sorted variation of each order and see how many aggregate functions can
use that, then we try again with the order of the remaining aggregates to
see if another order would suit more aggregate functions.  For example:

SELECT agg(a ORDER BY a),agg2(a ORDER BY a,b) ...

would request the sort order to be {a, b} because {a} is a subset of the
sort order of {a,b}, but;

SELECT agg(a ORDER BY a),agg2(a ORDER BY c) ...

would just pick a plan ordered by {a} (we give precedence to aggregates
which are earlier in the targetlist).

SELECT agg(a ORDER BY a),agg2(a ORDER BY b),agg3(a ORDER BY b) ...

would choose to order by {b} since two aggregates suit that vs just one
that requires input ordered by {a}.

Author: David Rowley
Reviewed-by: Ronan Dunklau, James Coleman, Ranier Vilela, Richard Guo, Tom Lane
Discussion: https://postgr.es/m/CAApHDvpHzfo92%3DR4W0%2BxVua3BUYCKMckWAmo-2t_KiXN-wYH%3Dw%40mail.gmail.com
2022-08-02 23:11:45 +12:00
Michael Paquier 12c254c99f Tweak detail and hint messages to be consistent with project policy
Detail and hint messages should be full sentences and should end with a
period, but some of the messages newly-introduced in v15 did not follow
that.

Author: Justin Pryzby
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/20220719120948.GF12702@telsasoft.com
Backpatch-through: 15
2022-07-20 09:50:12 +09:00
Tom Lane 31e5b50292 postgres_fdw: be more wary about shippability of reg* constants.
Don't consider a constant of regconfig or other reg* types to be
shippable unless it refers to a built-in object, or an object in
an extension that's been marked shippable.  Without this
restriction, we're too likely to send a constant that will fail
to parse on the remote server.

For the regconfig type only, consider OIDs up to 16383 to be
"built in", rather than the normal cutoff of 9999.  Otherwise
the initdb-created text search configurations will be considered
unshippable, which is unlikely to make anyone happy.

It's possible that this new restriction will de-optimize queries
that were working satisfactorily before.  Users can restore any
lost performance by making sure that objects that can be expected
to exist on the remote side are in shippable extensions.  However,
that's not a change that people are likely to be happy about having
to make after a minor-release update.  Between that consideration
and the lack of field complaints, let's just change this in HEAD.

Noted while fixing bug #17483, although this is not precisely
the problem that that report complained about.

Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
2022-07-17 18:11:22 -04:00
Tom Lane 0a7ccee8fe postgres_fdw: set search_path to 'pg_catalog' while deparsing constants.
The motivation for this is to ensure successful transmission of the
values of constants of regconfig and other reg* types.  The remote
will be reading them with search_path = 'pg_catalog', so schema
qualification is necessary when referencing objects in other schemas.

Per bug #17483 from Emmanuel Quincerot.  Back-patch to all supported
versions.  (There's some other stuff to do here, but it's less
back-patchable.)

Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
2022-07-17 17:27:50 -04:00
Fujii Masao 3b00a944a9 Support TRUNCATE triggers on foreign tables.
Now some foreign data wrappers support TRUNCATE command.
So it's useful to support TRUNCATE triggers on foreign tables for
audit logging or for preventing undesired truncation.

Author: Yugo Nagata
Reviewed-by: Fujii Masao, Ian Lawrence Barwick
Discussion: https://postgr.es/m/20220630193848.5b02e0d6076b86617a915682@sraoss.co.jp
2022-07-12 09:18:02 +09:00
Etsuro Fujita 5c854e7a2c Disable asynchronous execution if using gating Result nodes.
mark_async_capable_plan(), which is called from create_append_plan() to
determine whether subplans are async-capable, failed to take into
account that the given subplan created from a given subpath might
include a gating Result node if the subpath is a SubqueryScanPath or
ForeignPath, causing a segmentation fault there when the subplan created
from a SubqueryScanPath includes the Result node, or causing
ExecAsyncRequest() to throw an error about an unrecognized node type
when the subplan created from a ForeignPath includes the Result node,
because in the latter case the Result node was unintentionally
considered as async-capable, but we don't currently support executing
Result nodes asynchronously.  Fix by modifying mark_async_capable_plan()
to disable asynchronous execution in such cases.  Also, adjust code in
the ProjectionPath case in mark_async_capable_plan(), for consistency
with other cases, and adjust/improve comments there.

is_async_capable_path() added in commit 27e1f1456, which was rewritten
to mark_async_capable_plan() in a later commit, has the same issue,
causing the error at execution mentioned above, so back-patch to v14
where the aforesaid commit went in.

Per report from Justin Pryzby.

Etsuro Fujita, reviewed by Zhihong Yu and Justin Pryzby.

Discussion: https://postgr.es/m/20220408124338.GK24419%40telsasoft.com
2022-04-28 15:15:00 +09:00
Etsuro Fujita 4eea2202be postgres_fdw: Disable batch insert when BEFORE ROW INSERT triggers exist.
Previously, we allowed this, but such triggers might query the table to
insert into and act differently if the tuples that have already been
processed and prepared for insertion are not there, so disable it in
such cases.

Back-patch to v14 where batch insert was added.

Discussion: https://postgr.es/m/CAPmGK16_uPqsmgK0-LpLSUk54_BoK13bPrhxhfjSoSTVz414hA%40mail.gmail.com
2022-04-21 15:30:00 +09:00
Etsuro Fujita c2bb02bc2e Allow asynchronous execution in more cases.
In commit 27e1f1456, create_append_plan() only allowed the subplan
created from a given subpath to be executed asynchronously when it was
an async-capable ForeignPath.  To extend coverage, this patch handles
cases when the given subpath includes some other Path types as well that
can be omitted in the plan processing, such as a ProjectionPath directly
atop an async-capable ForeignPath, allowing asynchronous execution in
partitioned-scan/partitioned-join queries with non-Var tlist expressions
and more UNION queries.

Andrey Lepikhov and Etsuro Fujita, reviewed by Alexander Pyhalov and
Zhihong Yu.

Discussion: https://postgr.es/m/659c37a8-3e71-0ff2-394c-f04428c76f08%40postgrespro.ru
2022-04-06 15:45:00 +09:00
Tom Lane f3dd9fe1dd Fix postgres_fdw to check shippability of sort clauses properly.
postgres_fdw would push ORDER BY clauses to the remote side without
verifying that the sort operator is safe to ship.  Moreover, it failed
to print a suitable USING clause if the sort operator isn't default
for the sort expression's type.  The net result of this is that the
remote sort might not have anywhere near the semantics we expect,
which'd be disastrous for locally-performed merge joins in particular.

We addressed similar issues in the context of ORDER BY within an
aggregate function call in commit 7012b132d, but failed to notice
that query-level ORDER BY was broken.  Thus, much of the necessary
logic already existed, but it requires refactoring to be usable
in both cases.

Back-patch to all supported branches.  In HEAD only, remove the
core code's copy of find_em_expr_for_rel, which is no longer used
and really should never have been pushed into equivclass.c in the
first place.

Ronan Dunklau, per report from David Rowley;
reviews by David Rowley, Ranier Vilela, and myself

Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com
2022-03-31 14:29:48 -04:00
Tomas Vondra db0d67db24 Optimize order of GROUP BY keys
When evaluating a query with a multi-column GROUP BY clause using sort,
the cost may be heavily dependent on the order in which the keys are
compared when building the groups. Grouping does not imply any ordering,
so we're allowed to compare the keys in arbitrary order, and a Hash Agg
leverages this. But for Group Agg, we simply compared keys in the order
as specified in the query. This commit explores alternative ordering of
the keys, trying to find a cheaper one.

In principle, we might generate grouping paths for all permutations of
the keys, and leave the rest to the optimizer. But that might get very
expensive, so we try to pick only a couple interesting orderings based
on both local and global information.

When planning the grouping path, we explore statistics (number of
distinct values, cost of the comparison function) for the keys and
reorder them to minimize comparison costs. Intuitively, it may be better
to perform more expensive comparisons (for complex data types etc.)
last, because maybe the cheaper comparisons will be enough. Similarly,
the higher the cardinality of a key, the lower the probability we’ll
need to compare more keys. The patch generates and costs various
orderings, picking the cheapest ones.

The ordering of group keys may interact with other parts of the query,
some of which may not be known while planning the grouping. E.g. there
may be an explicit ORDER BY clause, or some other ordering-dependent
operation, higher up in the query, and using the same ordering may allow
using either incremental sort or even eliminate the sort entirely.

The patch generates orderings and picks those minimizing the comparison
cost (for various pathkeys), and then adds orderings that might be
useful for operations higher up in the plan (ORDER BY, etc.). Finally,
it always keeps the ordering specified in the query, on the assumption
the user might have additional insights.

This introduces a new GUC enable_group_by_reordering, so that the
optimization may be disabled if needed.

The original patch was proposed by Teodor Sigaev, and later improved and
reworked by Dmitry Dolgov. Reviews by a number of people, including me,
Andrey Lepikhov, Claudio Freire, Ibrar Ahmed and Zhihong Yu.

Author: Dmitry Dolgov, Teodor Sigaev, Tomas Vondra
Reviewed-by: Tomas Vondra, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed, Zhihong Yu
Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru
Discussion: https://postgr.es/m/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com
2022-03-31 01:13:33 +02:00
Etsuro Fujita 04e706d423 postgres_fdw: Add support for parallel commit.
postgres_fdw commits remote (sub)transactions opened on remote server(s)
in a local (sub)transaction one by one when the local (sub)transaction
commits.  This patch allows it to commit the remote (sub)transactions in
parallel to improve performance.  This is enabled by the server option
"parallel_commit".  The default is false.

Etsuro Fujita, reviewed by Fujii Masao and David Zhang.

Discussion: http://postgr.es/m/CAPmGK17dAZCXvwnfpr1eTfknTGdt%3DhYTV9405Gt5SqPOX8K84w%40mail.gmail.com
2022-02-24 14:30:00 +09:00
Fujii Masao 94c49d5340 postgres_fdw: Make postgres_fdw.application_name support more escape sequences.
Commit 6e0cb3dec1 allowed postgres_fdw.application_name to include
escape sequences %a (application name), %d (database name), %u (user name)
and %p (pid). In addition to them, this commit makes it support
the escape sequences for session ID (%c) and cluster name (%C).
These are helpful to investigate where each remote transactions came from.

Author: Fujii Masao
Reviewed-by: Ryohei Takahashi, Kyotaro Horiguchi
Discussion: https://postgr.es/m/1041dc9a-c976-049f-9f14-e7d94c29c4b2@oss.nttdata.com
2022-02-18 11:38:12 +09:00
Etsuro Fujita 9e283fc85d postgres_fdw: Fix handling of a pending asynchronous request in postgresReScanForeignScan().
Commit 27e1f1456 failed to process a pending asynchronous request made
for a given ForeignScan node in postgresReScanForeignScan() (if any) in
cases where we would only reset the next_tuple counter in that function,
contradicting the assumption that there should be no pending
asynchronous requests that have been made for async-capable subplans for
the parent Append node after ReScan.  This led to an assert failure in
an assert-enabled build.  I think this would also lead to mis-rewinding
the cursor in that function in the case where we have already fetched
one batch for the ForeignScan node and the asynchronous request has been
made for the second batch, because even in that case we would just reset
the counter when called from that function, so we would fail to execute
MOVE BACKWARD ALL.

To fix, modify that function to process the asynchronous request before
restarting the scan.

While at it, add a comment to a function to match other places.

Per bug #17344 from Alexander Lakhin.  Back-patch to v14 where the
aforesaid commit came in.

Patch by me.  Test case by Alexander Lakhin, adjusted by me.  Reviewed
and tested by Alexander Lakhin and Dmitry Dolgov.

Discussion: https://postgr.es/m/17344-226b78b00de73a7e@postgresql.org
2022-01-27 16:15:00 +09:00
Fujii Masao 353aa01687 postgres_fdw: Add regression test for postgres_fdw.application_name GUC.
Commit 449ab63505 added postgres_fdw.application_name GUC that specifies
a value for application_name configuration parameter used when postgres_fdw
establishes a connection to a foreign server. Also commit 6e0cb3dec1
allowed it to include escape sequences. Both commits added the regression
tests for the GUC, but those tests were reverted by commits 98dbef90eb and
5e64ad3697 because they were unstable and caused some buildfarm members
to report the failure.

This is the third try to add the regression test for
postgres_fdw.application_name GUC.

One of issues to make the test unstable was to have used
postgres_fdw_disconnect_all() to close the existing remote connections.
The test expected that the remote connection and its corresponding backend
at the remote server disappeared just after postgres_fdw_disconnect_all()
was executed, but it could take a bit time for them to disappear.
To make sure that they exit, this commit makes the test use
pg_terminate_backend() with the timeout at the remote server, instead.
If the timeout is set to greater than zero, this function waits until
they are actually terminated (or until the given time has passed).

Another issue was that the test didn't take into consideration the case
where postgres_fdw.application_name containing some escape sequences was
converted to the string larger than NAMEDATALEN. In this case it was
truncated to less than NAMEDATALEN when it's passed to the remote server,
but the test expected wrongly that full string of application_name was
always viewable. This commit changes the test so that it can handle that case.

Author: Fujii Masao
Reviewed-by: Masahiko Sawada, Hayato Kuroda, Kyotaro Horiguchi
Discussion: https://postgr.es/m/3220909.1631054766@sss.pgh.pa.us
Discussion: https://postgr.es/m/20211224.180006.2247635208768233073.horikyota.ntt@gmail.com
Discussion: https://postgr.es/m/e7b61420-a97b-8246-77c4-a0d48fba5a45@oss.nttdata.com
2022-01-07 15:31:56 +09:00
Fujii Masao 5e64ad3697 postgres_fdw: Revert unstable tests for postgres_fdw.application_name.
Commit 6e0cb3dec1 added the tests that check that escape sequences in
postgres_fdw.application_name setting are replaced with status information
expectedly. But they were unstable and caused some buildfarm
members to report the failure. This commit reverts those unstable tests.
2021-12-24 17:39:59 +09:00
Fujii Masao 6e0cb3dec1 postgres_fdw: Allow postgres_fdw.application_name to include escape sequences.
application_name that used when postgres_fdw establishes a connection to
a foreign server can be specified in either or both a connection parameter
of a server object and GUC postgres_fdw.application_name. This commit
allows those parameters to include escape sequences that begins with
% character. Then postgres_fdw replaces those escape sequences with
status information. For example, %d and %u are replaced with user name
and database name in local server, respectively. This feature enables us
to add information more easily to track remote transactions or queries,
into application_name of a remote connection.

Author: Hayato Kuroda
Reviewed-by: Kyotaro Horiguchi, Masahiro Ikeda, Hou Zhijie, Fujii Masao
Discussion: https://postgr.es/m/TYAPR01MB5866FAE71C66547C64616584F5EB9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/TYCPR01MB5870D1E8B949DAF6D3B84E02F5F29@TYCPR01MB5870.jpnprd01.prod.outlook.com
2021-12-24 16:55:11 +09:00
Michael Paquier 2e577c9446 Remove assertion for ALTER TABLE .. DETACH PARTITION CONCURRENTLY
One code path related to this flavor of ALTER TABLE was checking that
the relation to detach has to be a normal table or a partitioned table,
which would fail if using the command with a different relation kind.

Views, sequences and materialized views cannot be part of a partition
tree, so these would cause the command to fail anyway, but the assertion
was triggered.  Foreign tables can be part of a partition tree, and
again the assertion would have failed.  The simplest solution is just to
remove this assertion, so as we get the same failure as the
non-concurrent code path.

While on it, add a regression test in postgres_fdw for the concurrent
partition detach of a foreign table, as per a suggestion from Alexander
Lakhin.

Issue introduced in 71f4c8c.

Reported-by: Alexander Lakhin
Author: Michael Paquier, Alexander Lakhin
Reviewed-by: Peter Eisentraut, Kyotaro Horiguchi
Discussion: https://postgr.es/m/17339-a9e09aaf38a3457a@postgresql.org
Backpatch-through: 14
2021-12-22 15:38:00 +09:00
David Rowley e502150f7d Allow Memoize to operate in binary comparison mode
Memoize would always use the hash equality operator for the cache key
types to determine if the current set of parameters were the same as some
previously cached set.  Certain types such as floating points where -0.0
and +0.0 differ in their binary representation but are classed as equal by
the hash equality operator may cause problems as unless the join uses the
same operator it's possible that whichever join operator is being used
would be able to distinguish the two values.  In which case we may
accidentally return in the incorrect rows out of the cache.

To fix this here we add a binary mode to Memoize to allow it to the
current set of parameters to previously cached values by comparing
bit-by-bit rather than logically using the hash equality operator.  This
binary mode is always used for LATERAL joins and it's used for normal
joins when any of the join operators are not hashable.

Reported-by: Tom Lane
Author: David Rowley
Discussion: https://postgr.es/m/3004308.1632952496@sss.pgh.pa.us
Backpatch-through: 14, where Memoize was added
2021-11-24 10:06:59 +13:00
Daniel Gustafsson aa12781b0d Improve publication error messages
Commit 81d5995b4b introduced more fine-grained errormessages for
incorrect relkinds for publication, while unlogged and temporary
tables were reported with using the same message.  This provides
separate error messages for these types of relpersistence.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Discussion: https://postgr.es/m/CALj2ACW9S=AswyQHjtO6WMcsergMkCBTtzXGrM8DX26DzfeTLQ@mail.gmail.com
2021-11-17 14:40:38 +01:00
Tom Lane f8abb0f5e1 postgres_fdw: suppress casts on constants in limited cases.
When deparsing an expression of the form "remote_var OP constant",
we'd normally apply a cast to the constant to make sure that the
remote parser thinks it's of the same type we do.  However, doing
so is often not necessary, and it causes problems if the user has
intentionally declared the local column as being of a different
type than the remote column.  A plausible use-case for that is
using text to represent a type that's an enum on the remote side.
A comparison on such a column will get shipped as "var = 'foo'::text",
which blows up on the remote side because there's no enum = text
operator.  But if we simply leave off the explicit cast, the
comparison will do exactly what the user wants.

It's possible to do this without major risk of semantic problems, by
relying on the longstanding parser heuristic that "if one operand of
an operator is of type unknown, while the other one has a known type,
assume that the unknown operand is also of that type".  Hence, this
patch leaves off the cast only if (a) the operator inputs have the same
type locally; (b) the constant will print as a string literal or NULL,
both of which are initially taken as type unknown; and (c) the non-Const
input is a plain foreign Var.  Rule (c) guarantees that the remote
parser will know the type of the non-Const input; moreover, it means
that if this cast-omission does cause any semantic surprises, that can
only happen in cases where the local column has a different type than
the remote column.  That wasn't guaranteed to work anyway, and this
patch should represent a net usability gain for such cases.

One point that I (tgl) remain slightly uncomfortable with is that we
will ignore an implicit RelabelType when deciding if the non-Const input
is a plain Var.  That makes it a little squishy to argue that the remote
should resolve the Const as being of the same type as its Var, because
then our Const is not the same type as our Var.  However, if we don't do
that, then this hack won't work as desired if the user chooses to use
varchar rather than text to represent some remote column.  That seems
useful, so do it like this for now.  We might have to give up the
RelabelType-ignoring bit if any problems surface.

Dian Fay, with review and kibitzing by me

Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 11:50:47 -05:00
Fujii Masao 5fedf7417b Improve HINT message that FDW reports when there are no valid options.
The foreign data wrapper's validator function provides a HINT message with
list of valid options for the object specified in CREATE or ALTER command,
when the option given in the command is invalid. Previously
postgresql_fdw_validator() and the validator functions for postgres_fdw and
dblink_fdw worked in that way even there were no valid options in the object,
which could lead to the HINT message with empty list (because there were
no valid options). For example, ALTER FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (format 'csv') reported the following ERROR and HINT messages.
This behavior was confusing.

    ERROR: invalid option "format"
    HINT: Valid options in this context are:

There is no such issue in file_fdw. The validator function for file_fdw
reports the HINT message "There are no valid options in this context."
instead in that case.

This commit improves postgresql_fdw_validator() and the validator functions
for postgres_fdw and dblink_fdw so that they do likewise. For example,
this change causes the above ALTER FOREIGN DATA WRAPPER command to
report the following messages.

    ERROR:  invalid option "nonexistent"
    HINT:  There are no valid options in this context.

Author: Kosei Masumura
Reviewed-by: Bharath Rupireddy, Fujii Masao
Discussion: https://postgr.es/m/557d06cebe19081bfcc83ee2affc98d3@oss.nttdata.com
2021-10-27 00:46:52 +09:00
Tom Lane 3071bbfe44 Fix null-pointer crash in postgres_fdw's conversion_error_callback.
Commit c7b7311f6 adjusted conversion_error_callback to always use
information from the query's rangetable, to avoid doing catalog lookups
in an already-failed transaction.  However, as a result of the utterly
inadequate documentation for make_tuple_from_result_row, I failed to
realize that fsstate could be NULL in some contexts.  That led to a
crash if we got a conversion error in such a context.  Fix by falling
back to the previous coding when fsstate is NULL.  Improve the
commentary, too.

Per report from Andrey Borodin.  Back-patch to 9.6, like the previous
patch.

Discussion: https://postgr.es/m/08916396-55E4-4D68-AB3A-BD6066F9E5C0@yandex-team.ru
2021-10-06 15:50:24 -04:00
Noah Misch b073c3ccd0 Revoke PUBLIC CREATE from public schema, now owned by pg_database_owner.
This switches the default ACL to what the documentation has recommended
since CVE-2018-1058.  Upgrades will carry forward any old ownership and
ACL.  Sites that declined the 2018 recommendation should take a fresh
look.  Recipes for commissioning a new database cluster from scratch may
need to create a schema, grant more privileges, etc.  Out-of-tree test
suites may require such updates.

Reviewed by Peter Eisentraut.

Discussion: https://postgr.es/m/20201031163518.GB4039133@rfd.leadboat.com
2021-09-09 23:38:09 -07:00
Fujii Masao 98dbef90eb postgres_fdw: Revert unstable tests for postgres_fdw.application_name.
Commit 449ab63505 added the tests that check that postgres_fdw.application_name
GUC works as expected. But they were unstable and caused some buildfarm
members to report the failure. This commit reverts those unstable tests.

Reported-by: Tom Lane as per buildfarm
Discussion: https://postgr.es/m/3220909.1631054766@sss.pgh.pa.us
2021-09-08 16:28:43 +09:00
Fujii Masao 449ab63505 postgres_fdw: Allow application_name of remote connection to be set via GUC.
This commit adds postgres_fdw.application_name GUC which specifies
a value for application_name configuration parameter used
when postgres_fdw establishes a connection to a foreign server.
This GUC setting always overrides application_name option of
the foreign server object. This GUC is useful when we want to
specify our own application_name per remote connection.

Previously application_name of a remote connection could be set
basically only via options of a server object. But which meant that
every session connecting to the same foreign server basically
should use the same application_name. Also if we want to change
the setting, we had to execute "ALTER SERVER ... OPTIONS ..." command.
It was inconvenient.

Author: Hayato Kuroda
Reviewed-by: Masahiro Ikeda, Fujii Masao
Discussion: https://postgr.es/m/TYCPR01MB5870D1E8B949DAF6D3B84E02F5F29@TYCPR01MB5870.jpnprd01.prod.outlook.com
2021-09-07 12:27:30 +09:00
Etsuro Fujita aa769f80ed postgres_fdw: Fix issues with generated columns in foreign tables.
postgres_fdw imported generated columns from the remote tables as plain
columns, and caused failures like "ERROR: cannot insert a non-DEFAULT
value into column "foo"" when inserting into the foreign tables, as it
tried to insert values into the generated columns.  To fix, we do the
following under the assumption that generated columns in a postgres_fdw
foreign table are defined so that they represent generated columns in
the underlying remote table:

* Send DEFAULT for the generated columns to the foreign server on insert
  or update, not generated column values computed on the local server.
* Add to postgresImportForeignSchema() an option "import_generated" to
  include column generated expressions in the definitions of foreign
  tables imported from a foreign server.  The option is true by default.

The assumption seems reasonable, because that would make a query of the
postgres_fdw foreign table return values for the generated columns that
are consistent with the generated expression.

While here, fix another issue in postgresImportForeignSchema(): it tried
to include column generated expressions as column default expressions in
the foreign table definitions when the import_default option was enabled.

Per bug #16631 from Daniel Cherniy.  Back-patch to v12 where generated
columns were added.

Discussion: https://postgr.es/m/16631-e929fe9db0ffc7cf%40postgresql.org
2021-08-05 20:00:00 +09:00
Tom Lane 5d44fff01e In postgres_fdw, allow CASE expressions to be pushed to the remote server.
This is simple enough except for the need to check whether CaseTestExpr
nodes have a collation that is not derived from a remote Var.  For that,
examine the CASE's "arg" expression and then pass that info down into the
recursive examination of the WHEN expressions.

Alexander Pyhalov, reviewed by Gilles Darold and myself

Discussion: https://postgr.es/m/fda09032e90d85d9b726a41e03f9097f@postgrespro.ru
2021-07-30 13:39:48 -04:00
Etsuro Fujita 1ec7fca859 postgres_fdw: Fix handling of pending asynchronous requests.
A pending asynchronous request is handled by process_pending_request(),
which previously not only processed an in-progress remote query but
performed ExecForeignScan() to produce a tuple to return to the local
server asynchronously from the result of the remote query.  But that led
to a server crash when executing a query or led to an "InstrStartNode
called twice in a row" or "InstrEndLoop called on running node" failure
when doing EXPLAIN ANALYZE of it, in cases where the plan tree for it
contained multiple async-capable nodes accessing the same
initplan/subplan that contained multiple async-capable nodes scanning
the same foreign tables as for the parent async-capable nodes, as
reported by Andrey Lepikhov.  The reason is that the second step in
process_pending_request() invoked when executing the initplan/subplan
for one of the parent async-capable nodes caused recursive execution of
the initplan/subplan for another of the parent async-capable nodes.

To fix, split process_pending_request() into the two steps and postpone
the second step until ForeignAsyncConfigureWait() is called for each of
the pending asynchronous requests.  Also, in ExecAppendAsyncEventWait()
we assumed that FDWs would register at least one wait event in a
WaitEventSet created there when they were called from
ForeignAsyncConfigureWait() in that function, but allow FDWs to register
zero wait events in the WaitEventSet; modify ExecAppendAsyncEventWait()
to just return in that case.

Oversight in commit 27e1f1456.  Back-patch to v14 where that commit went
in.

Andrey Lepikhov and Etsuro Fujita

Discussion: https://postgr.es/m/fe5eaa19-1704-e4a4-76ee-3b9d37ade399@postgrespro.ru
2021-07-30 17:00:00 +09:00
David Rowley 83f4fcc655 Change the name of the Result Cache node to Memoize
"Result Cache" was never a great name for this node, but nobody managed
to come up with another name that anyone liked enough.  That was until
David Johnston mentioned "Node Memoization", which Tom Lane revised to
just "Memoize".  People seem to like "Memoize", so let's do the rename.

Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/20210708165145.GG1176@momjian.us
Backpatch-through: 14, where Result Cache was introduced
2021-07-14 12:43:58 +12:00
Tom Lane d68a003912 Rename debug_invalidate_system_caches_always to debug_discard_caches.
The name introduced by commit 4656e3d66 was agreed to be unreasonably
long.  To match this change, rename initdb's recently-added
--clobber-cache option to --discard-caches.

Discussion: https://postgr.es/m/1374320.1625430433@sss.pgh.pa.us
2021-07-13 15:01:01 -04:00
Tom Lane b9734c13f1 Fix crash in postgres_fdw for provably-empty remote UPDATE/DELETE.
In 86dc90056, I'd written find_modifytable_subplan with the assumption
that if the immediate child of a ModifyTable is a Result, it must be
a projecting Result with a subplan.  However, if the UPDATE or DELETE
has a provably-constant-false WHERE clause, that's not so: we'll
generate a dummy subplan with a childless Result.  Add the missing
null-check so we don't crash on such cases.

Per report from Alexander Pyhalov.

Discussion: https://postgr.es/m/b9a6f53549456b2f3e2fd150dcd79d72@postgrespro.ru
2021-07-07 15:21:25 -04:00
Fujii Masao d854720df6 postgres_fdw: Tighten up allowed values for batch_size, fetch_size options.
Previously the values such as '100$%$#$#', '9,223,372,' were accepted and
treated as valid integers for postgres_fdw options batch_size and fetch_size.
Whereas this is not the case with fdw_startup_cost and fdw_tuple_cost options
for which an error is thrown. This was because endptr was not used
while converting strings to integers using strtol.

This commit changes the logic so that it uses parse_int function
instead of strtol as it serves the purpose by returning false in case
if it is unable to convert the string to integer. Note that
this function also rounds off the values such as '100.456' to 100 and
'100.567' or '100.678' to 101.

While on this, use parse_real for fdw_startup_cost and fdw_tuple_cost options.

Since parse_int and parse_real are being used for reloptions and GUCs,
it is more appropriate to use in postgres_fdw rather than using strtol
and strtod directly.

Back-patch to v14.

Author: Bharath Rupireddy
Reviewed-by: Ashutosh Bapat, Tom Lane, Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACVMO6wY5Pc4oe1OCgUOAtdjHuFsBDw8R5uoYR86eWFQDA@mail.gmail.com
2021-07-07 11:13:40 +09:00
Tom Lane c7b7311f61 Avoid doing catalog lookups in postgres_fdw's conversion_error_callback.
As in 50371df26, this is a bad idea since the callback can't really
know what error is being thrown and thus whether or not it is safe
to attempt catalog accesses.  Rather than pushing said accesses into
the mainline code where they'd usually be a waste of cycles, we can
look at the query's rangetable instead.

This change does mean that we'll be printing query aliases (if any
were used) rather than the table or column's true name.  But that
doesn't seem like a bad thing: it's certainly a more useful definition
in self-join cases, for instance.  In any case, it seems unlikely that
any applications would be depending on this detail, so it seems safe
to change.

Patch by me.  Original complaint by Andres Freund; Bharath Rupireddy
noted the connection to conversion_error_callback.

Discussion: https://postgr.es/m/20210106020229.ne5xnuu6wlondjpe@alap3.anarazel.de
2021-07-06 12:36:12 -04:00
Tom Lane 8021770909 Further stabilize postgres_fdw test.
The queries involving ft1_nopw don't stably return the same row
anymore.  I surmise that an autovacuum hitting "S 1"."T 1"
right after the updates introduced by f61db909d/5843659d0 freed
some space, changing where subsequent insertions get stored.
It's only by good luck that these results were stable before,
though, since a LIMIT without ORDER BY isn't well defined,
and it's not like we've ever treated that table as append-only
in this test script.

Since we only really care whether these commands succeed or not,
just replace "SELECT *" with "SELECT 1".

Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2021-06-23%2019%3A52%3A08
2021-06-24 15:02:13 -04:00
Tom Lane 5843659d09 Stabilize test case added by commit f61db909d.
Buildfarm members ayu and tern have sometimes shown a different
plan than expected for this query.  I'd been unable to reproduce
that before today, but I finally realized what is happening.
If there is a concurrent open transaction (probably an autovacuum
run in the buildfarm, but this can also be arranged manually),
then the index entries for the rows removed by the DELETE a few
lines up are not killed promptly, causing a change in the planner's
estimate of the extremal value of ft2.c1, which moves the rowcount
estimate for "c1 > 1100" by enough to change the join plan from
nestloop to hash.

To fix, change the query condition to "c1 > 1000", causing the
hash plan to be preferred whether or not a concurrent open
transaction exists.  Since this UPDATE is tailored to be a no-op,
nothing else changes.

Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=ayu&dt=2021-06-09%2022%3A45%3A48
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=ayu&dt=2021-06-13%2022%3A38%3A18
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2021-06-20%2004%3A55%3A36
2021-06-20 11:48:44 -04:00
Tomas Vondra 99cea49d65 Fix copying data into slots with FDW batching
Commit b676ac443b optimized handling of tuple slots with bulk inserts
into foreign tables, so that the slots are initialized only once and
reused for all batches. The data was however copied into the slots only
after the initialization, inserting duplicate values when the slot gets
reused. Fixed by moving the ExecCopySlot outside the init branch.

The existing postgres_fdw tests failed to catch this due to inserting
data into foreign tables without unique indexes, and then checking only
the number of inserted rows. This adds a new test with both a unique
index and a check of inserted values.

Reported-by: Alexander Pyhalov
Discussion: https://postgr.es/m/7a8cf8d56b3d18e5c0bccd6cd42d04ac%40postgrespro.ru
2021-06-16 23:49:25 +02:00
Tomas Vondra cb92703384 Adjust batch size in postgres_fdw to not use too many parameters
The FE/BE protocol identifies parameters with an Int16 index, which
limits the maximum number of parameters per query to 65535. With
batching added to postges_fdw this limit is much easier to hit, as
the whole batch is essentially a single query, making this error much
easier to hit.

The failures are a bit unpredictable, because it also depends on the
number of columns in the query. So instead of just failing, this patch
tweaks the batch_size to not exceed the maximum number of parameters.

Reported-by: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/OS0PR01MB571603973C0AC2874AD6BF2594299%40OS0PR01MB5716.jpnprd01.prod.outlook.com
2021-06-08 20:28:31 +02:00
Etsuro Fujita f3baaf28a6 Fix rescanning of async-aware Append nodes.
In cases where run-time pruning isn't required, the synchronous and
asynchronous subplans for an async-aware Append node determined using
classify_matching_subplans() should be re-used when rescanning the node,
but the previous code re-determined them using that function repeatedly
each time when rescanning the node, leading to incorrect results in a
normal build and an Assert failure in an Assert-enabled build as that
function doesn't assume that it's called repeatedly in such cases.  Fix
the code as mentioned above.

My oversight in commit 27e1f1456.

While at it, initialize async-related pointers/variables to NULL/zero
explicitly in ExecInitAppend() and ExecReScanAppend(), just to be sure.
(The variables would have been set to zero before we get to the latter
function, but let's do so.)

Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/CAPmGK16Q4B2_KY%2BJH7rb7wQbw54AUprp7TMekGTd2T1B62yysQ%40mail.gmail.com
2021-06-07 12:45:00 +09:00
Tom Lane f61db909df Fix postgres_fdw failure with whole-row Vars of type RECORD.
Commit 86dc90056 expects that FDWs can cope with whole-row Vars for
their tables, even if the Vars are marked with vartype RECORDOID.
Previously, whole-row Vars generated by the planner had vartype equal
to the relevant table's rowtype OID.  (The point behind this change is
to enable sharing of resjunk columns across inheritance child tables.)

It turns out that postgres_fdw fails to cope with this, though through
bad fortune none of its test cases exposed that.  Things mostly work,
but when we try to read back a value of such a Var, the expected
rowtype is not available to record_in().  Fortunately, it's not
difficult to hack up the tupdesc that controls this process to
substitute the foreign table's rowtype for RECORDOID.  Thus we can
solve the runtime problem while still sharing the resjunk column
with other tables.

Per report from Alexander Pyhalov.

Discussion: https://postgr.es/m/7817fb9ebd6661cdf9b67dec6e129a78@postgrespro.ru
2021-06-04 20:07:08 -04:00
Tom Lane 889592344c Fix planner's row-mark code for inheritance from a foreign table.
Commit 428b260f8 broke planning of cases where row marks are needed
(SELECT FOR UPDATE, etc) and one of the query's tables is a foreign
table that has regular table(s) as inheritance children.  We got the
reverse case right, but apparently were thinking that foreign tables
couldn't be inheritance parents.  Not so; so we need to be able to
add a CTID junk column while adding a new child, not only a wholerow
junk column.

Back-patch to v12 where the faulty code came in.

Amit Langote

Discussion: https://postgr.es/m/CA+HiwqEmo3FV1LAQ4TVyS2h1WM=kMkZUmbNuZSCnfHvMcUcPeA@mail.gmail.com
2021-06-02 14:38:14 -04:00
Etsuro Fujita a784859f44 Prevent asynchronous execution of direct foreign-table modifications.
Commits 27e1f1456 and 86dc90056, which were independently discussed,
cause a crash when executing an inherited foreign UPDATE/DELETE query
with asynchronous execution enabled, where children of an Append node
that is the direct/indirect child of the ModifyTable node are rewritten
so as to modify foreign tables directly by postgresPlanDirectModify();
as in that case the direct modifications are executed asynchronously,
which is not currently supported by asynchronous execution.  Fix by
disabling asynchronous execution of the direct modifications in that
function.

Author: Etsuro Fujita
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/CAPmGK158e9sJOfuWxfn%2B0ynrspXQU3JhNjSCbaoeSzMvnga%2Bbw%40mail.gmail.com
2021-05-13 20:00:00 +09:00
Etsuro Fujita a363bc6da9 Fix EXPLAIN ANALYZE for async-capable nodes.
EXPLAIN ANALYZE for an async-capable ForeignScan node associated with
postgres_fdw is done just by using instrumentation for ExecProcNode()
called from the node's callbacks, causing the following problems:

1) If the remote table to scan is empty, the node is incorrectly
   considered as "never executed" by the command even if the node is
   executed, as ExecProcNode() isn't called from the node's callbacks at
   all in that case.
2) The command fails to collect timings for things other than
   ExecProcNode() done in the node, such as creating a cursor for the
   node's remote query.

To fix these problems, add instrumentation for async-capable nodes, and
modify postgres_fdw accordingly.

My oversight in commit 27e1f1456.

While at it, update a comment for the AsyncRequest struct in execnodes.h
and the documentation for the ForeignAsyncRequest API in fdwhandler.sgml
to match the code in ExecAsyncAppendResponse() in nodeAppend.c, and fix
typos in comments in nodeAppend.c.

Per report from Andrey Lepikhov, though I didn't use his patch.

Reviewed-by: Andrey Lepikhov
Discussion: https://postgr.es/m/2eb662bb-105d-fc20-7412-2f027cc3ca72%40postgrespro.ru
2021-05-12 14:00:00 +09:00
Tom Lane 1273a15bf9 Disable cache clobber to avoid breaking postgres_fdw termination test.
Commit 93f414614 improved a pre-existing test case so that it would
show whether or not termination of the "remote" worker process happened.
This soon exposed that, when debug_invalidate_system_caches_always
(nee CLOBBER_CACHE_ALWAYS) is enabled, no such termination occurs.
That's because cache invalidation forces postgres_fdw connections
to be dropped at end of transaction, so that there's no worker to
terminate.  There's a race condition as to whether the worker will
manage to get out of the BackendStatusArray before we look, but at
least on buildfarm member hyrax, it's failed twice in two attempts.

Rather than re-lobotomizing the test, let's fix this by transiently
disabling debug_invalidate_system_caches_always.  (Hooray for that
being just a GUC nowadays, rather than a compile-time option.)
If this proves not to be enough to make the test stable, we can
do the other thing instead.

Discussion: https://postgr.es/m/3854538.1620081771@sss.pgh.pa.us
2021-05-04 13:36:26 -04:00
Fujii Masao 8e9ea08bae Don't pass "ONLY" options specified in TRUNCATE to foreign data wrapper.
Commit 8ff1c94649 allowed TRUNCATE command to truncate foreign tables.
Previously the information about "ONLY" options specified in TRUNCATE
command were passed to the foreign data wrapper. Then postgres_fdw
constructed the TRUNCATE command to issue the remote server and
included "ONLY" options in it based on the passed information.

On the other hand, "ONLY" options specified in SELECT, UPDATE or DELETE
have no effect when accessing or modifying the remote table, i.e.,
are not passed to the foreign data wrapper. So it's inconsistent to
make only TRUNCATE command pass the "ONLY" options to the foreign data
wrapper. Therefore this commit changes the TRUNCATE command so that
it doesn't pass the "ONLY" options to the foreign data wrapper,
for the consistency with other statements. Also this commit changes
postgres_fdw so that it always doesn't include "ONLY" options in
the TRUNCATE command that it constructs.

Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Kyotaro Horiguchi, Justin Pryzby, Zhihong Yu
Discussion: https://postgr.es/m/551ed8c1-f531-818b-664a-2cecdab99cd8@oss.nttdata.com
2021-04-27 14:41:27 +09:00
Michael Paquier 93f4146144 Simplify tests of postgres_fdw terminating connections
The tests introduced in 32a9c0b for connections broken and
re-established rely on pg_terminate_backend() for their logic.  When
these were introduced, this function simply sent a signal to a backend
without waiting for the operation to complete, and the tests repeatedly
looked at pg_stat_activity to check if the operation was completed or
not.  Since aaf0432, it is possible to define a timeout to make
pg_terminate_backend() wait for a certain duration, so make use of it,
with a timeout reasonably large enough (3min) to give enough room for
the tests to pass even on slow machines.

Some measurements show that the tests of postgres_fdw are much faster
with this change.  For example, on my laptop, they now take 4s instead
of 6s.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACXGY_EfGrMTjKjHy2zi-u1u9rdeioU_fro0T6Jo8t56KQ@mail.gmail.com
2021-04-14 14:23:53 +09:00
Fujii Masao 8ff1c94649 Allow TRUNCATE command to truncate foreign tables.
This commit introduces new foreign data wrapper API for TRUNCATE.
It extends TRUNCATE command so that it accepts foreign tables as
the targets to truncate and invokes that API. Also it extends postgres_fdw
so that it can issue TRUNCATE command to foreign servers, by adding
new routine for that TRUNCATE API.

The information about options specified in TRUNCATE command, e.g.,
ONLY, CACADE, etc is passed to FDW via API. The list of foreign tables to
truncate is also passed to FDW. FDW truncates the foreign data sources
that the passed foreign tables specify, based on those information.
For example, postgres_fdw constructs TRUNCATE command using them
and issues it to the foreign server.

For performance, TRUNCATE command invokes the FDW routine for
TRUNCATE once per foreign server that foreign tables to truncate belong to.

Author: Kazutaka Onishi, Kohei KaiGai, slightly modified by Fujii Masao
Reviewed-by: Bharath Rupireddy, Michael Paquier, Zhihong Yu, Alvaro Herrera, Stephen Frost, Ashutosh Bapat, Amit Langote, Daniel Gustafsson, Ibrar Ahmed, Fujii Masao
Discussion: https://postgr.es/m/CAOP8fzb_gkReLput7OvOK+8NHgw-RKqNv59vem7=524krQTcWA@mail.gmail.com
Discussion: https://postgr.es/m/CAJuF6cMWDDqU-vn_knZgma+2GMaout68YUgn1uyDnexRhqqM5Q@mail.gmail.com
2021-04-08 20:56:08 +09:00