Commit Graph

153 Commits

Author SHA1 Message Date
Michael Paquier 096bbf7c93 Switch back sslcompression to be a normal input field in libpq
Per buildfarm member crake, any servers including a postgres_fdw server
with this option set would fail to do a pg_upgrade properly as the
option got hidden in f9264d1 by becoming a debug option, making the
restore of the FDW server fail.

This changes back the option in libpq to be visible, but still inactive
to fix this upgrade issue.

Discussion: https://postgr.es/m/YEbq15JKJwIX+S6m@paquier.xyz
2021-03-09 19:52:36 +09:00
Michael Paquier f9264d1524 Remove support for SSL compression
PostgreSQL disabled compression as of e3bdb2d and the documentation
recommends against using it since.  Additionally, SSL compression has
been disabled in OpenSSL since version 1.1.0, and was disabled in many
distributions long before that.  The most recent TLS version, TLSv1.3,
disallows compression at the protocol level.

This commit removes the feature itself, removing support for the libpq
parameter sslcompression (parameter still listed for compatibility
reasons with existing connection strings, just ignored), and removes
the equivalent field in pg_stat_ssl and de facto PgBackendSSLStatus.

Note that, on top of removing the ability to activate compression by
configuration, compression is actively disabled in both frontend and
backend to avoid overrides from local configurations.

A TAP test is added for deprecated SSL parameters to check after
backwards compatibility.

Bump catalog version.

Author: Daniel Gustafsson
Reviewed-by: Peter Eisentraut, Magnus Hagander, Michael Paquier
Discussion:  https://postgr.es/m/7E384D48-11C5-441B-9EC3-F7DB1F8518F6@yesql.se
2021-03-09 11:16:47 +09:00
Peter Eisentraut f5465fade9 Allow specifying CRL directory
Add another method to specify CRLs, hashed directory method, for both
server and client side.  This offers a means for server or libpq to
load only CRLs that are required to verify a certificate.  The CRL
directory is specifed by separate GUC variables or connection options
ssl_crl_dir and sslcrldir, alongside the existing ssl_crl_file and
sslcrl, so both methods can be used at the same time.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/20200731.173911.904649928639357911.horikyota.ntt@gmail.com
2021-02-18 07:59:10 +01:00
Tomas Vondra 927f453a94 Fix tuple routing to initialize batching only for inserts
A cross-partition update on a partitioned table is implemented as a
delete followed by an insert. With foreign partitions, this was however
causing issues, because the FDW and core may disagree on when to enable
batching.  postgres_fdw was only allowing batching for plain inserts
(CMD_INSERT) while core was trying to batch the insert component of the
cross-partition update.  Fix by restricting core to apply batching only
to plain CMD_INSERT queries.

It's possible to allow batching for cross-partition updates, but that
will require more extensive changes, so better to leave that for a
separate patch.

Author: Amit Langote
Reviewed-by: Tomas Vondra, Takayuki Tsunakawa
Discussion: https://postgr.es/m/20200628151002.7x5laxwpgvkyiu3q@development
2021-02-18 00:03:45 +01:00
Etsuro Fujita 5e7fa189ee postgres_fdw: Fix assertion in estimate_path_cost_size().
Commit 08d2d58a2 added an assertion assuming that the retrieved_rows
estimate for a foreign relation, which is re-used to cost pre-sorted
foreign paths with local stats, is set to at least one row in
estimate_path_cost_size(), which isn't correct because if the relation
is a foreign table with tuples=0, the estimate would be set to 0 there
when not using remote estimates.

Per bug #16807 from Alexander Lakhin.  Back-patch to v13 where the
aforementioned commit went in.

Author: Etsuro Fujita
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/16807-9fe4e08fbaa5c7ce%40postgresql.org
2021-02-05 15:30:00 +09:00
Fujii Masao f77717b298 postgres_fdw: Fix tests for CLOBBER_CACHE_ALWAYS.
The regression tests added in commits 708d165ddb and 411ae64997 caused
buildfarm failures when  CLOBBER_CACHE_ALWAYS was enabled.
This commit stabilizes those tests.

The foreign server connections established by postgres_fdw behaves
differently depending on whether CLOBBER_CACHE_ALWAYS is enabled or not.
If it's not enabled, those connections are cached. On the other hand,
if it's enabled, when the connections are established outside transaction
block, they are not cached (i.e., they are immediately closed at the end of
query that established them). So the subsequent postgres_fdw_get_connections()
cannot list those connections and postgres_fdw_disconnect() cannot close them
(because they are already closed).

When the connections are established inside transaction block, they are
cached whether CLOBBER_CACHE_ALWAYS was enabled or not. But if it's enabled,
they are immediately marked as invalid, otherwise not. This causes the
subsequent postgres_fdw_get_connections() to return different result in
"valid" column depending on whether CLOBBER_CACHE_ALWAYS was enabled or not.

This commit prevents the above differences of behavior from
affecting the regression tests.

Per buildfarm failure on trilobite.

Original patch by Bharath Rupireddy. I (Fujii Masao) extracted
the regression test fix from that and revised it a bit.

Reported-by: Tom Lane
Author: Bharath Rupireddy
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/2688508.1611865371@sss.pgh.pa.us
2021-01-30 10:12:22 +09:00
Fujii Masao 0c3fc09fe3 postgres_fdw: Fix test failure with -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
The roles created by regression test should have names starting with
"regress_", and the test introduced in commit 411ae64997 did not do that.

Per buildfarm member longfin.

Discussion: https://postgr.es/m/73fc5ae4-3c54-1262-4533-f8c547de2e60@oss.nttdata.com
2021-01-26 17:16:52 +09:00
Fujii Masao 6adc5376dc postgres_fdw: Stabilize regression test for postgres_fdw_disconnect_all().
The regression test added in commit 411ae64997 caused buildfarm failures.
The cause of them was that the order of warning messages output in the test
was not stable. To fix this, this commit sets client_min_messages to ERROR
temporarily when performing the test generating those warnings.

Per buildfarm failures.

Discussion: https://postgr.es/m/2147113.1611644754@sss.pgh.pa.us
2021-01-26 16:36:21 +09:00
Fujii Masao 411ae64997 postgres_fdw: Add functions to discard cached connections.
This commit introduces two new functions postgres_fdw_disconnect()
and postgres_fdw_disconnect_all(). The former function discards
the cached connections to the specified foreign server. The latter discards
all the cached connections. If the connection is used in the current
transaction, it's not closed and a warning message is emitted.

For example, these functions are useful when users want to explicitly
close the foreign server connections that are no longer necessary and
then to prevent them from eating up the foreign servers connections
capacity.

Author: Bharath Rupireddy, tweaked a bit by Fujii Masao
Reviewed-by: Alexey Kondratov, Zhijie Hou, Zhihong Yu, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACVvrp5=AVp2PupEm+nAC8S4buqR3fJMmaCoc7ftT0aD2A@mail.gmail.com
2021-01-26 15:35:54 +09:00
Tomas Vondra b663a41363 Implement support for bulk inserts in postgres_fdw
Extends the FDW API to allow batching inserts into foreign tables. That
is usually much more efficient than inserting individual rows, due to
high latency for each round-trip to the foreign server.

It was possible to implement something similar in the regular FDW API,
but it was inconvenient and there were issues with reporting the number
of actually inserted rows etc. This extends the FDW API with two new
functions:

* GetForeignModifyBatchSize - allows the FDW picking optimal batch size

* ExecForeignBatchInsert - inserts a batch of rows at once

Currently, only INSERT queries support batching. Support for DELETE and
UPDATE may be added in the future.

This also implements batching for postgres_fdw. The batch size may be
specified using "batch_size" option both at the server and table level.

The initial patch version was written by me, but it was rewritten and
improved in many ways by Takayuki Tsunakawa.

Author: Takayuki Tsunakawa
Reviewed-by: Tomas Vondra, Amit Langote
Discussion: https://postgr.es/m/20200628151002.7x5laxwpgvkyiu3q@development
2021-01-20 23:57:27 +01:00
Fujii Masao 708d165ddb postgres_fdw: Add function to list cached connections to foreign servers.
This commit adds function postgres_fdw_get_connections() to return
the foreign server names of all the open connections that postgres_fdw
established from the local session to the foreign servers. This function
also returns whether each connection is valid or not.

This function is useful when checking all the open foreign server connections.
If we found some connection to drop, from the result of function, probably
we can explicitly close them by the function that upcoming commit will add.

This commit bumps the version of postgres_fdw to 1.1 since it adds
new function.

Author: Bharath Rupireddy, tweaked by Fujii Masao
Reviewed-by: Zhijie Hou, Alexey Kondratov, Zhihong Yu, Fujii Masao
Discussion: https://postgr.es/m/2d5cb0b3-a6e8-9bbb-953f-879f47128faa@oss.nttdata.com
2021-01-18 15:11:08 +09:00
Fujii Masao e3ebcca843 postgres_fdw: Fix connection leak.
In postgres_fdw, the cached connections to foreign servers will not be
closed until the local session exits if the user mappings or foreign servers
that those connections depend on are dropped. Those connections can be
leaked.

To fix that connection leak issue, after a change to a pg_foreign_server
or pg_user_mapping catalog entry, this commit makes postgres_fdw close
the connections depending on that entry immediately if current
transaction has not used those connections yet. Otherwise, mark those
connections as invalid and then close them at the end of current transaction,
since they cannot be closed in the midst of the transaction using them.
Closed connections will be remade at the next opportunity if necessary.

Back-patch to all supported branches.

Author: Bharath Rupireddy
Reviewed-by: Zhihong Yu, Zhijie Hou, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACVNcGH_6qLY-4_tXz8JLvA+4yeBThRfxMz7Oxbk1aHcpQ@mail.gmail.com
2020-12-28 19:56:13 +09:00
Tom Lane 85d08b8b72 Band-aid new postgres_fdw test case to remove error text dependency.
Buildfarm member lorikeet is still failing the test from commit
32a9c0bdf, but now it's down to the should-have-foreseen-it problem
that the error message isn't what the expected-output file expects.
Let's see if we can get stable results by printing just the SQLSTATE.
I believe we'll reliably see ERRCODE_CONNECTION_FAILURE, since
pgfdw_report_error() will report that for any libpq-originated error.

There may be a better way to do this, but I'd like to get the
buildfarm back to green before we discuss further improvements.

Discussion: https://postgr.es/m/E1kPc9v-0005L4-2l@gemulon.postgresql.org
Discussion: https://postgr.es/m/2621622.1602184554@sss.pgh.pa.us
2020-10-10 19:57:25 -04:00
Fujii Masao 32a9c0bdf4 postgres_fdw: reestablish new connection if cached one is detected as broken.
In postgres_fdw, once remote connections are established, they are cached
and re-used for subsequent queries and transactions. There can be some
cases where those cached connections are unavaiable, for example,
by the restart of remote server. In these cases, previously an error was
reported and the query accessing to remote server failed if new remote
transaction failed to start because the cached connection was broken.

This commit improves postgres_fdw so that new connection is remade
if broken connection is detected when starting new remote transaction.
This is useful to avoid unnecessary failure of queries when connection is
broken but can be reestablished.

Author: Bharath Rupireddy, tweaked a bit by Fujii Masao
Reviewed-by: Ashutosh Bapat, Tatsuhito Kasahara, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACUAi23vf1WiHNar_LksM9EDOWXcbHCo-fD4Mbr1d=78YQ@mail.gmail.com
2020-10-06 10:51:07 +09:00
Tom Lane 76f412ab31 Remove factorial operators, leaving only the factorial() function.
The "!" operator is our only built-in postfix operator.  Remove it,
on the way to removal of grammar support for postfix operators.

There is also a "!!" prefix operator, but since it's been marked
deprecated for most of its existence, we might as well remove it too.

Also zap the SQL alias function numeric_fac(), which seems to have
equally little reason to live.

Mark Dilger, based on work by myself and Robert Haas;
review by John Naylor

Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-17 16:17:27 -04:00
Michael Paquier a6642b3ae0 Add support for partitioned tables and indexes in REINDEX
Until now, REINDEX was not able to work with partitioned tables and
indexes, forcing users to reindex partitions one by one.  This extends
REINDEX INDEX and REINDEX TABLE so as they can accept a partitioned
index and table in input, respectively, to reindex all the partitions
assigned to them with physical storage (foreign tables, partitioned
tables and indexes are then discarded).

This shares some logic with schema and database REINDEX as each
partition gets processed in its own transaction after building a list of
relations to work on.  This choice has the advantage to minimize the
number of invalid indexes to one partition with REINDEX CONCURRENTLY in
the event a cancellation or failure in-flight, as the only indexes
handled at once in a single REINDEX CONCURRENTLY loop are the ones from
the partition being working on.

Isolation tests are added to emulate some cases I bumped into while
developing this feature, particularly with the concurrent drop of a
leaf partition reindexed.  However, this is rather limited as LOCK would
cause REINDEX to block in the first transaction building the list of
partitions.

Per its multi-transaction nature, this new flavor cannot run in a
transaction block, similarly to REINDEX SCHEMA, SYSTEM and DATABASE.

Author: Justin Pryzby, Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/db12e897-73ff-467e-94cb-4af03705435f.adger.lj@alibaba-inc.com
2020-09-08 10:09:22 +09:00
Jeff Davis 0babd10980 Revert "Use CP_SMALL_TLIST for hash aggregate"
This reverts commit 4cad2534da due to a
performance regression. It will be replaced by a new approach in an
upcoming commit.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/20200614181418.mx4bvljmfkkhoqzl@alap3.anarazel.de
Backpatch-through: 13
2020-07-12 22:59:32 -07:00
Tomas Vondra 4cad2534da Use CP_SMALL_TLIST for hash aggregate
Commit 1f39bce021 added disk-based hash aggregation, which may spill
incoming tuples to disk. It however did not request projection to make
the tuples as narrow as possible, which may mean having to spill much
more data than necessary (increasing I/O, pushing other stuff from page
cache, etc.).

This adds CP_SMALL_TLIST in places that may use hash aggregation - we do
that only for AGG_HASHED. It's unnecessary for AGG_SORTED, because that
either uses explicit Sort (which already does projection) or pre-sorted
input (which does not need spilling to disk).

Author: Tomas Vondra
Reviewed-by: Jeff Davis
Discussion: https://postgr.es/m/20200519151202.u2p2gpiawoaznsv2%40development
2020-05-31 14:43:13 +02:00
Michael Paquier 401aad6704 Rename connection parameters to control min/max SSL protocol version in libpq
The libpq parameters ssl{max|min}protocolversion are renamed to use
underscores, to become ssl_{max|min}_protocol_version.  The related
environment variables still use the names introduced in commit ff8ca5f
that added the feature.

Per complaint from Peter Eisentraut (this was also mentioned by me in
the original patch review but the issue got discarded).

Author: Daniel Gustafsson
Reviewed-by: Peter Eisentraut, Michael Paquier
Discussion: https://postgr.es/m/b319e449-318d-e691-4997-1327e166fcc4@2ndquadrant.com
2020-04-30 13:39:10 +09:00
Michael Paquier ff8ca5fadd Add connection parameters to control SSL protocol min/max in libpq
These two new parameters, named sslminprotocolversion and
sslmaxprotocolversion, allow to respectively control the minimum and the
maximum version of the SSL protocol used for the SSL connection attempt.
The default setting is to allow any version for both the minimum and the
maximum bounds, causing libpq to rely on the bounds set by the backend
when negotiating the protocol to use for an SSL connection.  The bounds
are checked when the values are set at the earliest stage possible as
this makes the checks independent of any SSL implementation.

Author: Daniel Gustafsson
Reviewed-by: Michael Paquier, Cary Huang
Discussion: https://postgr.es/m/4F246AE3-A7AE-471E-BD3D-C799D3748E03@yesql.se
2020-01-28 10:40:48 +09:00
Tom Lane 215824f918 In postgres_fdw, don't try to ship MULTIEXPR updates to remote server.
In a statement like "UPDATE remote_tab SET (x,y) = (SELECT ...)",
we'd conclude that the statement could be directly executed remotely,
because the sub-SELECT is in a resjunk tlist item that's not examined
for shippability.  Currently that ends up crashing if the sub-SELECT
contains any remote Vars.  Prevent the crash by deeming MULTIEXEC
Params to be unshippable.

This is a bit of a brute-force solution, since if the sub-SELECT
*doesn't* contain any remote Vars, the current execution technology
would work; but that's not a terribly common use-case for this syntax,
I think.  In any case, we generally don't try to ship sub-SELECTs, so
it won't surprise anybody that this doesn't end up as a remote direct
update.  I'd be inclined to see if that general limitation can be fixed
before worrying about this case further.

Per report from Lukáš Sobotka.

Back-patch to 9.6.  9.5 had MULTIEXPR, but we didn't try to perform
remote direct updates then, so the case didn't arise anyway.

Discussion: https://postgr.es/m/CAJif3k+iA_ekBB5Zw2hDBaE1wtiQa4LH4_JUXrrMGwTrH0J01Q@mail.gmail.com
2020-01-26 14:31:08 -05:00
Andrew Dunstan cebf9d6e6e Only superuser can set sslcert/sslkey in postgres_fdw user mappings
Othrwise there is a security risk.

Discussion: https://postgr.es/m/20200109103014.GA4192@msg.df7cb.de
2020-01-13 18:08:09 +10:30
Andrew Dunstan f5fd995a1a Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
This allows different users to authenticate with different certificates.

Author: Craig Ringer
2020-01-09 18:39:54 +10:30
Tom Lane 0af0504da9 Adjust test case added by commit 6136e94dc.
Per project policy, transient roles created by regression test cases
should be named "regress_something", to reduce the risks of running
such cases against installed servers.  And no such role should ever
be left behind after running a test.

Discussion: https://postgr.es/m/11297.1576868677@sss.pgh.pa.us
2019-12-20 15:45:37 -05:00
Tom Lane e60b480d39 libpq should expose GSS-related parameters even when not implemented.
We realized years ago that it's better for libpq to accept all
connection parameters syntactically, even if some are ignored or
restricted due to lack of the feature in a particular build.
However, that lesson from the SSL support was for some reason never
applied to the GSSAPI support.  This is causing various buildfarm
members to have problems with a test case added by commit 6136e94dc,
and it's just a bad idea from a user-experience standpoint anyway,
so fix it.

While at it, fix some places where parameter-related infrastructure
was added with the aid of a dartboard, or perhaps with the aid of
the anti-pattern "add new stuff at the end".  It should be safe
to rearrange the contents of struct pg_conn even in released
branches, since that's private to libpq (and we'd have to move
some fields in some builds to fix this, anyway).

Back-patch to all supported branches.

Discussion: https://postgr.es/m/11297.1576868677@sss.pgh.pa.us
2019-12-20 15:34:07 -05:00
Andrew Dunstan 6136e94dcb Superuser can permit passwordless connections on postgres_fdw
Currently postgres_fdw doesn't permit a non-superuser to connect to a
foreign server without specifying a password, or to use an
authentication mechanism that doesn't use the password. This is to avoid
using the settings and identity of the user running Postgres.

However, this doesn't make sense for all authentication methods. We
therefore allow a superuser to set "password_required 'false'" for user
mappings for the postgres_fdw. The superuser must ensure that the
foreign server won't try to rely solely on the server identity (e.g.
trust, peer, ident) or use an authentication mechanism that relies on the
password settings (e.g. md5, scram-sha-256).

This feature is a prelude to better support for sslcert and sslkey
settings in user mappings.

Author: Craig Ringer.
Discussion: https://postgr.es/m/075135da-545c-f958-fed0-5dcb462d6dae@2ndQuadrant.com
2019-12-20 16:23:34 +10:30
Tom Lane 6ef77cf46e Further adjust EXPLAIN's choices of table alias names.
This patch causes EXPLAIN to always assign a separate table alias to the
parent RTE of an append relation (inheritance set); before, such RTEs
were ignored if not actually scanned by the plan.  Since the child RTEs
now always have that same alias to start with (cf. commit 55a1954da),
the net effect is that the parent RTE usually gets the alias used or
implied by the query text, and the children all get that alias with "_N"
appended.  (The exception to "usually" is if there are duplicate aliases
in different subtrees of the original query; then some of those original
RTEs will also have "_N" appended.)

This results in more uniform output for partitioned-table plans than
we had before: the partitioned table itself gets the original alias,
and all child tables have aliases with "_N", rather than the previous
behavior where one of the children would get an alias without "_N".

The reason for giving the parent RTE an alias, even if it isn't scanned
by the plan, is that we now use the parent's alias to qualify Vars that
refer to an appendrel output column and appear above the Append or
MergeAppend that computes the appendrel.  But below the append, Vars
refer to some one of the child relations, and are displayed that way.
This seems clearer than the old behavior where a Var that could carry
values from any child relation was displayed as if it referred to only
one of them.

While at it, change ruleutils.c so that the code paths used by EXPLAIN
deal in Plan trees not PlanState trees.  This effectively reverts a
decision made in commit 1cc29fe7c, which seemed like a good idea at
the time to make ruleutils.c consistent with explain.c.  However,
it's problematic because we'd really like to allow executor startup
pruning to remove all the children of an append node when possible,
leaving no child PlanState to resolve Vars against.  (That's not done
here, but will be in the next patch.)  This requires different handling
of subplans and initplans than before, but is otherwise a pretty
straightforward change.

Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 17:05:18 -05:00
Etsuro Fujita 5a20b0219e Fix handling of multiple AFTER ROW triggers on a foreign table.
AfterTriggerExecute() retrieves a fresh tuple or pair of tuples from a
tuplestore and then stores the tuple(s) in the passed-in slot(s) if
AFTER_TRIGGER_FDW_FETCH, while it uses the most-recently-retrieved
tuple(s) stored in the slot(s) if AFTER_TRIGGER_FDW_REUSE.  This was
done correctly before 12, but commit ff11e7f4b broke it by mistakenly
clearing the tuple(s) stored in the slot(s) in that function, leading to
an assertion failure as reported in bug #16139 from Alexander Lakhin.

Also, fix some other issues with the aforementioned commit in passing:

* For tg_newslot, which is a slot added to the TriggerData struct by the
  commit to store new updated tuples, it didn't ensure the slot was NULL
  if there was no such tuple.
* The commit failed to update the documentation about the trigger
  interface.

Author: Etsuro Fujita
Backpatch-through: 12
Discussion: https://postgr.es/m/16139-94f9ccf0db6119ec%40postgresql.org
2019-12-10 18:00:30 +09:00
Tom Lane bf39b3af6a Further sync postgres_fdw's "Relations" output with the rest of EXPLAIN.
EXPLAIN generally only adds schema qualifications to table names when
VERBOSE is specified.  In postgres_fdw's "Relations" output, table
names were always so qualified, but that was an implementation
restriction: in the original coding, we didn't have access to the
verbose flag at the time the string was generated.  After the code
rearrangement of commit 4526951d5, we do have that info available
at the right time, so make this output follow the normal rule.

Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-03 12:25:56 -05:00
Tom Lane 55a1954da1 Fix EXPLAIN's column alias output for mismatched child tables.
If an inheritance/partitioning parent table is assigned some column
alias names in the query, EXPLAIN mapped those aliases onto the
child tables' columns by physical position, resulting in bogus output
if a child table's columns aren't one-for-one with the parent's.

To fix, make expand_single_inheritance_child() generate a correctly
re-mapped column alias list, rather than just copying the parent
RTE's alias node.  (We have to fill the alias field, not just
adjust the eref field, because ruleutils.c will ignore eref in
favor of looking at the real column names.)

This means that child tables will now always have alias fields in
plan rtables, where before they might not have.  That results in
a rather substantial set of regression test output changes:
EXPLAIN will now always show child tables with aliases that match
the parent table (usually with "_N" appended for uniqueness).
But that seems like a net positive for understandability, since
the parent alias corresponds to something that actually appeared
in the original query, while the child table names didn't.
(Note that this does not change anything for cases where an explicit
table alias was written in the query for the parent table; it
just makes cases without such aliases behave similarly to that.)
Hence, while we could avoid these subsidiary changes if we made
inherit.c more complicated, we choose not to.

Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-02 19:08:10 -05:00
Tom Lane 4526951d56 Make postgres_fdw's "Relations" output agree with the rest of EXPLAIN.
The relation aliases shown in the "Relations" line for a foreign scan
didn't always agree with those used in the rest of EXPLAIN's output.
The regression test result changes appearing here provide examples.

It's really impossible for postgres_fdw to duplicate EXPLAIN's alias
assignment logic during postgresGetForeignRelSize(), because of the
de-duplication that EXPLAIN does on a global basis --- and anyway,
trying to duplicate that would be unmaintainable.  Instead, just put
numeric rangetable indexes into the string, and convert those to
table names/aliases in postgresExplainForeignScan, which does have
access to the results of ruleutils.c's alias assignment logic.
Aside from being more reliable, this shifts some work from planning
to EXPLAIN, which is a good tradeoff for performance.  (I also
changed from using StringInfo to using psprintf, which makes the
code slightly simpler and reduces its memory consumption.)

A kluge required by this solution is that we have to reverse-engineer
the rtoffset applied by setrefs.c.  If that logic ever fails
(presumably because the member tables of a join got offset by
different amounts), we'll need some more cooperation with setrefs.c
to keep things straight.  But for now, there's no need for that.

Arguably this is a back-patchable bug fix, but since this is a mostly
cosmetic issue and there have been no field complaints, I'll refrain
for now.

Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-02 16:31:03 -05:00
Michael Paquier 94fec48516 Add regression test for two-phase transaction in postgres_fdw
postgres_fdw does not support two-phase transactions, so let's add a
small negative test case to check after it.  Note that this is checked
using an end-of-xact callback to ensure a proper connection cleanup with
the foreign server, which is called before checking if a server is able
to handle 2PC with max_prepared_xacts, so this test does not need an
alternate output file.

Author: Gilles Darold
Discussion: https://postgr.es/m/20191108090507.GC1768@paquier.xyz
2019-11-13 13:30:14 +09:00
Michael Paquier c74d49d41c Fix many typos and inconsistencies
Author: Alexander Lakhin
Discussion: https://postgr.es/m/af27d1b3-a128-9d62-46e0-88f424397f44@gmail.com
2019-07-01 10:00:23 +09:00
Etsuro Fujita 8b6da83d16 postgres_fdw: Account for triggers in non-direct remote UPDATE planning.
Previously, in postgresPlanForeignModify, we planned an UPDATE operation
on a foreign table so that we transmit only columns that were explicitly
targets of the UPDATE, so as to avoid unnecessary data transmission, but
if there were BEFORE ROW UPDATE triggers on the foreign table, those
triggers might change values for non-target columns, in which case we
would miss sending changed values for those columns.  Prevent optimizing
away transmitting all columns if there are BEFORE ROW UPDATE triggers on
the foreign table.

This is an oversight in commit 7cbe57c34 which added triggers on foreign
tables, so apply the patch all the way back to 9.4 where that came in.

Author: Shohei Mochizuki
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/201905270152.x4R1q3qi014550@toshiba.co.jp
2019-06-13 17:59:09 +09:00
Tom Lane 8cad5adb9c Avoid postgres_fdw crash for a targetlist entry that's just a Param.
foreign_grouping_ok() is willing to put fairly arbitrary expressions into
the targetlist of a remote SELECT that's doing grouping or aggregation on
the remote side, including expressions that have no foreign component to
them at all.  This is possibly a bit dubious from an efficiency standpoint;
but it rises to the level of a crash-causing bug if the expression is just
a Param or non-foreign Var.  In that case, the expression will necessarily
also appear in the fdw_exprs list of values we need to send to the remote
server, and then setrefs.c's set_foreignscan_references will mistakenly
replace the fdw_exprs entry with a Var referencing the targetlist result.

The root cause of this problem is bad design in commit e7cb7ee14: it put
logic into set_foreignscan_references that IMV is postgres_fdw-specific,
and yet this bug shows that it isn't postgres_fdw-specific enough.  The
transformation being done on fdw_exprs assumes that fdw_exprs is to be
evaluated with the fdw_scan_tlist as input, which is not how postgres_fdw
uses it; yet it could be the right thing for some other FDW.  (In the
bigger picture, setrefs.c has no business assuming this for the other
expression fields of a ForeignScan either.)

The right fix therefore would be to expand the FDW API so that the
FDW could inform setrefs.c how it intends to evaluate these various
expressions.  We can't change that in the back branches though, and we
also can't just summarily change setrefs.c's behavior there, or we're
likely to break external FDWs.

As a stopgap, therefore, hack up postgres_fdw so that it won't attempt
to send targetlist entries that look exactly like the fdw_exprs entries
they'd produce.  In most cases this actually produces a superior plan,
IMO, with less data needing to be transmitted and returned; so we probably
ought to think harder about whether we should ship tlist expressions at
all when they don't contain any foreign Vars or Aggs.  But that's an
optimization not a bug fix so I left it for later.  One case where this
produces an inferior plan is where the expression in question is actually
a GROUP BY expression: then the restriction prevents us from using remote
grouping.  It might be possible to work around that (since that would
reduce to group-by-a-constant on the remote side); but it seems like a
pretty unlikely corner case, so I'm not sure it's worth expending effort
solely to improve that.  In any case the right long-term answer is to fix
the API as sketched above, and then revert this hack.

Per bug #15781 from Sean Johnston.  Back-patch to v10 where the problem
was introduced.

Discussion: https://postgr.es/m/15781-2601b1002bad087c@postgresql.org
2019-04-27 13:15:54 -04:00
Etsuro Fujita 5c47049180 postgres_fdw: Fix incorrect handling of row movement for remote partitions.
Commit 3d956d9562 added support for update row movement in postgres_fdw.
This patch fixes the following issues introduced by that commit:

* When a remote partition chosen to insert routed rows into was also an
  UPDATE subplan target rel that would be updated later, the UPDATE that
  used a direct modification plan modified those routed rows incorrectly
  because those routed rows were visible to the later UPDATE command.
  The right fix for this would be to have some way in postgres_fdw in
  which the later UPDATE command ignores those routed rows, but it seems
  hard to do so with the current infrastructure.  For now throw an error
  in that case.

* When a remote partition chosen to insert routed rows into was also an
  UPDATE subplan target rel, fmstate created for the UPDATE that used a
  non-direct modification plan was mistakenly overridden by another
  fmstate created for inserting those routed rows into the partition.
  This caused 1) server crash when the partition would be updated later,
  and 2) resource leak when the partition had been already updated.  To
  avoid that, adjust the treatment of the fmstate for the inserting.  As
  for #1, since we would also have the incorrectness issue as mentioned
  above, error out in that case as well.

Update the docs to mention that postgres_fdw currently does not handle
the case where a remote partition chosen to insert a routed row into is
also an UPDATE subplan target rel that will be updated later.

Author: Amit Langote and Etsuro Fujita
Reviewed-by: Amit Langote
Backpatch-through: 11 where row movement in postgres_fdw was added
Discussion: https://postgr.es/m/21e7eaa4-0d4d-20c2-a1f7-c7e96f4ce440@lab.ntt.co.jp
2019-04-24 18:31:50 +09:00
Michael Paquier 249d649996 Add support TCP user timeout in libpq and the backend server
Similarly to the set of parameters for keepalive, a connection parameter
for libpq is added as well as a backend GUC, called tcp_user_timeout.

Increasing the TCP user timeout is useful to allow a connection to
survive extended periods without end-to-end connection, and decreasing
it allows application to fail faster.  By default, the parameter is 0,
which makes the connection use the system default, and follows a logic
close to the keepalive parameters in its handling.  When connecting
through a Unix-socket domain, the parameters have no effect.

Author: Ryohei Nagaura
Reviewed-by: Fabien Coelho, Robert Haas, Kyotaro Horiguchi, Kirk
Jamison, Mikalai Keida, Takayuki Tsunakawa, Andrei Yahorau
Discussion: https://postgr.es/m/EDA4195584F5064680D8130B1CA91C45367328@G01JPEXMBYT04
2019-04-06 15:23:37 +09:00
Etsuro Fujita d50d172e51 postgres_fdw: Perform the (FINAL, NULL) upperrel operations remotely.
The upper-planner pathification allows FDWs to arrange to push down
different types of upper-stage operations to the remote side.  This
commit teaches postgres_fdw to do it for the (FINAL, NULL) upperrel,
which is responsible for doing LockRows, LIMIT, and/or ModifyTable.
This provides the ability for postgres_fdw to handle SELECT commands
so that it 1) skips the LockRows step (if any) (note that this is
safe since it performs early locking) and 2) pushes down the LIMIT
and/or OFFSET restrictions (if any) to the remote side.  This doesn't
handle the INSERT/UPDATE/DELETE cases.

Author: Etsuro Fujita
Reviewed-By: Antonin Houska and Jeff Janes
Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk
2019-04-02 20:30:45 +09:00
Etsuro Fujita 0269edefac postgres_fdw: Modify regression tests for EPQ-related planning problems.
This prevents the tests added by commit 4bbf6edfbd and adjusted by
commit 99f6a17dd6 from being useless by plan changes created by an
upcoming commit.

Author: Etsuro Fujita
Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk
2019-04-02 19:38:56 +09:00
Etsuro Fujita ffab494a4d postgres_fdw: Perform the (ORDERED, NULL) upperrel operations remotely.
The upper-planner pathification allows FDWs to arrange to push down
different types of upper-stage operations to the remote side.  This
commit teaches postgres_fdw to do it for the (ORDERED, NULL) upperrel,
which is responsible for evaluating the query's ORDER BY ordering.
Since postgres_fdw is already able to evaluate that ordering remotely
for foreign baserels and foreign joinrels (see commit aa09cd242f et al.),
this adds support for that for foreign grouping relations.

Author: Etsuro Fujita
Reviewed-By: Antonin Houska and Jeff Janes
Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk
2019-04-02 19:20:30 +09:00
Tom Lane 428b260f87 Speed up planning when partitions can be pruned at plan time.
Previously, the planner created RangeTblEntry and RelOptInfo structs
for every partition of a partitioned table, even though many of them
might later be deemed uninteresting thanks to partition pruning logic.
This incurred significant overhead when there are many partitions.
Arrange to postpone creation of these data structures until after
we've processed the query enough to identify restriction quals for
the partitioned table, and then apply partition pruning before not
after creation of each partition's data structures.  In this way
we need not open the partition relations at all for partitions that
the planner has no real interest in.

For queries that can be proven at plan time to access only a small
number of partitions, this patch improves the practical maximum
number of partitions from under 100 to perhaps a few thousand.

Amit Langote, reviewed at various times by Dilip Kumar, Jesper Pedersen,
Yoshikazu Imai, and David Rowley

Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
2019-03-30 18:58:55 -04:00
Peter Eisentraut fc22b6623b Generated columns
This is an SQL-standard feature that allows creating columns that are
computed from expressions rather than assigned, similar to a view or
materialized view but on a column basis.

This implements one kind of generated column: stored (computed on
write).  Another kind, virtual (computed on read), is planned for the
future, and some room is left for it.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b151f851-4019-bdb1-699e-ebab07d2f40a@2ndquadrant.com
2019-03-30 08:15:57 +01:00
Tom Lane e8d5dd6be7 Get rid of duplicate child RTE for a partitioned table.
We've been creating duplicate RTEs for partitioned tables just
because we do so for regular inheritance parent tables.  But unlike
regular-inheritance parents which are themselves regular tables
and thus need to be scanned, partitioned tables don't need the
extra RTE.

This makes the conditions for building a child RTE the same as those
for building an AppendRelInfo, allowing minor simplification in
expand_single_inheritance_child.  Since the planner's actual processing
is driven off the AppendRelInfo list, nothing much changes beyond that,
we just have one fewer useless RTE entry.

Amit Langote, reviewed and hacked a bit by me

Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
2019-03-26 12:03:27 -04:00
Tom Lane 8edd0e7946 Suppress Append and MergeAppend plan nodes that have a single child.
If there's only one child relation, the Append or MergeAppend isn't
doing anything useful, and can be elided.  It does have a purpose
during planning though, which is to serve as a buffer between parent
and child Var numbering.  Therefore we keep it all the way through
to setrefs.c, and get rid of it only after fixing references in the
plan level(s) above it.  This works largely the same as setrefs.c's
ancient hack to get rid of no-op SubqueryScan nodes, and can even
share some code with that.

Note the change to make setrefs.c use apply_tlist_labeling rather than
ad-hoc code.  This has the effect of propagating the child's resjunk
and ressortgroupref labels, which formerly weren't propagated when
removing a SubqueryScan.  Doing that is demonstrably necessary for
the [Merge]Append cases, and seems harmless for SubqueryScan, if only
because trivial_subqueryscan is afraid to collapse cases where the
resjunk marking differs.  (I suspect that restriction could now be
removed, though it's unclear that it'd make any new matches possible,
since the outer query can't have references to a child resjunk column.)

David Rowley, reviewed by Alvaro Herrera and Tomas Vondra

Discussion: https://postgr.es/m/CAKJS1f_7u8ATyJ1JGTMHFoKDvZdeF-iEBhs+sM_SXowOr9cArg@mail.gmail.com
2019-03-25 15:42:35 -04:00
Tom Lane 608b167f9f Allow user control of CTE materialization, and change the default behavior.
Historically we've always materialized the full output of a CTE query,
treating WITH as an optimization fence (so that, for example, restrictions
from the outer query cannot be pushed into it).  This is appropriate when
the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE
query is non-recursive and side-effect-free, there's no hazard of changing
the query results by pushing restrictions down.

Another argument for materialization is that it can avoid duplicate
computation of an expensive WITH query --- but that only applies if
the WITH query is called more than once in the outer query.  Even then
it could still be a net loss, if each call has restrictions that
would allow just a small part of the WITH query to be computed.

Hence, let's change the behavior for WITH queries that are non-recursive
and side-effect-free.  By default, we will inline them into the outer
query (removing the optimization fence) if they are called just once.
If they are called more than once, we will keep the old behavior by
default, but the user can override this and force inlining by specifying
NOT MATERIALIZED.  Lastly, the user can force the old behavior by
specifying MATERIALIZED; this would mainly be useful when the query had
deliberately been employing WITH as an optimization fence to prevent a
poor choice of plan.

Andreas Karlsson, Andrew Gierth, David Fetter

Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk
2019-02-16 16:11:12 -05:00
Tom Lane 34ea1ab7fd Split create_foreignscan_path() into three functions.
Up to now postgres_fdw has been using create_foreignscan_path() to
generate not only base-relation paths, but also paths for foreign joins
and foreign upperrels.  This is wrong, because create_foreignscan_path()
calls get_baserel_parampathinfo() which will only do the right thing for
baserels.  It accidentally fails to fail for unparameterized paths, which
are the only ones postgres_fdw (thought it) was handling, but we really
need different APIs for the baserel and join cases.

In HEAD, the best thing to do seems to be to split up the baserel,
joinrel, and upperrel cases into three functions so that they can
have different APIs.  I haven't actually given create_foreign_join_path
a different API in this commit: we should spend a bit of time thinking
about just what we want to do there, since perhaps FDWs would want to
do something different from the build-up-a-join-pairwise approach that
get_joinrel_parampathinfo expects.  In the meantime, since postgres_fdw
isn't prepared to generate parameterized joins anyway, just give it a
defense against trying to plan joins with lateral refs.

In addition (and this is what triggered this whole mess) fix bug #15613
from Srinivasan S A, by teaching file_fdw and postgres_fdw that plain
baserel foreign paths still have outer refs if the relation has
lateral_relids.  Add some assertions in relnode.c to catch future
occurrences of the same error --- in particular, to catch other FDWs
doing that, but also as backstop against core-code mistakes like the
one fixed by commit bdd9a99aa.

Bug #15613 also needs to be fixed in the back branches, but the
appropriate fix will look quite a bit different there, since we don't
want to assume that existing FDWs get the word right away.

Discussion: https://postgr.es/m/15613-092be1be9576c728@postgresql.org
2019-02-07 13:11:12 -05:00
Tom Lane 4be058fe9e In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner.  It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it.  prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer.  We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about.  Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.

For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall.  However testing says that the
penalty is very small, close to the noise level.  In more complex queries,
this is able to find optimizations that we could not find before.

The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before).  To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)

Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.

Patch by me, reviewed by David Rowley and Mark Dilger

Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 17:54:23 -05:00
Michael Paquier bf491a9073 Disable WAL-skipping optimization for COPY on views and foreign tables
COPY can skip writing WAL when loading data on a table which has been
created in the same transaction as the one loading the data, however
this cannot work on views or foreign table as this would result in
trying to flush relation files which do not exist.  So disable the
optimization so as commands are able to work the same way with any
configuration of wal_level.

Tests are added to cover the different cases, which need to have
wal_level set to minimal to allow the problem to show up, and that is
not the default configuration.

Reported-by: Luis M. Carril, Etsuro Fujita
Author: Amit Langote, Michael Paquier
Reviewed-by: Etsuro Fujita
Discussion: https://postgr.es/m/15552-c64aa14c5c22f63c@postgresql.org
Backpatch-through: 10, where support for COPY on views has been added,
while v11 has added support for COPY on foreign tables.
2018-12-23 16:42:22 +09:00
Tom Lane 77d4d88afb Repair bogus EPQ plans generated for postgres_fdw foreign joins.
postgres_fdw's postgresGetForeignPlan() assumes without checking that the
outer_plan it's given for a join relation must have a NestLoop, MergeJoin,
or HashJoin node at the top.  That's been wrong at least since commit
4bbf6edfb (which could cause insertion of a Sort node on top) and it seems
like a pretty unsafe thing to Just Assume even without that.

Through blind good fortune, this doesn't seem to have any worse
consequences today than strange EXPLAIN output, but it's clearly trouble
waiting to happen.

To fix, test the node type explicitly before touching Join-specific
fields, and avoid jamming the new tlist into a node type that can't
do projection.  Export a new support function from createplan.c
to avoid building low-level knowledge about the latter into FDWs.

Back-patch to 9.6 where the faulty coding was added.  Note that the
associated regression test cases don't show any changes before v11,
apparently because the tests back-patched with 4bbf6edfb don't actually
exercise the problem case before then (there's no top-level Sort
in those plans).

Discussion: https://postgr.es/m/8946.1544644803@sss.pgh.pa.us
2018-12-12 16:08:30 -05:00
Etsuro Fujita f8f6e44676 postgres_fdw: Improve cost and size estimation for aggregate pushdown.
In commit 7012b132d0, which added aggregate
pushdown to postgres_fdw, we didn't account for the evaluation cost and the
selectivity of HAVING quals attached to ForeignPaths performing aggregate
pushdown, as core had never accounted for that for AggPaths and GroupPaths.
And we didn't set these values of the locally-checked quals (ie, fpinfo's
local_conds_cost and local_conds_sel), which were initialized to zeros, but
since estimate_path_cost_size factors in these to estimate the result size
and the evaluation cost of such a ForeignPath when the use_remote_estimate
option is enabled, this caused it to produce underestimated results in that
case.

By commit 7b6c075471 core was changed so that
it accounts for the evaluation cost and the selectivity of HAVING quals in
aggregation paths, so change the postgres_fdw's aggregate pushdown code as
well as such.  This not only fixes the underestimation issue mentioned
above, but improves the estimation using local statistics in that function
when that option is disabled.

This would be a bug fix rather than an improvement, but apply it to HEAD
only to avoid destabilizing existing plan choices.

Author: Etsuro Fujita
Discussion: https://postgr.es/m/5BFD3EAD.2060301%40lab.ntt.co.jp
2018-12-04 17:18:58 +09:00