2007-03-13 01:33:44 +01:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* plancache.c
|
|
|
|
* Plan cache management.
|
|
|
|
*
|
2011-09-16 06:42:53 +02:00
|
|
|
* The plan cache manager has two principal responsibilities: deciding when
|
|
|
|
* to use a generic plan versus a custom (parameter-value-specific) plan,
|
|
|
|
* and tracking whether cached plans need to be invalidated because of schema
|
|
|
|
* changes in the objects they depend on.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
2011-09-16 06:42:53 +02:00
|
|
|
* The logic for choosing generic or custom plans is in choose_custom_plan,
|
|
|
|
* which see for comments.
|
|
|
|
*
|
|
|
|
* Cache invalidation is driven off sinval events. Any CachedPlanSource
|
|
|
|
* that matches the event is marked invalid, as is its generic CachedPlan
|
|
|
|
* if it has one. When (and if) the next demand for a cached plan occurs,
|
|
|
|
* parse analysis and rewrite is repeated to build a new valid query tree,
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
* and then planning is performed as normal. We also force re-analysis and
|
Reset plan->row_security_env and planUserId
In the plancache, we check if the environment we planned the query under
has changed in a way which requires us to re-plan, such as when the user
for whom the plan was prepared changes and RLS is being used (and,
therefore, there may be different policies to apply).
Unfortunately, while those values were set and checked, they were not
being reset when the query was re-planned and therefore, in cases where
we change role, re-plan, and then change role again, we weren't
re-planning again. This leads to potentially incorrect policies being
applied in cases where role-specific policies are used and a given query
is planned under one role and then executed under other roles, which
could happen under security definer functions or when a common user and
query is planned initially and then re-used across multiple SET ROLEs.
Further, extensions which made use of CopyCachedPlan() may suffer from
similar issues as the RLS-related fields were not properly copied as
part of the plan and therefore RevalidateCachedQuery() would copy in the
current settings without invalidating the query.
Fix by using the same approach used for 'search_path', where we set the
correct values in CompleteCachedPlan(), check them early on in
RevalidateCachedQuery() and then properly reset them if re-planning.
Also, copy through the values during CopyCachedPlan().
Pointed out by Ashutosh Bapat. Reviewed by Michael Paquier.
Back-patch to 9.5 where RLS was introduced.
Security: CVE-2016-2193
2016-03-28 15:03:20 +02:00
|
|
|
* re-planning if the active search_path is different from the previous time
|
|
|
|
* or, if RLS is involved, if the user changes or the RLS environment changes.
|
2011-09-16 06:42:53 +02:00
|
|
|
*
|
|
|
|
* Note that if the sinval was a result of user DDL actions, parse analysis
|
|
|
|
* could throw an error, for example if a column referenced by the query is
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
* no longer present. Another possibility is for the query's output tupdesc
|
|
|
|
* to change (for instance "SELECT *" might expand differently than before).
|
|
|
|
* The creator of a cached plan can specify whether it is allowable for the
|
|
|
|
* query to change output tupdesc on replan --- if so, it's up to the
|
2007-03-13 01:33:44 +01:00
|
|
|
* caller to notice changes and cope with them.
|
|
|
|
*
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
* Currently, we track exactly the dependencies of plans on relations,
|
|
|
|
* user-defined functions, and domains. On relcache invalidation events or
|
|
|
|
* pg_proc or pg_type syscache invalidation events, we invalidate just those
|
|
|
|
* plans that depend on the particular object being modified. (Note: this
|
|
|
|
* scheme assumes that any table modification that requires replanning will
|
|
|
|
* generate a relcache inval event.) We also watch for inval events on
|
|
|
|
* certain other system catalogs, such as pg_namespace; but for them, our
|
|
|
|
* response is just to invalidate all plans. We expect updates on those
|
|
|
|
* catalogs to be infrequent enough that more-detailed tracking is not worth
|
|
|
|
* the effort.
|
|
|
|
*
|
|
|
|
* In addition to full-fledged query plans, we provide a facility for
|
|
|
|
* detecting invalidations of simple scalar expressions. This is fairly
|
|
|
|
* bare-bones; it's the caller's responsibility to build a new expression
|
|
|
|
* if the old one gets invalidated.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
|
|
|
*
|
2022-01-08 01:04:57 +01:00
|
|
|
* Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
|
2007-03-13 01:33:44 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/utils/cache/plancache.c
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
#include <limits.h>
|
|
|
|
|
2007-09-20 19:56:33 +02:00
|
|
|
#include "access/transam.h"
|
2007-03-23 20:53:52 +01:00
|
|
|
#include "catalog/namespace.h"
|
2007-03-13 01:33:44 +01:00
|
|
|
#include "executor/executor.h"
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
#include "miscadmin.h"
|
2008-08-26 00:42:34 +02:00
|
|
|
#include "nodes/nodeFuncs.h"
|
2019-01-29 21:48:51 +01:00
|
|
|
#include "optimizer/optimizer.h"
|
2011-09-25 23:33:32 +02:00
|
|
|
#include "parser/analyze.h"
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
#include "parser/parsetree.h"
|
2007-03-13 01:33:44 +01:00
|
|
|
#include "storage/lmgr.h"
|
|
|
|
#include "tcop/pquery.h"
|
|
|
|
#include "tcop/utility.h"
|
|
|
|
#include "utils/inval.h"
|
|
|
|
#include "utils/memutils.h"
|
2012-08-29 00:02:07 +02:00
|
|
|
#include "utils/resowner_private.h"
|
Fix column-privilege leak in error-message paths
While building error messages to return to the user,
BuildIndexValueDescription, ExecBuildSlotValueDescription and
ri_ReportViolation would happily include the entire key or entire row in
the result returned to the user, even if the user didn't have access to
view all of the columns being included.
Instead, include only those columns which the user is providing or which
the user has select rights on. If the user does not have any rights
to view the table or any of the columns involved then no detail is
provided and a NULL value is returned from BuildIndexValueDescription
and ExecBuildSlotValueDescription. Note that, for key cases, the user
must have access to all of the columns for the key to be shown; a
partial key will not be returned.
Further, in master only, do not return any data for cases where row
security is enabled on the relation and row security should be applied
for the user. This required a bit of refactoring and moving of things
around related to RLS- note the addition of utils/misc/rls.c.
Back-patch all the way, as column-level privileges are now in all
supported versions.
This has been assigned CVE-2014-8161, but since the issue and the patch
have already been publicized on pgsql-hackers, there's no point in trying
to hide this commit.
2015-01-12 23:04:11 +01:00
|
|
|
#include "utils/rls.h"
|
2008-03-26 19:48:59 +01:00
|
|
|
#include "utils/snapmgr.h"
|
2008-09-09 20:58:09 +02:00
|
|
|
#include "utils/syscache.h"
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
/*
|
|
|
|
* We must skip "overhead" operations that involve database access when the
|
|
|
|
* cached plan's subject statement is a transaction control command.
|
|
|
|
*/
|
|
|
|
#define IsTransactionStmtPlan(plansource) \
|
|
|
|
((plansource)->raw_parse_tree && \
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
IsA((plansource)->raw_parse_tree->stmt, TransactionStmt))
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* This is the head of the backend's list of "saved" CachedPlanSources (i.e.,
|
|
|
|
* those that are in long-lived storage and are examined for sinval events).
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
* We use a dlist instead of separate List cells so that we can guarantee
|
|
|
|
* to save a CachedPlanSource without error.
|
|
|
|
*/
|
|
|
|
static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This is the head of the backend's list of CachedExpressions.
|
2011-09-16 06:42:53 +02:00
|
|
|
*/
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_list);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
static void ReleaseGenericPlan(CachedPlanSource *plansource);
|
2017-04-01 06:17:18 +02:00
|
|
|
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
|
|
|
|
QueryEnvironment *queryEnv);
|
2011-09-16 06:42:53 +02:00
|
|
|
static bool CheckCachedPlan(CachedPlanSource *plansource);
|
|
|
|
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
|
2017-04-01 06:17:18 +02:00
|
|
|
ParamListInfo boundParams, QueryEnvironment *queryEnv);
|
2011-09-16 06:42:53 +02:00
|
|
|
static bool choose_custom_plan(CachedPlanSource *plansource,
|
|
|
|
ParamListInfo boundParams);
|
2013-08-24 21:14:17 +02:00
|
|
|
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
static Query *QueryListGetPrimaryStmt(List *stmts);
|
2007-03-13 01:33:44 +01:00
|
|
|
static void AcquireExecutorLocks(List *stmt_list, bool acquire);
|
|
|
|
static void AcquirePlannerLocks(List *stmt_list, bool acquire);
|
2008-09-09 20:58:09 +02:00
|
|
|
static void ScanQueryForLocks(Query *parsetree, bool acquire);
|
|
|
|
static bool ScanQueryWalker(Node *node, bool *acquire);
|
2011-09-16 06:42:53 +02:00
|
|
|
static TupleDesc PlanCacheComputeResultDesc(List *stmt_list);
|
2008-09-09 20:58:09 +02:00
|
|
|
static void PlanCacheRelCallback(Datum arg, Oid relid);
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
static void PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue);
|
2011-08-17 01:27:46 +02:00
|
|
|
static void PlanCacheSysCallback(Datum arg, int cacheid, uint32 hashvalue);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2018-07-16 13:35:41 +02:00
|
|
|
/* GUC parameter */
|
Create an RTE field to record the query's lock mode for each relation.
Add RangeTblEntry.rellockmode, which records the appropriate lock mode for
each RTE_RELATION rangetable entry (either AccessShareLock, RowShareLock,
or RowExclusiveLock depending on the RTE's role in the query).
This patch creates the field and makes all creators of RTE nodes fill it
in reasonably, but for the moment nothing much is done with it. The plan
is to replace assorted post-parser logic that re-determines the right
lockmode to use with simple uses of rte->rellockmode. For now, just add
Asserts in each of those places that the rellockmode matches what they are
computing today. (In some cases the match isn't perfect, so the Asserts
are weaker than you might expect; but this seems OK, as per discussion.)
This passes check-world for me, but it seems worth pushing in this state
to see if the buildfarm finds any problems in cases I failed to test.
catversion bump due to change of stored rules.
Amit Langote, reviewed by David Rowley and Jesper Pedersen,
and whacked around a bit more by me
Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
2018-09-30 19:55:51 +02:00
|
|
|
int plan_cache_mode;
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* InitPlanCache: initialize module during InitPostgres.
|
|
|
|
*
|
2008-09-09 20:58:09 +02:00
|
|
|
* All we need to do is hook into inval.c's callback lists.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
InitPlanCache(void)
|
|
|
|
{
|
2008-09-09 20:58:09 +02:00
|
|
|
CacheRegisterRelcacheCallback(PlanCacheRelCallback, (Datum) 0);
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
CacheRegisterSyscacheCallback(PROCOID, PlanCacheObjectCallback, (Datum) 0);
|
|
|
|
CacheRegisterSyscacheCallback(TYPEOID, PlanCacheObjectCallback, (Datum) 0);
|
2008-09-09 20:58:09 +02:00
|
|
|
CacheRegisterSyscacheCallback(NAMESPACEOID, PlanCacheSysCallback, (Datum) 0);
|
|
|
|
CacheRegisterSyscacheCallback(OPEROID, PlanCacheSysCallback, (Datum) 0);
|
|
|
|
CacheRegisterSyscacheCallback(AMOPOPID, PlanCacheSysCallback, (Datum) 0);
|
Invalidate cached plans on FDW option changes.
This fixes problems where a plan must change but fails to do so,
as seen in a bug report from Rajkumar Raghuwanshi.
For ALTER FOREIGN TABLE OPTIONS, do this through the standard method of
forcing a relcache flush on the table. For ALTER FOREIGN DATA WRAPPER
and ALTER SERVER, just flush the whole plan cache on any change in
pg_foreign_data_wrapper or pg_foreign_server. That matches the way
we handle some other low-probability cases such as opclass changes, and
it's unclear that the case arises often enough to be worth working harder.
Besides, that gives a patch that is simple enough to back-patch with
confidence.
Back-patch to 9.3. In principle we could apply the code change to 9.2 as
well, but (a) we lack postgres_fdw to test it with, (b) it's doubtful that
anyone is doing anything exciting enough with FDWs that far back to need
this desperately, and (c) the patch doesn't apply cleanly.
Patch originally by Amit Langote, reviewed by Etsuro Fujita and Ashutosh
Bapat, who each contributed substantial changes as well.
Discussion: https://postgr.es/m/CAKcux6m5cA6rRPTKkqVdJ-R=KKDfe35Q_ZuUqxDSV_4hwga=og@mail.gmail.com
2017-01-06 20:12:52 +01:00
|
|
|
CacheRegisterSyscacheCallback(FOREIGNSERVEROID, PlanCacheSysCallback, (Datum) 0);
|
|
|
|
CacheRegisterSyscacheCallback(FOREIGNDATAWRAPPEROID, PlanCacheSysCallback, (Datum) 0);
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* CreateCachedPlan: initially create a plan cache entry.
|
|
|
|
*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Creation of a cached plan is divided into two steps, CreateCachedPlan and
|
|
|
|
* CompleteCachedPlan. CreateCachedPlan should be called after running the
|
|
|
|
* query through raw_parser, but before doing parse analysis and rewrite;
|
|
|
|
* CompleteCachedPlan is called after that. The reason for this arrangement
|
|
|
|
* is that it can save one round of copying of the raw parse tree, since
|
|
|
|
* the parser will normally scribble on the raw parse tree. Callers would
|
|
|
|
* otherwise need to make an extra copy of the parse tree to ensure they
|
|
|
|
* still had a clean copy to present at plan cache creation time.
|
|
|
|
*
|
|
|
|
* All arguments presented to CreateCachedPlan are copied into a memory
|
|
|
|
* context created as a child of the call-time CurrentMemoryContext, which
|
|
|
|
* should be a reasonably short-lived working context that will go away in
|
|
|
|
* event of an error. This ensures that the cached plan data structure will
|
|
|
|
* likewise disappear if an error occurs before we have fully constructed it.
|
|
|
|
* Once constructed, the cached plan can be made longer-lived, if needed,
|
|
|
|
* by calling SaveCachedPlan.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
2014-11-12 21:58:37 +01:00
|
|
|
* raw_parse_tree: output of raw_parser(), or NULL if empty query
|
2011-09-16 06:42:53 +02:00
|
|
|
* query_string: original query text
|
2020-03-02 22:19:51 +01:00
|
|
|
* commandTag: command tag for query, or UNKNOWN if empty query
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
|
|
|
CachedPlanSource *
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
CreateCachedPlan(RawStmt *raw_parse_tree,
|
2007-03-13 01:33:44 +01:00
|
|
|
const char *query_string,
|
2020-03-02 22:19:51 +01:00
|
|
|
CommandTag commandTag)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
|
|
|
CachedPlanSource *plansource;
|
|
|
|
MemoryContext source_context;
|
|
|
|
MemoryContext oldcxt;
|
|
|
|
|
Adjust things so that the query_string of a cached plan and the sourceText of
a portal are never NULL, but reliably provide the source text of the query.
It turns out that there was only one place that was really taking a short-cut,
which was the 'EXECUTE' utility statement. That doesn't seem like a
sufficiently critical performance hotspot to justify not offering a guarantee
of validity of the portal source text. Fix it to copy the source text over
from the cached plan. Add Asserts in the places that set up cached plans and
portals to reject null source strings, and simplify a bunch of places that
formerly needed to guard against nulls.
There may be a few places that cons up statements for execution without
having any source text at all; I found one such in ConvertTriggerToFK().
It seems sufficient to inject a phony source string in such a case,
for instance
ProcessUtility((Node *) atstmt,
"(generated ALTER TABLE ADD FOREIGN KEY command)",
NULL, false, None_Receiver, NULL);
We should take a second look at the usage of debug_query_string,
particularly the recently added current_query() SQL function.
ITAGAKI Takahiro and Tom Lane
2008-07-18 22:26:06 +02:00
|
|
|
Assert(query_string != NULL); /* required as of 8.4 */
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
|
|
|
* Make a dedicated memory context for the CachedPlanSource and its
|
2011-09-16 06:42:53 +02:00
|
|
|
* permanent subsidiary data. It's probably not going to be large, but
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
* just in case, allow it to grow large. Initially it's a child of the
|
|
|
|
* caller's context (which we assume to be transient), so that it will be
|
|
|
|
* cleaned up on error.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
source_context = AllocSetContextCreate(CurrentMemoryContext,
|
2007-03-13 01:33:44 +01:00
|
|
|
"CachedPlanSource",
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
ALLOCSET_START_SMALL_SIZES);
|
2007-03-23 20:53:52 +01:00
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
|
|
|
* Create and fill the CachedPlanSource struct within the new context.
|
2011-09-16 06:42:53 +02:00
|
|
|
* Most fields are just left empty for the moment.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
|
|
|
oldcxt = MemoryContextSwitchTo(source_context);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
plansource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource));
|
|
|
|
plansource->magic = CACHEDPLANSOURCE_MAGIC;
|
2007-03-13 01:33:44 +01:00
|
|
|
plansource->raw_parse_tree = copyObject(raw_parse_tree);
|
Adjust things so that the query_string of a cached plan and the sourceText of
a portal are never NULL, but reliably provide the source text of the query.
It turns out that there was only one place that was really taking a short-cut,
which was the 'EXECUTE' utility statement. That doesn't seem like a
sufficiently critical performance hotspot to justify not offering a guarantee
of validity of the portal source text. Fix it to copy the source text over
from the cached plan. Add Asserts in the places that set up cached plans and
portals to reject null source strings, and simplify a bunch of places that
formerly needed to guard against nulls.
There may be a few places that cons up statements for execution without
having any source text at all; I found one such in ConvertTriggerToFK().
It seems sufficient to inject a phony source string in such a case,
for instance
ProcessUtility((Node *) atstmt,
"(generated ALTER TABLE ADD FOREIGN KEY command)",
NULL, false, None_Receiver, NULL);
We should take a second look at the usage of debug_query_string,
particularly the recently added current_query() SQL function.
ITAGAKI Takahiro and Tom Lane
2008-07-18 22:26:06 +02:00
|
|
|
plansource->query_string = pstrdup(query_string);
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
MemoryContextSetIdentifier(source_context, plansource->query_string);
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->commandTag = commandTag;
|
|
|
|
plansource->param_types = NULL;
|
|
|
|
plansource->num_params = 0;
|
2009-11-04 23:26:08 +01:00
|
|
|
plansource->parserSetup = NULL;
|
|
|
|
plansource->parserSetupArg = NULL;
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->cursor_options = 0;
|
|
|
|
plansource->fixed_result = false;
|
|
|
|
plansource->resultDesc = NULL;
|
2007-03-13 01:33:44 +01:00
|
|
|
plansource->context = source_context;
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->query_list = NIL;
|
|
|
|
plansource->relationOids = NIL;
|
|
|
|
plansource->invalItems = NIL;
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
plansource->search_path = NULL;
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->query_context = NULL;
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
plansource->rewriteRoleId = InvalidOid;
|
|
|
|
plansource->rewriteRowSecurity = false;
|
|
|
|
plansource->dependsOnRLS = false;
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->gplan = NULL;
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
plansource->is_oneshot = false;
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->is_complete = false;
|
|
|
|
plansource->is_saved = false;
|
|
|
|
plansource->is_valid = false;
|
|
|
|
plansource->generation = 0;
|
|
|
|
plansource->generic_cost = -1;
|
|
|
|
plansource->total_custom_cost = 0;
|
2020-07-20 04:55:50 +02:00
|
|
|
plansource->num_generic_plans = 0;
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->num_custom_plans = 0;
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
MemoryContextSwitchTo(oldcxt);
|
|
|
|
|
|
|
|
return plansource;
|
|
|
|
}
|
|
|
|
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
/*
|
|
|
|
* CreateOneShotCachedPlan: initially create a one-shot plan cache entry.
|
|
|
|
*
|
|
|
|
* This variant of CreateCachedPlan creates a plan cache entry that is meant
|
|
|
|
* to be used only once. No data copying occurs: all data structures remain
|
|
|
|
* in the caller's memory context (which typically should get cleared after
|
|
|
|
* completing execution). The CachedPlanSource struct itself is also created
|
|
|
|
* in that context.
|
|
|
|
*
|
|
|
|
* A one-shot plan cannot be saved or copied, since we make no effort to
|
|
|
|
* preserve the raw parse tree unmodified. There is also no support for
|
|
|
|
* invalidation, so plan use must be completed in the current transaction,
|
|
|
|
* and DDL that might invalidate the querytree_list must be avoided as well.
|
|
|
|
*
|
2014-11-12 21:58:37 +01:00
|
|
|
* raw_parse_tree: output of raw_parser(), or NULL if empty query
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
* query_string: original query text
|
2020-03-02 22:19:51 +01:00
|
|
|
* commandTag: command tag for query, or NULL if empty query
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
*/
|
|
|
|
CachedPlanSource *
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
CreateOneShotCachedPlan(RawStmt *raw_parse_tree,
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
const char *query_string,
|
2020-03-02 22:19:51 +01:00
|
|
|
CommandTag commandTag)
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
{
|
|
|
|
CachedPlanSource *plansource;
|
|
|
|
|
|
|
|
Assert(query_string != NULL); /* required as of 8.4 */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create and fill the CachedPlanSource struct within the caller's memory
|
|
|
|
* context. Most fields are just left empty for the moment.
|
|
|
|
*/
|
|
|
|
plansource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource));
|
|
|
|
plansource->magic = CACHEDPLANSOURCE_MAGIC;
|
|
|
|
plansource->raw_parse_tree = raw_parse_tree;
|
|
|
|
plansource->query_string = query_string;
|
|
|
|
plansource->commandTag = commandTag;
|
|
|
|
plansource->param_types = NULL;
|
|
|
|
plansource->num_params = 0;
|
|
|
|
plansource->parserSetup = NULL;
|
|
|
|
plansource->parserSetupArg = NULL;
|
|
|
|
plansource->cursor_options = 0;
|
|
|
|
plansource->fixed_result = false;
|
|
|
|
plansource->resultDesc = NULL;
|
|
|
|
plansource->context = CurrentMemoryContext;
|
|
|
|
plansource->query_list = NIL;
|
|
|
|
plansource->relationOids = NIL;
|
|
|
|
plansource->invalItems = NIL;
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
plansource->search_path = NULL;
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
plansource->query_context = NULL;
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
plansource->rewriteRoleId = InvalidOid;
|
|
|
|
plansource->rewriteRowSecurity = false;
|
|
|
|
plansource->dependsOnRLS = false;
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
plansource->gplan = NULL;
|
|
|
|
plansource->is_oneshot = true;
|
|
|
|
plansource->is_complete = false;
|
|
|
|
plansource->is_saved = false;
|
|
|
|
plansource->is_valid = false;
|
|
|
|
plansource->generation = 0;
|
|
|
|
plansource->generic_cost = -1;
|
|
|
|
plansource->total_custom_cost = 0;
|
2020-07-20 04:55:50 +02:00
|
|
|
plansource->num_generic_plans = 0;
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
plansource->num_custom_plans = 0;
|
|
|
|
|
|
|
|
return plansource;
|
|
|
|
}
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* CompleteCachedPlan: second step of creating a plan cache entry.
|
|
|
|
*
|
|
|
|
* Pass in the analyzed-and-rewritten form of the query, as well as the
|
|
|
|
* required subsidiary data about parameters and such. All passed values will
|
|
|
|
* be copied into the CachedPlanSource's memory, except as specified below.
|
|
|
|
* After this is called, GetCachedPlan can be called to obtain a plan, and
|
|
|
|
* optionally the CachedPlanSource can be saved using SaveCachedPlan.
|
|
|
|
*
|
|
|
|
* If querytree_context is not NULL, the querytree_list must be stored in that
|
|
|
|
* context (but the other parameters need not be). The querytree_list is not
|
|
|
|
* copied, rather the given context is kept as the initial query_context of
|
|
|
|
* the CachedPlanSource. (It should have been created as a child of the
|
|
|
|
* caller's working memory context, but it will now be reparented to belong
|
|
|
|
* to the CachedPlanSource.) The querytree_context is normally the context in
|
|
|
|
* which the caller did raw parsing and parse analysis. This approach saves
|
|
|
|
* one tree copying step compared to passing NULL, but leaves lots of extra
|
|
|
|
* cruft in the query_context, namely whatever extraneous stuff parse analysis
|
|
|
|
* created, as well as whatever went unused from the raw parse tree. Using
|
|
|
|
* this option is a space-for-time tradeoff that is appropriate if the
|
|
|
|
* CachedPlanSource is not expected to survive long.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
2011-09-16 06:42:53 +02:00
|
|
|
* plancache.c cannot know how to copy the data referenced by parserSetupArg,
|
|
|
|
* and it would often be inappropriate to do so anyway. When using that
|
|
|
|
* option, it is caller's responsibility that the referenced data remains
|
|
|
|
* valid for as long as the CachedPlanSource exists.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
* If the CachedPlanSource is a "oneshot" plan, then no querytree copying
|
|
|
|
* occurs at all, and querytree_context is ignored; it is caller's
|
|
|
|
* responsibility that the passed querytree_list is sufficiently long-lived.
|
|
|
|
*
|
2011-09-16 06:42:53 +02:00
|
|
|
* plansource: structure returned by CreateCachedPlan
|
|
|
|
* querytree_list: analyzed-and-rewritten form of query (list of Query nodes)
|
|
|
|
* querytree_context: memory context containing querytree_list,
|
|
|
|
* or NULL to copy querytree_list into a fresh context
|
|
|
|
* param_types: array of fixed parameter type OIDs, or NULL if none
|
|
|
|
* num_params: number of fixed parameters
|
|
|
|
* parserSetup: alternate method for handling query parameters
|
|
|
|
* parserSetupArg: data to pass to parserSetup
|
|
|
|
* cursor_options: options bitmask to pass to planner
|
2017-08-16 06:22:32 +02:00
|
|
|
* fixed_result: true to disallow future changes in query's result tupdesc
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
void
|
|
|
|
CompleteCachedPlan(CachedPlanSource *plansource,
|
|
|
|
List *querytree_list,
|
|
|
|
MemoryContext querytree_context,
|
|
|
|
Oid *param_types,
|
|
|
|
int num_params,
|
|
|
|
ParserSetupHook parserSetup,
|
|
|
|
void *parserSetupArg,
|
|
|
|
int cursor_options,
|
|
|
|
bool fixed_result)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
MemoryContext source_context = plansource->context;
|
|
|
|
MemoryContext oldcxt = CurrentMemoryContext;
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Assert caller is doing things in a sane order */
|
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
Assert(!plansource->is_complete);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If caller supplied a querytree_context, reparent it underneath the
|
|
|
|
* CachedPlanSource's context; otherwise, create a suitable context and
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
* copy the querytree_list into it. But no data copying should be done
|
|
|
|
* for one-shot plans; for those, assume the passed querytree_list is
|
|
|
|
* sufficiently long-lived.
|
2011-09-16 06:42:53 +02:00
|
|
|
*/
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
if (plansource->is_oneshot)
|
|
|
|
{
|
|
|
|
querytree_context = CurrentMemoryContext;
|
|
|
|
}
|
|
|
|
else if (querytree_context != NULL)
|
2011-09-16 06:42:53 +02:00
|
|
|
{
|
|
|
|
MemoryContextSetParent(querytree_context, source_context);
|
|
|
|
MemoryContextSwitchTo(querytree_context);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/* Again, it's a good bet the querytree_context can be small */
|
|
|
|
querytree_context = AllocSetContextCreate(source_context,
|
|
|
|
"CachedPlanQuery",
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
ALLOCSET_START_SMALL_SIZES);
|
2011-09-16 06:42:53 +02:00
|
|
|
MemoryContextSwitchTo(querytree_context);
|
2017-03-09 21:18:59 +01:00
|
|
|
querytree_list = copyObject(querytree_list);
|
2011-09-16 06:42:53 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
plansource->query_context = querytree_context;
|
|
|
|
plansource->query_list = querytree_list;
|
Adjust things so that the query_string of a cached plan and the sourceText of
a portal are never NULL, but reliably provide the source text of the query.
It turns out that there was only one place that was really taking a short-cut,
which was the 'EXECUTE' utility statement. That doesn't seem like a
sufficiently critical performance hotspot to justify not offering a guarantee
of validity of the portal source text. Fix it to copy the source text over
from the cached plan. Add Asserts in the places that set up cached plans and
portals to reject null source strings, and simplify a bunch of places that
formerly needed to guard against nulls.
There may be a few places that cons up statements for execution without
having any source text at all; I found one such in ConvertTriggerToFK().
It seems sufficient to inject a phony source string in such a case,
for instance
ProcessUtility((Node *) atstmt,
"(generated ALTER TABLE ADD FOREIGN KEY command)",
NULL, false, None_Receiver, NULL);
We should take a second look at the usage of debug_query_string,
particularly the recently added current_query() SQL function.
ITAGAKI Takahiro and Tom Lane
2008-07-18 22:26:06 +02:00
|
|
|
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
if (!plansource->is_oneshot && !IsTransactionStmtPlan(plansource))
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Use the planner machinery to extract dependencies. Data is saved
|
|
|
|
* in query_context. (We assume that not a lot of extra cruft is
|
|
|
|
* created by this call.) We can skip this for one-shot plans, and
|
|
|
|
* transaction control commands have no such dependencies anyway.
|
|
|
|
*/
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
extract_query_dependencies((Node *) querytree_list,
|
|
|
|
&plansource->relationOids,
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
&plansource->invalItems,
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
&plansource->dependsOnRLS);
|
|
|
|
|
|
|
|
/* Update RLS info as well. */
|
|
|
|
plansource->rewriteRoleId = GetUserId();
|
|
|
|
plansource->rewriteRowSecurity = row_security;
|
2007-03-23 20:53:52 +01:00
|
|
|
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
/*
|
|
|
|
* Also save the current search_path in the query_context. (This
|
|
|
|
* should not generate much extra cruft either, since almost certainly
|
|
|
|
* the path is already valid.) Again, we don't really need this for
|
|
|
|
* one-shot plans; and we *must* skip this for transaction control
|
|
|
|
* commands, because this could result in catalog accesses.
|
|
|
|
*/
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
plansource->search_path = GetOverrideSearchPath(querytree_context);
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
}
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Save the final parameter types (or other parameter specification data)
|
|
|
|
* into the source_context, as well as our other parameters. Also save
|
|
|
|
* the result tuple descriptor.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
MemoryContextSwitchTo(source_context);
|
|
|
|
|
|
|
|
if (num_params > 0)
|
|
|
|
{
|
|
|
|
plansource->param_types = (Oid *) palloc(num_params * sizeof(Oid));
|
|
|
|
memcpy(plansource->param_types, param_types, num_params * sizeof(Oid));
|
|
|
|
}
|
|
|
|
else
|
|
|
|
plansource->param_types = NULL;
|
2007-03-13 01:33:44 +01:00
|
|
|
plansource->num_params = num_params;
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->parserSetup = parserSetup;
|
|
|
|
plansource->parserSetupArg = parserSetupArg;
|
2007-04-16 20:21:07 +02:00
|
|
|
plansource->cursor_options = cursor_options;
|
2007-03-13 01:33:44 +01:00
|
|
|
plansource->fixed_result = fixed_result;
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->resultDesc = PlanCacheComputeResultDesc(querytree_list);
|
|
|
|
|
|
|
|
MemoryContextSwitchTo(oldcxt);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->is_complete = true;
|
|
|
|
plansource->is_valid = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SaveCachedPlan: save a cached plan permanently
|
|
|
|
*
|
|
|
|
* This function moves the cached plan underneath CacheMemoryContext (making
|
|
|
|
* it live for the life of the backend, unless explicitly dropped), and adds
|
|
|
|
* it to the list of cached plans that are checked for invalidation when an
|
|
|
|
* sinval event occurs.
|
|
|
|
*
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
* This is guaranteed not to throw error, except for the caller-error case
|
|
|
|
* of trying to save a one-shot plan. Callers typically depend on that
|
2011-09-16 06:42:53 +02:00
|
|
|
* since this is called just before or just after adding a pointer to the
|
|
|
|
* CachedPlanSource to some permanent data structure of their own. Up until
|
|
|
|
* this is done, a CachedPlanSource is just transient data that will go away
|
|
|
|
* automatically on transaction abort.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
SaveCachedPlan(CachedPlanSource *plansource)
|
|
|
|
{
|
|
|
|
/* Assert caller is doing things in a sane order */
|
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
Assert(plansource->is_complete);
|
|
|
|
Assert(!plansource->is_saved);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
/* This seems worth a real test, though */
|
|
|
|
if (plansource->is_oneshot)
|
|
|
|
elog(ERROR, "cannot save one-shot cached plan");
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* In typical use, this function would be called before generating any
|
|
|
|
* plans from the CachedPlanSource. If there is a generic plan, moving it
|
|
|
|
* into CacheMemoryContext would be pretty risky since it's unclear
|
|
|
|
* whether the caller has taken suitable care with making references
|
|
|
|
* long-lived. Best thing to do seems to be to discard the plan.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
ReleaseGenericPlan(plansource);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Reparent the source memory context under CacheMemoryContext so that it
|
|
|
|
* will live indefinitely. The query_context follows along since it's
|
|
|
|
* already a child of the other one.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
MemoryContextSetParent(plansource->context, CacheMemoryContext);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* Add the entry to the global list of cached plans.
|
|
|
|
*/
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
dlist_push_tail(&saved_plan_list, &plansource->node);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->is_saved = true;
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
2009-11-04 23:26:08 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* DropCachedPlan: destroy a cached plan.
|
2009-11-04 23:26:08 +01:00
|
|
|
*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Actually this only destroys the CachedPlanSource: any referenced CachedPlan
|
|
|
|
* is released, but not destroyed until its refcount goes to zero. That
|
|
|
|
* handles the situation where DropCachedPlan is called while the plan is
|
|
|
|
* still in use.
|
2009-11-04 23:26:08 +01:00
|
|
|
*/
|
|
|
|
void
|
2011-09-16 06:42:53 +02:00
|
|
|
DropCachedPlan(CachedPlanSource *plansource)
|
2009-11-04 23:26:08 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
|
|
|
|
/* If it's been saved, remove it from the list */
|
|
|
|
if (plansource->is_saved)
|
|
|
|
{
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
dlist_delete(&plansource->node);
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->is_saved = false;
|
|
|
|
}
|
|
|
|
|
2019-07-01 03:00:23 +02:00
|
|
|
/* Decrement generic CachedPlan's refcount and drop if no longer needed */
|
2011-09-16 06:42:53 +02:00
|
|
|
ReleaseGenericPlan(plansource);
|
|
|
|
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
/* Mark it no longer valid */
|
|
|
|
plansource->magic = 0;
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* Remove the CachedPlanSource and all subsidiary data (including the
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
* query_context if any). But if it's a one-shot we can't free anything.
|
2011-09-16 06:42:53 +02:00
|
|
|
*/
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
if (!plansource->is_oneshot)
|
|
|
|
MemoryContextDelete(plansource->context);
|
2009-11-04 23:26:08 +01:00
|
|
|
}
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* ReleaseGenericPlan: release a CachedPlanSource's generic plan, if any.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
|
|
|
static void
|
2011-09-16 06:42:53 +02:00
|
|
|
ReleaseGenericPlan(CachedPlanSource *plansource)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Be paranoid about the possibility that ReleaseCachedPlan fails */
|
|
|
|
if (plansource->gplan)
|
|
|
|
{
|
|
|
|
CachedPlan *plan = plansource->gplan;
|
|
|
|
|
|
|
|
Assert(plan->magic == CACHEDPLAN_MAGIC);
|
|
|
|
plansource->gplan = NULL;
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
ReleaseCachedPlan(plan, NULL);
|
2011-09-16 06:42:53 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* RevalidateCachedQuery: ensure validity of analyzed-and-rewritten query tree.
|
|
|
|
*
|
|
|
|
* What we do here is re-acquire locks and redo parse analysis if necessary.
|
|
|
|
* On return, the query_list is valid and we have sufficient locks to begin
|
|
|
|
* planning.
|
|
|
|
*
|
|
|
|
* If any parse analysis activity is required, the caller's memory context is
|
|
|
|
* used for that work.
|
|
|
|
*
|
|
|
|
* The result value is the transient analyzed-and-rewritten query tree if we
|
|
|
|
* had to do re-analysis, and NIL otherwise. (This is returned just to save
|
|
|
|
* a tree copying step in a subsequent BuildCachedPlan call.)
|
|
|
|
*/
|
|
|
|
static List *
|
2017-04-01 06:17:18 +02:00
|
|
|
RevalidateCachedQuery(CachedPlanSource *plansource,
|
|
|
|
QueryEnvironment *queryEnv)
|
2011-09-16 06:42:53 +02:00
|
|
|
{
|
|
|
|
bool snapshot_set;
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
RawStmt *rawtree;
|
2011-09-16 06:42:53 +02:00
|
|
|
List *tlist; /* transient query-tree list */
|
|
|
|
List *qlist; /* permanent query-tree list */
|
|
|
|
TupleDesc resultDesc;
|
|
|
|
MemoryContext querytree_context;
|
2007-03-13 01:33:44 +01:00
|
|
|
MemoryContext oldcxt;
|
|
|
|
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
/*
|
|
|
|
* For one-shot plans, we do not support revalidation checking; it's
|
|
|
|
* assumed the query is parsed, planned, and executed in one transaction,
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
* so that no lock re-acquisition is necessary. Also, there is never any
|
|
|
|
* need to revalidate plans for transaction control commands (and we
|
|
|
|
* mustn't risk any catalog accesses when handling those).
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
*/
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
if (plansource->is_oneshot || IsTransactionStmtPlan(plansource))
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
{
|
|
|
|
Assert(plansource->is_valid);
|
|
|
|
return NIL;
|
|
|
|
}
|
|
|
|
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
/*
|
|
|
|
* If the query is currently valid, we should have a saved search_path ---
|
|
|
|
* check to see if that matches the current environment. If not, we want
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
* to force replan.
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
*/
|
|
|
|
if (plansource->is_valid)
|
|
|
|
{
|
|
|
|
Assert(plansource->search_path != NULL);
|
|
|
|
if (!OverrideSearchPathMatchesCurrent(plansource->search_path))
|
|
|
|
{
|
|
|
|
/* Invalidate the querytree and generic plan */
|
|
|
|
plansource->is_valid = false;
|
|
|
|
if (plansource->gplan)
|
|
|
|
plansource->gplan->is_valid = false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
/*
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
* If the query rewrite phase had a possible RLS dependency, we must redo
|
|
|
|
* it if either the role or the row_security setting has changed.
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
*/
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
if (plansource->is_valid && plansource->dependsOnRLS &&
|
|
|
|
(plansource->rewriteRoleId != GetUserId() ||
|
|
|
|
plansource->rewriteRowSecurity != row_security))
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
plansource->is_valid = false;
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* If the query is currently valid, acquire locks on the referenced
|
|
|
|
* objects; then check again. We need to do it this way to cover the race
|
|
|
|
* condition that an invalidation message arrives before we get the locks.
|
|
|
|
*/
|
|
|
|
if (plansource->is_valid)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
AcquirePlannerLocks(plansource->query_list, true);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* By now, if any invalidation has happened, the inval callback
|
|
|
|
* functions will have marked the query invalid.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
if (plansource->is_valid)
|
|
|
|
{
|
|
|
|
/* Successfully revalidated and locked the query. */
|
|
|
|
return NIL;
|
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2017-03-14 16:38:30 +01:00
|
|
|
/* Oops, the race case happened. Release useless locks. */
|
2011-09-16 06:42:53 +02:00
|
|
|
AcquirePlannerLocks(plansource->query_list, false);
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Discard the no-longer-useful query tree. (Note: we don't want to do
|
|
|
|
* this any earlier, else we'd not have been able to release locks
|
|
|
|
* correctly in the race condition case.)
|
|
|
|
*/
|
|
|
|
plansource->is_valid = false;
|
|
|
|
plansource->query_list = NIL;
|
|
|
|
plansource->relationOids = NIL;
|
|
|
|
plansource->invalItems = NIL;
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
plansource->search_path = NULL;
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Free the query_context. We don't really expect MemoryContextDelete to
|
|
|
|
* fail, but just in case, make sure the CachedPlanSource is left in a
|
|
|
|
* reasonably sane state. (The generic plan won't get unlinked yet, but
|
|
|
|
* that's acceptable.)
|
|
|
|
*/
|
|
|
|
if (plansource->query_context)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
MemoryContext qcxt = plansource->query_context;
|
|
|
|
|
|
|
|
plansource->query_context = NULL;
|
|
|
|
MemoryContextDelete(qcxt);
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Drop the generic plan reference if any */
|
|
|
|
ReleaseGenericPlan(plansource);
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Now re-do parse analysis and rewrite. This not incidentally acquires
|
|
|
|
* the locks we need to do planning safely.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
Assert(plansource->is_complete);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If a snapshot is already set (the normal case), we can just use that
|
|
|
|
* for parsing/planning. But if it isn't, install one. Note: no point in
|
|
|
|
* checking whether parse analysis requires a snapshot; utility commands
|
|
|
|
* don't have invalidatable plans, so we'd not get here for such a
|
|
|
|
* command.
|
|
|
|
*/
|
|
|
|
snapshot_set = false;
|
|
|
|
if (!ActiveSnapshotSet())
|
2007-09-20 19:56:33 +02:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
PushActiveSnapshot(GetTransactionSnapshot());
|
|
|
|
snapshot_set = true;
|
2007-09-20 19:56:33 +02:00
|
|
|
}
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Run parse analysis and rule rewriting. The parser tends to scribble on
|
|
|
|
* its input, so we must copy the raw parse tree to prevent corruption of
|
|
|
|
* the cache.
|
|
|
|
*/
|
|
|
|
rawtree = copyObject(plansource->raw_parse_tree);
|
2014-11-12 21:58:37 +01:00
|
|
|
if (rawtree == NULL)
|
|
|
|
tlist = NIL;
|
|
|
|
else if (plansource->parserSetup != NULL)
|
2022-03-04 14:49:37 +01:00
|
|
|
tlist = pg_analyze_and_rewrite_withcb(rawtree,
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->query_string,
|
|
|
|
plansource->parserSetup,
|
2017-04-01 06:17:18 +02:00
|
|
|
plansource->parserSetupArg,
|
|
|
|
queryEnv);
|
2007-09-20 19:56:33 +02:00
|
|
|
else
|
2022-03-04 14:49:37 +01:00
|
|
|
tlist = pg_analyze_and_rewrite_fixedparams(rawtree,
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->query_string,
|
|
|
|
plansource->param_types,
|
2017-04-01 06:17:18 +02:00
|
|
|
plansource->num_params,
|
|
|
|
queryEnv);
|
2010-01-15 23:36:35 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Release snapshot if we got one */
|
|
|
|
if (snapshot_set)
|
|
|
|
PopActiveSnapshot();
|
2010-01-15 23:36:35 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* Check or update the result tupdesc. XXX should we use a weaker
|
|
|
|
* condition than equalTupleDescs() here?
|
|
|
|
*
|
|
|
|
* We assume the parameter types didn't change from the first time, so no
|
|
|
|
* need to update that.
|
|
|
|
*/
|
|
|
|
resultDesc = PlanCacheComputeResultDesc(tlist);
|
|
|
|
if (resultDesc == NULL && plansource->resultDesc == NULL)
|
|
|
|
{
|
|
|
|
/* OK, doesn't return tuples */
|
2008-09-09 20:58:09 +02:00
|
|
|
}
|
2011-09-16 06:42:53 +02:00
|
|
|
else if (resultDesc == NULL || plansource->resultDesc == NULL ||
|
|
|
|
!equalTupleDescs(resultDesc, plansource->resultDesc))
|
2008-09-09 20:58:09 +02:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
/* can we give a better error message? */
|
|
|
|
if (plansource->fixed_result)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
|
|
|
|
errmsg("cached plan must not change result type")));
|
|
|
|
oldcxt = MemoryContextSwitchTo(plansource->context);
|
|
|
|
if (resultDesc)
|
|
|
|
resultDesc = CreateTupleDescCopy(resultDesc);
|
|
|
|
if (plansource->resultDesc)
|
|
|
|
FreeTupleDesc(plansource->resultDesc);
|
|
|
|
plansource->resultDesc = resultDesc;
|
|
|
|
MemoryContextSwitchTo(oldcxt);
|
2008-09-09 20:58:09 +02:00
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* Allocate new query_context and copy the completed querytree into it.
|
|
|
|
* It's transient until we complete the copying and dependency extraction.
|
|
|
|
*/
|
|
|
|
querytree_context = AllocSetContextCreate(CurrentMemoryContext,
|
|
|
|
"CachedPlanQuery",
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
ALLOCSET_START_SMALL_SIZES);
|
2011-09-16 06:42:53 +02:00
|
|
|
oldcxt = MemoryContextSwitchTo(querytree_context);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2017-03-09 21:18:59 +01:00
|
|
|
qlist = copyObject(tlist);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* Use the planner machinery to extract dependencies. Data is saved in
|
|
|
|
* query_context. (We assume that not a lot of extra cruft is created by
|
|
|
|
* this call.)
|
|
|
|
*/
|
|
|
|
extract_query_dependencies((Node *) qlist,
|
|
|
|
&plansource->relationOids,
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
&plansource->invalItems,
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
&plansource->dependsOnRLS);
|
|
|
|
|
|
|
|
/* Update RLS info as well. */
|
|
|
|
plansource->rewriteRoleId = GetUserId();
|
|
|
|
plansource->rewriteRowSecurity = row_security;
|
2007-03-13 01:33:44 +01:00
|
|
|
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
/*
|
|
|
|
* Also save the current search_path in the query_context. (This should
|
|
|
|
* not generate much extra cruft either, since almost certainly the path
|
|
|
|
* is already valid.)
|
|
|
|
*/
|
|
|
|
plansource->search_path = GetOverrideSearchPath(querytree_context);
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
MemoryContextSwitchTo(oldcxt);
|
|
|
|
|
|
|
|
/* Now reparent the finished query_context and save the links */
|
|
|
|
MemoryContextSetParent(querytree_context, plansource->context);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
plansource->query_context = querytree_context;
|
|
|
|
plansource->query_list = qlist;
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Note: we do not reset generic_cost or total_custom_cost, although we
|
|
|
|
* could choose to do so. If the DDL or statistics change that prompted
|
|
|
|
* the invalidation meant a significant change in the cost estimates, it
|
|
|
|
* would be better to reset those variables and start fresh; but often it
|
|
|
|
* doesn't, and we're better retaining our hard-won knowledge about the
|
|
|
|
* relative costs.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
plansource->is_valid = true;
|
|
|
|
|
|
|
|
/* Return transient copy of querytrees for possible use in planning */
|
|
|
|
return tlist;
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* CheckCachedPlan: see if the CachedPlanSource's generic plan is valid.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Caller must have already called RevalidateCachedQuery to verify that the
|
|
|
|
* querytree is up to date.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
2011-09-16 06:42:53 +02:00
|
|
|
* On a "true" return, we have acquired the locks needed to run the plan.
|
|
|
|
* (We must do this for the "true" result to be race-condition-free.)
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
static bool
|
|
|
|
CheckCachedPlan(CachedPlanSource *plansource)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
CachedPlan *plan = plansource->gplan;
|
|
|
|
|
|
|
|
/* Assert that caller checked the querytree */
|
|
|
|
Assert(plansource->is_valid);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* If there's no generic plan, just say "false" */
|
|
|
|
if (!plan)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
Assert(plan->magic == CACHEDPLAN_MAGIC);
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
/* Generic plans are never one-shot */
|
|
|
|
Assert(!plan->is_oneshot);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
/*
|
|
|
|
* If plan isn't valid for current role, we can't use it.
|
|
|
|
*/
|
|
|
|
if (plan->is_valid && plan->dependsOnRole &&
|
|
|
|
plan->planRoleId != GetUserId())
|
|
|
|
plan->is_valid = false;
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* If it appears valid, acquire locks and recheck; this is much the same
|
|
|
|
* logic as in RevalidateCachedQuery, but for a plan.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
if (plan->is_valid)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Plan must have positive refcount because it is referenced by
|
|
|
|
* plansource; so no need to fear it disappears under us here.
|
|
|
|
*/
|
|
|
|
Assert(plan->refcount > 0);
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
AcquireExecutorLocks(plan->stmt_list, true);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2007-09-20 19:56:33 +02:00
|
|
|
/*
|
|
|
|
* If plan was transient, check to see if TransactionXmin has
|
|
|
|
* advanced, and if so invalidate it.
|
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
if (plan->is_valid &&
|
2007-09-20 19:56:33 +02:00
|
|
|
TransactionIdIsValid(plan->saved_xmin) &&
|
|
|
|
!TransactionIdEquals(plan->saved_xmin, TransactionXmin))
|
2011-09-16 06:42:53 +02:00
|
|
|
plan->is_valid = false;
|
2007-09-20 19:56:33 +02:00
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
2008-09-09 20:58:09 +02:00
|
|
|
* By now, if any invalidation has happened, the inval callback
|
2011-09-16 06:42:53 +02:00
|
|
|
* functions will have marked the plan invalid.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
if (plan->is_valid)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Successfully revalidated and locked the query. */
|
|
|
|
return true;
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
2011-09-16 06:42:53 +02:00
|
|
|
|
2017-03-14 16:38:30 +01:00
|
|
|
/* Oops, the race case happened. Release useless locks. */
|
2011-09-16 06:42:53 +02:00
|
|
|
AcquireExecutorLocks(plan->stmt_list, false);
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Plan has been invalidated, so unlink it from the parent and release it.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
ReleaseGenericPlan(plansource);
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* BuildCachedPlan: construct a new CachedPlan from a CachedPlanSource.
|
|
|
|
*
|
2011-09-26 18:44:17 +02:00
|
|
|
* qlist should be the result value from a previous RevalidateCachedQuery,
|
|
|
|
* or it can be set to NIL if we need to re-copy the plansource's query_list.
|
2011-09-16 06:42:53 +02:00
|
|
|
*
|
|
|
|
* To build a generic, parameter-value-independent plan, pass NULL for
|
|
|
|
* boundParams. To build a custom plan, pass the actual parameter values via
|
|
|
|
* boundParams. For best effect, the PARAM_FLAG_CONST flag should be set on
|
|
|
|
* each parameter value; otherwise the planner will treat the value as a
|
|
|
|
* hint rather than a hard constant.
|
|
|
|
*
|
|
|
|
* Planning work is done in the caller's memory context. The finished plan
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
* is in a child memory context, which typically should get reparented
|
|
|
|
* (unless this is a one-shot plan, in which case we don't copy the plan).
|
2011-09-16 06:42:53 +02:00
|
|
|
*/
|
|
|
|
static CachedPlan *
|
|
|
|
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
|
2017-04-01 06:17:18 +02:00
|
|
|
ParamListInfo boundParams, QueryEnvironment *queryEnv)
|
2011-09-16 06:42:53 +02:00
|
|
|
{
|
|
|
|
CachedPlan *plan;
|
|
|
|
List *plist;
|
|
|
|
bool snapshot_set;
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
bool is_transient;
|
2011-09-16 06:42:53 +02:00
|
|
|
MemoryContext plan_context;
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
MemoryContext oldcxt = CurrentMemoryContext;
|
2016-01-28 20:05:36 +01:00
|
|
|
ListCell *lc;
|
2011-09-16 06:42:53 +02:00
|
|
|
|
2011-09-17 07:47:33 +02:00
|
|
|
/*
|
|
|
|
* Normally the querytree should be valid already, but if it's not,
|
|
|
|
* rebuild it.
|
|
|
|
*
|
|
|
|
* NOTE: GetCachedPlan should have called RevalidateCachedQuery first, so
|
|
|
|
* we ought to be holding sufficient locks to prevent any invalidation.
|
|
|
|
* However, if we're building a custom plan after having built and
|
|
|
|
* rejected a generic plan, it's possible to reach here with is_valid
|
|
|
|
* false due to an invalidation while making the generic plan. In theory
|
|
|
|
* the invalidation must be a false positive, perhaps a consequence of an
|
2021-07-13 21:01:01 +02:00
|
|
|
* sinval reset event or the debug_discard_caches code. But for safety,
|
|
|
|
* let's treat it as real and redo the RevalidateCachedQuery call.
|
2011-09-17 07:47:33 +02:00
|
|
|
*/
|
|
|
|
if (!plansource->is_valid)
|
2017-04-01 06:17:18 +02:00
|
|
|
qlist = RevalidateCachedQuery(plansource, queryEnv);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If we don't already have a copy of the querytree list that can be
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
* scribbled on by the planner, make one. For a one-shot plan, we assume
|
|
|
|
* it's okay to scribble on the original query_list.
|
2011-09-16 06:42:53 +02:00
|
|
|
*/
|
|
|
|
if (qlist == NIL)
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
{
|
|
|
|
if (!plansource->is_oneshot)
|
2017-03-09 21:18:59 +01:00
|
|
|
qlist = copyObject(plansource->query_list);
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
else
|
|
|
|
qlist = plansource->query_list;
|
|
|
|
}
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/*
|
2011-09-25 23:33:32 +02:00
|
|
|
* If a snapshot is already set (the normal case), we can just use that
|
|
|
|
* for planning. But if it isn't, and we need one, install one.
|
2011-09-16 06:42:53 +02:00
|
|
|
*/
|
|
|
|
snapshot_set = false;
|
2011-09-25 23:33:32 +02:00
|
|
|
if (!ActiveSnapshotSet() &&
|
2014-11-12 21:58:37 +01:00
|
|
|
plansource->raw_parse_tree &&
|
2011-09-25 23:33:32 +02:00
|
|
|
analyze_requires_snapshot(plansource->raw_parse_tree))
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
PushActiveSnapshot(GetTransactionSnapshot());
|
|
|
|
snapshot_set = true;
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* Generate the plan.
|
|
|
|
*/
|
2020-03-30 06:51:05 +02:00
|
|
|
plist = pg_plan_queries(qlist, plansource->query_string,
|
|
|
|
plansource->cursor_options, boundParams);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/* Release snapshot if we got one */
|
|
|
|
if (snapshot_set)
|
|
|
|
PopActiveSnapshot();
|
|
|
|
|
|
|
|
/*
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
* Normally we make a dedicated memory context for the CachedPlan and its
|
|
|
|
* subsidiary data. (It's probably not going to be large, but just in
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
* case, allow it to grow large. It's transient for the moment.) But for
|
|
|
|
* a one-shot plan, we just leave it in the caller's memory context.
|
2011-09-16 06:42:53 +02:00
|
|
|
*/
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
if (!plansource->is_oneshot)
|
|
|
|
{
|
|
|
|
plan_context = AllocSetContextCreate(CurrentMemoryContext,
|
|
|
|
"CachedPlan",
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
ALLOCSET_START_SMALL_SIZES);
|
2018-04-06 18:10:00 +02:00
|
|
|
MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
/*
|
|
|
|
* Copy plan into the new context.
|
|
|
|
*/
|
|
|
|
MemoryContextSwitchTo(plan_context);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
2017-03-09 21:18:59 +01:00
|
|
|
plist = copyObject(plist);
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
}
|
|
|
|
else
|
|
|
|
plan_context = CurrentMemoryContext;
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Create and fill the CachedPlan struct within the new context.
|
|
|
|
*/
|
|
|
|
plan = (CachedPlan *) palloc(sizeof(CachedPlan));
|
|
|
|
plan->magic = CACHEDPLAN_MAGIC;
|
|
|
|
plan->stmt_list = plist;
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* CachedPlan is dependent on role either if RLS affected the rewrite
|
|
|
|
* phase or if a role dependency was injected during planning. And it's
|
|
|
|
* transient if any plan is marked so.
|
|
|
|
*/
|
|
|
|
plan->planRoleId = GetUserId();
|
|
|
|
plan->dependsOnRole = plansource->dependsOnRLS;
|
|
|
|
is_transient = false;
|
|
|
|
foreach(lc, plist)
|
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
if (plannedstmt->commandType == CMD_UTILITY)
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
continue; /* Ignore utility statements */
|
|
|
|
|
|
|
|
if (plannedstmt->transientPlan)
|
|
|
|
is_transient = true;
|
|
|
|
if (plannedstmt->dependsOnRole)
|
|
|
|
plan->dependsOnRole = true;
|
|
|
|
}
|
|
|
|
if (is_transient)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
Assert(TransactionIdIsNormal(TransactionXmin));
|
|
|
|
plan->saved_xmin = TransactionXmin;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
plan->saved_xmin = InvalidTransactionId;
|
|
|
|
plan->refcount = 0;
|
|
|
|
plan->context = plan_context;
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
plan->is_oneshot = plansource->is_oneshot;
|
2011-09-16 06:42:53 +02:00
|
|
|
plan->is_saved = false;
|
|
|
|
plan->is_valid = true;
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* assign generation number to new plan */
|
|
|
|
plan->generation = ++(plansource->generation);
|
2007-03-23 20:53:52 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
MemoryContextSwitchTo(oldcxt);
|
2008-12-13 03:00:20 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
return plan;
|
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* choose_custom_plan: choose whether to use custom or generic plan
|
|
|
|
*
|
|
|
|
* This defines the policy followed by GetCachedPlan.
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
choose_custom_plan(CachedPlanSource *plansource, ParamListInfo boundParams)
|
|
|
|
{
|
|
|
|
double avg_custom_cost;
|
2009-07-14 17:37:50 +02:00
|
|
|
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
/* One-shot plans will always be considered custom */
|
|
|
|
if (plansource->is_oneshot)
|
|
|
|
return true;
|
|
|
|
|
|
|
|
/* Otherwise, never any point in a custom plan if there's no parameters */
|
2011-09-16 06:42:53 +02:00
|
|
|
if (boundParams == NULL)
|
|
|
|
return false;
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
/* ... nor for transaction control statements */
|
|
|
|
if (IsTransactionStmtPlan(plansource))
|
|
|
|
return false;
|
2009-07-14 17:37:50 +02:00
|
|
|
|
2018-07-16 13:35:41 +02:00
|
|
|
/* Let settings force the decision */
|
|
|
|
if (plan_cache_mode == PLAN_CACHE_MODE_FORCE_GENERIC_PLAN)
|
|
|
|
return false;
|
|
|
|
if (plan_cache_mode == PLAN_CACHE_MODE_FORCE_CUSTOM_PLAN)
|
|
|
|
return true;
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* See if caller wants to force the decision */
|
|
|
|
if (plansource->cursor_options & CURSOR_OPT_GENERIC_PLAN)
|
|
|
|
return false;
|
|
|
|
if (plansource->cursor_options & CURSOR_OPT_CUSTOM_PLAN)
|
|
|
|
return true;
|
2009-07-14 17:37:50 +02:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Generate custom plans until we have done at least 5 (arbitrary) */
|
|
|
|
if (plansource->num_custom_plans < 5)
|
|
|
|
return true;
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
avg_custom_cost = plansource->total_custom_cost / plansource->num_custom_plans;
|
|
|
|
|
|
|
|
/*
|
2013-08-24 21:14:17 +02:00
|
|
|
* Prefer generic plan if it's less expensive than the average custom
|
|
|
|
* plan. (Because we include a charge for cost of planning in the
|
|
|
|
* custom-plan costs, this means the generic plan only has to be less
|
|
|
|
* expensive than the execution cost plus replan cost of the custom
|
|
|
|
* plans.)
|
2011-09-16 06:42:53 +02:00
|
|
|
*
|
|
|
|
* Note that if generic_cost is -1 (indicating we've not yet determined
|
|
|
|
* the generic plan cost), we'll always prefer generic at this point.
|
|
|
|
*/
|
2013-08-24 21:14:17 +02:00
|
|
|
if (plansource->generic_cost < avg_custom_cost)
|
2011-09-16 06:42:53 +02:00
|
|
|
return false;
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* cached_plan_cost: calculate estimated cost of a plan
|
2013-08-24 21:14:17 +02:00
|
|
|
*
|
|
|
|
* If include_planner is true, also include the estimated cost of constructing
|
|
|
|
* the plan. (We must factor that into the cost of using a custom plan, but
|
|
|
|
* we don't count it for a generic plan.)
|
2011-09-16 06:42:53 +02:00
|
|
|
*/
|
|
|
|
static double
|
2013-08-24 21:14:17 +02:00
|
|
|
cached_plan_cost(CachedPlan *plan, bool include_planner)
|
2011-09-16 06:42:53 +02:00
|
|
|
{
|
|
|
|
double result = 0;
|
|
|
|
ListCell *lc;
|
|
|
|
|
|
|
|
foreach(lc, plan->stmt_list)
|
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
if (plannedstmt->commandType == CMD_UTILITY)
|
2011-09-16 06:42:53 +02:00
|
|
|
continue; /* Ignore utility statements */
|
|
|
|
|
|
|
|
result += plannedstmt->planTree->total_cost;
|
2013-08-24 21:14:17 +02:00
|
|
|
|
|
|
|
if (include_planner)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Currently we use a very crude estimate of planning effort based
|
|
|
|
* on the number of relations in the finished plan's rangetable.
|
|
|
|
* Join planning effort actually scales much worse than linearly
|
|
|
|
* in the number of relations --- but only until the join collapse
|
|
|
|
* limits kick in. Also, while inheritance child relations surely
|
|
|
|
* add to planning effort, they don't make the join situation
|
|
|
|
* worse. So the actual shape of the planning cost curve versus
|
|
|
|
* number of relations isn't all that obvious. It will take
|
|
|
|
* considerable work to arrive at a less crude estimate, and for
|
|
|
|
* now it's not clear that's worth doing.
|
|
|
|
*
|
|
|
|
* The other big difficulty here is that we don't have any very
|
|
|
|
* good model of how planning cost compares to execution costs.
|
|
|
|
* The current multiplier of 1000 * cpu_operator_cost is probably
|
|
|
|
* on the low side, but we'll try this for awhile before making a
|
|
|
|
* more aggressive correction.
|
|
|
|
*
|
|
|
|
* If we ever do write a more complicated estimator, it should
|
|
|
|
* probably live in src/backend/optimizer/ not here.
|
|
|
|
*/
|
|
|
|
int nrelations = list_length(plannedstmt->rtable);
|
|
|
|
|
|
|
|
result += 1000.0 * cpu_operator_cost * (nrelations + 1);
|
|
|
|
}
|
2011-09-16 06:42:53 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* GetCachedPlan: get a cached plan from a CachedPlanSource.
|
|
|
|
*
|
|
|
|
* This function hides the logic that decides whether to use a generic
|
|
|
|
* plan or a custom plan for the given parameters: the caller does not know
|
|
|
|
* which it will get.
|
|
|
|
*
|
|
|
|
* On return, the plan is valid and we have sufficient locks to begin
|
|
|
|
* execution.
|
|
|
|
*
|
|
|
|
* On return, the refcount of the plan has been incremented; a later
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
* ReleaseCachedPlan() call is expected. If "owner" is not NULL then
|
|
|
|
* the refcount has been reported to that ResourceOwner (note that this
|
|
|
|
* is only supported for "saved" CachedPlanSources).
|
2011-09-16 06:42:53 +02:00
|
|
|
*
|
|
|
|
* Note: if any replanning activity is required, the caller's memory context
|
|
|
|
* is used for that work.
|
|
|
|
*/
|
|
|
|
CachedPlan *
|
|
|
|
GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
ResourceOwner owner, QueryEnvironment *queryEnv)
|
2011-09-16 06:42:53 +02:00
|
|
|
{
|
2016-12-07 05:02:38 +01:00
|
|
|
CachedPlan *plan = NULL;
|
2011-09-16 06:42:53 +02:00
|
|
|
List *qlist;
|
|
|
|
bool customplan;
|
|
|
|
|
|
|
|
/* Assert caller is doing things in a sane order */
|
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
Assert(plansource->is_complete);
|
|
|
|
/* This seems worth a real test, though */
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
if (owner && !plansource->is_saved)
|
2011-09-16 06:42:53 +02:00
|
|
|
elog(ERROR, "cannot apply ResourceOwner to non-saved cached plan");
|
|
|
|
|
|
|
|
/* Make sure the querytree list is valid and we have parse-time locks */
|
2017-04-01 06:17:18 +02:00
|
|
|
qlist = RevalidateCachedQuery(plansource, queryEnv);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/* Decide whether to use a custom plan */
|
|
|
|
customplan = choose_custom_plan(plansource, boundParams);
|
|
|
|
|
|
|
|
if (!customplan)
|
|
|
|
{
|
|
|
|
if (CheckCachedPlan(plansource))
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
/* We want a generic plan, and we already have a valid one */
|
|
|
|
plan = plansource->gplan;
|
|
|
|
Assert(plan->magic == CACHEDPLAN_MAGIC);
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
2011-09-16 06:42:53 +02:00
|
|
|
else
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Build a new generic plan */
|
2017-04-01 06:17:18 +02:00
|
|
|
plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Just make real sure plansource->gplan is clear */
|
|
|
|
ReleaseGenericPlan(plansource);
|
|
|
|
/* Link the new generic plan into the plansource */
|
|
|
|
plansource->gplan = plan;
|
|
|
|
plan->refcount++;
|
|
|
|
/* Immediately reparent into appropriate context */
|
|
|
|
if (plansource->is_saved)
|
|
|
|
{
|
|
|
|
/* saved plans all live under CacheMemoryContext */
|
|
|
|
MemoryContextSetParent(plan->context, CacheMemoryContext);
|
|
|
|
plan->is_saved = true;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/* otherwise, it should be a sibling of the plansource */
|
|
|
|
MemoryContextSetParent(plan->context,
|
|
|
|
MemoryContextGetParent(plansource->context));
|
|
|
|
}
|
|
|
|
/* Update generic_cost whenever we make a new generic plan */
|
2013-08-24 21:14:17 +02:00
|
|
|
plansource->generic_cost = cached_plan_cost(plan, false);
|
2007-03-23 20:53:52 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* If, based on the now-known value of generic_cost, we'd not have
|
|
|
|
* chosen to use a generic plan, then forget it and make a custom
|
|
|
|
* plan. This is a bit of a wart but is necessary to avoid a
|
|
|
|
* glitch in behavior when the custom plans are consistently big
|
|
|
|
* winners; at some point we'll experiment with a generic plan and
|
|
|
|
* find it's a loser, but we don't want to actually execute that
|
|
|
|
* plan.
|
|
|
|
*/
|
|
|
|
customplan = choose_custom_plan(plansource, boundParams);
|
2011-09-26 18:44:17 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If we choose to plan again, we need to re-copy the query_list,
|
|
|
|
* since the planner probably scribbled on it. We can force
|
|
|
|
* BuildCachedPlan to do that by passing NIL.
|
|
|
|
*/
|
|
|
|
qlist = NIL;
|
2011-09-16 06:42:53 +02:00
|
|
|
}
|
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
if (customplan)
|
|
|
|
{
|
|
|
|
/* Build a custom plan */
|
2017-04-01 06:17:18 +02:00
|
|
|
plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
|
2020-07-20 04:55:50 +02:00
|
|
|
/* Accumulate total costs of custom plans */
|
|
|
|
plansource->total_custom_cost += cached_plan_cost(plan, true);
|
|
|
|
|
|
|
|
plansource->num_custom_plans++;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
plansource->num_generic_plans++;
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
2016-12-07 05:02:38 +01:00
|
|
|
Assert(plan != NULL);
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Flag the plan as in use by caller */
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
if (owner)
|
|
|
|
ResourceOwnerEnlargePlanCacheRefs(owner);
|
2007-03-13 01:33:44 +01:00
|
|
|
plan->refcount++;
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
if (owner)
|
|
|
|
ResourceOwnerRememberPlanCacheRef(owner, plan);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* Saved plans should be under CacheMemoryContext so they will not go away
|
|
|
|
* until their reference count goes to zero. In the generic-plan cases we
|
|
|
|
* already took care of that, but for a custom plan, do it as soon as we
|
|
|
|
* have created a reference-counted link.
|
|
|
|
*/
|
|
|
|
if (customplan && plansource->is_saved)
|
|
|
|
{
|
|
|
|
MemoryContextSetParent(plan->context, CacheMemoryContext);
|
|
|
|
plan->is_saved = true;
|
|
|
|
}
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
return plan;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* ReleaseCachedPlan: release active use of a cached plan.
|
|
|
|
*
|
|
|
|
* This decrements the reference count, and frees the plan if the count
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
* has thereby gone to zero. If "owner" is not NULL, it is assumed that
|
|
|
|
* the reference count is managed by that ResourceOwner.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
* Note: owner == NULL is used for releasing references that are in
|
2007-03-13 01:33:44 +01:00
|
|
|
* persistent data structures, such as the parent CachedPlanSource or a
|
|
|
|
* Portal. Transient references should be protected by a resource owner.
|
|
|
|
*/
|
|
|
|
void
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
Assert(plan->magic == CACHEDPLAN_MAGIC);
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
if (owner)
|
2011-09-16 06:42:53 +02:00
|
|
|
{
|
|
|
|
Assert(plan->is_saved);
|
Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
|
|
|
ResourceOwnerForgetPlanCacheRef(owner, plan);
|
2011-09-16 06:42:53 +02:00
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
Assert(plan->refcount > 0);
|
|
|
|
plan->refcount--;
|
|
|
|
if (plan->refcount == 0)
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
{
|
|
|
|
/* Mark it no longer valid */
|
|
|
|
plan->magic = 0;
|
|
|
|
|
|
|
|
/* One-shot plans do not own their context, so we can't free them */
|
|
|
|
if (!plan->is_oneshot)
|
|
|
|
MemoryContextDelete(plan->context);
|
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.
First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).
The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session. This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.
Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation. The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.
Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.
Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here. Also thanks to Andres Freund for review.
Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
|
|
|
/*
|
|
|
|
* CachedPlanAllowsSimpleValidityCheck: can we use CachedPlanIsSimplyValid?
|
|
|
|
*
|
|
|
|
* This function, together with CachedPlanIsSimplyValid, provides a fast path
|
|
|
|
* for revalidating "simple" generic plans. The core requirement to be simple
|
|
|
|
* is that the plan must not require taking any locks, which translates to
|
|
|
|
* not touching any tables; this happens to match up well with an important
|
|
|
|
* use-case in PL/pgSQL. This function tests whether that's true, along
|
|
|
|
* with checking some other corner cases that we'd rather not bother with
|
|
|
|
* handling in the fast path. (Note that it's still possible for such a plan
|
|
|
|
* to be invalidated, for example due to a change in a function that was
|
|
|
|
* inlined into the plan.)
|
|
|
|
*
|
Rearrange validity checks for plpgsql "simple" expressions.
Buildfarm experience shows what probably should've occurred to me before:
if a cache flush occurs partway through building a generic plan, then
the plansource may have is_valid = false even though the plan is valid.
We need to accept this case, use the generated plan, and then try to
replan the next time. We can't try to replan immediately, because that
would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover
it's really overkill. (We can assume that the plan is valid, it's just
possibly a bit stale. Note that the pre-existing code behaved this way,
and the non-simple-expression code paths do too.) Conversely, not using
the generated plan would drop us into the not-a-simple-expression code
path, which is bad for performance and would also cause regression-test
failures due to visibly different error-reporting behavior.
Hence, refactor the validity-check functions so that the initial check
and recheck cases can react differently to plansource->is_valid.
This makes their usage a bit simpler, too.
Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-27 19:47:34 +01:00
|
|
|
* If the plan is simply valid, and "owner" is not NULL, record a refcount on
|
|
|
|
* the plan in that resowner before returning. It is caller's responsibility
|
|
|
|
* to be sure that a refcount is held on any plan that's being actively used.
|
|
|
|
*
|
Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.
First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).
The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session. This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.
Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation. The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.
Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.
Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here. Also thanks to Andres Freund for review.
Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
|
|
|
* This must only be called on known-valid generic plans (eg, ones just
|
|
|
|
* returned by GetCachedPlan). If it returns true, the caller may re-use
|
|
|
|
* the cached plan as long as CachedPlanIsSimplyValid returns true; that
|
|
|
|
* check is much cheaper than the full revalidation done by GetCachedPlan.
|
|
|
|
* Nonetheless, no required checks are omitted.
|
|
|
|
*/
|
|
|
|
bool
|
|
|
|
CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
|
Rearrange validity checks for plpgsql "simple" expressions.
Buildfarm experience shows what probably should've occurred to me before:
if a cache flush occurs partway through building a generic plan, then
the plansource may have is_valid = false even though the plan is valid.
We need to accept this case, use the generated plan, and then try to
replan the next time. We can't try to replan immediately, because that
would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover
it's really overkill. (We can assume that the plan is valid, it's just
possibly a bit stale. Note that the pre-existing code behaved this way,
and the non-simple-expression code paths do too.) Conversely, not using
the generated plan would drop us into the not-a-simple-expression code
path, which is bad for performance and would also cause regression-test
failures due to visibly different error-reporting behavior.
Hence, refactor the validity-check functions so that the initial check
and recheck cases can react differently to plansource->is_valid.
This makes their usage a bit simpler, too.
Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-27 19:47:34 +01:00
|
|
|
CachedPlan *plan, ResourceOwner owner)
|
Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.
First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).
The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session. This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.
Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation. The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.
Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.
Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here. Also thanks to Andres Freund for review.
Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
|
|
|
{
|
|
|
|
ListCell *lc;
|
|
|
|
|
Rearrange validity checks for plpgsql "simple" expressions.
Buildfarm experience shows what probably should've occurred to me before:
if a cache flush occurs partway through building a generic plan, then
the plansource may have is_valid = false even though the plan is valid.
We need to accept this case, use the generated plan, and then try to
replan the next time. We can't try to replan immediately, because that
would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover
it's really overkill. (We can assume that the plan is valid, it's just
possibly a bit stale. Note that the pre-existing code behaved this way,
and the non-simple-expression code paths do too.) Conversely, not using
the generated plan would drop us into the not-a-simple-expression code
path, which is bad for performance and would also cause regression-test
failures due to visibly different error-reporting behavior.
Hence, refactor the validity-check functions so that the initial check
and recheck cases can react differently to plansource->is_valid.
This makes their usage a bit simpler, too.
Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-27 19:47:34 +01:00
|
|
|
/*
|
|
|
|
* Sanity-check that the caller gave us a validated generic plan. Notice
|
|
|
|
* that we *don't* assert plansource->is_valid as you might expect; that's
|
|
|
|
* because it's possible that that's already false when GetCachedPlan
|
|
|
|
* returns, e.g. because ResetPlanCache happened partway through. We
|
|
|
|
* should accept the plan as long as plan->is_valid is true, and expect to
|
|
|
|
* replan after the next CachedPlanIsSimplyValid call.
|
|
|
|
*/
|
Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.
First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).
The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session. This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.
Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation. The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.
Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.
Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here. Also thanks to Andres Freund for review.
Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
Assert(plan->magic == CACHEDPLAN_MAGIC);
|
|
|
|
Assert(plan->is_valid);
|
|
|
|
Assert(plan == plansource->gplan);
|
Rearrange validity checks for plpgsql "simple" expressions.
Buildfarm experience shows what probably should've occurred to me before:
if a cache flush occurs partway through building a generic plan, then
the plansource may have is_valid = false even though the plan is valid.
We need to accept this case, use the generated plan, and then try to
replan the next time. We can't try to replan immediately, because that
would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover
it's really overkill. (We can assume that the plan is valid, it's just
possibly a bit stale. Note that the pre-existing code behaved this way,
and the non-simple-expression code paths do too.) Conversely, not using
the generated plan would drop us into the not-a-simple-expression code
path, which is bad for performance and would also cause regression-test
failures due to visibly different error-reporting behavior.
Hence, refactor the validity-check functions so that the initial check
and recheck cases can react differently to plansource->is_valid.
This makes their usage a bit simpler, too.
Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-27 19:47:34 +01:00
|
|
|
Assert(plansource->search_path != NULL);
|
|
|
|
Assert(OverrideSearchPathMatchesCurrent(plansource->search_path));
|
Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.
First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).
The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session. This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.
Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation. The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.
Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.
Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here. Also thanks to Andres Freund for review.
Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
|
|
|
|
|
|
|
/* We don't support oneshot plans here. */
|
|
|
|
if (plansource->is_oneshot)
|
|
|
|
return false;
|
|
|
|
Assert(!plan->is_oneshot);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the plan is dependent on RLS considerations, or it's transient,
|
|
|
|
* reject. These things probably can't ever happen for table-free
|
|
|
|
* queries, but for safety's sake let's check.
|
|
|
|
*/
|
|
|
|
if (plansource->dependsOnRLS)
|
|
|
|
return false;
|
|
|
|
if (plan->dependsOnRole)
|
|
|
|
return false;
|
|
|
|
if (TransactionIdIsValid(plan->saved_xmin))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reject if AcquirePlannerLocks would have anything to do. This is
|
|
|
|
* simplistic, but there's no need to inquire any more carefully; indeed,
|
|
|
|
* for current callers it shouldn't even be possible to hit any of these
|
|
|
|
* checks.
|
|
|
|
*/
|
|
|
|
foreach(lc, plansource->query_list)
|
|
|
|
{
|
|
|
|
Query *query = lfirst_node(Query, lc);
|
|
|
|
|
|
|
|
if (query->commandType == CMD_UTILITY)
|
|
|
|
return false;
|
|
|
|
if (query->rtable || query->cteList || query->hasSubLinks)
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reject if AcquireExecutorLocks would have anything to do. This is
|
|
|
|
* probably unnecessary given the previous check, but let's be safe.
|
|
|
|
*/
|
|
|
|
foreach(lc, plan->stmt_list)
|
|
|
|
{
|
|
|
|
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
|
|
|
|
ListCell *lc2;
|
|
|
|
|
|
|
|
if (plannedstmt->commandType == CMD_UTILITY)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We have to grovel through the rtable because it's likely to contain
|
|
|
|
* an RTE_RESULT relation, rather than being totally empty.
|
|
|
|
*/
|
|
|
|
foreach(lc2, plannedstmt->rtable)
|
|
|
|
{
|
|
|
|
RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
|
|
|
|
|
|
|
|
if (rte->rtekind == RTE_RELATION)
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Okay, it's simple. Note that what we've primarily established here is
|
|
|
|
* that no locks need be taken before checking the plan's is_valid flag.
|
|
|
|
*/
|
Rearrange validity checks for plpgsql "simple" expressions.
Buildfarm experience shows what probably should've occurred to me before:
if a cache flush occurs partway through building a generic plan, then
the plansource may have is_valid = false even though the plan is valid.
We need to accept this case, use the generated plan, and then try to
replan the next time. We can't try to replan immediately, because that
would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover
it's really overkill. (We can assume that the plan is valid, it's just
possibly a bit stale. Note that the pre-existing code behaved this way,
and the non-simple-expression code paths do too.) Conversely, not using
the generated plan would drop us into the not-a-simple-expression code
path, which is bad for performance and would also cause regression-test
failures due to visibly different error-reporting behavior.
Hence, refactor the validity-check functions so that the initial check
and recheck cases can react differently to plansource->is_valid.
This makes their usage a bit simpler, too.
Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-27 19:47:34 +01:00
|
|
|
|
|
|
|
/* Bump refcount if requested. */
|
|
|
|
if (owner)
|
|
|
|
{
|
|
|
|
ResourceOwnerEnlargePlanCacheRefs(owner);
|
|
|
|
plan->refcount++;
|
|
|
|
ResourceOwnerRememberPlanCacheRef(owner, plan);
|
|
|
|
}
|
|
|
|
|
Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.
First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).
The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session. This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.
Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation. The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.
Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.
Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here. Also thanks to Andres Freund for review.
Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* CachedPlanIsSimplyValid: quick check for plan still being valid
|
|
|
|
*
|
|
|
|
* This function must not be used unless CachedPlanAllowsSimpleValidityCheck
|
|
|
|
* previously said it was OK.
|
|
|
|
*
|
|
|
|
* If the plan is valid, and "owner" is not NULL, record a refcount on
|
|
|
|
* the plan in that resowner before returning. It is caller's responsibility
|
|
|
|
* to be sure that a refcount is held on any plan that's being actively used.
|
|
|
|
*
|
|
|
|
* The code here is unconditionally safe as long as the only use of this
|
|
|
|
* CachedPlanSource is in connection with the particular CachedPlan pointer
|
|
|
|
* that's passed in. If the plansource were being used for other purposes,
|
|
|
|
* it's possible that its generic plan could be invalidated and regenerated
|
|
|
|
* while the current caller wasn't looking, and then there could be a chance
|
|
|
|
* collision of address between this caller's now-stale plan pointer and the
|
|
|
|
* actual address of the new generic plan. For current uses, that scenario
|
|
|
|
* can't happen; but with a plansource shared across multiple uses, it'd be
|
|
|
|
* advisable to also save plan->generation and verify that that still matches.
|
|
|
|
*/
|
|
|
|
bool
|
|
|
|
CachedPlanIsSimplyValid(CachedPlanSource *plansource, CachedPlan *plan,
|
|
|
|
ResourceOwner owner)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Careful here: since the caller doesn't necessarily hold a refcount on
|
|
|
|
* the plan to start with, it's possible that "plan" is a dangling
|
|
|
|
* pointer. Don't dereference it until we've verified that it still
|
|
|
|
* matches the plansource's gplan (which is either valid or NULL).
|
|
|
|
*/
|
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Has cache invalidation fired on this plan? We can check this right
|
Rearrange validity checks for plpgsql "simple" expressions.
Buildfarm experience shows what probably should've occurred to me before:
if a cache flush occurs partway through building a generic plan, then
the plansource may have is_valid = false even though the plan is valid.
We need to accept this case, use the generated plan, and then try to
replan the next time. We can't try to replan immediately, because that
would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover
it's really overkill. (We can assume that the plan is valid, it's just
possibly a bit stale. Note that the pre-existing code behaved this way,
and the non-simple-expression code paths do too.) Conversely, not using
the generated plan would drop us into the not-a-simple-expression code
path, which is bad for performance and would also cause regression-test
failures due to visibly different error-reporting behavior.
Hence, refactor the validity-check functions so that the initial check
and recheck cases can react differently to plansource->is_valid.
This makes their usage a bit simpler, too.
Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-27 19:47:34 +01:00
|
|
|
* away since there are no locks that we'd need to acquire first. Note
|
|
|
|
* that here we *do* check plansource->is_valid, so as to force plan
|
|
|
|
* rebuild if that's become false.
|
Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.
First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).
The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session. This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.
Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation. The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.
Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.
Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here. Also thanks to Andres Freund for review.
Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
|
|
|
*/
|
|
|
|
if (!plansource->is_valid || plan != plansource->gplan || !plan->is_valid)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
Assert(plan->magic == CACHEDPLAN_MAGIC);
|
|
|
|
|
|
|
|
/* Is the search_path still the same as when we made it? */
|
|
|
|
Assert(plansource->search_path != NULL);
|
|
|
|
if (!OverrideSearchPathMatchesCurrent(plansource->search_path))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
/* It's still good. Bump refcount if requested. */
|
|
|
|
if (owner)
|
|
|
|
{
|
|
|
|
ResourceOwnerEnlargePlanCacheRefs(owner);
|
|
|
|
plan->refcount++;
|
|
|
|
ResourceOwnerRememberPlanCacheRef(owner, plan);
|
|
|
|
}
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2008-09-16 01:37:40 +02:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* CachedPlanSetParentContext: move a CachedPlanSource to a new memory context
|
|
|
|
*
|
|
|
|
* This can only be applied to unsaved plans; once saved, a plan always
|
|
|
|
* lives underneath CacheMemoryContext.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
CachedPlanSetParentContext(CachedPlanSource *plansource,
|
|
|
|
MemoryContext newcontext)
|
|
|
|
{
|
|
|
|
/* Assert caller is doing things in a sane order */
|
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
Assert(plansource->is_complete);
|
|
|
|
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
/* These seem worth real tests, though */
|
2011-09-16 06:42:53 +02:00
|
|
|
if (plansource->is_saved)
|
|
|
|
elog(ERROR, "cannot move a saved cached plan to another context");
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
if (plansource->is_oneshot)
|
|
|
|
elog(ERROR, "cannot move a one-shot cached plan to another context");
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/* OK, let the caller keep the plan where he wishes */
|
|
|
|
MemoryContextSetParent(plansource->context, newcontext);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The query_context needs no special handling, since it's a child of
|
|
|
|
* plansource->context. But if there's a generic plan, it should be
|
|
|
|
* maintained as a sibling of plansource->context.
|
|
|
|
*/
|
|
|
|
if (plansource->gplan)
|
|
|
|
{
|
|
|
|
Assert(plansource->gplan->magic == CACHEDPLAN_MAGIC);
|
|
|
|
MemoryContextSetParent(plansource->gplan->context, newcontext);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* CopyCachedPlan: make a copy of a CachedPlanSource
|
|
|
|
*
|
|
|
|
* This is a convenience routine that does the equivalent of
|
|
|
|
* CreateCachedPlan + CompleteCachedPlan, using the data stored in the
|
|
|
|
* input CachedPlanSource. The result is therefore "unsaved" (regardless
|
|
|
|
* of the state of the source), and we don't copy any generic plan either.
|
|
|
|
* The result will be currently valid, or not, the same as the source.
|
|
|
|
*/
|
|
|
|
CachedPlanSource *
|
|
|
|
CopyCachedPlan(CachedPlanSource *plansource)
|
|
|
|
{
|
|
|
|
CachedPlanSource *newsource;
|
|
|
|
MemoryContext source_context;
|
|
|
|
MemoryContext querytree_context;
|
|
|
|
MemoryContext oldcxt;
|
|
|
|
|
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
Assert(plansource->is_complete);
|
|
|
|
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
/*
|
|
|
|
* One-shot plans can't be copied, because we haven't taken care that
|
|
|
|
* parsing/planning didn't scribble on the raw parse tree or querytrees.
|
|
|
|
*/
|
|
|
|
if (plansource->is_oneshot)
|
|
|
|
elog(ERROR, "cannot copy a one-shot cached plan");
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
source_context = AllocSetContextCreate(CurrentMemoryContext,
|
|
|
|
"CachedPlanSource",
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
ALLOCSET_START_SMALL_SIZES);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
oldcxt = MemoryContextSwitchTo(source_context);
|
|
|
|
|
|
|
|
newsource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource));
|
|
|
|
newsource->magic = CACHEDPLANSOURCE_MAGIC;
|
|
|
|
newsource->raw_parse_tree = copyObject(plansource->raw_parse_tree);
|
|
|
|
newsource->query_string = pstrdup(plansource->query_string);
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
MemoryContextSetIdentifier(source_context, newsource->query_string);
|
2011-09-16 06:42:53 +02:00
|
|
|
newsource->commandTag = plansource->commandTag;
|
|
|
|
if (plansource->num_params > 0)
|
|
|
|
{
|
|
|
|
newsource->param_types = (Oid *)
|
|
|
|
palloc(plansource->num_params * sizeof(Oid));
|
|
|
|
memcpy(newsource->param_types, plansource->param_types,
|
|
|
|
plansource->num_params * sizeof(Oid));
|
|
|
|
}
|
|
|
|
else
|
|
|
|
newsource->param_types = NULL;
|
|
|
|
newsource->num_params = plansource->num_params;
|
|
|
|
newsource->parserSetup = plansource->parserSetup;
|
|
|
|
newsource->parserSetupArg = plansource->parserSetupArg;
|
|
|
|
newsource->cursor_options = plansource->cursor_options;
|
|
|
|
newsource->fixed_result = plansource->fixed_result;
|
|
|
|
if (plansource->resultDesc)
|
|
|
|
newsource->resultDesc = CreateTupleDescCopy(plansource->resultDesc);
|
|
|
|
else
|
|
|
|
newsource->resultDesc = NULL;
|
|
|
|
newsource->context = source_context;
|
|
|
|
|
|
|
|
querytree_context = AllocSetContextCreate(source_context,
|
|
|
|
"CachedPlanQuery",
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
ALLOCSET_START_SMALL_SIZES);
|
2011-09-16 06:42:53 +02:00
|
|
|
MemoryContextSwitchTo(querytree_context);
|
2017-03-09 21:18:59 +01:00
|
|
|
newsource->query_list = copyObject(plansource->query_list);
|
|
|
|
newsource->relationOids = copyObject(plansource->relationOids);
|
|
|
|
newsource->invalItems = copyObject(plansource->invalItems);
|
Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 20:14:41 +01:00
|
|
|
if (plansource->search_path)
|
|
|
|
newsource->search_path = CopyOverrideSearchPath(plansource->search_path);
|
2011-09-16 06:42:53 +02:00
|
|
|
newsource->query_context = querytree_context;
|
Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 23:22:56 +02:00
|
|
|
newsource->rewriteRoleId = plansource->rewriteRoleId;
|
|
|
|
newsource->rewriteRowSecurity = plansource->rewriteRowSecurity;
|
|
|
|
newsource->dependsOnRLS = plansource->dependsOnRLS;
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
newsource->gplan = NULL;
|
|
|
|
|
Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
2013-01-04 23:42:19 +01:00
|
|
|
newsource->is_oneshot = false;
|
2011-09-16 06:42:53 +02:00
|
|
|
newsource->is_complete = true;
|
|
|
|
newsource->is_saved = false;
|
|
|
|
newsource->is_valid = plansource->is_valid;
|
|
|
|
newsource->generation = plansource->generation;
|
|
|
|
|
|
|
|
/* We may as well copy any acquired cost knowledge */
|
|
|
|
newsource->generic_cost = plansource->generic_cost;
|
|
|
|
newsource->total_custom_cost = plansource->total_custom_cost;
|
2020-07-20 04:55:50 +02:00
|
|
|
newsource->num_generic_plans = plansource->num_generic_plans;
|
2011-09-16 06:42:53 +02:00
|
|
|
newsource->num_custom_plans = plansource->num_custom_plans;
|
|
|
|
|
|
|
|
MemoryContextSwitchTo(oldcxt);
|
|
|
|
|
|
|
|
return newsource;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* CachedPlanIsValid: test whether the rewritten querytree within a
|
|
|
|
* CachedPlanSource is currently valid (that is, not marked as being in need
|
|
|
|
* of revalidation).
|
2008-09-16 01:37:40 +02:00
|
|
|
*
|
|
|
|
* This result is only trustworthy (ie, free from race conditions) if
|
|
|
|
* the caller has acquired locks on all the relations used in the plan.
|
|
|
|
*/
|
|
|
|
bool
|
|
|
|
CachedPlanIsValid(CachedPlanSource *plansource)
|
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
return plansource->is_valid;
|
|
|
|
}
|
2008-09-16 01:37:40 +02:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* CachedPlanGetTargetList: return tlist, if any, describing plan's output
|
|
|
|
*
|
|
|
|
* The result is guaranteed up-to-date. However, it is local storage
|
|
|
|
* within the cached plan, and may disappear next time the plan is updated.
|
|
|
|
*/
|
|
|
|
List *
|
2017-04-01 06:17:18 +02:00
|
|
|
CachedPlanGetTargetList(CachedPlanSource *plansource,
|
|
|
|
QueryEnvironment *queryEnv)
|
2011-09-16 06:42:53 +02:00
|
|
|
{
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
Query *pstmt;
|
2008-09-16 01:37:40 +02:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Assert caller is doing things in a sane order */
|
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
|
|
|
Assert(plansource->is_complete);
|
2008-09-16 01:37:40 +02:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* No work needed if statement doesn't return tuples (we assume this
|
|
|
|
* feature cannot be changed by an invalidation)
|
|
|
|
*/
|
|
|
|
if (plansource->resultDesc == NULL)
|
|
|
|
return NIL;
|
2008-09-16 01:37:40 +02:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Make sure the querytree list is valid and we have parse-time locks */
|
2017-04-01 06:17:18 +02:00
|
|
|
RevalidateCachedQuery(plansource, queryEnv);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
|
|
|
/* Get the primary statement and find out what it returns */
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
pstmt = QueryListGetPrimaryStmt(plansource->query_list);
|
2011-09-16 06:42:53 +02:00
|
|
|
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
return FetchStatementTargetList((Node *) pstmt);
|
|
|
|
}
|
|
|
|
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
/*
|
|
|
|
* GetCachedExpression: construct a CachedExpression for an expression.
|
|
|
|
*
|
|
|
|
* This performs the same transformations on the expression as
|
|
|
|
* expression_planner(), ie, convert an expression as emitted by parse
|
|
|
|
* analysis to be ready to pass to the executor.
|
|
|
|
*
|
|
|
|
* The result is stashed in a private, long-lived memory context.
|
|
|
|
* (Note that this might leak a good deal of memory in the caller's
|
|
|
|
* context before that.) The passed-in expr tree is not modified.
|
|
|
|
*/
|
|
|
|
CachedExpression *
|
|
|
|
GetCachedExpression(Node *expr)
|
|
|
|
{
|
|
|
|
CachedExpression *cexpr;
|
|
|
|
List *relationOids;
|
|
|
|
List *invalItems;
|
|
|
|
MemoryContext cexpr_context;
|
|
|
|
MemoryContext oldcxt;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Pass the expression through the planner, and collect dependencies.
|
|
|
|
* Everything built here is leaked in the caller's context; that's
|
|
|
|
* intentional to minimize the size of the permanent data structure.
|
|
|
|
*/
|
|
|
|
expr = (Node *) expression_planner_with_deps((Expr *) expr,
|
|
|
|
&relationOids,
|
|
|
|
&invalItems);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make a private memory context, and copy what we need into that. To
|
|
|
|
* avoid leaking a long-lived context if we fail while copying data, we
|
|
|
|
* initially make the context under the caller's context.
|
|
|
|
*/
|
|
|
|
cexpr_context = AllocSetContextCreate(CurrentMemoryContext,
|
|
|
|
"CachedExpression",
|
|
|
|
ALLOCSET_SMALL_SIZES);
|
|
|
|
|
|
|
|
oldcxt = MemoryContextSwitchTo(cexpr_context);
|
|
|
|
|
|
|
|
cexpr = (CachedExpression *) palloc(sizeof(CachedExpression));
|
|
|
|
cexpr->magic = CACHEDEXPR_MAGIC;
|
|
|
|
cexpr->expr = copyObject(expr);
|
|
|
|
cexpr->is_valid = true;
|
|
|
|
cexpr->relationOids = copyObject(relationOids);
|
|
|
|
cexpr->invalItems = copyObject(invalItems);
|
|
|
|
cexpr->context = cexpr_context;
|
|
|
|
|
|
|
|
MemoryContextSwitchTo(oldcxt);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reparent the expr's memory context under CacheMemoryContext so that it
|
|
|
|
* will live indefinitely.
|
|
|
|
*/
|
|
|
|
MemoryContextSetParent(cexpr_context, CacheMemoryContext);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Add the entry to the global list of cached expressions.
|
|
|
|
*/
|
|
|
|
dlist_push_tail(&cached_expression_list, &cexpr->node);
|
|
|
|
|
|
|
|
return cexpr;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* FreeCachedExpression
|
|
|
|
* Delete a CachedExpression.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
FreeCachedExpression(CachedExpression *cexpr)
|
|
|
|
{
|
|
|
|
/* Sanity check */
|
|
|
|
Assert(cexpr->magic == CACHEDEXPR_MAGIC);
|
|
|
|
/* Unlink from global list */
|
|
|
|
dlist_delete(&cexpr->node);
|
|
|
|
/* Free all storage associated with CachedExpression */
|
|
|
|
MemoryContextDelete(cexpr->context);
|
|
|
|
}
|
|
|
|
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
/*
|
|
|
|
* QueryListGetPrimaryStmt
|
|
|
|
* Get the "primary" stmt within a list, ie, the one marked canSetTag.
|
|
|
|
*
|
|
|
|
* Returns NULL if no such stmt. If multiple queries within the list are
|
|
|
|
* marked canSetTag, returns the first one. Neither of these cases should
|
|
|
|
* occur in present usages of this function.
|
|
|
|
*/
|
|
|
|
static Query *
|
|
|
|
QueryListGetPrimaryStmt(List *stmts)
|
|
|
|
{
|
|
|
|
ListCell *lc;
|
|
|
|
|
|
|
|
foreach(lc, stmts)
|
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
Query *stmt = lfirst_node(Query, lc);
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
|
|
|
|
if (stmt->canSetTag)
|
|
|
|
return stmt;
|
|
|
|
}
|
|
|
|
return NULL;
|
2008-09-16 01:37:40 +02:00
|
|
|
}
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* AcquireExecutorLocks: acquire locks needed for execution of a cached plan;
|
|
|
|
* or release them if acquire is false.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
|
|
|
static void
|
|
|
|
AcquireExecutorLocks(List *stmt_list, bool acquire)
|
|
|
|
{
|
|
|
|
ListCell *lc1;
|
|
|
|
|
|
|
|
foreach(lc1, stmt_list)
|
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
|
2007-03-13 01:33:44 +01:00
|
|
|
ListCell *lc2;
|
|
|
|
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
if (plannedstmt->commandType == CMD_UTILITY)
|
2010-01-15 23:36:35 +01:00
|
|
|
{
|
|
|
|
/*
|
Restructure SELECT INTO's parsetree representation into CreateTableAsStmt.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements. The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.
In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs. Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.
Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.
Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.
Andres Freund and Tom Lane
2012-03-20 02:37:19 +01:00
|
|
|
* Ignore utility statements, except those (such as EXPLAIN) that
|
|
|
|
* contain a parsed-but-not-planned query. Note: it's okay to use
|
2010-01-15 23:36:35 +01:00
|
|
|
* ScanQueryForLocks, even though the query hasn't been through
|
|
|
|
* rule rewriting, because rewriting doesn't change the query
|
|
|
|
* representation.
|
|
|
|
*/
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
Query *query = UtilityContainsQuery(plannedstmt->utilityStmt);
|
2010-01-15 23:36:35 +01:00
|
|
|
|
Restructure SELECT INTO's parsetree representation into CreateTableAsStmt.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements. The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.
In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs. Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.
Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.
Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.
Andres Freund and Tom Lane
2012-03-20 02:37:19 +01:00
|
|
|
if (query)
|
2010-01-15 23:36:35 +01:00
|
|
|
ScanQueryForLocks(query, acquire);
|
|
|
|
continue;
|
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
foreach(lc2, plannedstmt->rtable)
|
|
|
|
{
|
|
|
|
RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
|
|
|
|
|
|
|
|
if (rte->rtekind != RTE_RELATION)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Acquire the appropriate type of lock on each relation OID. Note
|
|
|
|
* that we don't actually try to open the rel, and hence will not
|
|
|
|
* fail if it's been dropped entirely --- we'll just transiently
|
|
|
|
* acquire a non-conflicting lock.
|
|
|
|
*/
|
|
|
|
if (acquire)
|
Change rewriter/planner/executor/plancache to depend on RTE rellockmode.
Instead of recomputing the required lock levels in all these places,
just use what commit fdba460a2 made the parser store in the RTE fields.
This already simplifies the code measurably in these places, and
follow-on changes will remove a bunch of no-longer-needed infrastructure.
In a few cases, this change causes us to acquire a higher lock level
than we did before. This is OK primarily because said higher lock level
should've been acquired already at query parse time; thus, we're saving
a useless extra trip through the shared lock manager to acquire a lesser
lock alongside the original lock. The only known exception to this is
that re-execution of a previously planned SELECT FOR UPDATE/SHARE query,
for a table that uses ROW_MARK_REFERENCE or ROW_MARK_COPY methods, might
have gotten only AccessShareLock before. Now it will get RowShareLock
like the first execution did, which seems fine.
While there's more to do, push it in this state anyway, to let the
buildfarm help verify that nothing bad happened.
Amit Langote, reviewed by David Rowley and Jesper Pedersen,
and whacked around a bit more by me
Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
2018-10-02 20:43:01 +02:00
|
|
|
LockRelationOid(rte->relid, rte->rellockmode);
|
2007-03-13 01:33:44 +01:00
|
|
|
else
|
Change rewriter/planner/executor/plancache to depend on RTE rellockmode.
Instead of recomputing the required lock levels in all these places,
just use what commit fdba460a2 made the parser store in the RTE fields.
This already simplifies the code measurably in these places, and
follow-on changes will remove a bunch of no-longer-needed infrastructure.
In a few cases, this change causes us to acquire a higher lock level
than we did before. This is OK primarily because said higher lock level
should've been acquired already at query parse time; thus, we're saving
a useless extra trip through the shared lock manager to acquire a lesser
lock alongside the original lock. The only known exception to this is
that re-execution of a previously planned SELECT FOR UPDATE/SHARE query,
for a table that uses ROW_MARK_REFERENCE or ROW_MARK_COPY methods, might
have gotten only AccessShareLock before. Now it will get RowShareLock
like the first execution did, which seems fine.
While there's more to do, push it in this state anyway, to let the
buildfarm help verify that nothing bad happened.
Amit Langote, reviewed by David Rowley and Jesper Pedersen,
and whacked around a bit more by me
Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
2018-10-02 20:43:01 +02:00
|
|
|
UnlockRelationOid(rte->relid, rte->rellockmode);
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* AcquirePlannerLocks: acquire locks needed for planning of a querytree list;
|
|
|
|
* or release them if acquire is false.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
|
|
|
* Note that we don't actually try to open the relations, and hence will not
|
|
|
|
* fail if one has been dropped entirely --- we'll just transiently acquire
|
|
|
|
* a non-conflicting lock.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
AcquirePlannerLocks(List *stmt_list, bool acquire)
|
|
|
|
{
|
|
|
|
ListCell *lc;
|
|
|
|
|
|
|
|
foreach(lc, stmt_list)
|
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
Query *query = lfirst_node(Query, lc);
|
2010-01-15 23:36:35 +01:00
|
|
|
|
|
|
|
if (query->commandType == CMD_UTILITY)
|
|
|
|
{
|
Restructure SELECT INTO's parsetree representation into CreateTableAsStmt.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements. The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.
In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs. Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.
Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.
Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.
Andres Freund and Tom Lane
2012-03-20 02:37:19 +01:00
|
|
|
/* Ignore utility statements, unless they contain a Query */
|
|
|
|
query = UtilityContainsQuery(query->utilityStmt);
|
|
|
|
if (query)
|
2010-01-15 23:36:35 +01:00
|
|
|
ScanQueryForLocks(query, acquire);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2008-09-09 20:58:09 +02:00
|
|
|
ScanQueryForLocks(query, acquire);
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2008-09-09 20:58:09 +02:00
|
|
|
* ScanQueryForLocks: recursively scan one Query for AcquirePlannerLocks.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
|
|
|
static void
|
2008-09-09 20:58:09 +02:00
|
|
|
ScanQueryForLocks(Query *parsetree, bool acquire)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
|
|
|
ListCell *lc;
|
|
|
|
|
2010-01-15 23:36:35 +01:00
|
|
|
/* Shouldn't get called on utility commands */
|
|
|
|
Assert(parsetree->commandType != CMD_UTILITY);
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
|
|
|
* First, process RTEs of the current query level.
|
|
|
|
*/
|
|
|
|
foreach(lc, parsetree->rtable)
|
|
|
|
{
|
|
|
|
RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
|
|
|
|
|
|
|
|
switch (rte->rtekind)
|
|
|
|
{
|
|
|
|
case RTE_RELATION:
|
2008-09-09 20:58:09 +02:00
|
|
|
/* Acquire or release the appropriate type of lock */
|
|
|
|
if (acquire)
|
Change rewriter/planner/executor/plancache to depend on RTE rellockmode.
Instead of recomputing the required lock levels in all these places,
just use what commit fdba460a2 made the parser store in the RTE fields.
This already simplifies the code measurably in these places, and
follow-on changes will remove a bunch of no-longer-needed infrastructure.
In a few cases, this change causes us to acquire a higher lock level
than we did before. This is OK primarily because said higher lock level
should've been acquired already at query parse time; thus, we're saving
a useless extra trip through the shared lock manager to acquire a lesser
lock alongside the original lock. The only known exception to this is
that re-execution of a previously planned SELECT FOR UPDATE/SHARE query,
for a table that uses ROW_MARK_REFERENCE or ROW_MARK_COPY methods, might
have gotten only AccessShareLock before. Now it will get RowShareLock
like the first execution did, which seems fine.
While there's more to do, push it in this state anyway, to let the
buildfarm help verify that nothing bad happened.
Amit Langote, reviewed by David Rowley and Jesper Pedersen,
and whacked around a bit more by me
Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
2018-10-02 20:43:01 +02:00
|
|
|
LockRelationOid(rte->relid, rte->rellockmode);
|
2008-09-09 20:58:09 +02:00
|
|
|
else
|
Change rewriter/planner/executor/plancache to depend on RTE rellockmode.
Instead of recomputing the required lock levels in all these places,
just use what commit fdba460a2 made the parser store in the RTE fields.
This already simplifies the code measurably in these places, and
follow-on changes will remove a bunch of no-longer-needed infrastructure.
In a few cases, this change causes us to acquire a higher lock level
than we did before. This is OK primarily because said higher lock level
should've been acquired already at query parse time; thus, we're saving
a useless extra trip through the shared lock manager to acquire a lesser
lock alongside the original lock. The only known exception to this is
that re-execution of a previously planned SELECT FOR UPDATE/SHARE query,
for a table that uses ROW_MARK_REFERENCE or ROW_MARK_COPY methods, might
have gotten only AccessShareLock before. Now it will get RowShareLock
like the first execution did, which seems fine.
While there's more to do, push it in this state anyway, to let the
buildfarm help verify that nothing bad happened.
Amit Langote, reviewed by David Rowley and Jesper Pedersen,
and whacked around a bit more by me
Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
2018-10-02 20:43:01 +02:00
|
|
|
UnlockRelationOid(rte->relid, rte->rellockmode);
|
2007-03-13 01:33:44 +01:00
|
|
|
break;
|
|
|
|
|
|
|
|
case RTE_SUBQUERY:
|
2008-09-09 20:58:09 +02:00
|
|
|
/* Recurse into subquery-in-FROM */
|
|
|
|
ScanQueryForLocks(rte->subquery, acquire);
|
2007-03-13 01:33:44 +01:00
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
|
|
|
/* ignore other types of RTEs */
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2008-10-04 23:56:55 +02:00
|
|
|
/* Recurse into subquery-in-WITH */
|
|
|
|
foreach(lc, parsetree->cteList)
|
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
CommonTableExpr *cte = lfirst_node(CommonTableExpr, lc);
|
2008-10-04 23:56:55 +02:00
|
|
|
|
2017-01-27 04:09:34 +01:00
|
|
|
ScanQueryForLocks(castNode(Query, cte->ctequery), acquire);
|
2008-10-04 23:56:55 +02:00
|
|
|
}
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
|
|
|
* Recurse into sublink subqueries, too. But we already did the ones in
|
2008-10-04 23:56:55 +02:00
|
|
|
* the rtable and cteList.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
|
|
|
if (parsetree->hasSubLinks)
|
|
|
|
{
|
|
|
|
query_tree_walker(parsetree, ScanQueryWalker,
|
2008-09-09 20:58:09 +02:00
|
|
|
(void *) &acquire,
|
2008-10-04 23:56:55 +02:00
|
|
|
QTW_IGNORE_RC_SUBQUERIES);
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2008-09-09 20:58:09 +02:00
|
|
|
* Walker to find sublink subqueries for ScanQueryForLocks
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
|
|
|
static bool
|
2008-09-09 20:58:09 +02:00
|
|
|
ScanQueryWalker(Node *node, bool *acquire)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
|
|
|
if (node == NULL)
|
|
|
|
return false;
|
|
|
|
if (IsA(node, SubLink))
|
|
|
|
{
|
|
|
|
SubLink *sub = (SubLink *) node;
|
|
|
|
|
|
|
|
/* Do what we came for */
|
2017-01-27 04:09:34 +01:00
|
|
|
ScanQueryForLocks(castNode(Query, sub->subselect), *acquire);
|
2007-03-13 01:33:44 +01:00
|
|
|
/* Fall through to process lefthand args of SubLink */
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2008-09-09 20:58:09 +02:00
|
|
|
* Do NOT recurse into Query nodes, because ScanQueryForLocks already
|
2007-03-13 01:33:44 +01:00
|
|
|
* processed subselects of subselects for us.
|
|
|
|
*/
|
|
|
|
return expression_tree_walker(node, ScanQueryWalker,
|
2008-09-09 20:58:09 +02:00
|
|
|
(void *) acquire);
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* PlanCacheComputeResultDesc: given a list of analyzed-and-rewritten Queries,
|
|
|
|
* determine the result tupledesc it will produce. Returns NULL if the
|
|
|
|
* execution will not return tuples.
|
2007-03-13 01:33:44 +01:00
|
|
|
*
|
|
|
|
* Note: the result is created or copied into current memory context.
|
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
static TupleDesc
|
2007-03-16 00:12:07 +01:00
|
|
|
PlanCacheComputeResultDesc(List *stmt_list)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
|
|
|
Query *query;
|
|
|
|
|
|
|
|
switch (ChoosePortalStrategy(stmt_list))
|
|
|
|
{
|
|
|
|
case PORTAL_ONE_SELECT:
|
2011-02-26 00:56:23 +01:00
|
|
|
case PORTAL_ONE_MOD_WITH:
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
query = linitial_node(Query, stmt_list);
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
return ExecCleanTypeFromTL(query->targetList);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
case PORTAL_ONE_RETURNING:
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
query = QueryListGetPrimaryStmt(stmt_list);
|
2011-09-16 06:42:53 +02:00
|
|
|
Assert(query->returningList);
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
return ExecCleanTypeFromTL(query->returningList);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
case PORTAL_UTIL_SELECT:
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
query = linitial_node(Query, stmt_list);
|
2011-09-16 06:42:53 +02:00
|
|
|
Assert(query->utilityStmt);
|
|
|
|
return UtilityTupleDescriptor(query->utilityStmt);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
case PORTAL_MULTI_QUERY:
|
|
|
|
/* will not return tuples */
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2008-09-09 20:58:09 +02:00
|
|
|
* PlanCacheRelCallback
|
2007-03-13 01:33:44 +01:00
|
|
|
* Relcache inval callback function
|
2007-03-26 02:36:19 +02:00
|
|
|
*
|
|
|
|
* Invalidate all plans mentioning the given rel, or all plans mentioning
|
|
|
|
* any rel at all if relid == InvalidOid.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
|
|
|
static void
|
2008-09-09 20:58:09 +02:00
|
|
|
PlanCacheRelCallback(Datum arg, Oid relid)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
dlist_iter iter;
|
2007-03-13 01:33:44 +01:00
|
|
|
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
dlist_foreach(iter, &saved_plan_list)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
CachedPlanSource *plansource = dlist_container(CachedPlanSource,
|
|
|
|
node, iter.cur);
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
|
|
|
/* No work if it's already invalidated */
|
2011-09-16 06:42:53 +02:00
|
|
|
if (!plansource->is_valid)
|
2007-03-13 01:33:44 +01:00
|
|
|
continue;
|
2010-01-15 23:36:35 +01:00
|
|
|
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
/* Never invalidate transaction control commands */
|
|
|
|
if (IsTransactionStmtPlan(plansource))
|
|
|
|
continue;
|
|
|
|
|
2010-01-15 23:36:35 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Check the dependency list for the rewritten querytree.
|
2010-01-15 23:36:35 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
if ((relid == InvalidOid) ? plansource->relationOids != NIL :
|
|
|
|
list_member_oid(plansource->relationOids, relid))
|
|
|
|
{
|
|
|
|
/* Invalidate the querytree and generic plan */
|
|
|
|
plansource->is_valid = false;
|
|
|
|
if (plansource->gplan)
|
|
|
|
plansource->gplan->is_valid = false;
|
|
|
|
}
|
2010-01-15 23:36:35 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* The generic plan, if any, could have more dependencies than the
|
|
|
|
* querytree does, so we have to check it too.
|
|
|
|
*/
|
|
|
|
if (plansource->gplan && plansource->gplan->is_valid)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
ListCell *lc;
|
2008-09-09 20:58:09 +02:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
foreach(lc, plansource->gplan->stmt_list)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
if (plannedstmt->commandType == CMD_UTILITY)
|
2007-03-13 01:33:44 +01:00
|
|
|
continue; /* Ignore utility statements */
|
2007-10-11 20:05:27 +02:00
|
|
|
if ((relid == InvalidOid) ? plannedstmt->relationOids != NIL :
|
2021-03-24 06:40:12 +01:00
|
|
|
list_member_oid(plannedstmt->relationOids, relid))
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Invalidate the generic plan only */
|
|
|
|
plansource->gplan->is_valid = false;
|
2007-03-13 01:33:44 +01:00
|
|
|
break; /* out of stmt_list scan */
|
2007-10-11 20:05:27 +02:00
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
}
|
2008-09-09 20:58:09 +02:00
|
|
|
}
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
|
|
|
|
/* Likewise check cached expressions */
|
|
|
|
dlist_foreach(iter, &cached_expression_list)
|
|
|
|
{
|
|
|
|
CachedExpression *cexpr = dlist_container(CachedExpression,
|
|
|
|
node, iter.cur);
|
|
|
|
|
|
|
|
Assert(cexpr->magic == CACHEDEXPR_MAGIC);
|
|
|
|
|
|
|
|
/* No work if it's already invalidated */
|
|
|
|
if (!cexpr->is_valid)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if ((relid == InvalidOid) ? cexpr->relationOids != NIL :
|
|
|
|
list_member_oid(cexpr->relationOids, relid))
|
|
|
|
{
|
|
|
|
cexpr->is_valid = false;
|
|
|
|
}
|
|
|
|
}
|
2008-09-09 20:58:09 +02:00
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2008-09-09 20:58:09 +02:00
|
|
|
/*
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
* PlanCacheObjectCallback
|
|
|
|
* Syscache inval callback function for PROCOID and TYPEOID caches
|
2008-09-09 20:58:09 +02:00
|
|
|
*
|
2011-08-17 01:27:46 +02:00
|
|
|
* Invalidate all plans mentioning the object with the specified hash value,
|
|
|
|
* or all plans mentioning any member of this cache if hashvalue == 0.
|
2008-09-09 20:58:09 +02:00
|
|
|
*/
|
|
|
|
static void
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
|
2008-09-09 20:58:09 +02:00
|
|
|
{
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
dlist_iter iter;
|
2008-09-09 20:58:09 +02:00
|
|
|
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
dlist_foreach(iter, &saved_plan_list)
|
2008-09-09 20:58:09 +02:00
|
|
|
{
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
CachedPlanSource *plansource = dlist_container(CachedPlanSource,
|
|
|
|
node, iter.cur);
|
2011-09-16 06:42:53 +02:00
|
|
|
ListCell *lc;
|
|
|
|
|
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
2008-09-09 20:58:09 +02:00
|
|
|
|
|
|
|
/* No work if it's already invalidated */
|
2011-09-16 06:42:53 +02:00
|
|
|
if (!plansource->is_valid)
|
2008-09-09 20:58:09 +02:00
|
|
|
continue;
|
2010-01-15 23:36:35 +01:00
|
|
|
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
/* Never invalidate transaction control commands */
|
|
|
|
if (IsTransactionStmtPlan(plansource))
|
|
|
|
continue;
|
|
|
|
|
2010-01-15 23:36:35 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* Check the dependency list for the rewritten querytree.
|
2010-01-15 23:36:35 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
foreach(lc, plansource->invalItems)
|
2008-09-09 20:58:09 +02:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
PlanInvalItem *item = (PlanInvalItem *) lfirst(lc);
|
2007-03-13 01:33:44 +01:00
|
|
|
|
2010-01-15 23:36:35 +01:00
|
|
|
if (item->cacheId != cacheid)
|
|
|
|
continue;
|
2011-08-17 01:27:46 +02:00
|
|
|
if (hashvalue == 0 ||
|
|
|
|
item->hashValue == hashvalue)
|
2010-01-15 23:36:35 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Invalidate the querytree and generic plan */
|
|
|
|
plansource->is_valid = false;
|
|
|
|
if (plansource->gplan)
|
|
|
|
plansource->gplan->is_valid = false;
|
2010-01-15 23:36:35 +01:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
/*
|
|
|
|
* The generic plan, if any, could have more dependencies than the
|
|
|
|
* querytree does, so we have to check it too.
|
|
|
|
*/
|
|
|
|
if (plansource->gplan && plansource->gplan->is_valid)
|
2010-01-15 23:36:35 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
foreach(lc, plansource->gplan->stmt_list)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
|
2008-09-09 20:58:09 +02:00
|
|
|
ListCell *lc3;
|
|
|
|
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
if (plannedstmt->commandType == CMD_UTILITY)
|
2008-09-09 20:58:09 +02:00
|
|
|
continue; /* Ignore utility statements */
|
|
|
|
foreach(lc3, plannedstmt->invalItems)
|
|
|
|
{
|
|
|
|
PlanInvalItem *item = (PlanInvalItem *) lfirst(lc3);
|
|
|
|
|
|
|
|
if (item->cacheId != cacheid)
|
|
|
|
continue;
|
2011-08-17 01:27:46 +02:00
|
|
|
if (hashvalue == 0 ||
|
|
|
|
item->hashValue == hashvalue)
|
2008-09-09 20:58:09 +02:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
/* Invalidate the generic plan only */
|
|
|
|
plansource->gplan->is_valid = false;
|
2008-09-09 20:58:09 +02:00
|
|
|
break; /* out of invalItems scan */
|
|
|
|
}
|
|
|
|
}
|
2011-09-16 06:42:53 +02:00
|
|
|
if (!plansource->gplan->is_valid)
|
2008-09-09 20:58:09 +02:00
|
|
|
break; /* out of stmt_list scan */
|
|
|
|
}
|
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
|
|
|
|
/* Likewise check cached expressions */
|
|
|
|
dlist_foreach(iter, &cached_expression_list)
|
|
|
|
{
|
|
|
|
CachedExpression *cexpr = dlist_container(CachedExpression,
|
|
|
|
node, iter.cur);
|
|
|
|
ListCell *lc;
|
|
|
|
|
|
|
|
Assert(cexpr->magic == CACHEDEXPR_MAGIC);
|
|
|
|
|
|
|
|
/* No work if it's already invalidated */
|
|
|
|
if (!cexpr->is_valid)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
foreach(lc, cexpr->invalItems)
|
|
|
|
{
|
|
|
|
PlanInvalItem *item = (PlanInvalItem *) lfirst(lc);
|
|
|
|
|
|
|
|
if (item->cacheId != cacheid)
|
|
|
|
continue;
|
|
|
|
if (hashvalue == 0 ||
|
|
|
|
item->hashValue == hashvalue)
|
|
|
|
{
|
|
|
|
cexpr->is_valid = false;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|
|
|
|
|
2007-04-12 08:53:49 +02:00
|
|
|
/*
|
2008-09-09 20:58:09 +02:00
|
|
|
* PlanCacheSysCallback
|
|
|
|
* Syscache inval callback function for other caches
|
|
|
|
*
|
|
|
|
* Just invalidate everything...
|
2007-04-12 08:53:49 +02:00
|
|
|
*/
|
2008-09-09 20:58:09 +02:00
|
|
|
static void
|
2011-08-17 01:27:46 +02:00
|
|
|
PlanCacheSysCallback(Datum arg, int cacheid, uint32 hashvalue)
|
2007-04-12 08:53:49 +02:00
|
|
|
{
|
2008-09-09 20:58:09 +02:00
|
|
|
ResetPlanCache();
|
2007-04-12 08:53:49 +02:00
|
|
|
}
|
|
|
|
|
2007-03-13 01:33:44 +01:00
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* ResetPlanCache: invalidate all cached plans.
|
2007-03-13 01:33:44 +01:00
|
|
|
*/
|
2008-09-09 20:58:09 +02:00
|
|
|
void
|
|
|
|
ResetPlanCache(void)
|
2007-03-13 01:33:44 +01:00
|
|
|
{
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
dlist_iter iter;
|
2008-09-09 20:58:09 +02:00
|
|
|
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
dlist_foreach(iter, &saved_plan_list)
|
2008-09-09 20:58:09 +02:00
|
|
|
{
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
CachedPlanSource *plansource = dlist_container(CachedPlanSource,
|
|
|
|
node, iter.cur);
|
2011-09-16 06:42:53 +02:00
|
|
|
ListCell *lc;
|
|
|
|
|
|
|
|
Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
|
2008-09-09 20:58:09 +02:00
|
|
|
|
2010-01-13 17:56:56 +01:00
|
|
|
/* No work if it's already invalidated */
|
2011-09-16 06:42:53 +02:00
|
|
|
if (!plansource->is_valid)
|
2010-01-13 17:56:56 +01:00
|
|
|
continue;
|
|
|
|
|
|
|
|
/*
|
2011-09-16 06:42:53 +02:00
|
|
|
* We *must not* mark transaction control statements as invalid,
|
2010-01-13 17:56:56 +01:00
|
|
|
* particularly not ROLLBACK, because they may need to be executed in
|
|
|
|
* aborted transactions when we can't revalidate them (cf bug #5269).
|
Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 22:59:21 +02:00
|
|
|
*/
|
|
|
|
if (IsTransactionStmtPlan(plansource))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
/*
|
2010-01-13 17:56:56 +01:00
|
|
|
* In general there is no point in invalidating utility statements
|
2011-09-16 06:42:53 +02:00
|
|
|
* since they have no plans anyway. So invalidate it only if it
|
Restructure SELECT INTO's parsetree representation into CreateTableAsStmt.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements. The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.
In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs. Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.
Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.
Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.
Andres Freund and Tom Lane
2012-03-20 02:37:19 +01:00
|
|
|
* contains at least one non-utility statement, or contains a utility
|
|
|
|
* statement that contains a pre-analyzed query (which could have
|
|
|
|
* dependencies.)
|
2010-01-13 17:56:56 +01:00
|
|
|
*/
|
2011-09-16 06:42:53 +02:00
|
|
|
foreach(lc, plansource->query_list)
|
2010-01-13 17:56:56 +01:00
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
Query *query = lfirst_node(Query, lc);
|
2010-01-13 17:56:56 +01:00
|
|
|
|
2011-09-16 06:42:53 +02:00
|
|
|
if (query->commandType != CMD_UTILITY ||
|
Restructure SELECT INTO's parsetree representation into CreateTableAsStmt.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements. The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.
In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs. Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.
Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.
Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.
Andres Freund and Tom Lane
2012-03-20 02:37:19 +01:00
|
|
|
UtilityContainsQuery(query->utilityStmt))
|
2010-01-13 17:56:56 +01:00
|
|
|
{
|
2011-09-16 06:42:53 +02:00
|
|
|
/* non-utility statement, so invalidate */
|
|
|
|
plansource->is_valid = false;
|
|
|
|
if (plansource->gplan)
|
|
|
|
plansource->gplan->is_valid = false;
|
Restructure SELECT INTO's parsetree representation into CreateTableAsStmt.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements. The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.
In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs. Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.
Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.
Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.
Andres Freund and Tom Lane
2012-03-20 02:37:19 +01:00
|
|
|
/* no need to look further */
|
2011-09-16 06:42:53 +02:00
|
|
|
break;
|
2010-01-13 17:56:56 +01:00
|
|
|
}
|
|
|
|
}
|
2008-09-09 20:58:09 +02:00
|
|
|
}
|
Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType. This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column). However, we do have to support invalidating the plan later if
a constraint gets added to the domain. That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions. We
need only duplicate or share that logic for domains.
ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)
Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache. Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.
We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs. But I've left that idea for later.)
Other stuff fixed in passing:
* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains. We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.
* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list. That eliminates a potential
performance problem in DropCachedPlan.
* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type. Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.
Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
|
|
|
|
|
|
|
/* Likewise invalidate cached expressions */
|
|
|
|
dlist_foreach(iter, &cached_expression_list)
|
|
|
|
{
|
|
|
|
CachedExpression *cexpr = dlist_container(CachedExpression,
|
|
|
|
node, iter.cur);
|
|
|
|
|
|
|
|
Assert(cexpr->magic == CACHEDEXPR_MAGIC);
|
|
|
|
|
|
|
|
cexpr->is_valid = false;
|
|
|
|
}
|
2007-03-13 01:33:44 +01:00
|
|
|
}
|