postgresql/src/pl/plpgsql/src/pl_exec.c

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

8756 lines
252 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* pl_exec.c - Executor for the PL/pgSQL
* procedural language
*
* Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
*
* IDENTIFICATION
2010-09-20 22:08:53 +02:00
* src/pl/plpgsql/src/pl_exec.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include <ctype.h>
#include "access/detoast.h"
#include "access/htup_details.h"
#include "access/transam.h"
#include "access/tupconvert.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
#include "commands/defrem.h"
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
#include "executor/execExpr.h"
Fix plpgsql's reporting of plan-time errors in possibly-simple expressions. exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression *would* show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
2013-01-31 02:02:23 +01:00
#include "executor/spi.h"
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
#include "executor/tstoreReceiver.h"
#include "funcapi.h"
#include "mb/stringinfo_mb.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/optimizer.h"
#include "parser/parse_coerce.h"
Fix plpgsql to re-look-up composite type names at need. Commit 4b93f5799 rearranged things in plpgsql to make it cope better with composite types changing underneath it intra-session. However, I failed to consider the case of a composite type being dropped and recreated entirely. In my defense, the previous coding didn't consider that possibility at all either --- but it would accidentally work so long as you didn't change the type's field list, because the built-at-compile-time list of component variables would then still match the type's new definition. The new coding, however, occasionally tries to re-look-up the type by OID, and then fails to find the dropped type. To fix this, we need to save the TypeName struct, and then redo the type OID lookup from that. Of course that's expensive, so we don't want to do it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye on the type's typcache entry to see if its tupledesc has been invalidated. (Perhaps, at some point, this mechanism should be generalized so it can work for non-composite types too; but for now, plpgsql only tries to cope with intra-session redefinitions of composites.) I'm slightly hesitant to back-patch this into v11, because it changes the contents of struct PLpgSQL_type as well as the signature of plpgsql_build_datatype(), so in principle it could break code that is poking into the innards of plpgsql. However, the only popular extension of that ilk is pldebugger, and it doesn't seem to be affected. Since this is a regression for people who were relying on the old behavior, it seems worth taking the small risk of causing compatibility issues. Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799 came in. Discussion: https://postgr.es/m/15913-a7e112e16dedcffc@postgresql.org
2019-08-15 21:21:47 +02:00
#include "parser/parse_type.h"
#include "parser/scansup.h"
#include "plpgsql.h"
#include "storage/proc.h"
#include "tcop/cmdtag.h"
#include "tcop/pquery.h"
#include "tcop/tcopprot.h"
#include "tcop/utility.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/datum.h"
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/typcache.h"
/*
* All plpgsql function executions within a single transaction share the same
* executor EState for evaluating "simple" expressions. Each function call
* creates its own "eval_econtext" ExprContext within this estate for
* per-evaluation workspace. eval_econtext is freed at normal function exit,
* and the EState is freed at transaction end (in case of error, we assume
* that the abort mechanisms clean it all up). Furthermore, any exception
* block within a function has to have its own eval_econtext separate from
* the containing function's, so that we can clean up ExprContext callbacks
* properly at subtransaction exit. We maintain a stack that tracks the
* individual econtexts so that we can clean up correctly at subxact exit.
*
* This arrangement is a bit tedious to maintain, but it's worth the trouble
* so that we don't have to re-prepare simple expressions on each trip through
* a function. (We assume the case to optimize is many repetitions of a
* function within a transaction.)
2013-11-15 19:52:03 +01:00
*
* However, there's no value in trying to amortize simple expression setup
* across multiple executions of a DO block (inline code block), since there
* can never be any. If we use the shared EState for a DO block, the expr
* state trees are effectively leaked till end of transaction, and that can
* add up if the user keeps on submitting DO blocks. Therefore, each DO block
* has its own simple-expression EState, which is cleaned up at exit from
* plpgsql_inline_handler(). DO blocks still use the simple_econtext_stack,
* though, so that subxact abort cleanup does the right thing.
*
* (However, if a DO block executes COMMIT or ROLLBACK, then exec_stmt_commit
* or exec_stmt_rollback will unlink it from the DO's simple-expression EState
* and create a new shared EState that will be used thenceforth. The original
* EState will be cleaned up when we get back to plpgsql_inline_handler. This
* is a bit ugly, but it isn't worth doing better, since scenarios like this
* can't result in indefinite accumulation of state trees.)
*/
typedef struct SimpleEcontextStackEntry
{
ExprContext *stack_econtext; /* a stacked econtext */
SubTransactionId xact_subxid; /* ID for current subxact */
struct SimpleEcontextStackEntry *next; /* next stack entry up */
} SimpleEcontextStackEntry;
2013-11-15 19:52:03 +01:00
static EState *shared_simple_eval_estate = NULL;
static SimpleEcontextStackEntry *simple_econtext_stack = NULL;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/*
* In addition to the shared simple-eval EState, we have a shared resource
* owner that holds refcounts on the CachedPlans for any "simple" expressions
* we have evaluated in the current transaction. This allows us to avoid
* continually grabbing and releasing a plan refcount when a simple expression
* is used over and over. (DO blocks use their own resowner, in exactly the
* same way described above for shared_simple_eval_estate.)
*/
static ResourceOwner shared_simple_eval_resowner = NULL;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/*
* Memory management within a plpgsql function generally works with three
* contexts:
*
* 1. Function-call-lifespan data, such as variable values, is kept in the
* "main" context, a/k/a the "SPI Proc" context established by SPI_connect().
* This is usually the CurrentMemoryContext while running code in this module
* (which is not good, because careless coding can easily cause
* function-lifespan memory leaks, but we live with it for now).
*
* 2. Some statement-execution routines need statement-lifespan workspace.
* A suitable context is created on-demand by get_stmt_mcontext(), and must
* be reset at the end of the requesting routine. Error recovery will clean
* it up automatically. Nested statements requiring statement-lifespan
* workspace will result in a stack of such contexts, see push_stmt_mcontext().
*
* 3. We use the eval_econtext's per-tuple memory context for expression
* evaluation, and as a general-purpose workspace for short-lived allocations.
* Such allocations usually aren't explicitly freed, but are left to be
* cleaned up by a context reset, typically done by exec_eval_cleanup().
*
* These macros are for use in making short-lived allocations:
*/
#define get_eval_mcontext(estate) \
((estate)->eval_econtext->ecxt_per_tuple_memory)
#define eval_mcontext_alloc(estate, sz) \
MemoryContextAlloc(get_eval_mcontext(estate), sz)
#define eval_mcontext_alloc0(estate, sz) \
MemoryContextAllocZero(get_eval_mcontext(estate), sz)
2015-07-17 21:53:09 +02:00
/*
* We use a session-wide hash table for caching cast information.
*
* Once built, the compiled expression trees (cast_expr fields) survive for
* the life of the session. At some point it might be worth invalidating
* those after pg_cast changes, but for the moment we don't bother.
*
* The evaluation state trees (cast_exprstate) are managed in the same way as
* simple expressions (i.e., we assume cast expressions are always simple).
*
* As with simple expressions, DO blocks don't use the shared hash table but
* must have their own. This isn't ideal, but we don't want to deal with
* multiple simple_eval_estates within a DO block.
2015-07-17 21:53:09 +02:00
*/
typedef struct /* lookup key for cast info */
{
/* NB: we assume this struct contains no padding bytes */
Oid srctype; /* source type for cast */
Oid dsttype; /* destination type for cast */
int32 srctypmod; /* source typmod for cast */
int32 dsttypmod; /* destination typmod for cast */
} plpgsql_CastHashKey;
typedef struct /* cast_hash table entry */
{
plpgsql_CastHashKey key; /* hash key --- MUST BE FIRST */
Expr *cast_expr; /* cast expression, or NULL if no-op cast */
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
CachedExpression *cast_cexpr; /* cached expression backing the above */
Faster expression evaluation and targetlist projection. This replaces the old, recursive tree-walk based evaluation, with non-recursive, opcode dispatch based, expression evaluation. Projection is now implemented as part of expression evaluation. This both leads to significant performance improvements, and makes future just-in-time compilation of expressions easier. The speed gains primarily come from: - non-recursive implementation reduces stack usage / overhead - simple sub-expressions are implemented with a single jump, without function calls - sharing some state between different sub-expressions - reduced amount of indirect/hard to predict memory accesses by laying out operation metadata sequentially; including the avoidance of nearly all of the previously used linked lists - more code has been moved to expression initialization, avoiding constant re-checks at evaluation time Future just-in-time compilation (JIT) has become easier, as demonstrated by released patches intended to be merged in a later release, for primarily two reasons: Firstly, due to a stricter split between expression initialization and evaluation, less code has to be handled by the JIT. Secondly, due to the non-recursive nature of the generated "instructions", less performance-critical code-paths can easily be shared between interpreted and compiled evaluation. The new framework allows for significant future optimizations. E.g.: - basic infrastructure for to later reduce the per executor-startup overhead of expression evaluation, by caching state in prepared statements. That'd be helpful in OLTPish scenarios where initialization overhead is measurable. - optimizing the generated "code". A number of proposals for potential work has already been made. - optimizing the interpreter. Similarly a number of proposals have been made here too. The move of logic into the expression initialization step leads to some backward-incompatible changes: - Function permission checks are now done during expression initialization, whereas previously they were done during execution. In edge cases this can lead to errors being raised that previously wouldn't have been, e.g. a NULL array being coerced to a different array type previously didn't perform checks. - The set of domain constraints to be checked, is now evaluated once during expression initialization, previously it was re-built every time a domain check was evaluated. For normal queries this doesn't change much, but e.g. for plpgsql functions, which caches ExprStates, the old set could stick around longer. The behavior around might still change. Author: Andres Freund, with significant changes by Tom Lane, changes by Heikki Linnakangas Reviewed-By: Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
/* ExprState is valid only when cast_lxid matches current LXID */
2015-07-17 21:53:09 +02:00
ExprState *cast_exprstate; /* expression's eval tree */
bool cast_in_use; /* true while we're executing eval tree */
LocalTransactionId cast_lxid;
} plpgsql_CastHashEntry;
static MemoryContext shared_cast_context = NULL;
static HTAB *shared_cast_hash = NULL;
2015-07-17 21:53:09 +02:00
/*
* LOOP_RC_PROCESSING encapsulates common logic for looping statements to
* handle return/exit/continue result codes from the loop body statement(s).
* It's meant to be used like this:
*
* int rc = PLPGSQL_RC_OK;
* for (...)
* {
* ...
* rc = exec_stmts(estate, stmt->body);
* LOOP_RC_PROCESSING(stmt->label, break);
* ...
* }
* return rc;
*
* If execution of the loop should terminate, LOOP_RC_PROCESSING will execute
* "exit_action" (typically a "break" or "goto"), after updating "rc" to the
* value the current statement should return. If execution should continue,
* LOOP_RC_PROCESSING will do nothing except reset "rc" to PLPGSQL_RC_OK.
*
* estate and rc are implicit arguments to the macro.
* estate->exitlabel is examined and possibly updated.
*/
#define LOOP_RC_PROCESSING(looplabel, exit_action) \
if (rc == PLPGSQL_RC_RETURN) \
{ \
/* RETURN, so propagate RC_RETURN out */ \
exit_action; \
} \
else if (rc == PLPGSQL_RC_EXIT) \
{ \
if (estate->exitlabel == NULL) \
{ \
/* unlabeled EXIT terminates this loop */ \
rc = PLPGSQL_RC_OK; \
exit_action; \
} \
else if ((looplabel) != NULL && \
strcmp(looplabel, estate->exitlabel) == 0) \
{ \
/* labeled EXIT matching this loop, so terminate loop */ \
estate->exitlabel = NULL; \
rc = PLPGSQL_RC_OK; \
exit_action; \
} \
else \
{ \
/* non-matching labeled EXIT, propagate RC_EXIT out */ \
exit_action; \
} \
} \
else if (rc == PLPGSQL_RC_CONTINUE) \
{ \
if (estate->exitlabel == NULL) \
{ \
/* unlabeled CONTINUE matches this loop, so continue in loop */ \
rc = PLPGSQL_RC_OK; \
} \
else if ((looplabel) != NULL && \
strcmp(looplabel, estate->exitlabel) == 0) \
{ \
/* labeled CONTINUE matching this loop, so continue in loop */ \
estate->exitlabel = NULL; \
rc = PLPGSQL_RC_OK; \
} \
else \
{ \
/* non-matching labeled CONTINUE, propagate RC_CONTINUE out */ \
exit_action; \
} \
} \
else \
Assert(rc == PLPGSQL_RC_OK)
/************************************************************
* Local function forward declarations
************************************************************/
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
static void coerce_function_result_tuple(PLpgSQL_execstate *estate,
TupleDesc tupdesc);
static void plpgsql_exec_error_callback(void *arg);
static void copy_plpgsql_datums(PLpgSQL_execstate *estate,
PLpgSQL_function *func);
static void plpgsql_fulfill_promise(PLpgSQL_execstate *estate,
PLpgSQL_var *var);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
static MemoryContext get_stmt_mcontext(PLpgSQL_execstate *estate);
static void push_stmt_mcontext(PLpgSQL_execstate *estate);
static void pop_stmt_mcontext(PLpgSQL_execstate *estate);
2005-10-15 04:49:52 +02:00
static int exec_toplevel_block(PLpgSQL_execstate *estate,
PLpgSQL_stmt_block *block);
static int exec_stmt_block(PLpgSQL_execstate *estate,
PLpgSQL_stmt_block *block);
static int exec_stmts(PLpgSQL_execstate *estate,
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
List *stmts);
static int exec_stmt_assign(PLpgSQL_execstate *estate,
PLpgSQL_stmt_assign *stmt);
static int exec_stmt_perform(PLpgSQL_execstate *estate,
PLpgSQL_stmt_perform *stmt);
static int exec_stmt_call(PLpgSQL_execstate *estate,
PLpgSQL_stmt_call *stmt);
static int exec_stmt_getdiag(PLpgSQL_execstate *estate,
PLpgSQL_stmt_getdiag *stmt);
static int exec_stmt_if(PLpgSQL_execstate *estate,
PLpgSQL_stmt_if *stmt);
static int exec_stmt_case(PLpgSQL_execstate *estate,
PLpgSQL_stmt_case *stmt);
static int exec_stmt_loop(PLpgSQL_execstate *estate,
PLpgSQL_stmt_loop *stmt);
static int exec_stmt_while(PLpgSQL_execstate *estate,
PLpgSQL_stmt_while *stmt);
static int exec_stmt_fori(PLpgSQL_execstate *estate,
PLpgSQL_stmt_fori *stmt);
static int exec_stmt_fors(PLpgSQL_execstate *estate,
PLpgSQL_stmt_fors *stmt);
static int exec_stmt_forc(PLpgSQL_execstate *estate,
PLpgSQL_stmt_forc *stmt);
static int exec_stmt_foreach_a(PLpgSQL_execstate *estate,
PLpgSQL_stmt_foreach_a *stmt);
static int exec_stmt_open(PLpgSQL_execstate *estate,
PLpgSQL_stmt_open *stmt);
static int exec_stmt_fetch(PLpgSQL_execstate *estate,
PLpgSQL_stmt_fetch *stmt);
static int exec_stmt_close(PLpgSQL_execstate *estate,
PLpgSQL_stmt_close *stmt);
static int exec_stmt_exit(PLpgSQL_execstate *estate,
PLpgSQL_stmt_exit *stmt);
static int exec_stmt_return(PLpgSQL_execstate *estate,
PLpgSQL_stmt_return *stmt);
static int exec_stmt_return_next(PLpgSQL_execstate *estate,
PLpgSQL_stmt_return_next *stmt);
static int exec_stmt_return_query(PLpgSQL_execstate *estate,
PLpgSQL_stmt_return_query *stmt);
static int exec_stmt_raise(PLpgSQL_execstate *estate,
PLpgSQL_stmt_raise *stmt);
static int exec_stmt_assert(PLpgSQL_execstate *estate,
PLpgSQL_stmt_assert *stmt);
static int exec_stmt_execsql(PLpgSQL_execstate *estate,
PLpgSQL_stmt_execsql *stmt);
static int exec_stmt_dynexecute(PLpgSQL_execstate *estate,
PLpgSQL_stmt_dynexecute *stmt);
static int exec_stmt_dynfors(PLpgSQL_execstate *estate,
PLpgSQL_stmt_dynfors *stmt);
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
static int exec_stmt_commit(PLpgSQL_execstate *estate,
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
PLpgSQL_stmt_commit *stmt);
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
static int exec_stmt_rollback(PLpgSQL_execstate *estate,
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
PLpgSQL_stmt_rollback *stmt);
static void plpgsql_estate_setup(PLpgSQL_execstate *estate,
PLpgSQL_function *func,
2013-11-15 19:52:03 +01:00
ReturnSetInfo *rsi,
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
EState *simple_eval_estate,
ResourceOwner simple_eval_resowner);
static void exec_eval_cleanup(PLpgSQL_execstate *estate);
static void exec_prepare_plan(PLpgSQL_execstate *estate,
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
PLpgSQL_expr *expr, int cursorOptions);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
static void exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr);
static void exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan);
static void exec_check_rw_parameter(PLpgSQL_expr *expr);
static bool exec_eval_simple_expr(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr,
Datum *result,
bool *isNull,
Oid *rettype,
int32 *rettypmod);
static void exec_assign_expr(PLpgSQL_execstate *estate,
PLpgSQL_datum *target,
PLpgSQL_expr *expr);
static void exec_assign_c_string(PLpgSQL_execstate *estate,
PLpgSQL_datum *target,
const char *str);
static void exec_assign_value(PLpgSQL_execstate *estate,
PLpgSQL_datum *target,
Datum value, bool isNull,
Oid valtype, int32 valtypmod);
static void exec_eval_datum(PLpgSQL_execstate *estate,
PLpgSQL_datum *datum,
Oid *typeid,
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
int32 *typetypmod,
Datum *value,
bool *isnull);
static int exec_eval_integer(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr,
bool *isNull);
static bool exec_eval_boolean(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr,
bool *isNull);
static Datum exec_eval_expr(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr,
bool *isNull,
Oid *rettype,
int32 *rettypmod);
static int exec_run_select(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr, long maxtuples, Portal *portalP);
static int exec_for_query(PLpgSQL_execstate *estate, PLpgSQL_stmt_forq *stmt,
Portal portal, bool prefetch_ok);
static ParamListInfo setup_param_list(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr);
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
static ParamExternData *plpgsql_param_fetch(ParamListInfo params,
int paramid, bool speculative,
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
ParamExternData *workspace);
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
static void plpgsql_param_compile(ParamListInfo params, Param *param,
ExprState *state,
Datum *resv, bool *resnull);
static void plpgsql_param_eval_var(ExprState *state, ExprEvalStep *op,
ExprContext *econtext);
static void plpgsql_param_eval_var_ro(ExprState *state, ExprEvalStep *op,
ExprContext *econtext);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
static void plpgsql_param_eval_recfield(ExprState *state, ExprEvalStep *op,
ExprContext *econtext);
static void plpgsql_param_eval_generic(ExprState *state, ExprEvalStep *op,
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
ExprContext *econtext);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
static void plpgsql_param_eval_generic_ro(ExprState *state, ExprEvalStep *op,
ExprContext *econtext);
static void exec_move_row(PLpgSQL_execstate *estate,
PLpgSQL_variable *target,
HeapTuple tup, TupleDesc tupdesc);
Fix plpgsql to re-look-up composite type names at need. Commit 4b93f5799 rearranged things in plpgsql to make it cope better with composite types changing underneath it intra-session. However, I failed to consider the case of a composite type being dropped and recreated entirely. In my defense, the previous coding didn't consider that possibility at all either --- but it would accidentally work so long as you didn't change the type's field list, because the built-at-compile-time list of component variables would then still match the type's new definition. The new coding, however, occasionally tries to re-look-up the type by OID, and then fails to find the dropped type. To fix this, we need to save the TypeName struct, and then redo the type OID lookup from that. Of course that's expensive, so we don't want to do it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye on the type's typcache entry to see if its tupledesc has been invalidated. (Perhaps, at some point, this mechanism should be generalized so it can work for non-composite types too; but for now, plpgsql only tries to cope with intra-session redefinitions of composites.) I'm slightly hesitant to back-patch this into v11, because it changes the contents of struct PLpgSQL_type as well as the signature of plpgsql_build_datatype(), so in principle it could break code that is poking into the innards of plpgsql. However, the only popular extension of that ilk is pldebugger, and it doesn't seem to be affected. Since this is a regression for people who were relying on the old behavior, it seems worth taking the small risk of causing compatibility issues. Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799 came in. Discussion: https://postgr.es/m/15913-a7e112e16dedcffc@postgresql.org
2019-08-15 21:21:47 +02:00
static void revalidate_rectypeid(PLpgSQL_rec *rec);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
static ExpandedRecordHeader *make_expanded_record_for_rec(PLpgSQL_execstate *estate,
PLpgSQL_rec *rec,
TupleDesc srctupdesc,
ExpandedRecordHeader *srcerh);
static void exec_move_row_from_fields(PLpgSQL_execstate *estate,
PLpgSQL_variable *target,
ExpandedRecordHeader *newerh,
Datum *values, bool *nulls,
TupleDesc tupdesc);
static bool compatible_tupdescs(TupleDesc src_tupdesc, TupleDesc dst_tupdesc);
static HeapTuple make_tuple_from_row(PLpgSQL_execstate *estate,
PLpgSQL_row *row,
TupleDesc tupdesc);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
static TupleDesc deconstruct_composite_datum(Datum value,
HeapTupleData *tmptup);
static void exec_move_row_from_datum(PLpgSQL_execstate *estate,
PLpgSQL_variable *target,
Datum value);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
static void instantiate_empty_record_variable(PLpgSQL_execstate *estate,
PLpgSQL_rec *rec);
static char *convert_value_to_string(PLpgSQL_execstate *estate,
Datum value, Oid valtype);
static inline Datum exec_cast_value(PLpgSQL_execstate *estate,
Datum value, bool *isnull,
Oid valtype, int32 valtypmod,
Oid reqtype, int32 reqtypmod);
static Datum do_cast_value(PLpgSQL_execstate *estate,
Datum value, bool *isnull,
Oid valtype, int32 valtypmod,
Oid reqtype, int32 reqtypmod);
2015-07-17 21:53:09 +02:00
static plpgsql_CastHashEntry *get_cast_hashentry(PLpgSQL_execstate *estate,
Oid srctype, int32 srctypmod,
Oid dsttype, int32 dsttypmod);
static void exec_init_tuple_store(PLpgSQL_execstate *estate);
static void exec_set_found(PLpgSQL_execstate *estate, bool state);
static void plpgsql_create_econtext(PLpgSQL_execstate *estate);
static void plpgsql_destroy_econtext(PLpgSQL_execstate *estate);
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
static void assign_simple_var(PLpgSQL_execstate *estate, PLpgSQL_var *var,
Datum newvalue, bool isnull, bool freeable);
static void assign_text_var(PLpgSQL_execstate *estate, PLpgSQL_var *var,
const char *str);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
static void assign_record_var(PLpgSQL_execstate *estate, PLpgSQL_rec *rec,
ExpandedRecordHeader *erh);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
static ParamListInfo exec_eval_using_params(PLpgSQL_execstate *estate,
List *params);
static Portal exec_dynquery_with_params(PLpgSQL_execstate *estate,
PLpgSQL_expr *dynquery, List *params,
const char *portalname, int cursorOptions);
static char *format_expr_params(PLpgSQL_execstate *estate,
const PLpgSQL_expr *expr);
static char *format_preparedparamsdata(PLpgSQL_execstate *estate,
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
ParamListInfo paramLI);
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
static PLpgSQL_variable *make_callstmt_target(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr);
/* ----------
* plpgsql_exec_function Called by the call handler for
* function execution.
2013-11-15 19:52:03 +01:00
*
* This is also used to execute inline code blocks (DO blocks). The only
* difference that this code is aware of is that for a DO block, we want
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
* to use a private simple_eval_estate and a private simple_eval_resowner,
* which are created and passed in by the caller. For regular functions,
* pass NULL, which implies using shared_simple_eval_estate and
* shared_simple_eval_resowner. (When using a private simple_eval_estate,
* we must also use a private cast hashtable, but that's taken care of
* within plpgsql_estate_setup.)
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
* procedure_resowner is a resowner that will survive for the duration
* of execution of this function/procedure. It is needed only if we
* are doing non-atomic execution and there are CALL or DO statements
* in the function; otherwise it can be NULL. We use it to hold refcounts
* on the CALL/DO statements' plans.
* ----------
*/
Datum
2013-11-15 19:52:03 +01:00
plpgsql_exec_function(PLpgSQL_function *func, FunctionCallInfo fcinfo,
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
EState *simple_eval_estate,
ResourceOwner simple_eval_resowner,
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
ResourceOwner procedure_resowner,
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
bool atomic)
{
PLpgSQL_execstate estate;
ErrorContextCallback plerrcontext;
int i;
int rc;
/*
* Setup the execution state
*/
2013-11-15 19:52:03 +01:00
plpgsql_estate_setup(&estate, func, (ReturnSetInfo *) fcinfo->resultinfo,
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
simple_eval_estate, simple_eval_resowner);
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
estate.procedure_resowner = procedure_resowner;
estate.atomic = atomic;
/*
* Setup error traceback support for ereport()
*/
plerrcontext.callback = plpgsql_exec_error_callback;
plerrcontext.arg = &estate;
plerrcontext.previous = error_context_stack;
error_context_stack = &plerrcontext;
/*
* Make local execution copies of all the datums
*/
estate.err_text = gettext_noop("during initialization of execution state");
copy_plpgsql_datums(&estate, func);
/*
* Store the actual call argument values into the appropriate variables
*/
estate.err_text = gettext_noop("while storing call arguments into local variables");
for (i = 0; i < func->fn_nargs; i++)
{
int n = func->fn_argvarnos[i];
switch (estate.datums[n]->dtype)
{
case PLPGSQL_DTYPE_VAR:
{
PLpgSQL_var *var = (PLpgSQL_var *) estate.datums[n];
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(&estate, var,
Change function call information to be variable length. Before this change FunctionCallInfoData, the struct arguments etc for V1 function calls are stored in, always had space for FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two arrays. For nearly every function call 100 arguments is far more than needed, therefore wasting memory. Arg and argnull being two separate arrays also guarantees that to access a single argument, two cachelines have to be touched. Change the layout so there's a single variable-length array with pairs of value / isnull. That drastically reduces memory consumption for most function calls (on x86-64 a two argument function now uses 64bytes, previously 936 bytes), and makes it very likely that argument value and its nullness are on the same cacheline. Arguments are stored in a new NullableDatum struct, which, due to padding, needs more memory per argument than before. But as usually far fewer arguments are stored, and individual arguments are cheaper to access, that's still a clear win. It's likely that there's other places where conversion to NullableDatum arrays would make sense, e.g. TupleTableSlots, but that's for another commit. Because the function call information is now variable-length allocations have to take the number of arguments into account. For heap allocations that can be done with SizeForFunctionCallInfoData(), for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro that helps to allocate an appropriately sized and aligned variable. Some places with stack allocation function call information don't know the number of arguments at compile time, and currently variably sized stack allocations aren't allowed in postgres. Therefore allow for FUNC_MAX_ARGS space in these cases. They're not that common, so for now that seems acceptable. Because of the need to allocate FunctionCallInfo of the appropriate size, older extensions may need to update their code. To avoid subtle breakages, the FunctionCallInfoData struct has been renamed to FunctionCallInfoBaseData. Most code only references FunctionCallInfo, so that shouldn't cause much collateral damage. This change is also a prerequisite for more efficient expression JIT compilation (by allocating the function call information on the stack, allowing LLVM to optimize it away); previously the size of the call information caused problems inside LLVM's optimizer. Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
2019-01-26 23:17:52 +01:00
fcinfo->args[i].value,
fcinfo->args[i].isnull,
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
false);
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
/*
* Force any array-valued parameter to be stored in
* expanded form in our local variable, in hopes of
* improving efficiency of uses of the variable. (This is
* a hack, really: why only arrays? Need more thought
* about which cases are likely to win. See also
* typisarray-specific heuristic in exec_assign_value.)
*
* Special cases: If passed a R/W expanded pointer, assume
* we can commandeer the object rather than having to copy
* it. If passed a R/O expanded pointer, just keep it as
* the value of the variable for the moment. (We'll force
* it to R/W if the variable gets modified, but that may
* very well never happen.)
*/
if (!var->isnull && var->datatype->typisarray)
{
if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(var->value)))
{
/* take ownership of R/W object */
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(&estate, var,
TransferExpandedObject(var->value,
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
estate.datum_context),
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
false,
true);
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
}
else if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(var->value)))
{
/* R/O pointer, keep it as-is until assigned to */
}
else
{
/* flat array, so force to expanded form */
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(&estate, var,
expand_array(var->value,
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
estate.datum_context,
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
NULL),
false,
true);
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
}
}
}
break;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
case PLPGSQL_DTYPE_REC:
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
PLpgSQL_rec *rec = (PLpgSQL_rec *) estate.datums[n];
Change function call information to be variable length. Before this change FunctionCallInfoData, the struct arguments etc for V1 function calls are stored in, always had space for FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two arrays. For nearly every function call 100 arguments is far more than needed, therefore wasting memory. Arg and argnull being two separate arrays also guarantees that to access a single argument, two cachelines have to be touched. Change the layout so there's a single variable-length array with pairs of value / isnull. That drastically reduces memory consumption for most function calls (on x86-64 a two argument function now uses 64bytes, previously 936 bytes), and makes it very likely that argument value and its nullness are on the same cacheline. Arguments are stored in a new NullableDatum struct, which, due to padding, needs more memory per argument than before. But as usually far fewer arguments are stored, and individual arguments are cheaper to access, that's still a clear win. It's likely that there's other places where conversion to NullableDatum arrays would make sense, e.g. TupleTableSlots, but that's for another commit. Because the function call information is now variable-length allocations have to take the number of arguments into account. For heap allocations that can be done with SizeForFunctionCallInfoData(), for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro that helps to allocate an appropriately sized and aligned variable. Some places with stack allocation function call information don't know the number of arguments at compile time, and currently variably sized stack allocations aren't allowed in postgres. Therefore allow for FUNC_MAX_ARGS space in these cases. They're not that common, so for now that seems acceptable. Because of the need to allocate FunctionCallInfo of the appropriate size, older extensions may need to update their code. To avoid subtle breakages, the FunctionCallInfoData struct has been renamed to FunctionCallInfoBaseData. Most code only references FunctionCallInfo, so that shouldn't cause much collateral damage. This change is also a prerequisite for more efficient expression JIT compilation (by allocating the function call information on the stack, allowing LLVM to optimize it away); previously the size of the call information caused problems inside LLVM's optimizer. Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
2019-01-26 23:17:52 +01:00
if (!fcinfo->args[i].isnull)
{
/* Assign row value from composite datum */
exec_move_row_from_datum(&estate,
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
(PLpgSQL_variable *) rec,
Change function call information to be variable length. Before this change FunctionCallInfoData, the struct arguments etc for V1 function calls are stored in, always had space for FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two arrays. For nearly every function call 100 arguments is far more than needed, therefore wasting memory. Arg and argnull being two separate arrays also guarantees that to access a single argument, two cachelines have to be touched. Change the layout so there's a single variable-length array with pairs of value / isnull. That drastically reduces memory consumption for most function calls (on x86-64 a two argument function now uses 64bytes, previously 936 bytes), and makes it very likely that argument value and its nullness are on the same cacheline. Arguments are stored in a new NullableDatum struct, which, due to padding, needs more memory per argument than before. But as usually far fewer arguments are stored, and individual arguments are cheaper to access, that's still a clear win. It's likely that there's other places where conversion to NullableDatum arrays would make sense, e.g. TupleTableSlots, but that's for another commit. Because the function call information is now variable-length allocations have to take the number of arguments into account. For heap allocations that can be done with SizeForFunctionCallInfoData(), for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro that helps to allocate an appropriately sized and aligned variable. Some places with stack allocation function call information don't know the number of arguments at compile time, and currently variably sized stack allocations aren't allowed in postgres. Therefore allow for FUNC_MAX_ARGS space in these cases. They're not that common, so for now that seems acceptable. Because of the need to allocate FunctionCallInfo of the appropriate size, older extensions may need to update their code. To avoid subtle breakages, the FunctionCallInfoData struct has been renamed to FunctionCallInfoBaseData. Most code only references FunctionCallInfo, so that shouldn't cause much collateral damage. This change is also a prerequisite for more efficient expression JIT compilation (by allocating the function call information on the stack, allowing LLVM to optimize it away); previously the size of the call information caused problems inside LLVM's optimizer. Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
2019-01-26 23:17:52 +01:00
fcinfo->args[i].value);
}
else
{
/* If arg is null, set variable to null */
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
exec_move_row(&estate, (PLpgSQL_variable *) rec,
NULL, NULL);
}
/* clean up after exec_move_row() */
exec_eval_cleanup(&estate);
}
break;
default:
/* Anything else should not be an argument variable */
elog(ERROR, "unrecognized dtype: %d", func->datums[i]->dtype);
}
}
estate.err_text = gettext_noop("during function entry");
/*
* Set the magic variable FOUND to false
*/
exec_set_found(&estate, false);
/*
* Let the instrumentation plugin peek at this function
*/
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->func_beg)
((*plpgsql_plugin_ptr)->func_beg) (&estate, func);
/*
* Now call the toplevel block of statements
*/
estate.err_text = NULL;
rc = exec_toplevel_block(&estate, func->action);
if (rc != PLPGSQL_RC_RETURN)
{
estate.err_text = NULL;
ereport(ERROR,
(errcode(ERRCODE_S_R_E_FUNCTION_EXECUTED_NO_RETURN_STATEMENT),
errmsg("control reached end of function without RETURN")));
}
/*
* We got a return value - process it
*/
estate.err_text = gettext_noop("while casting return value to function's return type");
fcinfo->isnull = estate.retisnull;
if (estate.retisset)
{
ReturnSetInfo *rsi = estate.rsi;
/* Check caller can handle a set result */
if (!rsi || !IsA(rsi, ReturnSetInfo) ||
(rsi->allowedModes & SFRM_Materialize) == 0)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("set-valued function called in context that cannot accept a set")));
rsi->returnMode = SFRM_Materialize;
/* If we produced any tuples, send back the result */
if (estate.tuple_store)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
MemoryContext oldcxt;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
rsi->setResult = estate.tuple_store;
oldcxt = MemoryContextSwitchTo(estate.tuple_store_cxt);
rsi->setDesc = CreateTupleDescCopy(estate.tuple_store_desc);
MemoryContextSwitchTo(oldcxt);
}
estate.retval = (Datum) 0;
fcinfo->isnull = true;
}
else if (!estate.retisnull)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Cast result value to function's declared result type, and copy it
* out to the upper executor memory context. We must treat tuple
* results specially in order to deal with cases like rowtypes
* involving dropped columns.
*/
if (estate.retistuple)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Don't need coercion if rowtype is known to match */
if (func->fn_rettype == estate.rettype &&
func->fn_rettype != RECORDOID)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Copy the tuple result into upper executor memory context.
* However, if we have a R/W expanded datum, we can just
* transfer its ownership out to the upper context.
*/
estate.retval = SPI_datumTransfer(estate.retval,
false,
-1);
}
else
{
/*
* Need to look up the expected result type. XXX would be
* better to cache the tupdesc instead of repeating
* get_call_result_type(), but the only easy place to save it
* is in the PLpgSQL_function struct, and that's too
* long-lived: composite types could change during the
* existence of a PLpgSQL_function.
*/
Oid resultTypeId;
TupleDesc tupdesc;
2006-10-04 02:30:14 +02:00
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
switch (get_call_result_type(fcinfo, &resultTypeId, &tupdesc))
{
case TYPEFUNC_COMPOSITE:
/* got the expected result rowtype, now coerce it */
coerce_function_result_tuple(&estate, tupdesc);
break;
case TYPEFUNC_COMPOSITE_DOMAIN:
/* got the expected result rowtype, now coerce it */
coerce_function_result_tuple(&estate, tupdesc);
/* and check domain constraints */
/* XXX allowing caching here would be good, too */
domain_check(estate.retval, false, resultTypeId,
NULL, NULL);
break;
case TYPEFUNC_RECORD:
/*
* Failed to determine actual type of RECORD. We
* could raise an error here, but what this means in
* practice is that the caller is expecting any old
* generic rowtype, so we don't really need to be
* restrictive. Pass back the generated result as-is.
*/
estate.retval = SPI_datumTransfer(estate.retval,
false,
-1);
break;
default:
/* shouldn't get here if retistuple is true ... */
elog(ERROR, "return type must be a row type");
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
break;
}
}
}
else
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Scalar case: use exec_cast_value */
estate.retval = exec_cast_value(&estate,
estate.retval,
&fcinfo->isnull,
estate.rettype,
-1,
func->fn_rettype,
-1);
2001-11-08 21:37:52 +01:00
/*
2004-11-30 04:50:29 +01:00
* If the function's return type isn't by value, copy the value
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
* into upper executor memory context. However, if we have a R/W
* expanded datum, we can just transfer its ownership out to the
* upper executor context.
*/
if (!fcinfo->isnull && !func->fn_retbyval)
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
estate.retval = SPI_datumTransfer(estate.retval,
false,
func->fn_rettyplen);
}
}
else
{
/*
* We're returning a NULL, which normally requires no conversion work
* regardless of datatypes. But, if we are casting it to a domain
* return type, we'd better check that the domain's constraints pass.
*/
if (func->fn_retisdomain)
estate.retval = exec_cast_value(&estate,
estate.retval,
&fcinfo->isnull,
estate.rettype,
-1,
func->fn_rettype,
-1);
}
estate.err_text = gettext_noop("during function exit");
/*
* Let the instrumentation plugin peek at this function
*/
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->func_end)
((*plpgsql_plugin_ptr)->func_end) (&estate, func);
/* Clean up any leftover temporary memory */
plpgsql_destroy_econtext(&estate);
exec_eval_cleanup(&estate);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* stmt_mcontext will be destroyed when function's main context is */
/*
* Pop the error context stack
*/
error_context_stack = plerrcontext.previous;
/*
2004-11-30 04:50:29 +01:00
* Return the function's result
*/
return estate.retval;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Helper for plpgsql_exec_function: coerce composite result to the specified
* tuple descriptor, and copy it out to upper executor memory. This is split
* out mostly for cosmetic reasons --- the logic would be very deeply nested
* otherwise.
*
* estate->retval is updated in-place.
*/
static void
coerce_function_result_tuple(PLpgSQL_execstate *estate, TupleDesc tupdesc)
{
HeapTuple rettup;
TupleDesc retdesc;
TupleConversionMap *tupmap;
/* We assume exec_stmt_return verified that result is composite */
Assert(type_is_rowtype(estate->rettype));
/* We can special-case expanded records for speed */
if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(estate->retval)))
{
ExpandedRecordHeader *erh = (ExpandedRecordHeader *) DatumGetEOHP(estate->retval);
Assert(erh->er_magic == ER_MAGIC);
/* Extract record's TupleDesc */
retdesc = expanded_record_get_tupdesc(erh);
/* check rowtype compatibility */
tupmap = convert_tuples_by_position(retdesc,
tupdesc,
gettext_noop("returned record type does not match expected record type"));
/* it might need conversion */
if (tupmap)
{
rettup = expanded_record_get_tuple(erh);
Assert(rettup);
rettup = execute_attr_map_tuple(rettup, tupmap);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Copy tuple to upper executor memory, as a tuple Datum. Make
* sure it is labeled with the caller-supplied tuple type.
*/
estate->retval = PointerGetDatum(SPI_returntuple(rettup, tupdesc));
/* no need to free map, we're about to return anyway */
}
else if (!(tupdesc->tdtypeid == erh->er_decltypeid ||
(tupdesc->tdtypeid == RECORDOID &&
!ExpandedRecordIsDomain(erh))))
{
/*
* The expanded record has the right physical tupdesc, but the
* wrong type ID. (Typically, the expanded record is RECORDOID
* but the function is declared to return a named composite type.
* As in exec_move_row_from_datum, we don't allow returning a
* composite-domain record from a function declared to return
* RECORD.) So we must flatten the record to a tuple datum and
* overwrite its type fields with the right thing. spi.c doesn't
* provide any easy way to deal with this case, so we end up
* duplicating the guts of datumCopy() :-(
*/
Size resultsize;
HeapTupleHeader tuphdr;
resultsize = EOH_get_flat_size(&erh->hdr);
tuphdr = (HeapTupleHeader) SPI_palloc(resultsize);
EOH_flatten_into(&erh->hdr, (void *) tuphdr, resultsize);
HeapTupleHeaderSetTypeId(tuphdr, tupdesc->tdtypeid);
HeapTupleHeaderSetTypMod(tuphdr, tupdesc->tdtypmod);
estate->retval = PointerGetDatum(tuphdr);
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
else
{
/*
* We need only copy result into upper executor memory context.
* However, if we have a R/W expanded datum, we can just transfer
* its ownership out to the upper executor context.
*/
estate->retval = SPI_datumTransfer(estate->retval,
false,
-1);
}
}
else
{
/* Convert composite datum to a HeapTuple and TupleDesc */
HeapTupleData tmptup;
retdesc = deconstruct_composite_datum(estate->retval, &tmptup);
rettup = &tmptup;
/* check rowtype compatibility */
tupmap = convert_tuples_by_position(retdesc,
tupdesc,
gettext_noop("returned record type does not match expected record type"));
/* it might need conversion */
if (tupmap)
rettup = execute_attr_map_tuple(rettup, tupmap);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Copy tuple to upper executor memory, as a tuple Datum. Make sure
* it is labeled with the caller-supplied tuple type.
*/
estate->retval = PointerGetDatum(SPI_returntuple(rettup, tupdesc));
/* no need to free map, we're about to return anyway */
ReleaseTupleDesc(retdesc);
}
}
/* ----------
* plpgsql_exec_trigger Called by the call handler for
* trigger execution.
* ----------
*/
HeapTuple
plpgsql_exec_trigger(PLpgSQL_function *func,
TriggerData *trigdata)
{
PLpgSQL_execstate estate;
ErrorContextCallback plerrcontext;
int rc;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
TupleDesc tupdesc;
PLpgSQL_rec *rec_new,
*rec_old;
HeapTuple rettup;
/*
* Setup the execution state
*/
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
plpgsql_estate_setup(&estate, func, NULL, NULL, NULL);
estate.trigdata = trigdata;
/*
* Setup error traceback support for ereport()
*/
plerrcontext.callback = plpgsql_exec_error_callback;
plerrcontext.arg = &estate;
plerrcontext.previous = error_context_stack;
error_context_stack = &plerrcontext;
/*
* Make local execution copies of all the datums
*/
estate.err_text = gettext_noop("during initialization of execution state");
copy_plpgsql_datums(&estate, func);
/*
* Put the OLD and NEW tuples into record variables
*
* We set up expanded records for both variables even though only one may
* have a value. This allows record references to succeed in functions
* that are used for multiple trigger types. For example, we might have a
* test like "if (TG_OP = 'INSERT' and NEW.foo = 'xyz')", which should
* work regardless of the current trigger type. If a value is actually
* fetched from an unsupplied tuple, it will read as NULL.
*/
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
tupdesc = RelationGetDescr(trigdata->tg_relation);
rec_new = (PLpgSQL_rec *) (estate.datums[func->new_varno]);
rec_old = (PLpgSQL_rec *) (estate.datums[func->old_varno]);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
rec_new->erh = make_expanded_record_from_tupdesc(tupdesc,
estate.datum_context);
rec_old->erh = make_expanded_record_from_exprecord(rec_new->erh,
estate.datum_context);
if (!TRIGGER_FIRED_FOR_ROW(trigdata->tg_event))
{
/*
* Per-statement triggers don't use OLD/NEW variables
*/
}
else if (TRIGGER_FIRED_BY_INSERT(trigdata->tg_event))
{
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(rec_new->erh, trigdata->tg_trigtuple,
false, false);
}
else if (TRIGGER_FIRED_BY_UPDATE(trigdata->tg_event))
{
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(rec_new->erh, trigdata->tg_newtuple,
false, false);
expanded_record_set_tuple(rec_old->erh, trigdata->tg_trigtuple,
false, false);
/*
* In BEFORE trigger, stored generated columns are not computed yet,
* so make them null in the NEW row. (Only needed in UPDATE branch;
* in the INSERT case, they are already null, but in UPDATE, the field
* still contains the old value.) Alternatively, we could construct a
* whole new row structure without the generated columns, but this way
* seems more efficient and potentially less confusing.
*/
if (tupdesc->constr && tupdesc->constr->has_generated_stored &&
TRIGGER_FIRED_BEFORE(trigdata->tg_event))
{
for (int i = 0; i < tupdesc->natts; i++)
if (TupleDescAttr(tupdesc, i)->attgenerated == ATTRIBUTE_GENERATED_STORED)
expanded_record_set_field_internal(rec_new->erh,
i + 1,
(Datum) 0,
true, /* isnull */
false, false);
}
}
else if (TRIGGER_FIRED_BY_DELETE(trigdata->tg_event))
{
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(rec_old->erh, trigdata->tg_trigtuple,
false, false);
}
else
elog(ERROR, "unrecognized trigger action: not INSERT, DELETE, or UPDATE");
/* Make transition tables visible to this SPI connection */
rc = SPI_register_trigger_data(trigdata);
Assert(rc >= 0);
estate.err_text = gettext_noop("during function entry");
/*
* Set the magic variable FOUND to false
*/
exec_set_found(&estate, false);
/*
* Let the instrumentation plugin peek at this function
*/
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->func_beg)
((*plpgsql_plugin_ptr)->func_beg) (&estate, func);
/*
* Now call the toplevel block of statements
*/
estate.err_text = NULL;
rc = exec_toplevel_block(&estate, func->action);
if (rc != PLPGSQL_RC_RETURN)
{
estate.err_text = NULL;
ereport(ERROR,
(errcode(ERRCODE_S_R_E_FUNCTION_EXECUTED_NO_RETURN_STATEMENT),
errmsg("control reached end of trigger procedure without RETURN")));
}
estate.err_text = gettext_noop("during function exit");
if (estate.retisset)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("trigger procedure cannot return a set")));
/*
* Check that the returned tuple structure has the same attributes, the
* relation that fired the trigger has. A per-statement trigger always
* needs to return NULL, so we ignore any return value the function itself
* produces (XXX: is this a good idea?)
*
* XXX This way it is possible, that the trigger returns a tuple where
* attributes don't have the correct atttypmod's length. It's up to the
* trigger's programmer to ensure that this doesn't happen. Jan
*/
if (estate.retisnull || !TRIGGER_FIRED_FOR_ROW(trigdata->tg_event))
rettup = NULL;
else
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
TupleDesc retdesc;
TupleConversionMap *tupmap;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* We assume exec_stmt_return verified that result is composite */
Assert(type_is_rowtype(estate.rettype));
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* We can special-case expanded records for speed */
if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(estate.retval)))
{
ExpandedRecordHeader *erh = (ExpandedRecordHeader *) DatumGetEOHP(estate.retval);
Assert(erh->er_magic == ER_MAGIC);
/* Extract HeapTuple and TupleDesc */
rettup = expanded_record_get_tuple(erh);
Assert(rettup);
retdesc = expanded_record_get_tupdesc(erh);
if (retdesc != RelationGetDescr(trigdata->tg_relation))
{
/* check rowtype compatibility */
tupmap = convert_tuples_by_position(retdesc,
RelationGetDescr(trigdata->tg_relation),
gettext_noop("returned row structure does not match the structure of the triggering table"));
/* it might need conversion */
if (tupmap)
rettup = execute_attr_map_tuple(rettup, tupmap);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* no need to free map, we're about to return anyway */
}
/*
* Copy tuple to upper executor memory. But if user just did
* "return new" or "return old" without changing anything, there's
* no need to copy; we can return the original tuple (which will
* save a few cycles in trigger.c as well as here).
*/
if (rettup != trigdata->tg_newtuple &&
rettup != trigdata->tg_trigtuple)
rettup = SPI_copytuple(rettup);
}
else
{
/* Convert composite datum to a HeapTuple and TupleDesc */
HeapTupleData tmptup;
retdesc = deconstruct_composite_datum(estate.retval, &tmptup);
rettup = &tmptup;
/* check rowtype compatibility */
tupmap = convert_tuples_by_position(retdesc,
RelationGetDescr(trigdata->tg_relation),
gettext_noop("returned row structure does not match the structure of the triggering table"));
/* it might need conversion */
if (tupmap)
rettup = execute_attr_map_tuple(rettup, tupmap);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
ReleaseTupleDesc(retdesc);
/* no need to free map, we're about to return anyway */
/* Copy tuple to upper executor memory */
rettup = SPI_copytuple(rettup);
}
}
/*
* Let the instrumentation plugin peek at this function
*/
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->func_end)
((*plpgsql_plugin_ptr)->func_end) (&estate, func);
/* Clean up any leftover temporary memory */
plpgsql_destroy_econtext(&estate);
exec_eval_cleanup(&estate);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* stmt_mcontext will be destroyed when function's main context is */
/*
* Pop the error context stack
*/
error_context_stack = plerrcontext.previous;
/*
* Return the trigger's result
*/
return rettup;
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* ----------
* plpgsql_exec_event_trigger Called by the call handler for
* event trigger execution.
* ----------
*/
void
plpgsql_exec_event_trigger(PLpgSQL_function *func, EventTriggerData *trigdata)
{
PLpgSQL_execstate estate;
ErrorContextCallback plerrcontext;
int rc;
/*
* Setup the execution state
*/
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
plpgsql_estate_setup(&estate, func, NULL, NULL, NULL);
estate.evtrigdata = trigdata;
/*
* Setup error traceback support for ereport()
*/
plerrcontext.callback = plpgsql_exec_error_callback;
plerrcontext.arg = &estate;
plerrcontext.previous = error_context_stack;
error_context_stack = &plerrcontext;
/*
* Make local execution copies of all the datums
*/
estate.err_text = gettext_noop("during initialization of execution state");
copy_plpgsql_datums(&estate, func);
/*
* Let the instrumentation plugin peek at this function
*/
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->func_beg)
((*plpgsql_plugin_ptr)->func_beg) (&estate, func);
/*
* Now call the toplevel block of statements
*/
estate.err_text = NULL;
rc = exec_toplevel_block(&estate, func->action);
if (rc != PLPGSQL_RC_RETURN)
{
estate.err_text = NULL;
ereport(ERROR,
(errcode(ERRCODE_S_R_E_FUNCTION_EXECUTED_NO_RETURN_STATEMENT),
errmsg("control reached end of trigger procedure without RETURN")));
}
estate.err_text = gettext_noop("during function exit");
/*
* Let the instrumentation plugin peek at this function
*/
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->func_end)
((*plpgsql_plugin_ptr)->func_end) (&estate, func);
/* Clean up any leftover temporary memory */
plpgsql_destroy_econtext(&estate);
exec_eval_cleanup(&estate);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* stmt_mcontext will be destroyed when function's main context is */
/*
* Pop the error context stack
*/
error_context_stack = plerrcontext.previous;
}
/*
* error context callback to let us supply a call-stack traceback
*/
static void
plpgsql_exec_error_callback(void *arg)
{
PLpgSQL_execstate *estate = (PLpgSQL_execstate *) arg;
int err_lineno;
/*
* If err_var is set, report the variable's declaration line number.
* Otherwise, if err_stmt is set, report the err_stmt's line number. When
* err_stmt is not set, we're in function entry/exit, or some such place
* not attached to a specific line number.
*/
if (estate->err_var != NULL)
err_lineno = estate->err_var->lineno;
else if (estate->err_stmt != NULL)
err_lineno = estate->err_stmt->lineno;
else
err_lineno = 0;
if (estate->err_text != NULL)
{
/*
* We don't expend the cycles to run gettext() on err_text unless we
* actually need it. Therefore, places that set up err_text should
* use gettext_noop() to ensure the strings get recorded in the
* message dictionary.
2003-08-04 02:43:34 +02:00
*/
if (err_lineno > 0)
{
2009-02-18 12:33:04 +01:00
/*
* translator: last %s is a phrase such as "during statement block
* local variable initialization"
*/
errcontext("PL/pgSQL function %s line %d %s",
estate->func->fn_signature,
err_lineno,
2009-02-18 12:33:04 +01:00
_(estate->err_text));
}
else
{
2009-02-18 12:33:04 +01:00
/*
* translator: last %s is a phrase such as "while storing call
* arguments into local variables"
*/
errcontext("PL/pgSQL function %s %s",
estate->func->fn_signature,
2009-02-18 12:33:04 +01:00
_(estate->err_text));
}
}
else if (estate->err_stmt != NULL && err_lineno > 0)
{
/* translator: last %s is a plpgsql statement type name */
errcontext("PL/pgSQL function %s line %d at %s",
estate->func->fn_signature,
err_lineno,
plpgsql_stmt_typename(estate->err_stmt));
}
else
errcontext("PL/pgSQL function %s",
estate->func->fn_signature);
}
/* ----------
* Support function for initializing local execution variables
* ----------
*/
static void
copy_plpgsql_datums(PLpgSQL_execstate *estate,
PLpgSQL_function *func)
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
{
int ndatums = estate->ndatums;
PLpgSQL_datum **indatums;
PLpgSQL_datum **outdatums;
char *workspace;
char *ws_next;
int i;
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
/* Allocate local datum-pointer array */
estate->datums = (PLpgSQL_datum **)
palloc(sizeof(PLpgSQL_datum *) * ndatums);
/*
* To reduce palloc overhead, we make a single palloc request for all the
* space needed for locally-instantiated datums.
*/
workspace = palloc(func->copiable_size);
ws_next = workspace;
/* Fill datum-pointer array, copying datums into workspace as needed */
indatums = func->datums;
outdatums = estate->datums;
for (i = 0; i < ndatums; i++)
{
PLpgSQL_datum *indatum = indatums[i];
PLpgSQL_datum *outdatum;
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
/* This must agree with plpgsql_finish_datums on what is copiable */
switch (indatum->dtype)
{
case PLPGSQL_DTYPE_VAR:
case PLPGSQL_DTYPE_PROMISE:
outdatum = (PLpgSQL_datum *) ws_next;
memcpy(outdatum, indatum, sizeof(PLpgSQL_var));
ws_next += MAXALIGN(sizeof(PLpgSQL_var));
break;
case PLPGSQL_DTYPE_REC:
outdatum = (PLpgSQL_datum *) ws_next;
memcpy(outdatum, indatum, sizeof(PLpgSQL_rec));
ws_next += MAXALIGN(sizeof(PLpgSQL_rec));
break;
case PLPGSQL_DTYPE_ROW:
case PLPGSQL_DTYPE_RECFIELD:
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
/*
* These datum records are read-only at runtime, so no need to
* copy them (well, RECFIELD contains cached data, but we'd
* just as soon centralize the caching anyway).
*/
outdatum = indatum;
break;
default:
elog(ERROR, "unrecognized dtype: %d", indatum->dtype);
outdatum = NULL; /* keep compiler quiet */
break;
}
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
outdatums[i] = outdatum;
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
}
Assert(ws_next == workspace + func->copiable_size);
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
}
/*
* If the variable has an armed "promise", compute the promised value
* and assign it to the variable.
* The assignment automatically disarms the promise.
*/
static void
plpgsql_fulfill_promise(PLpgSQL_execstate *estate,
PLpgSQL_var *var)
{
MemoryContext oldcontext;
if (var->promise == PLPGSQL_PROMISE_NONE)
return; /* nothing to do */
/*
* This will typically be invoked in a short-lived context such as the
* mcontext. We must create variable values in the estate's datum
* context. This quick-and-dirty solution risks leaking some additional
* cruft there, but since any one promise is honored at most once per
* function call, it's probably not worth being more careful.
*/
oldcontext = MemoryContextSwitchTo(estate->datum_context);
switch (var->promise)
{
case PLPGSQL_PROMISE_TG_NAME:
if (estate->trigdata == NULL)
elog(ERROR, "trigger promise is not in a trigger function");
assign_simple_var(estate, var,
DirectFunctionCall1(namein,
CStringGetDatum(estate->trigdata->tg_trigger->tgname)),
false, true);
break;
case PLPGSQL_PROMISE_TG_WHEN:
if (estate->trigdata == NULL)
elog(ERROR, "trigger promise is not in a trigger function");
if (TRIGGER_FIRED_BEFORE(estate->trigdata->tg_event))
assign_text_var(estate, var, "BEFORE");
else if (TRIGGER_FIRED_AFTER(estate->trigdata->tg_event))
assign_text_var(estate, var, "AFTER");
else if (TRIGGER_FIRED_INSTEAD(estate->trigdata->tg_event))
assign_text_var(estate, var, "INSTEAD OF");
else
elog(ERROR, "unrecognized trigger execution time: not BEFORE, AFTER, or INSTEAD OF");
break;
case PLPGSQL_PROMISE_TG_LEVEL:
if (estate->trigdata == NULL)
elog(ERROR, "trigger promise is not in a trigger function");
if (TRIGGER_FIRED_FOR_ROW(estate->trigdata->tg_event))
assign_text_var(estate, var, "ROW");
else if (TRIGGER_FIRED_FOR_STATEMENT(estate->trigdata->tg_event))
assign_text_var(estate, var, "STATEMENT");
else
elog(ERROR, "unrecognized trigger event type: not ROW or STATEMENT");
break;
case PLPGSQL_PROMISE_TG_OP:
if (estate->trigdata == NULL)
elog(ERROR, "trigger promise is not in a trigger function");
if (TRIGGER_FIRED_BY_INSERT(estate->trigdata->tg_event))
assign_text_var(estate, var, "INSERT");
else if (TRIGGER_FIRED_BY_UPDATE(estate->trigdata->tg_event))
assign_text_var(estate, var, "UPDATE");
else if (TRIGGER_FIRED_BY_DELETE(estate->trigdata->tg_event))
assign_text_var(estate, var, "DELETE");
else if (TRIGGER_FIRED_BY_TRUNCATE(estate->trigdata->tg_event))
assign_text_var(estate, var, "TRUNCATE");
else
elog(ERROR, "unrecognized trigger action: not INSERT, DELETE, UPDATE, or TRUNCATE");
break;
case PLPGSQL_PROMISE_TG_RELID:
if (estate->trigdata == NULL)
elog(ERROR, "trigger promise is not in a trigger function");
assign_simple_var(estate, var,
ObjectIdGetDatum(estate->trigdata->tg_relation->rd_id),
false, false);
break;
case PLPGSQL_PROMISE_TG_TABLE_NAME:
if (estate->trigdata == NULL)
elog(ERROR, "trigger promise is not in a trigger function");
assign_simple_var(estate, var,
DirectFunctionCall1(namein,
CStringGetDatum(RelationGetRelationName(estate->trigdata->tg_relation))),
false, true);
break;
case PLPGSQL_PROMISE_TG_TABLE_SCHEMA:
if (estate->trigdata == NULL)
elog(ERROR, "trigger promise is not in a trigger function");
assign_simple_var(estate, var,
DirectFunctionCall1(namein,
CStringGetDatum(get_namespace_name(RelationGetNamespace(estate->trigdata->tg_relation)))),
false, true);
break;
case PLPGSQL_PROMISE_TG_NARGS:
if (estate->trigdata == NULL)
elog(ERROR, "trigger promise is not in a trigger function");
assign_simple_var(estate, var,
Int16GetDatum(estate->trigdata->tg_trigger->tgnargs),
false, false);
break;
case PLPGSQL_PROMISE_TG_ARGV:
if (estate->trigdata == NULL)
elog(ERROR, "trigger promise is not in a trigger function");
if (estate->trigdata->tg_trigger->tgnargs > 0)
{
/*
* For historical reasons, tg_argv[] subscripts start at zero
* not one. So we can't use construct_array().
*/
int nelems = estate->trigdata->tg_trigger->tgnargs;
Datum *elems;
int dims[1];
int lbs[1];
int i;
elems = palloc(sizeof(Datum) * nelems);
for (i = 0; i < nelems; i++)
elems[i] = CStringGetTextDatum(estate->trigdata->tg_trigger->tgargs[i]);
dims[0] = nelems;
lbs[0] = 0;
assign_simple_var(estate, var,
PointerGetDatum(construct_md_array(elems, NULL,
1, dims, lbs,
TEXTOID,
-1, false, TYPALIGN_INT)),
false, true);
}
else
{
assign_simple_var(estate, var, (Datum) 0, true, false);
}
break;
case PLPGSQL_PROMISE_TG_EVENT:
if (estate->evtrigdata == NULL)
elog(ERROR, "event trigger promise is not in an event trigger function");
assign_text_var(estate, var, estate->evtrigdata->event);
break;
case PLPGSQL_PROMISE_TG_TAG:
if (estate->evtrigdata == NULL)
elog(ERROR, "event trigger promise is not in an event trigger function");
assign_text_var(estate, var, GetCommandTagName(estate->evtrigdata->tag));
break;
default:
elog(ERROR, "unrecognized promise type: %d", var->promise);
}
MemoryContextSwitchTo(oldcontext);
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/*
* Create a memory context for statement-lifespan variables, if we don't
* have one already. It will be a child of stmt_mcontext_parent, which is
* either the function's main context or a pushed-down outer stmt_mcontext.
*/
static MemoryContext
get_stmt_mcontext(PLpgSQL_execstate *estate)
{
if (estate->stmt_mcontext == NULL)
{
estate->stmt_mcontext =
AllocSetContextCreate(estate->stmt_mcontext_parent,
"PLpgSQL per-statement data",
Add macros to make AllocSetContextCreate() calls simpler and safer. I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
ALLOCSET_DEFAULT_SIZES);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
}
return estate->stmt_mcontext;
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/*
* Push down the current stmt_mcontext so that called statements won't use it.
* This is needed by statements that have statement-lifespan data and need to
* preserve it across some inner statements. The caller should eventually do
* pop_stmt_mcontext().
*/
static void
push_stmt_mcontext(PLpgSQL_execstate *estate)
{
/* Should have done get_stmt_mcontext() first */
Assert(estate->stmt_mcontext != NULL);
/* Assert we've not messed up the stack linkage */
Assert(MemoryContextGetParent(estate->stmt_mcontext) == estate->stmt_mcontext_parent);
/* Push it down to become the parent of any nested stmt mcontext */
estate->stmt_mcontext_parent = estate->stmt_mcontext;
/* And make it not available for use directly */
estate->stmt_mcontext = NULL;
}
/*
* Undo push_stmt_mcontext(). We assume this is done just before or after
* resetting the caller's stmt_mcontext; since that action will also delete
* any child contexts, there's no need to explicitly delete whatever context
* might currently be estate->stmt_mcontext.
*/
static void
pop_stmt_mcontext(PLpgSQL_execstate *estate)
{
/* We need only pop the stack */
estate->stmt_mcontext = estate->stmt_mcontext_parent;
estate->stmt_mcontext_parent = MemoryContextGetParent(estate->stmt_mcontext);
}
/*
* Subroutine for exec_stmt_block: does any condition in the condition list
* match the current exception?
*/
static bool
exception_matches_conditions(ErrorData *edata, PLpgSQL_condition *cond)
{
for (; cond != NULL; cond = cond->next)
{
int sqlerrstate = cond->sqlerrstate;
/*
* OTHERS matches everything *except* query-canceled and
* assert-failure. If you're foolish enough, you can match those
* explicitly.
*/
if (sqlerrstate == 0)
{
if (edata->sqlerrcode != ERRCODE_QUERY_CANCELED &&
edata->sqlerrcode != ERRCODE_ASSERT_FAILURE)
return true;
}
/* Exact match? */
else if (edata->sqlerrcode == sqlerrstate)
return true;
/* Category match? */
else if (ERRCODE_IS_CATEGORY(sqlerrstate) &&
ERRCODE_TO_CATEGORY(edata->sqlerrcode) == sqlerrstate)
return true;
}
return false;
}
/* ----------
* exec_toplevel_block Execute the toplevel block
*
* This is intentionally equivalent to executing exec_stmts() with a
* list consisting of the one statement. One tiny difference is that
* we do not bother to save the entry value of estate->err_stmt;
* that's assumed to be NULL.
* ----------
*/
static int
exec_toplevel_block(PLpgSQL_execstate *estate, PLpgSQL_stmt_block *block)
{
int rc;
estate->err_stmt = (PLpgSQL_stmt *) block;
/* Let the plugin know that we are about to execute this statement */
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->stmt_beg)
((*plpgsql_plugin_ptr)->stmt_beg) (estate, (PLpgSQL_stmt *) block);
CHECK_FOR_INTERRUPTS();
rc = exec_stmt_block(estate, block);
/* Let the plugin know that we have finished executing this statement */
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->stmt_end)
((*plpgsql_plugin_ptr)->stmt_end) (estate, (PLpgSQL_stmt *) block);
estate->err_stmt = NULL;
return rc;
}
/* ----------
* exec_stmt_block Execute a block of statements
* ----------
*/
static int
exec_stmt_block(PLpgSQL_execstate *estate, PLpgSQL_stmt_block *block)
{
volatile int rc = -1;
int i;
/*
* First initialize all variables declared in this block
*/
estate->err_text = gettext_noop("during statement block local variable initialization");
for (i = 0; i < block->n_initvars; i++)
{
int n = block->initvarnos[i];
PLpgSQL_datum *datum = estate->datums[n];
/*
* The set of dtypes handled here must match plpgsql_add_initdatums().
*
* Note that we currently don't support promise datums within blocks,
* only at a function's outermost scope, so we needn't handle those
* here.
*
* Since RECFIELD isn't a supported case either, it's okay to cast the
* PLpgSQL_datum to PLpgSQL_variable.
*/
estate->err_var = (PLpgSQL_variable *) datum;
switch (datum->dtype)
{
case PLPGSQL_DTYPE_VAR:
{
PLpgSQL_var *var = (PLpgSQL_var *) datum;
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
/*
* Free any old value, in case re-entering block, and
* initialize to NULL
*/
assign_simple_var(estate, var, (Datum) 0, true, false);
if (var->default_val == NULL)
{
/*
* If needed, give the datatype a chance to reject
* NULLs, by assigning a NULL to the variable. We
* claim the value is of type UNKNOWN, not the var's
* datatype, else coercion will be skipped.
*/
if (var->datatype->typtype == TYPTYPE_DOMAIN)
exec_assign_value(estate,
(PLpgSQL_datum *) var,
(Datum) 0,
true,
UNKNOWNOID,
-1);
/* parser should have rejected NOT NULL */
Assert(!var->notnull);
}
else
{
exec_assign_expr(estate, (PLpgSQL_datum *) var,
var->default_val);
}
}
break;
case PLPGSQL_DTYPE_REC:
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) datum;
/*
* Deletion of any existing object will be handled during
* the assignments below, and in some cases it's more
* efficient for us not to get rid of it beforehand.
*/
if (rec->default_val == NULL)
{
/*
* If needed, give the datatype a chance to reject
* NULLs, by assigning a NULL to the variable.
*/
exec_move_row(estate, (PLpgSQL_variable *) rec,
NULL, NULL);
/* parser should have rejected NOT NULL */
Assert(!rec->notnull);
}
else
{
exec_assign_expr(estate, (PLpgSQL_datum *) rec,
rec->default_val);
}
}
break;
default:
elog(ERROR, "unrecognized dtype: %d", datum->dtype);
}
}
estate->err_var = NULL;
if (block->exceptions)
{
/*
* Execute the statements in the block's body inside a sub-transaction
*/
MemoryContext oldcontext = CurrentMemoryContext;
ResourceOwner oldowner = CurrentResourceOwner;
ExprContext *old_eval_econtext = estate->eval_econtext;
ErrorData *save_cur_error = estate->cur_error;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext stmt_mcontext;
estate->err_text = gettext_noop("during statement block entry");
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/*
* We will need a stmt_mcontext to hold the error data if an error
* occurs. It seems best to force it to exist before entering the
* subtransaction, so that we reduce the risk of out-of-memory during
* error recovery, and because this greatly simplifies restoring the
* stmt_mcontext stack to the correct state after an error. We can
* ameliorate the cost of this by allowing the called statements to
* use this mcontext too; so we don't push it down here.
*/
stmt_mcontext = get_stmt_mcontext(estate);
BeginInternalSubTransaction(NULL);
/* Want to run statements inside function's memory context */
MemoryContextSwitchTo(oldcontext);
PG_TRY();
{
/*
* We need to run the block's statements with a new eval_econtext
* that belongs to the current subtransaction; if we try to use
* the outer econtext then ExprContext shutdown callbacks will be
* called at the wrong times.
*/
plpgsql_create_econtext(estate);
estate->err_text = NULL;
/* Run the block's statements */
rc = exec_stmts(estate, block->body);
estate->err_text = gettext_noop("during statement block exit");
/*
* If the block ended with RETURN, we may need to copy the return
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* value out of the subtransaction eval_context. We can avoid a
* physical copy if the value happens to be a R/W expanded object.
*/
if (rc == PLPGSQL_RC_RETURN &&
!estate->retisset &&
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
!estate->retisnull)
{
int16 resTypLen;
bool resTypByVal;
get_typlenbyval(estate->rettype, &resTypLen, &resTypByVal);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
estate->retval = datumTransfer(estate->retval,
resTypByVal, resTypLen);
}
/* Commit the inner transaction, return to outer xact context */
ReleaseCurrentSubTransaction();
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Assert that the stmt_mcontext stack is unchanged */
Assert(stmt_mcontext == estate->stmt_mcontext);
/*
* Revert to outer eval_econtext. (The inner one was
* automatically cleaned up during subxact exit.)
*/
estate->eval_econtext = old_eval_econtext;
}
PG_CATCH();
{
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
ErrorData *edata;
ListCell *e;
estate->err_text = gettext_noop("during exception cleanup");
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Save error info in our stmt_mcontext */
MemoryContextSwitchTo(stmt_mcontext);
edata = CopyErrorData();
FlushErrorState();
/* Abort the inner transaction */
RollbackAndReleaseCurrentSubTransaction();
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/*
* Set up the stmt_mcontext stack as though we had restored our
* previous state and then done push_stmt_mcontext(). The push is
* needed so that statements in the exception handler won't
* clobber the error data that's in our stmt_mcontext.
*/
estate->stmt_mcontext_parent = stmt_mcontext;
estate->stmt_mcontext = NULL;
/*
* Now we can delete any nested stmt_mcontexts that might have
* been created as children of ours. (Note: we do not immediately
* release any statement-lifespan data that might have been left
* behind in stmt_mcontext itself. We could attempt that by doing
* a MemoryContextReset on it before collecting the error data
* above, but it seems too risky to do any significant amount of
* work before collecting the error.)
*/
MemoryContextDeleteChildren(stmt_mcontext);
/* Revert to outer eval_econtext */
estate->eval_econtext = old_eval_econtext;
Prevent leakage of SPI tuple tables during subtransaction abort. plpgsql often just remembers SPI-result tuple tables in local variables, and has no mechanism for freeing them if an ereport(ERROR) causes an escape out of the execution function whose local variable it is. In the original coding, that wasn't a problem because the tuple table would be cleaned up when the function's SPI context went away during transaction abort. However, once plpgsql grew the ability to trap exceptions, repeated trapping of errors within a function could result in significant intra-function-call memory leakage, as illustrated in bug #8279 from Chad Wagner. We could fix this locally in plpgsql with a bunch of PG_TRY/PG_CATCH coding, but that would be tedious, probably slow, and prone to bugs of omission; moreover it would do nothing for similar risks elsewhere. What seems like a better plan is to make SPI itself responsible for freeing tuple tables at subtransaction abort. This patch attacks the problem that way, keeping a list of live tuple tables within each SPI function context. Currently, such freeing is automatic for tuple tables made within the failed subtransaction. We might later add a SPI call to mark a tuple table as not to be freed this way, allowing callers to opt out; but until someone exhibits a clear use-case for such behavior, it doesn't seem worth bothering. A very useful side-effect of this change is that SPI_freetuptable() can now defend itself against bad calls, such as duplicate free requests; this should make things more robust in many places. (In particular, this reduces the risks involved if a third-party extension contains now-redundant SPI_freetuptable() calls in error cleanup code.) Even though the leakage problem is of long standing, it seems imprudent to back-patch this into stable branches, since it does represent an API semantics change for SPI users. We'll patch this in 9.3, but live with the leakage in older branches.
2013-07-25 22:45:43 +02:00
/*
* Must clean up the econtext too. However, any tuple table made
* in the subxact will have been thrown away by SPI during subxact
* abort, so we don't need to (and mustn't try to) free the
* eval_tuptable.
*/
estate->eval_tuptable = NULL;
exec_eval_cleanup(estate);
/* Look for a matching exception handler */
foreach(e, block->exceptions->exc_list)
{
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
PLpgSQL_exception *exception = (PLpgSQL_exception *) lfirst(e);
if (exception_matches_conditions(edata, exception->conditions))
{
/*
* Initialize the magic SQLSTATE and SQLERRM variables for
* the exception block; this also frees values from any
* prior use of the same exception. We needn't do this
* until we have found a matching exception.
*/
PLpgSQL_var *state_var;
PLpgSQL_var *errm_var;
state_var = (PLpgSQL_var *)
estate->datums[block->exceptions->sqlstate_varno];
errm_var = (PLpgSQL_var *)
estate->datums[block->exceptions->sqlerrm_varno];
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_text_var(estate, state_var,
unpack_sql_state(edata->sqlerrcode));
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_text_var(estate, errm_var, edata->message);
/*
* Also set up cur_error so the error data is accessible
* inside the handler.
*/
estate->cur_error = edata;
estate->err_text = NULL;
rc = exec_stmts(estate, exception->action);
break;
}
}
/*
* Restore previous state of cur_error, whether or not we executed
* a handler. This is needed in case an error got thrown from
* some inner block's exception handler.
*/
estate->cur_error = save_cur_error;
/* If no match found, re-throw the error */
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
if (e == NULL)
ReThrowError(edata);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Restore stmt_mcontext stack and release the error data */
pop_stmt_mcontext(estate);
MemoryContextReset(stmt_mcontext);
}
PG_END_TRY();
Assert(save_cur_error == estate->cur_error);
}
else
{
/*
* Just execute the statements in the block's body
*/
estate->err_text = NULL;
rc = exec_stmts(estate, block->body);
}
estate->err_text = NULL;
/*
* Handle the return code. This is intentionally different from
* LOOP_RC_PROCESSING(): CONTINUE never matches a block, and EXIT matches
* a block only if there is a label match.
*/
switch (rc)
{
case PLPGSQL_RC_OK:
case PLPGSQL_RC_RETURN:
case PLPGSQL_RC_CONTINUE:
return rc;
case PLPGSQL_RC_EXIT:
if (estate->exitlabel == NULL)
return PLPGSQL_RC_EXIT;
if (block->label == NULL)
return PLPGSQL_RC_EXIT;
if (strcmp(block->label, estate->exitlabel) != 0)
return PLPGSQL_RC_EXIT;
estate->exitlabel = NULL;
return PLPGSQL_RC_OK;
2005-10-15 04:49:52 +02:00
default:
elog(ERROR, "unrecognized rc: %d", rc);
}
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmts Iterate over a list of statements
* as long as their return code is OK
* ----------
*/
static int
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
exec_stmts(PLpgSQL_execstate *estate, List *stmts)
{
PLpgSQL_stmt *save_estmt = estate->err_stmt;
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
ListCell *s;
if (stmts == NIL)
{
/*
* Ensure we do a CHECK_FOR_INTERRUPTS() even though there is no
* statement. This prevents hangup in a tight loop if, for instance,
* there is a LOOP construct with an empty body.
*/
CHECK_FOR_INTERRUPTS();
return PLPGSQL_RC_OK;
}
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
foreach(s, stmts)
{
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
PLpgSQL_stmt *stmt = (PLpgSQL_stmt *) lfirst(s);
int rc;
estate->err_stmt = stmt;
/* Let the plugin know that we are about to execute this statement */
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->stmt_beg)
((*plpgsql_plugin_ptr)->stmt_beg) (estate, stmt);
CHECK_FOR_INTERRUPTS();
switch (stmt->cmd_type)
{
case PLPGSQL_STMT_BLOCK:
rc = exec_stmt_block(estate, (PLpgSQL_stmt_block *) stmt);
break;
case PLPGSQL_STMT_ASSIGN:
rc = exec_stmt_assign(estate, (PLpgSQL_stmt_assign *) stmt);
break;
case PLPGSQL_STMT_PERFORM:
rc = exec_stmt_perform(estate, (PLpgSQL_stmt_perform *) stmt);
break;
case PLPGSQL_STMT_CALL:
rc = exec_stmt_call(estate, (PLpgSQL_stmt_call *) stmt);
break;
case PLPGSQL_STMT_GETDIAG:
rc = exec_stmt_getdiag(estate, (PLpgSQL_stmt_getdiag *) stmt);
break;
case PLPGSQL_STMT_IF:
rc = exec_stmt_if(estate, (PLpgSQL_stmt_if *) stmt);
break;
case PLPGSQL_STMT_CASE:
rc = exec_stmt_case(estate, (PLpgSQL_stmt_case *) stmt);
break;
case PLPGSQL_STMT_LOOP:
rc = exec_stmt_loop(estate, (PLpgSQL_stmt_loop *) stmt);
break;
case PLPGSQL_STMT_WHILE:
rc = exec_stmt_while(estate, (PLpgSQL_stmt_while *) stmt);
break;
case PLPGSQL_STMT_FORI:
rc = exec_stmt_fori(estate, (PLpgSQL_stmt_fori *) stmt);
break;
case PLPGSQL_STMT_FORS:
rc = exec_stmt_fors(estate, (PLpgSQL_stmt_fors *) stmt);
break;
case PLPGSQL_STMT_FORC:
rc = exec_stmt_forc(estate, (PLpgSQL_stmt_forc *) stmt);
break;
case PLPGSQL_STMT_FOREACH_A:
rc = exec_stmt_foreach_a(estate, (PLpgSQL_stmt_foreach_a *) stmt);
break;
case PLPGSQL_STMT_EXIT:
rc = exec_stmt_exit(estate, (PLpgSQL_stmt_exit *) stmt);
break;
case PLPGSQL_STMT_RETURN:
rc = exec_stmt_return(estate, (PLpgSQL_stmt_return *) stmt);
break;
case PLPGSQL_STMT_RETURN_NEXT:
rc = exec_stmt_return_next(estate, (PLpgSQL_stmt_return_next *) stmt);
break;
case PLPGSQL_STMT_RETURN_QUERY:
rc = exec_stmt_return_query(estate, (PLpgSQL_stmt_return_query *) stmt);
break;
case PLPGSQL_STMT_RAISE:
rc = exec_stmt_raise(estate, (PLpgSQL_stmt_raise *) stmt);
break;
case PLPGSQL_STMT_ASSERT:
rc = exec_stmt_assert(estate, (PLpgSQL_stmt_assert *) stmt);
break;
case PLPGSQL_STMT_EXECSQL:
rc = exec_stmt_execsql(estate, (PLpgSQL_stmt_execsql *) stmt);
break;
case PLPGSQL_STMT_DYNEXECUTE:
rc = exec_stmt_dynexecute(estate, (PLpgSQL_stmt_dynexecute *) stmt);
break;
case PLPGSQL_STMT_DYNFORS:
rc = exec_stmt_dynfors(estate, (PLpgSQL_stmt_dynfors *) stmt);
break;
case PLPGSQL_STMT_OPEN:
rc = exec_stmt_open(estate, (PLpgSQL_stmt_open *) stmt);
break;
case PLPGSQL_STMT_FETCH:
rc = exec_stmt_fetch(estate, (PLpgSQL_stmt_fetch *) stmt);
break;
case PLPGSQL_STMT_CLOSE:
rc = exec_stmt_close(estate, (PLpgSQL_stmt_close *) stmt);
break;
case PLPGSQL_STMT_COMMIT:
rc = exec_stmt_commit(estate, (PLpgSQL_stmt_commit *) stmt);
break;
case PLPGSQL_STMT_ROLLBACK:
rc = exec_stmt_rollback(estate, (PLpgSQL_stmt_rollback *) stmt);
break;
default:
/* point err_stmt to parent, since this one seems corrupt */
estate->err_stmt = save_estmt;
elog(ERROR, "unrecognized cmd_type: %d", stmt->cmd_type);
rc = -1; /* keep compiler quiet */
}
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
/* Let the plugin know that we have finished executing this statement */
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->stmt_end)
((*plpgsql_plugin_ptr)->stmt_end) (estate, stmt);
if (rc != PLPGSQL_RC_OK)
{
estate->err_stmt = save_estmt;
return rc;
}
} /* end of loop over statements */
estate->err_stmt = save_estmt;
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmt_assign Evaluate an expression and
* put the result into a variable.
* ----------
*/
static int
exec_stmt_assign(PLpgSQL_execstate *estate, PLpgSQL_stmt_assign *stmt)
{
Assert(stmt->varno >= 0);
exec_assign_expr(estate, estate->datums[stmt->varno], stmt->expr);
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmt_perform Evaluate query and discard result (but set
* FOUND depending on whether at least one row
* was returned).
* ----------
*/
static int
exec_stmt_perform(PLpgSQL_execstate *estate, PLpgSQL_stmt_perform *stmt)
{
PLpgSQL_expr *expr = stmt->expr;
(void) exec_run_select(estate, expr, 0, NULL);
exec_set_found(estate, (estate->eval_processed != 0));
exec_eval_cleanup(estate);
return PLPGSQL_RC_OK;
}
/*
* exec_stmt_call
*
* NOTE: this is used for both CALL and DO statements.
*/
static int
exec_stmt_call(PLpgSQL_execstate *estate, PLpgSQL_stmt_call *stmt)
{
PLpgSQL_expr *expr = stmt->expr;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
LocalTransactionId before_lxid;
LocalTransactionId after_lxid;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
ParamListInfo paramLI;
SPIExecuteOptions options;
int rc;
/*
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
* Make a plan if we don't have one already.
*/
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
if (expr->plan == NULL)
exec_prepare_plan(estate, expr, 0);
Fix corner-case uninitialized-variable issues in plpgsql. If an error was raised during our initial attempt to check whether a successfully-compiled expression is "simple", subsequent calls of exec_stmt_execsql would suppose that stmt->mod_stmt was already computed when it had not been. This could lead to assertion failures in debug builds; in production builds the effect would typically be to act as if INTO STRICT had been specified even when it had not been. Of course that only matters if the subsequent attempt to execute the expression succeeds, so that the problem can only be reached by fixing a failure in some referenced, inline-able SQL function and then retrying the calling plpgsql function in the same session. (There might be even-more-obscure ways to change the expression's behavior without changing the plpgsql function, but that one seems like the only one people would be likely to hit in practice.) The most foolproof way to fix this would be to arrange for exec_prepare_plan to not set expr->plan until we've finished the subsidiary simple-expression check. But it seems hard to do that without creating reference-count leak issues. So settle for documenting the hazard in a comment and fixing exec_stmt_execsql to test separately for whether it's computed stmt->mod_stmt. (That adds a test-and-branch per execution, but hopefully that's negligible in context.) In v11 and up, also fix exec_stmt_call which had a variant of the same issue. Per bug #17113 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17113-077605ce00e0e7ec@postgresql.org
2021-07-20 19:01:48 +02:00
/*
* A CALL or DO can never be a simple expression.
*/
Assert(!expr->expr_simple_expr);
Fix corner-case uninitialized-variable issues in plpgsql. If an error was raised during our initial attempt to check whether a successfully-compiled expression is "simple", subsequent calls of exec_stmt_execsql would suppose that stmt->mod_stmt was already computed when it had not been. This could lead to assertion failures in debug builds; in production builds the effect would typically be to act as if INTO STRICT had been specified even when it had not been. Of course that only matters if the subsequent attempt to execute the expression succeeds, so that the problem can only be reached by fixing a failure in some referenced, inline-able SQL function and then retrying the calling plpgsql function in the same session. (There might be even-more-obscure ways to change the expression's behavior without changing the plpgsql function, but that one seems like the only one people would be likely to hit in practice.) The most foolproof way to fix this would be to arrange for exec_prepare_plan to not set expr->plan until we've finished the subsidiary simple-expression check. But it seems hard to do that without creating reference-count leak issues. So settle for documenting the hazard in a comment and fixing exec_stmt_execsql to test separately for whether it's computed stmt->mod_stmt. (That adds a test-and-branch per execution, but hopefully that's negligible in context.) In v11 and up, also fix exec_stmt_call which had a variant of the same issue. Per bug #17113 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17113-077605ce00e0e7ec@postgresql.org
2021-07-20 19:01:48 +02:00
/*
* Also construct a DTYPE_ROW datum representing the plpgsql variables
* associated with the procedure's output arguments. Then we can use
* exec_move_row() to do the assignments.
*/
if (stmt->is_call && stmt->target == NULL)
stmt->target = make_callstmt_target(estate, expr);
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
paramLI = setup_param_list(estate, expr);
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
before_lxid = MyProc->lxid;
/*
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
* If we have a procedure-lifespan resowner, use that to hold the refcount
* for the plan. This avoids refcount leakage complaints if the called
* procedure ends the current transaction.
Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit 2ecfeda3e into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
2021-05-21 20:03:53 +02:00
*
* Also, tell SPI to allow non-atomic execution.
*/
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
memset(&options, 0, sizeof(options));
options.params = paramLI;
options.read_only = estate->readonly_func;
Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit 2ecfeda3e into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
2021-05-21 20:03:53 +02:00
options.allow_nonatomic = true;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
options.owner = estate->procedure_resowner;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
rc = SPI_execute_plan_extended(expr->plan, &options);
if (rc < 0)
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
elog(ERROR, "SPI_execute_plan_extended failed executing query \"%s\": %s",
expr->query, SPI_result_code_string(rc));
after_lxid = MyProc->lxid;
Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit 2ecfeda3e into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
2021-05-21 20:03:53 +02:00
if (before_lxid != after_lxid)
{
/*
* If we are in a new transaction after the call, we need to build new
* simple-expression infrastructure.
*/
estate->simple_eval_estate = NULL;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
estate->simple_eval_resowner = NULL;
plpgsql_create_econtext(estate);
}
/*
* Check result rowcount; if there's one row, assign procedure's output
* values back to the appropriate variables.
*/
if (SPI_processed == 1)
{
SPITupleTable *tuptab = SPI_tuptable;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
if (!stmt->is_call)
elog(ERROR, "DO statement returned a row");
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
exec_move_row(estate, stmt->target, tuptab->vals[0], tuptab->tupdesc);
}
else if (SPI_processed > 1)
elog(ERROR, "procedure call returned more than one row");
exec_eval_cleanup(estate);
SPI_freetuptable(SPI_tuptable);
return PLPGSQL_RC_OK;
}
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
/*
* We construct a DTYPE_ROW datum representing the plpgsql variables
* associated with the procedure's output arguments. Then we can use
* exec_move_row() to do the assignments.
*/
static PLpgSQL_variable *
make_callstmt_target(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
{
List *plansources;
CachedPlanSource *plansource;
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
CallStmt *stmt;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
FuncExpr *funcexpr;
HeapTuple func_tuple;
Oid *argtypes;
char **argnames;
char *argmodes;
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
int numargs;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
MemoryContext oldcontext;
PLpgSQL_row *row;
int nfields;
int i;
/* Use eval_mcontext for any cruft accumulated here */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
/*
* Get the parsed CallStmt, and look up the called procedure
*/
plansources = SPI_plan_get_plan_sources(expr->plan);
if (list_length(plansources) != 1)
elog(ERROR, "query for CALL statement is not a CallStmt");
plansource = (CachedPlanSource *) linitial(plansources);
if (list_length(plansource->query_list) != 1)
elog(ERROR, "query for CALL statement is not a CallStmt");
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
stmt = (CallStmt *) linitial_node(Query,
plansource->query_list)->utilityStmt;
if (stmt == NULL || !IsA(stmt, CallStmt))
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
elog(ERROR, "query for CALL statement is not a CallStmt");
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
funcexpr = stmt->funcexpr;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
func_tuple = SearchSysCache1(PROCOID,
ObjectIdGetDatum(funcexpr->funcid));
if (!HeapTupleIsValid(func_tuple))
elog(ERROR, "cache lookup failed for function %u",
funcexpr->funcid);
/*
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
* Get the argument names and modes, so that we can deliver on-point error
* messages when something is wrong.
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
*/
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
numargs = get_func_arg_info(func_tuple, &argtypes, &argnames, &argmodes);
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
ReleaseSysCache(func_tuple);
/*
* Begin constructing row Datum; keep it in fn_cxt so it's adequately
* long-lived.
*/
MemoryContextSwitchTo(estate->func->fn_cxt);
row = (PLpgSQL_row *) palloc0(sizeof(PLpgSQL_row));
row->dtype = PLPGSQL_DTYPE_ROW;
row->refname = "(unnamed row)";
row->lineno = -1;
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
row->varnos = (int *) palloc(numargs * sizeof(int));
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
MemoryContextSwitchTo(get_eval_mcontext(estate));
/*
* Examine procedure's argument list. Each output arg position should be
* an unadorned plpgsql variable (Datum), which we can insert into the row
* Datum.
*/
nfields = 0;
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
for (i = 0; i < numargs; i++)
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
{
if (argmodes &&
(argmodes[i] == PROARGMODE_INOUT ||
argmodes[i] == PROARGMODE_OUT))
{
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
Node *n = list_nth(stmt->outargs, nfields);
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
if (IsA(n, Param))
{
Param *param = (Param *) n;
/* paramid is offset by 1 (see make_datum_param()) */
row->varnos[nfields++] = param->paramid - 1;
}
else
{
/* report error using parameter name, if available */
if (argnames && argnames[i] && argnames[i][0])
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("procedure parameter \"%s\" is an output parameter but corresponding argument is not writable",
argnames[i])));
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("procedure parameter %d is an output parameter but corresponding argument is not writable",
i + 1)));
}
}
}
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
Assert(nfields == list_length(stmt->outargs));
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
row->nfields = nfields;
MemoryContextSwitchTo(oldcontext);
return (PLpgSQL_variable *) row;
}
/* ----------
* exec_stmt_getdiag Put internal PG information into
* specified variables.
* ----------
*/
static int
exec_stmt_getdiag(PLpgSQL_execstate *estate, PLpgSQL_stmt_getdiag *stmt)
{
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
ListCell *lc;
/*
* GET STACKED DIAGNOSTICS is only valid inside an exception handler.
*
* Note: we trust the grammar to have disallowed the relevant item kinds
* if not is_stacked, otherwise we'd dump core below.
*/
if (stmt->is_stacked && estate->cur_error == NULL)
ereport(ERROR,
(errcode(ERRCODE_STACKED_DIAGNOSTICS_ACCESSED_WITHOUT_ACTIVE_HANDLER),
errmsg("GET STACKED DIAGNOSTICS cannot be used outside an exception handler")));
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
foreach(lc, stmt->diag_items)
{
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
PLpgSQL_diag_item *diag_item = (PLpgSQL_diag_item *) lfirst(lc);
PLpgSQL_datum *var = estate->datums[diag_item->target];
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
switch (diag_item->kind)
{
case PLPGSQL_GETDIAG_ROW_COUNT:
exec_assign_value(estate, var,
UInt64GetDatum(estate->eval_processed),
false, INT8OID, -1);
break;
case PLPGSQL_GETDIAG_ERROR_CONTEXT:
exec_assign_c_string(estate, var,
estate->cur_error->context);
break;
case PLPGSQL_GETDIAG_ERROR_DETAIL:
exec_assign_c_string(estate, var,
estate->cur_error->detail);
break;
case PLPGSQL_GETDIAG_ERROR_HINT:
exec_assign_c_string(estate, var,
estate->cur_error->hint);
break;
case PLPGSQL_GETDIAG_RETURNED_SQLSTATE:
exec_assign_c_string(estate, var,
unpack_sql_state(estate->cur_error->sqlerrcode));
break;
case PLPGSQL_GETDIAG_COLUMN_NAME:
exec_assign_c_string(estate, var,
estate->cur_error->column_name);
break;
case PLPGSQL_GETDIAG_CONSTRAINT_NAME:
exec_assign_c_string(estate, var,
estate->cur_error->constraint_name);
break;
case PLPGSQL_GETDIAG_DATATYPE_NAME:
exec_assign_c_string(estate, var,
estate->cur_error->datatype_name);
break;
case PLPGSQL_GETDIAG_MESSAGE_TEXT:
exec_assign_c_string(estate, var,
estate->cur_error->message);
break;
case PLPGSQL_GETDIAG_TABLE_NAME:
exec_assign_c_string(estate, var,
estate->cur_error->table_name);
break;
case PLPGSQL_GETDIAG_SCHEMA_NAME:
exec_assign_c_string(estate, var,
estate->cur_error->schema_name);
break;
case PLPGSQL_GETDIAG_CONTEXT:
{
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
char *contextstackstr;
MemoryContext oldcontext;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Use eval_mcontext for short-lived string */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
contextstackstr = GetErrorContextStack();
MemoryContextSwitchTo(oldcontext);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
exec_assign_c_string(estate, var, contextstackstr);
}
break;
default:
elog(ERROR, "unrecognized diagnostic item kind: %d",
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
diag_item->kind);
}
}
2001-03-22 05:01:46 +01:00
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
exec_eval_cleanup(estate);
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmt_if Evaluate a bool expression and
* execute the true or false body
* conditionally.
* ----------
*/
static int
exec_stmt_if(PLpgSQL_execstate *estate, PLpgSQL_stmt_if *stmt)
{
bool value;
bool isnull;
ListCell *lc;
value = exec_eval_boolean(estate, stmt->cond, &isnull);
exec_eval_cleanup(estate);
if (!isnull && value)
return exec_stmts(estate, stmt->then_body);
foreach(lc, stmt->elsif_list)
{
PLpgSQL_if_elsif *elif = (PLpgSQL_if_elsif *) lfirst(lc);
value = exec_eval_boolean(estate, elif->cond, &isnull);
exec_eval_cleanup(estate);
if (!isnull && value)
return exec_stmts(estate, elif->stmts);
}
return exec_stmts(estate, stmt->else_body);
}
/*-----------
* exec_stmt_case
*-----------
*/
static int
exec_stmt_case(PLpgSQL_execstate *estate, PLpgSQL_stmt_case *stmt)
{
PLpgSQL_var *t_var = NULL;
bool isnull;
ListCell *l;
if (stmt->t_expr != NULL)
{
/* simple case */
Datum t_val;
Oid t_typoid;
int32 t_typmod;
t_val = exec_eval_expr(estate, stmt->t_expr,
&isnull, &t_typoid, &t_typmod);
t_var = (PLpgSQL_var *) estate->datums[stmt->t_varno];
/*
* When expected datatype is different from real, change it. Note that
* what we're modifying here is an execution copy of the datum, so
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* this doesn't affect the originally stored function parse tree. (In
* theory, if the expression datatype keeps changing during execution,
* this could cause a function-lifespan memory leak. Doesn't seem
* worth worrying about though.)
*/
if (t_var->datatype->typoid != t_typoid ||
t_var->datatype->atttypmod != t_typmod)
t_var->datatype = plpgsql_build_datatype(t_typoid,
t_typmod,
Fix plpgsql to re-look-up composite type names at need. Commit 4b93f5799 rearranged things in plpgsql to make it cope better with composite types changing underneath it intra-session. However, I failed to consider the case of a composite type being dropped and recreated entirely. In my defense, the previous coding didn't consider that possibility at all either --- but it would accidentally work so long as you didn't change the type's field list, because the built-at-compile-time list of component variables would then still match the type's new definition. The new coding, however, occasionally tries to re-look-up the type by OID, and then fails to find the dropped type. To fix this, we need to save the TypeName struct, and then redo the type OID lookup from that. Of course that's expensive, so we don't want to do it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye on the type's typcache entry to see if its tupledesc has been invalidated. (Perhaps, at some point, this mechanism should be generalized so it can work for non-composite types too; but for now, plpgsql only tries to cope with intra-session redefinitions of composites.) I'm slightly hesitant to back-patch this into v11, because it changes the contents of struct PLpgSQL_type as well as the signature of plpgsql_build_datatype(), so in principle it could break code that is poking into the innards of plpgsql. However, the only popular extension of that ilk is pldebugger, and it doesn't seem to be affected. Since this is a regression for people who were relying on the old behavior, it seems worth taking the small risk of causing compatibility issues. Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799 came in. Discussion: https://postgr.es/m/15913-a7e112e16dedcffc@postgresql.org
2019-08-15 21:21:47 +02:00
estate->func->fn_input_collation,
NULL);
/* now we can assign to the variable */
exec_assign_value(estate,
(PLpgSQL_datum *) t_var,
t_val,
isnull,
t_typoid,
t_typmod);
exec_eval_cleanup(estate);
}
/* Now search for a successful WHEN clause */
foreach(l, stmt->case_when_list)
{
PLpgSQL_case_when *cwt = (PLpgSQL_case_when *) lfirst(l);
bool value;
value = exec_eval_boolean(estate, cwt->expr, &isnull);
exec_eval_cleanup(estate);
if (!isnull && value)
{
/* Found it */
/* We can now discard any value we had for the temp variable */
if (t_var != NULL)
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(estate, t_var, (Datum) 0, true, false);
/* Evaluate the statement(s), and we're done */
return exec_stmts(estate, cwt->stmts);
}
}
/* We can now discard any value we had for the temp variable */
if (t_var != NULL)
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(estate, t_var, (Datum) 0, true, false);
/* SQL2003 mandates this error if there was no ELSE clause */
if (!stmt->have_else)
ereport(ERROR,
(errcode(ERRCODE_CASE_NOT_FOUND),
errmsg("case not found"),
errhint("CASE statement is missing ELSE part.")));
/* Evaluate the ELSE statements, and we're done */
return exec_stmts(estate, stmt->else_stmts);
}
/* ----------
* exec_stmt_loop Loop over statements until
* an exit occurs.
* ----------
*/
static int
exec_stmt_loop(PLpgSQL_execstate *estate, PLpgSQL_stmt_loop *stmt)
{
int rc = PLPGSQL_RC_OK;
for (;;)
{
rc = exec_stmts(estate, stmt->body);
LOOP_RC_PROCESSING(stmt->label, break);
}
return rc;
}
/* ----------
* exec_stmt_while Loop over statements as long
* as an expression evaluates to
* true or an exit occurs.
* ----------
*/
static int
exec_stmt_while(PLpgSQL_execstate *estate, PLpgSQL_stmt_while *stmt)
{
int rc = PLPGSQL_RC_OK;
for (;;)
{
bool value;
bool isnull;
value = exec_eval_boolean(estate, stmt->cond, &isnull);
exec_eval_cleanup(estate);
if (isnull || !value)
break;
rc = exec_stmts(estate, stmt->body);
LOOP_RC_PROCESSING(stmt->label, break);
}
return rc;
}
/* ----------
* exec_stmt_fori Iterate an integer variable
* from a lower to an upper value
* incrementing or decrementing by the BY value
* ----------
*/
static int
exec_stmt_fori(PLpgSQL_execstate *estate, PLpgSQL_stmt_fori *stmt)
{
PLpgSQL_var *var;
Datum value;
bool isnull;
Oid valtype;
int32 valtypmod;
int32 loop_value;
int32 end_value;
int32 step_value;
bool found = false;
int rc = PLPGSQL_RC_OK;
var = (PLpgSQL_var *) (estate->datums[stmt->var->dno]);
/*
* Get the value of the lower bound
*/
value = exec_eval_expr(estate, stmt->lower,
&isnull, &valtype, &valtypmod);
value = exec_cast_value(estate, value, &isnull,
valtype, valtypmod,
var->datatype->typoid,
var->datatype->atttypmod);
if (isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("lower bound of FOR loop cannot be null")));
loop_value = DatumGetInt32(value);
exec_eval_cleanup(estate);
/*
* Get the value of the upper bound
*/
value = exec_eval_expr(estate, stmt->upper,
&isnull, &valtype, &valtypmod);
value = exec_cast_value(estate, value, &isnull,
valtype, valtypmod,
var->datatype->typoid,
var->datatype->atttypmod);
if (isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("upper bound of FOR loop cannot be null")));
end_value = DatumGetInt32(value);
exec_eval_cleanup(estate);
/*
* Get the step value
*/
if (stmt->step)
{
value = exec_eval_expr(estate, stmt->step,
&isnull, &valtype, &valtypmod);
value = exec_cast_value(estate, value, &isnull,
valtype, valtypmod,
var->datatype->typoid,
var->datatype->atttypmod);
if (isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("BY value of FOR loop cannot be null")));
step_value = DatumGetInt32(value);
exec_eval_cleanup(estate);
if (step_value <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("BY value of FOR loop must be greater than zero")));
}
else
step_value = 1;
/*
* Now do the loop
*/
for (;;)
{
/*
* Check against upper bound
*/
if (stmt->reverse)
{
if (loop_value < end_value)
break;
}
else
{
if (loop_value > end_value)
break;
}
found = true; /* looped at least once */
/*
* Assign current value to loop var
*/
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(estate, var, Int32GetDatum(loop_value), false, false);
/*
* Execute the statements
*/
rc = exec_stmts(estate, stmt->body);
LOOP_RC_PROCESSING(stmt->label, break);
/*
* Increase/decrease loop value, unless it would overflow, in which
* case exit the loop.
*/
if (stmt->reverse)
{
if (loop_value < (PG_INT32_MIN + step_value))
break;
loop_value -= step_value;
}
else
{
if (loop_value > (PG_INT32_MAX - step_value))
break;
loop_value += step_value;
}
}
/*
* Set the FOUND variable to indicate the result of executing the loop
* (namely, whether we looped one or more times). This must be set here so
* that it does not interfere with the value of the FOUND variable inside
* the loop processing itself.
*/
exec_set_found(estate, found);
return rc;
}
/* ----------
* exec_stmt_fors Execute a query, assign each
* tuple to a record or row and
* execute a group of statements
* for it.
* ----------
*/
static int
exec_stmt_fors(PLpgSQL_execstate *estate, PLpgSQL_stmt_fors *stmt)
{
Portal portal;
int rc;
/*
* Open the implicit cursor for the statement using exec_run_select
*/
exec_run_select(estate, stmt->query, 0, &portal);
/*
* Execute the loop
*/
rc = exec_for_query(estate, (PLpgSQL_stmt_forq *) stmt, portal, true);
/*
* Close the implicit cursor
*/
SPI_cursor_close(portal);
return rc;
}
/* ----------
* exec_stmt_forc Execute a loop for each row from a cursor.
* ----------
*/
static int
exec_stmt_forc(PLpgSQL_execstate *estate, PLpgSQL_stmt_forc *stmt)
{
PLpgSQL_var *curvar;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext stmt_mcontext = NULL;
char *curname = NULL;
PLpgSQL_expr *query;
ParamListInfo paramLI;
Portal portal;
int rc;
/* ----------
* Get the cursor variable and if it has an assigned name, check
* that it's not in use currently.
* ----------
*/
curvar = (PLpgSQL_var *) (estate->datums[stmt->curvar]);
if (!curvar->isnull)
{
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
/* We only need stmt_mcontext to hold the cursor name string */
stmt_mcontext = get_stmt_mcontext(estate);
oldcontext = MemoryContextSwitchTo(stmt_mcontext);
curname = TextDatumGetCString(curvar->value);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
if (SPI_cursor_find(curname) != NULL)
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_CURSOR),
errmsg("cursor \"%s\" already in use", curname)));
}
/* ----------
* Open the cursor just like an OPEN command
*
* Note: parser should already have checked that statement supplies
* args iff cursor needs them, but we check again to be safe.
* ----------
*/
if (stmt->argquery != NULL)
{
/* ----------
* OPEN CURSOR with args. We fake a SELECT ... INTO ...
* statement to evaluate the args and put 'em into the
* internal row.
* ----------
*/
PLpgSQL_stmt_execsql set_args;
if (curvar->cursor_explicit_argrow < 0)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("arguments given for cursor without arguments")));
memset(&set_args, 0, sizeof(set_args));
set_args.cmd_type = PLPGSQL_STMT_EXECSQL;
set_args.lineno = stmt->lineno;
set_args.sqlstmt = stmt->argquery;
set_args.into = true;
/* XXX historically this has not been STRICT */
set_args.target = (PLpgSQL_variable *)
(estate->datums[curvar->cursor_explicit_argrow]);
if (exec_stmt_execsql(estate, &set_args) != PLPGSQL_RC_OK)
elog(ERROR, "open cursor failed during argument processing");
}
else
{
if (curvar->cursor_explicit_argrow >= 0)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("arguments required for cursor")));
}
query = curvar->cursor_explicit_expr;
Assert(query);
if (query->plan == NULL)
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
exec_prepare_plan(estate, query, curvar->cursor_options);
/*
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
* Set up ParamListInfo for this query
*/
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
paramLI = setup_param_list(estate, query);
/*
* Open the cursor (the paramlist will get copied into the portal)
*/
portal = SPI_cursor_open_with_paramlist(curname, query->plan,
paramLI,
estate->readonly_func);
if (portal == NULL)
elog(ERROR, "could not open cursor: %s",
SPI_result_code_string(SPI_result));
/*
* If cursor variable was NULL, store the generated portal name in it
*/
if (curname == NULL)
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_text_var(estate, curvar, portal->name);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/*
* Clean up before entering exec_for_query
*/
exec_eval_cleanup(estate);
if (stmt_mcontext)
MemoryContextReset(stmt_mcontext);
/*
* Execute the loop. We can't prefetch because the cursor is accessible
* to the user, for instance via UPDATE WHERE CURRENT OF within the loop.
*/
rc = exec_for_query(estate, (PLpgSQL_stmt_forq *) stmt, portal, false);
/* ----------
* Close portal, and restore cursor variable if it was initially NULL.
* ----------
*/
SPI_cursor_close(portal);
if (curname == NULL)
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(estate, curvar, (Datum) 0, true, false);
return rc;
}
/* ----------
* exec_stmt_foreach_a Loop over elements or slices of an array
*
* When looping over elements, the loop variable is the same type that the
* array stores (eg: integer), when looping through slices, the loop variable
* is an array of size and dimensions to match the size of the slice.
* ----------
*/
static int
exec_stmt_foreach_a(PLpgSQL_execstate *estate, PLpgSQL_stmt_foreach_a *stmt)
{
ArrayType *arr;
Oid arrtype;
int32 arrtypmod;
PLpgSQL_datum *loop_var;
Oid loop_var_elem_type;
bool found = false;
int rc = PLPGSQL_RC_OK;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext stmt_mcontext;
MemoryContext oldcontext;
ArrayIterator array_iterator;
Oid iterator_result_type;
int32 iterator_result_typmod;
Datum value;
bool isnull;
/* get the value of the array expression */
value = exec_eval_expr(estate, stmt->expr, &isnull, &arrtype, &arrtypmod);
if (isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2011-05-20 23:50:35 +02:00
errmsg("FOREACH expression must not be null")));
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/*
* Do as much as possible of the code below in stmt_mcontext, to avoid any
* leaks from called subroutines. We need a private stmt_mcontext since
* we'll be calling arbitrary statement code.
*/
stmt_mcontext = get_stmt_mcontext(estate);
push_stmt_mcontext(estate);
oldcontext = MemoryContextSwitchTo(stmt_mcontext);
/* check the type of the expression - must be an array */
if (!OidIsValid(get_element_type(arrtype)))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("FOREACH expression must yield an array, not type %s",
format_type_be(arrtype))));
/*
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* We must copy the array into stmt_mcontext, else it will disappear in
* exec_eval_cleanup. This is annoying, but cleanup will certainly happen
* while running the loop body, so we have little choice.
*/
arr = DatumGetArrayTypePCopy(value);
/* Clean up any leftover temporary memory */
exec_eval_cleanup(estate);
/* Slice dimension must be less than or equal to array dimension */
if (stmt->slice < 0 || stmt->slice > ARR_NDIM(arr))
ereport(ERROR,
(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
errmsg("slice dimension (%d) is out of the valid range 0..%d",
stmt->slice, ARR_NDIM(arr))));
/* Set up the loop variable and see if it is of an array type */
loop_var = estate->datums[stmt->varno];
if (loop_var->dtype == PLPGSQL_DTYPE_REC ||
loop_var->dtype == PLPGSQL_DTYPE_ROW)
{
/*
* Record/row variable is certainly not of array type, and might not
* be initialized at all yet, so don't try to get its type
*/
loop_var_elem_type = InvalidOid;
}
else
loop_var_elem_type = get_element_type(plpgsql_exec_get_datum_type(estate,
loop_var));
/*
* Sanity-check the loop variable type. We don't try very hard here, and
* should not be too picky since it's possible that exec_assign_value can
* coerce values of different types. But it seems worthwhile to complain
* if the array-ness of the loop variable is not right.
*/
if (stmt->slice > 0 && loop_var_elem_type == InvalidOid)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("FOREACH ... SLICE loop variable must be of an array type")));
if (stmt->slice == 0 && loop_var_elem_type != InvalidOid)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("FOREACH loop variable must not be of an array type")));
/* Create an iterator to step through the array */
array_iterator = array_create_iterator(arr, stmt->slice, NULL);
/* Identify iterator result type */
if (stmt->slice > 0)
{
/* When slicing, nominal type of result is same as array type */
iterator_result_type = arrtype;
iterator_result_typmod = arrtypmod;
}
else
{
/* Without slicing, results are individual array elements */
iterator_result_type = ARR_ELEMTYPE(arr);
iterator_result_typmod = arrtypmod;
}
/* Iterate over the array elements or slices */
while (array_iterate(array_iterator, &value, &isnull))
{
found = true; /* looped at least once */
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* exec_assign_value and exec_stmts must run in the main context */
MemoryContextSwitchTo(oldcontext);
/* Assign current element/slice to the loop variable */
exec_assign_value(estate, loop_var, value, isnull,
iterator_result_type, iterator_result_typmod);
/* In slice case, value is temporary; must free it to avoid leakage */
if (stmt->slice > 0)
pfree(DatumGetPointer(value));
/*
* Execute the statements
*/
rc = exec_stmts(estate, stmt->body);
LOOP_RC_PROCESSING(stmt->label, break);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(stmt_mcontext);
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Restore memory context state */
MemoryContextSwitchTo(oldcontext);
pop_stmt_mcontext(estate);
/* Release temporary memory, including the array value */
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextReset(stmt_mcontext);
/*
* Set the FOUND variable to indicate the result of executing the loop
* (namely, whether we looped one or more times). This must be set here so
* that it does not interfere with the value of the FOUND variable inside
* the loop processing itself.
*/
exec_set_found(estate, found);
return rc;
}
/* ----------
* exec_stmt_exit Implements EXIT and CONTINUE
*
* This begins the process of exiting / restarting a loop.
* ----------
*/
static int
exec_stmt_exit(PLpgSQL_execstate *estate, PLpgSQL_stmt_exit *stmt)
{
/*
* If the exit / continue has a condition, evaluate it
*/
if (stmt->cond != NULL)
{
bool value;
bool isnull;
value = exec_eval_boolean(estate, stmt->cond, &isnull);
exec_eval_cleanup(estate);
if (isnull || value == false)
return PLPGSQL_RC_OK;
}
estate->exitlabel = stmt->label;
if (stmt->is_exit)
return PLPGSQL_RC_EXIT;
else
return PLPGSQL_RC_CONTINUE;
}
/* ----------
* exec_stmt_return Evaluate an expression and start
* returning from the function.
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* Note: The result may be in the eval_mcontext. Therefore, we must not
* do exec_eval_cleanup while unwinding the control stack.
* ----------
*/
static int
exec_stmt_return(PLpgSQL_execstate *estate, PLpgSQL_stmt_return *stmt)
{
/*
* If processing a set-returning PL/pgSQL function, the final RETURN
* indicates that the function is finished producing tuples. The rest of
* the work will be done at the top level.
*/
if (estate->retisset)
return PLPGSQL_RC_RETURN;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* initialize for null result */
estate->retval = (Datum) 0;
estate->retisnull = true;
estate->rettype = InvalidOid;
/*
* Special case path when the RETURN expression is a simple variable
* reference; in particular, this path is always taken in functions with
* one or more OUT parameters.
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
*
* This special case is especially efficient for returning variables that
* have R/W expanded values: we can put the R/W pointer directly into
* estate->retval, leading to transferring the value to the caller's
* context cheaply. If we went through exec_eval_expr we'd end up with a
* R/O pointer. It's okay to skip MakeExpandedObjectReadOnly here since
* we know we won't need the variable's value within the function anymore.
*/
if (stmt->retvarno >= 0)
{
PLpgSQL_datum *retvar = estate->datums[stmt->retvarno];
switch (retvar->dtype)
{
case PLPGSQL_DTYPE_PROMISE:
/* fulfill promise if needed, then handle like regular var */
plpgsql_fulfill_promise(estate, (PLpgSQL_var *) retvar);
/* FALL THRU */
case PLPGSQL_DTYPE_VAR:
{
PLpgSQL_var *var = (PLpgSQL_var *) retvar;
estate->retval = var->value;
estate->retisnull = var->isnull;
estate->rettype = var->datatype->typoid;
/*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* A PLpgSQL_var could not be of composite type, so
* conversion must fail if retistuple. We throw a custom
* error mainly for consistency with historical behavior.
* For the same reason, we don't throw error if the result
* is NULL. (Note that plpgsql_exec_trigger assumes that
* any non-null result has been verified to be composite.)
*/
if (estate->retistuple && !estate->retisnull)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("cannot return non-composite value from function returning composite type")));
}
break;
case PLPGSQL_DTYPE_REC:
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) retvar;
2005-10-15 04:49:52 +02:00
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* If record is empty, we return NULL not a row of nulls */
if (rec->erh && !ExpandedRecordIsEmpty(rec->erh))
2005-10-15 04:49:52 +02:00
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
estate->retval = ExpandedRecordGetDatum(rec->erh);
estate->retisnull = false;
estate->rettype = rec->rectypeid;
2005-10-15 04:49:52 +02:00
}
}
break;
case PLPGSQL_DTYPE_ROW:
{
PLpgSQL_row *row = (PLpgSQL_row *) retvar;
int32 rettypmod;
2005-10-15 04:49:52 +02:00
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* We get here if there are multiple OUT parameters */
exec_eval_datum(estate,
(PLpgSQL_datum *) row,
&estate->rettype,
&rettypmod,
&estate->retval,
&estate->retisnull);
}
break;
default:
elog(ERROR, "unrecognized dtype: %d", retvar->dtype);
}
return PLPGSQL_RC_RETURN;
}
if (stmt->expr != NULL)
{
int32 rettypmod;
estate->retval = exec_eval_expr(estate, stmt->expr,
&(estate->retisnull),
&(estate->rettype),
&rettypmod);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* As in the DTYPE_VAR case above, throw a custom error if a non-null,
* non-composite value is returned in a function returning tuple.
*/
if (estate->retistuple && !estate->retisnull &&
!type_is_rowtype(estate->rettype))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("cannot return non-composite value from function returning composite type")));
return PLPGSQL_RC_RETURN;
}
/*
* Special hack for function returning VOID: instead of NULL, return a
* non-null VOID value. This is of dubious importance but is kept for
* backwards compatibility. We don't do it for procedures, though.
*/
if (estate->fn_rettype == VOIDOID &&
estate->func->fn_prokind != PROKIND_PROCEDURE)
{
estate->retval = (Datum) 0;
estate->retisnull = false;
estate->rettype = VOIDOID;
}
return PLPGSQL_RC_RETURN;
}
/* ----------
* exec_stmt_return_next Evaluate an expression and add it to the
* list of tuples returned by the current
* SRF.
* ----------
*/
static int
exec_stmt_return_next(PLpgSQL_execstate *estate,
PLpgSQL_stmt_return_next *stmt)
{
TupleDesc tupdesc;
int natts;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
HeapTuple tuple;
MemoryContext oldcontext;
if (!estate->retisset)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("cannot use RETURN NEXT in a non-SETOF function")));
if (estate->tuple_store == NULL)
exec_init_tuple_store(estate);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* tuple_store_desc will be filled by exec_init_tuple_store */
tupdesc = estate->tuple_store_desc;
natts = tupdesc->natts;
/*
* Special case path when the RETURN NEXT expression is a simple variable
* reference; in particular, this path is always taken in functions with
* one or more OUT parameters.
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
*
* Unlike exec_stmt_return, there's no special win here for R/W expanded
* values, since they'll have to get flattened to go into the tuplestore.
* Indeed, we'd better make them R/O to avoid any risk of the casting step
* changing them in-place.
*/
if (stmt->retvarno >= 0)
{
PLpgSQL_datum *retvar = estate->datums[stmt->retvarno];
switch (retvar->dtype)
{
case PLPGSQL_DTYPE_PROMISE:
/* fulfill promise if needed, then handle like regular var */
plpgsql_fulfill_promise(estate, (PLpgSQL_var *) retvar);
/* FALL THRU */
case PLPGSQL_DTYPE_VAR:
{
PLpgSQL_var *var = (PLpgSQL_var *) retvar;
Datum retval = var->value;
bool isNull = var->isnull;
Form_pg_attribute attr = TupleDescAttr(tupdesc, 0);
if (natts != 1)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("wrong result type supplied in RETURN NEXT")));
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
/* let's be very paranoid about the cast step */
retval = MakeExpandedObjectReadOnly(retval,
isNull,
var->datatype->typlen);
/* coerce type if needed */
retval = exec_cast_value(estate,
retval,
&isNull,
var->datatype->typoid,
var->datatype->atttypmod,
attr->atttypid,
attr->atttypmod);
tuplestore_putvalues(estate->tuple_store, tupdesc,
&retval, &isNull);
}
break;
case PLPGSQL_DTYPE_REC:
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) retvar;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
TupleDesc rec_tupdesc;
TupleConversionMap *tupmap;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* If rec is null, try to convert it to a row of nulls */
if (rec->erh == NULL)
instantiate_empty_record_variable(estate, rec);
if (ExpandedRecordIsEmpty(rec->erh))
deconstruct_expanded_record(rec->erh);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Use eval_mcontext for tuple conversion work */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
rec_tupdesc = expanded_record_get_tupdesc(rec->erh);
tupmap = convert_tuples_by_position(rec_tupdesc,
tupdesc,
gettext_noop("wrong record type supplied in RETURN NEXT"));
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
tuple = expanded_record_get_tuple(rec->erh);
if (tupmap)
tuple = execute_attr_map_tuple(tuple, tupmap);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
tuplestore_puttuple(estate->tuple_store, tuple);
MemoryContextSwitchTo(oldcontext);
}
break;
case PLPGSQL_DTYPE_ROW:
{
PLpgSQL_row *row = (PLpgSQL_row *) retvar;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* We get here if there are multiple OUT parameters */
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Use eval_mcontext for tuple conversion work */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
tuple = make_tuple_from_row(estate, row, tupdesc);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (tuple == NULL) /* should not happen */
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("wrong record type supplied in RETURN NEXT")));
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
tuplestore_puttuple(estate->tuple_store, tuple);
MemoryContextSwitchTo(oldcontext);
}
break;
default:
elog(ERROR, "unrecognized dtype: %d", retvar->dtype);
break;
}
}
else if (stmt->expr)
{
Datum retval;
bool isNull;
Oid rettype;
int32 rettypmod;
retval = exec_eval_expr(estate,
stmt->expr,
&isNull,
&rettype,
&rettypmod);
if (estate->retistuple)
{
/* Expression should be of RECORD or composite type */
if (!isNull)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
HeapTupleData tmptup;
TupleDesc retvaldesc;
TupleConversionMap *tupmap;
if (!type_is_rowtype(rettype))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("cannot return non-composite value from function returning composite type")));
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Use eval_mcontext for tuple conversion work */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
retvaldesc = deconstruct_composite_datum(retval, &tmptup);
tuple = &tmptup;
tupmap = convert_tuples_by_position(retvaldesc, tupdesc,
gettext_noop("returned record type does not match expected record type"));
if (tupmap)
tuple = execute_attr_map_tuple(tuple, tupmap);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
tuplestore_puttuple(estate->tuple_store, tuple);
ReleaseTupleDesc(retvaldesc);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
}
else
{
/* Composite NULL --- store a row of nulls */
Datum *nulldatums;
bool *nullflags;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
nulldatums = (Datum *)
eval_mcontext_alloc0(estate, natts * sizeof(Datum));
nullflags = (bool *)
eval_mcontext_alloc(estate, natts * sizeof(bool));
memset(nullflags, true, natts * sizeof(bool));
tuplestore_putvalues(estate->tuple_store, tupdesc,
nulldatums, nullflags);
}
}
else
{
Form_pg_attribute attr = TupleDescAttr(tupdesc, 0);
/* Simple scalar result */
if (natts != 1)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("wrong result type supplied in RETURN NEXT")));
/* coerce type if needed */
retval = exec_cast_value(estate,
retval,
&isNull,
rettype,
rettypmod,
attr->atttypid,
attr->atttypmod);
tuplestore_putvalues(estate->tuple_store, tupdesc,
&retval, &isNull);
}
}
else
{
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("RETURN NEXT must have a parameter")));
}
exec_eval_cleanup(estate);
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmt_return_query Evaluate a query and add it to the
* list of tuples returned by the current
* SRF.
* ----------
*/
static int
exec_stmt_return_query(PLpgSQL_execstate *estate,
PLpgSQL_stmt_return_query *stmt)
{
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
int64 tcount;
DestReceiver *treceiver;
int rc;
uint64 processed;
MemoryContext stmt_mcontext = get_stmt_mcontext(estate);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
if (!estate->retisset)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("cannot use RETURN QUERY in a non-SETOF function")));
if (estate->tuple_store == NULL)
exec_init_tuple_store(estate);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/* There might be some tuples in the tuplestore already */
tcount = tuplestore_tuple_count(estate->tuple_store);
/*
* Set up DestReceiver to transfer results directly to tuplestore,
* converting rowtype if necessary. DestReceiver lives in mcontext.
*/
oldcontext = MemoryContextSwitchTo(stmt_mcontext);
treceiver = CreateDestReceiver(DestTuplestore);
SetTuplestoreDestReceiverParams(treceiver,
estate->tuple_store,
estate->tuple_store_cxt,
false,
estate->tuple_store_desc,
gettext_noop("structure of query does not match function result type"));
MemoryContextSwitchTo(oldcontext);
if (stmt->query != NULL)
{
/* static query */
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
PLpgSQL_expr *expr = stmt->query;
ParamListInfo paramLI;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
SPIExecuteOptions options;
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/*
* On the first call for this expression generate the plan.
*/
if (expr->plan == NULL)
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
exec_prepare_plan(estate, expr, CURSOR_OPT_PARALLEL_OK);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/*
* Set up ParamListInfo to pass to executor
*/
paramLI = setup_param_list(estate, expr);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/*
* Execute the query
*/
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
memset(&options, 0, sizeof(options));
options.params = paramLI;
options.read_only = estate->readonly_func;
Fix checking of query type in plpgsql's RETURN QUERY command. Prior to v14, we insisted that the query in RETURN QUERY be of a type that returns tuples. (For instance, INSERT RETURNING was allowed, but not plain INSERT.) That happened indirectly because we opened a cursor for the query, so spi.c checked SPI_is_cursor_plan(). As a consequence, the error message wasn't terribly on-point, but at least it was there. Commit 2f48ede08 lost this detail. Instead, plain RETURN QUERY insisted that the query be a SELECT (by checking for SPI_OK_SELECT) while RETURN QUERY EXECUTE failed to check the query type at all. Neither of these changes was intended. The only convenient place to check this in the EXECUTE case is inside _SPI_execute_plan, because we haven't done parse analysis until then. So we need to pass down a flag saying whether to enforce that the query returns tuples. Fortunately, we can squeeze another boolean into struct SPIExecuteOptions without an ABI break, since there's padding space there. (It's unlikely that any extensions would already be using this new struct, but preserving ABI in v14 seems like a smart idea anyway.) Within spi.c, it seemed like _SPI_execute_plan's parameter list was already ridiculously long, and I didn't want to make it longer. So I thought of passing SPIExecuteOptions down as-is, allowing that parameter list to become much shorter. This makes the patch a bit more invasive than it might otherwise be, but it's all internal to spi.c, so that seems fine. Per report from Marc Bachmann. Back-patch to v14 where the faulty code came in. Discussion: https://postgr.es/m/1F2F75F0-27DF-406F-848D-8B50C7EEF06A@gmail.com
2021-10-03 19:21:20 +02:00
options.must_return_tuples = true;
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
options.dest = treceiver;
rc = SPI_execute_plan_extended(expr->plan, &options);
Fix checking of query type in plpgsql's RETURN QUERY command. Prior to v14, we insisted that the query in RETURN QUERY be of a type that returns tuples. (For instance, INSERT RETURNING was allowed, but not plain INSERT.) That happened indirectly because we opened a cursor for the query, so spi.c checked SPI_is_cursor_plan(). As a consequence, the error message wasn't terribly on-point, but at least it was there. Commit 2f48ede08 lost this detail. Instead, plain RETURN QUERY insisted that the query be a SELECT (by checking for SPI_OK_SELECT) while RETURN QUERY EXECUTE failed to check the query type at all. Neither of these changes was intended. The only convenient place to check this in the EXECUTE case is inside _SPI_execute_plan, because we haven't done parse analysis until then. So we need to pass down a flag saying whether to enforce that the query returns tuples. Fortunately, we can squeeze another boolean into struct SPIExecuteOptions without an ABI break, since there's padding space there. (It's unlikely that any extensions would already be using this new struct, but preserving ABI in v14 seems like a smart idea anyway.) Within spi.c, it seemed like _SPI_execute_plan's parameter list was already ridiculously long, and I didn't want to make it longer. So I thought of passing SPIExecuteOptions down as-is, allowing that parameter list to become much shorter. This makes the patch a bit more invasive than it might otherwise be, but it's all internal to spi.c, so that seems fine. Per report from Marc Bachmann. Back-patch to v14 where the faulty code came in. Discussion: https://postgr.es/m/1F2F75F0-27DF-406F-848D-8B50C7EEF06A@gmail.com
2021-10-03 19:21:20 +02:00
if (rc < 0)
elog(ERROR, "SPI_execute_plan_extended failed executing query \"%s\": %s",
expr->query, SPI_result_code_string(rc));
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
}
else
{
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/* RETURN QUERY EXECUTE */
Datum query;
bool isnull;
Oid restype;
int32 restypmod;
char *querystr;
SPIExecuteOptions options;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/*
* Evaluate the string expression after the EXECUTE keyword. Its
* result is the querystring we have to execute.
*/
Assert(stmt->dynquery != NULL);
query = exec_eval_expr(estate, stmt->dynquery,
&isnull, &restype, &restypmod);
if (isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
errmsg("query string argument of EXECUTE is null")));
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/* Get the C-String representation */
querystr = convert_value_to_string(estate, query, restype);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/* copy it into the stmt_mcontext before we clean up */
querystr = MemoryContextStrdup(stmt_mcontext, querystr);
2007-11-15 22:14:46 +01:00
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
exec_eval_cleanup(estate);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/* Execute query, passing params if necessary */
memset(&options, 0, sizeof(options));
options.params = exec_eval_using_params(estate,
stmt->params);
options.read_only = estate->readonly_func;
Fix checking of query type in plpgsql's RETURN QUERY command. Prior to v14, we insisted that the query in RETURN QUERY be of a type that returns tuples. (For instance, INSERT RETURNING was allowed, but not plain INSERT.) That happened indirectly because we opened a cursor for the query, so spi.c checked SPI_is_cursor_plan(). As a consequence, the error message wasn't terribly on-point, but at least it was there. Commit 2f48ede08 lost this detail. Instead, plain RETURN QUERY insisted that the query be a SELECT (by checking for SPI_OK_SELECT) while RETURN QUERY EXECUTE failed to check the query type at all. Neither of these changes was intended. The only convenient place to check this in the EXECUTE case is inside _SPI_execute_plan, because we haven't done parse analysis until then. So we need to pass down a flag saying whether to enforce that the query returns tuples. Fortunately, we can squeeze another boolean into struct SPIExecuteOptions without an ABI break, since there's padding space there. (It's unlikely that any extensions would already be using this new struct, but preserving ABI in v14 seems like a smart idea anyway.) Within spi.c, it seemed like _SPI_execute_plan's parameter list was already ridiculously long, and I didn't want to make it longer. So I thought of passing SPIExecuteOptions down as-is, allowing that parameter list to become much shorter. This makes the patch a bit more invasive than it might otherwise be, but it's all internal to spi.c, so that seems fine. Per report from Marc Bachmann. Back-patch to v14 where the faulty code came in. Discussion: https://postgr.es/m/1F2F75F0-27DF-406F-848D-8B50C7EEF06A@gmail.com
2021-10-03 19:21:20 +02:00
options.must_return_tuples = true;
options.dest = treceiver;
rc = SPI_execute_extended(querystr, &options);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
if (rc < 0)
elog(ERROR, "SPI_execute_extended failed executing query \"%s\": %s",
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
querystr, SPI_result_code_string(rc));
}
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/* Clean up */
treceiver->rDestroy(treceiver);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
exec_eval_cleanup(estate);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
MemoryContextReset(stmt_mcontext);
/* Count how many tuples we got */
processed = tuplestore_tuple_count(estate->tuple_store) - tcount;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
estate->eval_processed = processed;
exec_set_found(estate, processed != 0);
return PLPGSQL_RC_OK;
}
static void
exec_init_tuple_store(PLpgSQL_execstate *estate)
{
ReturnSetInfo *rsi = estate->rsi;
MemoryContext oldcxt;
ResourceOwner oldowner;
/*
* Check caller can handle a set result in the way we want
*/
if (!rsi || !IsA(rsi, ReturnSetInfo) ||
(rsi->allowedModes & SFRM_Materialize) == 0 ||
rsi->expectedDesc == NULL)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("set-valued function called in context that cannot accept a set")));
/*
* Switch to the right memory context and resource owner for storing the
* tuplestore for return set. If we're within a subtransaction opened for
* an exception-block, for example, we must still create the tuplestore in
* the resource owner that was active when this function was entered, and
* not in the subtransaction resource owner.
*/
oldcxt = MemoryContextSwitchTo(estate->tuple_store_cxt);
oldowner = CurrentResourceOwner;
CurrentResourceOwner = estate->tuple_store_owner;
estate->tuple_store =
tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random,
false, work_mem);
CurrentResourceOwner = oldowner;
MemoryContextSwitchTo(oldcxt);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
estate->tuple_store_desc = rsi->expectedDesc;
}
#define SET_RAISE_OPTION_TEXT(opt, name) \
do { \
if (opt) \
ereport(ERROR, \
(errcode(ERRCODE_SYNTAX_ERROR), \
errmsg("RAISE option already specified: %s", \
name))); \
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
opt = MemoryContextStrdup(stmt_mcontext, extval); \
} while (0)
/* ----------
* exec_stmt_raise Build a message and throw it with elog()
* ----------
*/
static int
exec_stmt_raise(PLpgSQL_execstate *estate, PLpgSQL_stmt_raise *stmt)
{
int err_code = 0;
char *condname = NULL;
char *err_message = NULL;
char *err_detail = NULL;
char *err_hint = NULL;
char *err_column = NULL;
char *err_constraint = NULL;
char *err_datatype = NULL;
char *err_table = NULL;
char *err_schema = NULL;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext stmt_mcontext;
ListCell *lc;
/* RAISE with no parameters: re-throw current exception */
if (stmt->condname == NULL && stmt->message == NULL &&
stmt->options == NIL)
{
if (estate->cur_error != NULL)
ReThrowError(estate->cur_error);
/* oops, we're not inside a handler */
ereport(ERROR,
(errcode(ERRCODE_STACKED_DIAGNOSTICS_ACCESSED_WITHOUT_ACTIVE_HANDLER),
errmsg("RAISE without parameters cannot be used outside an exception handler")));
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* We'll need to accumulate the various strings in stmt_mcontext */
stmt_mcontext = get_stmt_mcontext(estate);
if (stmt->condname)
{
err_code = plpgsql_recognize_err_condition(stmt->condname, true);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
condname = MemoryContextStrdup(stmt_mcontext, stmt->condname);
}
if (stmt->message)
{
StringInfoData ds;
ListCell *current_param;
char *cp;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* build string in stmt_mcontext */
oldcontext = MemoryContextSwitchTo(stmt_mcontext);
initStringInfo(&ds);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
current_param = list_head(stmt->params);
for (cp = stmt->message; *cp; cp++)
{
/*
* Occurrences of a single % are replaced by the next parameter's
* external representation. Double %'s are converted to one %.
*/
if (cp[0] == '%')
{
Oid paramtypeid;
int32 paramtypmod;
Datum paramvalue;
bool paramisnull;
char *extval;
if (cp[1] == '%')
{
appendStringInfoChar(&ds, '%');
cp++;
continue;
}
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
/* should have been checked at compile time */
if (current_param == NULL)
elog(ERROR, "unexpected RAISE parameter list length");
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
paramvalue = exec_eval_expr(estate,
(PLpgSQL_expr *) lfirst(current_param),
&paramisnull,
&paramtypeid,
&paramtypmod);
if (paramisnull)
extval = "<NULL>";
else
extval = convert_value_to_string(estate,
paramvalue,
paramtypeid);
appendStringInfoString(&ds, extval);
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
current_param = lnext(stmt->params, current_param);
exec_eval_cleanup(estate);
}
else
appendStringInfoChar(&ds, cp[0]);
}
/* should have been checked at compile time */
if (current_param != NULL)
elog(ERROR, "unexpected RAISE parameter list length");
err_message = ds.data;
}
foreach(lc, stmt->options)
{
PLpgSQL_raise_option *opt = (PLpgSQL_raise_option *) lfirst(lc);
Datum optionvalue;
bool optionisnull;
Oid optiontypeid;
int32 optiontypmod;
char *extval;
optionvalue = exec_eval_expr(estate, opt->expr,
&optionisnull,
&optiontypeid,
&optiontypmod);
if (optionisnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("RAISE statement option cannot be null")));
extval = convert_value_to_string(estate, optionvalue, optiontypeid);
switch (opt->opt_type)
{
case PLPGSQL_RAISEOPTION_ERRCODE:
if (err_code)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("RAISE option already specified: %s",
"ERRCODE")));
err_code = plpgsql_recognize_err_condition(extval, true);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
condname = MemoryContextStrdup(stmt_mcontext, extval);
break;
case PLPGSQL_RAISEOPTION_MESSAGE:
SET_RAISE_OPTION_TEXT(err_message, "MESSAGE");
break;
case PLPGSQL_RAISEOPTION_DETAIL:
SET_RAISE_OPTION_TEXT(err_detail, "DETAIL");
break;
case PLPGSQL_RAISEOPTION_HINT:
SET_RAISE_OPTION_TEXT(err_hint, "HINT");
break;
case PLPGSQL_RAISEOPTION_COLUMN:
SET_RAISE_OPTION_TEXT(err_column, "COLUMN");
break;
case PLPGSQL_RAISEOPTION_CONSTRAINT:
SET_RAISE_OPTION_TEXT(err_constraint, "CONSTRAINT");
break;
case PLPGSQL_RAISEOPTION_DATATYPE:
SET_RAISE_OPTION_TEXT(err_datatype, "DATATYPE");
break;
case PLPGSQL_RAISEOPTION_TABLE:
SET_RAISE_OPTION_TEXT(err_table, "TABLE");
break;
case PLPGSQL_RAISEOPTION_SCHEMA:
SET_RAISE_OPTION_TEXT(err_schema, "SCHEMA");
break;
default:
elog(ERROR, "unrecognized raise option: %d", opt->opt_type);
}
exec_eval_cleanup(estate);
}
/* Default code if nothing specified */
if (err_code == 0 && stmt->elog_level >= ERROR)
err_code = ERRCODE_RAISE_EXCEPTION;
/* Default error message if nothing specified */
if (err_message == NULL)
{
if (condname)
{
err_message = condname;
condname = NULL;
}
else
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
err_message = MemoryContextStrdup(stmt_mcontext,
unpack_sql_state(err_code));
}
This patch changes makes some significant changes to how compilation and parsing work in PL/PgSQL: - memory management is now done via palloc(). The compiled representation of each function now has its own memory context. Therefore, the storage consumed by a function can be reclaimed via MemoryContextDelete(). During compilation, the CurrentMemoryContext is the function's memory context. This means that a palloc() is sufficient to allocate memory that will have the same lifetime as the function itself. As a result, code invoked during compilation should be careful to pfree() temporary allocations to avoid leaking memory. Since a lot of the code in the backend is not careful about releasing palloc'ed memory, that means we should switch into a temporary memory context before invoking backend functions. A temporary context appropriate for such allocations is `compile_tmp_cxt'. - The ability to use palloc() allows us to simply a lot of the code in the parser. Rather than representing lists of elements via ad hoc linked lists or arrays, we can use the List type. Rather than doing malloc followed by memset(0), we can just use palloc0(). - We now check that the user has supplied the right number of parameters to a RAISE statement. Supplying either too few or too many results in an error (at runtime). - PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this means we need to be quite lax in what the PL/PgSQL grammar considers a "SQL statement". This can lead to misleading behavior if there is a syntax error in the function definition, since we assume a malformed PL/PgSQL construct is a SQL statement. Furthermore, these errors were only detected at runtime (when we tried to execute the alleged "SQL statement" via SPI). To rectify this, the patch changes the parser to invoke the main SQL parser when it sees a string it believes to be a SQL expression. This means that synctically-invalid SQL will be rejected during the compilation of the PL/PgSQL function. This is only done when compiling for "validation" purposes (i.e. at CREATE FUNCTION time), so it should not impose a runtime overhead. - Fixes for the various buffer overruns I've patched in stable branches in the past few weeks. I've rewritten code where I thought it was warranted (unlike the patches applied to older branches, which were minimally invasive). - Various other minor changes and cleanups. - Updates to the regression tests.
2005-02-22 08:18:27 +01:00
/*
* Throw the error (may or may not come back)
*/
ereport(stmt->elog_level,
(err_code ? errcode(err_code) : 0,
errmsg_internal("%s", err_message),
(err_detail != NULL) ? errdetail_internal("%s", err_detail) : 0,
(err_hint != NULL) ? errhint("%s", err_hint) : 0,
(err_column != NULL) ?
err_generic_string(PG_DIAG_COLUMN_NAME, err_column) : 0,
(err_constraint != NULL) ?
err_generic_string(PG_DIAG_CONSTRAINT_NAME, err_constraint) : 0,
(err_datatype != NULL) ?
err_generic_string(PG_DIAG_DATATYPE_NAME, err_datatype) : 0,
(err_table != NULL) ?
err_generic_string(PG_DIAG_TABLE_NAME, err_table) : 0,
(err_schema != NULL) ?
err_generic_string(PG_DIAG_SCHEMA_NAME, err_schema) : 0));
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Clean up transient strings */
MemoryContextReset(stmt_mcontext);
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmt_assert Assert statement
* ----------
*/
static int
exec_stmt_assert(PLpgSQL_execstate *estate, PLpgSQL_stmt_assert *stmt)
{
bool value;
bool isnull;
/* do nothing when asserts are not enabled */
if (!plpgsql_check_asserts)
return PLPGSQL_RC_OK;
value = exec_eval_boolean(estate, stmt->cond, &isnull);
exec_eval_cleanup(estate);
if (isnull || !value)
{
char *message = NULL;
if (stmt->message != NULL)
{
Datum val;
Oid typeid;
int32 typmod;
val = exec_eval_expr(estate, stmt->message,
&isnull, &typeid, &typmod);
if (!isnull)
message = convert_value_to_string(estate, val, typeid);
/* we mustn't do exec_eval_cleanup here */
}
ereport(ERROR,
(errcode(ERRCODE_ASSERT_FAILURE),
message ? errmsg_internal("%s", message) :
errmsg("assertion failed")));
}
return PLPGSQL_RC_OK;
}
/* ----------
* Initialize a mostly empty execution state
* ----------
*/
static void
plpgsql_estate_setup(PLpgSQL_execstate *estate,
PLpgSQL_function *func,
2013-11-15 19:52:03 +01:00
ReturnSetInfo *rsi,
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
EState *simple_eval_estate,
ResourceOwner simple_eval_resowner)
{
HASHCTL ctl;
/* this link will be restored at exit from plpgsql_call_handler */
func->cur_estate = estate;
estate->func = func;
estate->trigdata = NULL;
estate->evtrigdata = NULL;
estate->retval = (Datum) 0;
estate->retisnull = true;
estate->rettype = InvalidOid;
estate->fn_rettype = func->fn_rettype;
estate->retistuple = func->fn_retistuple;
estate->retisset = func->fn_retset;
estate->readonly_func = func->fn_readonly;
estate->atomic = true;
estate->exitlabel = NULL;
estate->cur_error = NULL;
estate->tuple_store = NULL;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
estate->tuple_store_desc = NULL;
if (rsi)
{
estate->tuple_store_cxt = rsi->econtext->ecxt_per_query_memory;
estate->tuple_store_owner = CurrentResourceOwner;
}
else
{
estate->tuple_store_cxt = NULL;
estate->tuple_store_owner = NULL;
}
estate->rsi = rsi;
estate->found_varno = func->found_varno;
estate->ndatums = func->ndatums;
estate->datums = NULL;
/* the datums array will be filled by copy_plpgsql_datums() */
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
estate->datum_context = CurrentMemoryContext;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
/* initialize our ParamListInfo with appropriate hook functions */
estate->paramLI = makeParamList(0);
estate->paramLI->paramFetch = plpgsql_param_fetch;
estate->paramLI->paramFetchArg = (void *) estate;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
estate->paramLI->paramCompile = plpgsql_param_compile;
estate->paramLI->paramCompileArg = NULL; /* not needed */
estate->paramLI->parserSetup = (ParserSetupHook) plpgsql_parser_setup;
estate->paramLI->parserSetupArg = NULL; /* filled during use */
estate->paramLI->numParams = estate->ndatums;
/* set up for use of appropriate simple-expression EState and cast hash */
2013-11-15 19:52:03 +01:00
if (simple_eval_estate)
{
2013-11-15 19:52:03 +01:00
estate->simple_eval_estate = simple_eval_estate;
/* Private cast hash just lives in function's main context */
ctl.keysize = sizeof(plpgsql_CastHashKey);
ctl.entrysize = sizeof(plpgsql_CastHashEntry);
ctl.hcxt = CurrentMemoryContext;
estate->cast_hash = hash_create("PLpgSQL private cast cache",
16, /* start small and extend */
&ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
estate->cast_hash_context = CurrentMemoryContext;
}
2013-11-15 19:52:03 +01:00
else
{
2013-11-15 19:52:03 +01:00
estate->simple_eval_estate = shared_simple_eval_estate;
/* Create the session-wide cast-info hash table if we didn't already */
if (shared_cast_hash == NULL)
{
shared_cast_context = AllocSetContextCreate(TopMemoryContext,
"PLpgSQL cast info",
Add macros to make AllocSetContextCreate() calls simpler and safer. I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
ALLOCSET_DEFAULT_SIZES);
ctl.keysize = sizeof(plpgsql_CastHashKey);
ctl.entrysize = sizeof(plpgsql_CastHashEntry);
ctl.hcxt = shared_cast_context;
shared_cast_hash = hash_create("PLpgSQL cast cache",
16, /* start small and extend */
&ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
}
estate->cast_hash = shared_cast_hash;
estate->cast_hash_context = shared_cast_context;
}
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/* likewise for the simple-expression resource owner */
if (simple_eval_resowner)
estate->simple_eval_resowner = simple_eval_resowner;
else
estate->simple_eval_resowner = shared_simple_eval_resowner;
2013-11-15 19:52:03 +01:00
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
/* if there's a procedure resowner, it'll be filled in later */
estate->procedure_resowner = NULL;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/*
* We start with no stmt_mcontext; one will be created only if needed.
* That context will be a direct child of the function's main execution
* context. Additional stmt_mcontexts might be created as children of it.
*/
estate->stmt_mcontext = NULL;
estate->stmt_mcontext_parent = CurrentMemoryContext;
estate->eval_tuptable = NULL;
estate->eval_processed = 0;
estate->eval_econtext = NULL;
estate->err_stmt = NULL;
estate->err_var = NULL;
estate->err_text = NULL;
estate->plugin_info = NULL;
/*
* Create an EState and ExprContext for evaluation of simple expressions.
*/
plpgsql_create_econtext(estate);
/*
* Let the plugin see this function before we initialize any local
* PL/pgSQL variables - note that we also give the plugin a few function
* pointers so it can call back into PL/pgSQL for doing things like
* variable assignments and stack traces
*/
if (*plpgsql_plugin_ptr)
{
(*plpgsql_plugin_ptr)->error_callback = plpgsql_exec_error_callback;
(*plpgsql_plugin_ptr)->assign_expr = exec_assign_expr;
if ((*plpgsql_plugin_ptr)->func_setup)
((*plpgsql_plugin_ptr)->func_setup) (estate, func);
}
}
/* ----------
* Release temporary memory used by expression/subselect evaluation
*
* NB: the result of the evaluation is no longer valid after this is done,
* unless it is a pass-by-value datatype.
* ----------
*/
static void
exec_eval_cleanup(PLpgSQL_execstate *estate)
{
/* Clear result of a full SPI_execute */
if (estate->eval_tuptable != NULL)
SPI_freetuptable(estate->eval_tuptable);
estate->eval_tuptable = NULL;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/*
* Clear result of exec_eval_simple_expr (but keep the econtext). This
* also clears any short-lived allocations done via get_eval_mcontext.
*/
if (estate->eval_econtext != NULL)
ResetExprContext(estate->eval_econtext);
}
/* ----------
* Generate a prepared plan
Fix corner-case uninitialized-variable issues in plpgsql. If an error was raised during our initial attempt to check whether a successfully-compiled expression is "simple", subsequent calls of exec_stmt_execsql would suppose that stmt->mod_stmt was already computed when it had not been. This could lead to assertion failures in debug builds; in production builds the effect would typically be to act as if INTO STRICT had been specified even when it had not been. Of course that only matters if the subsequent attempt to execute the expression succeeds, so that the problem can only be reached by fixing a failure in some referenced, inline-able SQL function and then retrying the calling plpgsql function in the same session. (There might be even-more-obscure ways to change the expression's behavior without changing the plpgsql function, but that one seems like the only one people would be likely to hit in practice.) The most foolproof way to fix this would be to arrange for exec_prepare_plan to not set expr->plan until we've finished the subsidiary simple-expression check. But it seems hard to do that without creating reference-count leak issues. So settle for documenting the hazard in a comment and fixing exec_stmt_execsql to test separately for whether it's computed stmt->mod_stmt. (That adds a test-and-branch per execution, but hopefully that's negligible in context.) In v11 and up, also fix exec_stmt_call which had a variant of the same issue. Per bug #17113 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17113-077605ce00e0e7ec@postgresql.org
2021-07-20 19:01:48 +02:00
*
* CAUTION: it is possible for this function to throw an error after it has
* built a SPIPlan and saved it in expr->plan. Therefore, be wary of doing
* additional things contingent on expr->plan being NULL. That is, given
* code like
*
* if (query->plan == NULL)
* {
* // okay to put setup code here
* exec_prepare_plan(estate, query, ...);
* // NOT okay to put more logic here
* }
*
* extra steps at the end are unsafe because they will not be executed when
* re-executing the calling statement, if exec_prepare_plan failed the first
* time. This is annoyingly error-prone, but the alternatives are worse.
* ----------
*/
static void
exec_prepare_plan(PLpgSQL_execstate *estate,
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
PLpgSQL_expr *expr, int cursorOptions)
{
SPIPlanPtr plan;
SPIPrepareOptions options;
/*
* The grammar can't conveniently set expr->func while building the parse
* tree, so make sure it's set before parser hooks need it.
*/
expr->func = estate->func;
/*
* Generate and save the plan
*/
memset(&options, 0, sizeof(options));
options.parserSetup = (ParserSetupHook) plpgsql_parser_setup;
options.parserSetupArg = (void *) expr;
options.parseMode = expr->parseMode;
options.cursorOptions = cursorOptions;
plan = SPI_prepare_extended(expr->query, &options);
if (plan == NULL)
elog(ERROR, "SPI_prepare_extended failed for \"%s\": %s",
expr->query, SPI_result_code_string(SPI_result));
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
SPI_keepplan(plan);
expr->plan = plan;
/* Check to see if it's a simple expression */
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
exec_simple_check_plan(estate, expr);
}
/* ----------
* exec_stmt_execsql Execute an SQL statement (possibly with INTO).
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
*
* Note: some callers rely on this not touching stmt_mcontext. If it ever
* needs to use that, fix those callers to push/pop stmt_mcontext.
* ----------
*/
static int
exec_stmt_execsql(PLpgSQL_execstate *estate,
PLpgSQL_stmt_execsql *stmt)
{
ParamListInfo paramLI;
long tcount;
int rc;
PLpgSQL_expr *expr = stmt->sqlstmt;
int too_many_rows_level = 0;
if (plpgsql_extra_errors & PLPGSQL_XCHECK_TOOMANYROWS)
too_many_rows_level = ERROR;
else if (plpgsql_extra_warnings & PLPGSQL_XCHECK_TOOMANYROWS)
too_many_rows_level = WARNING;
/*
* On the first call for this statement generate the plan, and detect
* whether the statement is INSERT/UPDATE/DELETE
*/
if (expr->plan == NULL)
Fix corner-case uninitialized-variable issues in plpgsql. If an error was raised during our initial attempt to check whether a successfully-compiled expression is "simple", subsequent calls of exec_stmt_execsql would suppose that stmt->mod_stmt was already computed when it had not been. This could lead to assertion failures in debug builds; in production builds the effect would typically be to act as if INTO STRICT had been specified even when it had not been. Of course that only matters if the subsequent attempt to execute the expression succeeds, so that the problem can only be reached by fixing a failure in some referenced, inline-able SQL function and then retrying the calling plpgsql function in the same session. (There might be even-more-obscure ways to change the expression's behavior without changing the plpgsql function, but that one seems like the only one people would be likely to hit in practice.) The most foolproof way to fix this would be to arrange for exec_prepare_plan to not set expr->plan until we've finished the subsidiary simple-expression check. But it seems hard to do that without creating reference-count leak issues. So settle for documenting the hazard in a comment and fixing exec_stmt_execsql to test separately for whether it's computed stmt->mod_stmt. (That adds a test-and-branch per execution, but hopefully that's negligible in context.) In v11 and up, also fix exec_stmt_call which had a variant of the same issue. Per bug #17113 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17113-077605ce00e0e7ec@postgresql.org
2021-07-20 19:01:48 +02:00
exec_prepare_plan(estate, expr, CURSOR_OPT_PARALLEL_OK);
if (!stmt->mod_stmt_set)
{
ListCell *l;
stmt->mod_stmt = false;
Fix plpgsql's reporting of plan-time errors in possibly-simple expressions. exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression *would* show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
2013-01-31 02:02:23 +01:00
foreach(l, SPI_plan_get_plan_sources(expr->plan))
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(l);
Fix misidentification of SQL statement type in plpgsql's exec_stmt_execsql. To distinguish SQL statements that are INSERT/UPDATE/DELETE from other ones, exec_stmt_execsql looked at the post-rewrite form of the statement rather than the original. This is problematic because it did that only during first execution of the statement (in a session), but the correct answer could change later due to addition or removal of DO INSTEAD rules during the session. That could lead to an Assert failure, as reported by Tushar Ahuja and Robert Haas. In non-assert builds, there's a hazard that we would fail to enforce STRICT behavior when we'd be expected to. That would happen if an initially present DO INSTEAD, that replaced the original statement with one of a different type, were removed; after that the statement should act "normally", including strictness enforcement, but it didn't. (The converse case of enforcing strictness when we shouldn't doesn't seem to be a hazard, as addition of a DO INSTEAD that changes the statement type would always lead to acting as though the statement returned zero rows, so that the strictness error could not fire.) To fix, inspect the original form of the statement not the post-rewrite form, making it valid to assume the answer can't change intra-session. This should lead to the same answer in every case except when there is a DO INSTEAD that changes the statement type; we will now set mod_stmt=true anyway, while we would not have done so before. That breaks the Assert in the SPI_OK_REWRITTEN code path, which expected the latter behavior. It might be all right to assert mod_stmt rather than !mod_stmt there, but I'm not entirely convinced that that'd always hold, so just remove the assertion altogether. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/CA+TgmoZUrRN4xvZe_BbBn_Xp0BDwuMEue-0OyF0fJpfvU2Yc7Q@mail.gmail.com
2018-05-25 20:31:06 +02:00
/*
* We could look at the raw_parse_tree, but it seems simpler to
* check the command tag. Note we should *not* look at the Query
* tree(s), since those are the result of rewriting and could have
* been transmogrified into something else entirely.
*/
if (plansource->commandTag == CMDTAG_INSERT ||
plansource->commandTag == CMDTAG_UPDATE ||
plansource->commandTag == CMDTAG_DELETE)
{
Fix misidentification of SQL statement type in plpgsql's exec_stmt_execsql. To distinguish SQL statements that are INSERT/UPDATE/DELETE from other ones, exec_stmt_execsql looked at the post-rewrite form of the statement rather than the original. This is problematic because it did that only during first execution of the statement (in a session), but the correct answer could change later due to addition or removal of DO INSTEAD rules during the session. That could lead to an Assert failure, as reported by Tushar Ahuja and Robert Haas. In non-assert builds, there's a hazard that we would fail to enforce STRICT behavior when we'd be expected to. That would happen if an initially present DO INSTEAD, that replaced the original statement with one of a different type, were removed; after that the statement should act "normally", including strictness enforcement, but it didn't. (The converse case of enforcing strictness when we shouldn't doesn't seem to be a hazard, as addition of a DO INSTEAD that changes the statement type would always lead to acting as though the statement returned zero rows, so that the strictness error could not fire.) To fix, inspect the original form of the statement not the post-rewrite form, making it valid to assume the answer can't change intra-session. This should lead to the same answer in every case except when there is a DO INSTEAD that changes the statement type; we will now set mod_stmt=true anyway, while we would not have done so before. That breaks the Assert in the SPI_OK_REWRITTEN code path, which expected the latter behavior. It might be all right to assert mod_stmt rather than !mod_stmt there, but I'm not entirely convinced that that'd always hold, so just remove the assertion altogether. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/CA+TgmoZUrRN4xvZe_BbBn_Xp0BDwuMEue-0OyF0fJpfvU2Yc7Q@mail.gmail.com
2018-05-25 20:31:06 +02:00
stmt->mod_stmt = true;
break;
}
}
Fix corner-case uninitialized-variable issues in plpgsql. If an error was raised during our initial attempt to check whether a successfully-compiled expression is "simple", subsequent calls of exec_stmt_execsql would suppose that stmt->mod_stmt was already computed when it had not been. This could lead to assertion failures in debug builds; in production builds the effect would typically be to act as if INTO STRICT had been specified even when it had not been. Of course that only matters if the subsequent attempt to execute the expression succeeds, so that the problem can only be reached by fixing a failure in some referenced, inline-able SQL function and then retrying the calling plpgsql function in the same session. (There might be even-more-obscure ways to change the expression's behavior without changing the plpgsql function, but that one seems like the only one people would be likely to hit in practice.) The most foolproof way to fix this would be to arrange for exec_prepare_plan to not set expr->plan until we've finished the subsidiary simple-expression check. But it seems hard to do that without creating reference-count leak issues. So settle for documenting the hazard in a comment and fixing exec_stmt_execsql to test separately for whether it's computed stmt->mod_stmt. (That adds a test-and-branch per execution, but hopefully that's negligible in context.) In v11 and up, also fix exec_stmt_call which had a variant of the same issue. Per bug #17113 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17113-077605ce00e0e7ec@postgresql.org
2021-07-20 19:01:48 +02:00
stmt->mod_stmt_set = true;
}
/*
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
* Set up ParamListInfo to pass to executor
*/
paramLI = setup_param_list(estate, expr);
/*
* If we have INTO, then we only need one row back ... but if we have INTO
* STRICT or extra check too_many_rows, ask for two rows, so that we can
* verify the statement returns only one. INSERT/UPDATE/DELETE are always
* treated strictly. Without INTO, just run the statement to completion
* (tcount = 0).
*
* We could just ask for two rows always when using INTO, but there are
* some cases where demanding the extra row costs significant time, eg by
* forcing completion of a sequential scan. So don't do it unless we need
* to enforce strictness.
*/
if (stmt->into)
{
if (stmt->strict || stmt->mod_stmt || too_many_rows_level)
tcount = 2;
else
tcount = 1;
}
else
tcount = 0;
/*
* Execute the plan
*/
rc = SPI_execute_plan_with_paramlist(expr->plan, paramLI,
estate->readonly_func, tcount);
/*
* Check for error, and set FOUND if appropriate (for historical reasons
* we set FOUND only for certain query types). Also Assert that we
* identified the statement type the same as SPI did.
*/
switch (rc)
{
case SPI_OK_SELECT:
Assert(!stmt->mod_stmt);
exec_set_found(estate, (SPI_processed != 0));
break;
case SPI_OK_INSERT:
case SPI_OK_UPDATE:
case SPI_OK_DELETE:
case SPI_OK_INSERT_RETURNING:
case SPI_OK_UPDATE_RETURNING:
case SPI_OK_DELETE_RETURNING:
Assert(stmt->mod_stmt);
exec_set_found(estate, (SPI_processed != 0));
break;
2002-09-04 22:31:48 +02:00
case SPI_OK_SELINTO:
case SPI_OK_UTILITY:
Assert(!stmt->mod_stmt);
break;
case SPI_OK_REWRITTEN:
/*
* The command was rewritten into another kind of command. It's
* not clear what FOUND would mean in that case (and SPI doesn't
Fix misidentification of SQL statement type in plpgsql's exec_stmt_execsql. To distinguish SQL statements that are INSERT/UPDATE/DELETE from other ones, exec_stmt_execsql looked at the post-rewrite form of the statement rather than the original. This is problematic because it did that only during first execution of the statement (in a session), but the correct answer could change later due to addition or removal of DO INSTEAD rules during the session. That could lead to an Assert failure, as reported by Tushar Ahuja and Robert Haas. In non-assert builds, there's a hazard that we would fail to enforce STRICT behavior when we'd be expected to. That would happen if an initially present DO INSTEAD, that replaced the original statement with one of a different type, were removed; after that the statement should act "normally", including strictness enforcement, but it didn't. (The converse case of enforcing strictness when we shouldn't doesn't seem to be a hazard, as addition of a DO INSTEAD that changes the statement type would always lead to acting as though the statement returned zero rows, so that the strictness error could not fire.) To fix, inspect the original form of the statement not the post-rewrite form, making it valid to assume the answer can't change intra-session. This should lead to the same answer in every case except when there is a DO INSTEAD that changes the statement type; we will now set mod_stmt=true anyway, while we would not have done so before. That breaks the Assert in the SPI_OK_REWRITTEN code path, which expected the latter behavior. It might be all right to assert mod_stmt rather than !mod_stmt there, but I'm not entirely convinced that that'd always hold, so just remove the assertion altogether. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/CA+TgmoZUrRN4xvZe_BbBn_Xp0BDwuMEue-0OyF0fJpfvU2Yc7Q@mail.gmail.com
2018-05-25 20:31:06 +02:00
* return the row count either), so just set it to false. Note
* that we can't assert anything about mod_stmt here.
*/
exec_set_found(estate, false);
break;
/* Some SPI errors deserve specific error messages */
case SPI_ERROR_COPY:
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot COPY to/from client in PL/pgSQL")));
break;
case SPI_ERROR_TRANSACTION:
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("unsupported transaction command in PL/pgSQL")));
break;
default:
elog(ERROR, "SPI_execute_plan_with_paramlist failed executing query \"%s\": %s",
expr->query, SPI_result_code_string(rc));
break;
}
/* All variants should save result info for GET DIAGNOSTICS */
estate->eval_processed = SPI_processed;
/* Process INTO if present */
if (stmt->into)
{
SPITupleTable *tuptab = SPI_tuptable;
uint64 n = SPI_processed;
PLpgSQL_variable *target;
/* If the statement did not return a tuple table, complain */
if (tuptab == NULL)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("INTO used with a command that cannot return data")));
/* Fetch target's datum entry */
target = (PLpgSQL_variable *) estate->datums[stmt->target->dno];
/*
* If SELECT ... INTO specified STRICT, and the query didn't find
* exactly one row, throw an error. If STRICT was not specified, then
* allow the query to find any number of rows.
*/
if (n == 0)
{
if (stmt->strict)
{
char *errdetail;
if (estate->func->print_strict_params)
errdetail = format_expr_params(estate, expr);
else
errdetail = NULL;
ereport(ERROR,
(errcode(ERRCODE_NO_DATA_FOUND),
errmsg("query returned no rows"),
errdetail ? errdetail_internal("parameters: %s", errdetail) : 0));
}
/* set the target to NULL(s) */
exec_move_row(estate, target, NULL, tuptab->tupdesc);
}
else
{
if (n > 1 && (stmt->strict || stmt->mod_stmt || too_many_rows_level))
{
char *errdetail;
int errlevel;
if (estate->func->print_strict_params)
errdetail = format_expr_params(estate, expr);
else
errdetail = NULL;
errlevel = (stmt->strict || stmt->mod_stmt) ? ERROR : too_many_rows_level;
ereport(errlevel,
(errcode(ERRCODE_TOO_MANY_ROWS),
errmsg("query returned more than one row"),
errdetail ? errdetail_internal("parameters: %s", errdetail) : 0,
errhint("Make sure the query returns a single row, or use LIMIT 1.")));
}
/* Put the first result row into the target */
exec_move_row(estate, target, tuptab->vals[0], tuptab->tupdesc);
}
/* Clean up */
exec_eval_cleanup(estate);
SPI_freetuptable(SPI_tuptable);
}
else
{
/* If the statement returned a tuple table, complain */
if (SPI_tuptable != NULL)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("query has no destination for result data"),
(rc == SPI_OK_SELECT) ? errhint("If you want to discard the results of a SELECT, use PERFORM instead.") : 0));
}
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmt_dynexecute Execute a dynamic SQL query
* (possibly with INTO).
* ----------
*/
static int
exec_stmt_dynexecute(PLpgSQL_execstate *estate,
PLpgSQL_stmt_dynexecute *stmt)
{
Datum query;
bool isnull;
Oid restype;
int32 restypmod;
char *querystr;
int exec_res;
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
ParamListInfo paramLI;
SPIExecuteOptions options;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext stmt_mcontext = get_stmt_mcontext(estate);
/*
* First we evaluate the string expression after the EXECUTE keyword. Its
* result is the querystring we have to execute.
*/
query = exec_eval_expr(estate, stmt->query, &isnull, &restype, &restypmod);
if (isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("query string argument of EXECUTE is null")));
/* Get the C-String representation */
querystr = convert_value_to_string(estate, query, restype);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* copy it into the stmt_mcontext before we clean up */
querystr = MemoryContextStrdup(stmt_mcontext, querystr);
exec_eval_cleanup(estate);
/*
* Execute the query without preparing a saved plan.
*/
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
paramLI = exec_eval_using_params(estate, stmt->params);
memset(&options, 0, sizeof(options));
options.params = paramLI;
options.read_only = estate->readonly_func;
exec_res = SPI_execute_extended(querystr, &options);
switch (exec_res)
{
case SPI_OK_SELECT:
case SPI_OK_INSERT:
case SPI_OK_UPDATE:
case SPI_OK_DELETE:
case SPI_OK_INSERT_RETURNING:
case SPI_OK_UPDATE_RETURNING:
case SPI_OK_DELETE_RETURNING:
case SPI_OK_UTILITY:
case SPI_OK_REWRITTEN:
break;
case 0:
2001-03-22 05:01:46 +01:00
/*
* Also allow a zero return, which implies the querystring
* contained no commands.
*/
break;
case SPI_OK_SELINTO:
2001-03-22 05:01:46 +01:00
/*
* We want to disallow SELECT INTO for now, because its behavior
* is not consistent with SELECT INTO in a normal plpgsql context.
* (We need to reimplement EXECUTE to parse the string as a
* plpgsql command, not just feed it to SPI_execute.) This is not
* a functional limitation because CREATE TABLE AS is allowed.
*/
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("EXECUTE of SELECT ... INTO is not implemented"),
errhint("You might want to use EXECUTE ... INTO or EXECUTE CREATE TABLE ... AS instead.")));
break;
/* Some SPI errors deserve specific error messages */
case SPI_ERROR_COPY:
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot COPY to/from client in PL/pgSQL")));
break;
case SPI_ERROR_TRANSACTION:
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("EXECUTE of transaction commands is not implemented")));
break;
default:
elog(ERROR, "SPI_execute_extended failed executing query \"%s\": %s",
querystr, SPI_result_code_string(exec_res));
break;
}
/* Save result info for GET DIAGNOSTICS */
estate->eval_processed = SPI_processed;
/* Process INTO if present */
if (stmt->into)
{
SPITupleTable *tuptab = SPI_tuptable;
uint64 n = SPI_processed;
PLpgSQL_variable *target;
/* If the statement did not return a tuple table, complain */
if (tuptab == NULL)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("INTO used with a command that cannot return data")));
/* Fetch target's datum entry */
target = (PLpgSQL_variable *) estate->datums[stmt->target->dno];
/*
* If SELECT ... INTO specified STRICT, and the query didn't find
* exactly one row, throw an error. If STRICT was not specified, then
* allow the query to find any number of rows.
*/
if (n == 0)
{
if (stmt->strict)
{
char *errdetail;
if (estate->func->print_strict_params)
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
errdetail = format_preparedparamsdata(estate, paramLI);
else
errdetail = NULL;
ereport(ERROR,
(errcode(ERRCODE_NO_DATA_FOUND),
errmsg("query returned no rows"),
errdetail ? errdetail_internal("parameters: %s", errdetail) : 0));
}
/* set the target to NULL(s) */
exec_move_row(estate, target, NULL, tuptab->tupdesc);
}
else
{
if (n > 1 && stmt->strict)
{
char *errdetail;
if (estate->func->print_strict_params)
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
errdetail = format_preparedparamsdata(estate, paramLI);
else
errdetail = NULL;
ereport(ERROR,
(errcode(ERRCODE_TOO_MANY_ROWS),
errmsg("query returned more than one row"),
errdetail ? errdetail_internal("parameters: %s", errdetail) : 0));
}
/* Put the first result row into the target */
exec_move_row(estate, target, tuptab->vals[0], tuptab->tupdesc);
}
/* clean up after exec_move_row() */
exec_eval_cleanup(estate);
}
else
{
/*
* It might be a good idea to raise an error if the query returned
* tuples that are being ignored, but historically we have not done
* that.
*/
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Release any result from SPI_execute, as well as transient data */
SPI_freetuptable(SPI_tuptable);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextReset(stmt_mcontext);
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmt_dynfors Execute a dynamic query, assign each
* tuple to a record or row and
* execute a group of statements
* for it.
* ----------
*/
static int
exec_stmt_dynfors(PLpgSQL_execstate *estate, PLpgSQL_stmt_dynfors *stmt)
{
Portal portal;
int rc;
portal = exec_dynquery_with_params(estate, stmt->query, stmt->params,
NULL, CURSOR_OPT_NO_SCROLL);
/*
* Execute the loop
*/
rc = exec_for_query(estate, (PLpgSQL_stmt_forq *) stmt, portal, true);
/*
* Close the implicit cursor
*/
SPI_cursor_close(portal);
return rc;
}
/* ----------
* exec_stmt_open Execute an OPEN cursor statement
* ----------
*/
static int
exec_stmt_open(PLpgSQL_execstate *estate, PLpgSQL_stmt_open *stmt)
{
PLpgSQL_var *curvar;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext stmt_mcontext = NULL;
char *curname = NULL;
PLpgSQL_expr *query;
Portal portal;
ParamListInfo paramLI;
/* ----------
* Get the cursor variable and if it has an assigned name, check
* that it's not in use currently.
* ----------
*/
curvar = (PLpgSQL_var *) (estate->datums[stmt->curvar]);
if (!curvar->isnull)
{
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
/* We only need stmt_mcontext to hold the cursor name string */
stmt_mcontext = get_stmt_mcontext(estate);
oldcontext = MemoryContextSwitchTo(stmt_mcontext);
curname = TextDatumGetCString(curvar->value);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
if (SPI_cursor_find(curname) != NULL)
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_CURSOR),
errmsg("cursor \"%s\" already in use", curname)));
}
/* ----------
* Process the OPEN according to it's type.
* ----------
*/
if (stmt->query != NULL)
{
/* ----------
* This is an OPEN refcursor FOR SELECT ...
*
* We just make sure the query is planned. The real work is
* done downstairs.
* ----------
*/
query = stmt->query;
if (query->plan == NULL)
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
exec_prepare_plan(estate, query, stmt->cursor_options);
}
else if (stmt->dynquery != NULL)
{
/* ----------
* This is an OPEN refcursor FOR EXECUTE ...
* ----------
*/
portal = exec_dynquery_with_params(estate,
stmt->dynquery,
stmt->params,
curname,
stmt->cursor_options);
/*
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* If cursor variable was NULL, store the generated portal name in it.
* Note: exec_dynquery_with_params already reset the stmt_mcontext, so
* curname is a dangling pointer here; but testing it for nullness is
* OK.
*/
if (curname == NULL)
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_text_var(estate, curvar, portal->name);
return PLPGSQL_RC_OK;
}
else
{
/* ----------
* This is an OPEN cursor
*
* Note: parser should already have checked that statement supplies
* args iff cursor needs them, but we check again to be safe.
* ----------
*/
if (stmt->argquery != NULL)
{
/* ----------
* OPEN CURSOR with args. We fake a SELECT ... INTO ...
* statement to evaluate the args and put 'em into the
* internal row.
* ----------
*/
PLpgSQL_stmt_execsql set_args;
if (curvar->cursor_explicit_argrow < 0)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("arguments given for cursor without arguments")));
memset(&set_args, 0, sizeof(set_args));
set_args.cmd_type = PLPGSQL_STMT_EXECSQL;
set_args.lineno = stmt->lineno;
set_args.sqlstmt = stmt->argquery;
set_args.into = true;
/* XXX historically this has not been STRICT */
set_args.target = (PLpgSQL_variable *)
(estate->datums[curvar->cursor_explicit_argrow]);
if (exec_stmt_execsql(estate, &set_args) != PLPGSQL_RC_OK)
elog(ERROR, "open cursor failed during argument processing");
}
else
{
if (curvar->cursor_explicit_argrow >= 0)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("arguments required for cursor")));
}
query = curvar->cursor_explicit_expr;
if (query->plan == NULL)
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
exec_prepare_plan(estate, query, curvar->cursor_options);
}
/*
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
* Set up ParamListInfo for this query
*/
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
paramLI = setup_param_list(estate, query);
/*
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
* Open the cursor (the paramlist will get copied into the portal)
*/
portal = SPI_cursor_open_with_paramlist(curname, query->plan,
paramLI,
estate->readonly_func);
if (portal == NULL)
elog(ERROR, "could not open cursor: %s",
SPI_result_code_string(SPI_result));
/*
* If cursor variable was NULL, store the generated portal name in it
*/
if (curname == NULL)
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_text_var(estate, curvar, portal->name);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* If we had any transient data, clean it up */
exec_eval_cleanup(estate);
if (stmt_mcontext)
MemoryContextReset(stmt_mcontext);
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmt_fetch Fetch from a cursor into a target, or just
* move the current position of the cursor
* ----------
*/
static int
exec_stmt_fetch(PLpgSQL_execstate *estate, PLpgSQL_stmt_fetch *stmt)
{
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
PLpgSQL_var *curvar;
long how_many = stmt->how_many;
SPITupleTable *tuptab;
Portal portal;
char *curname;
uint64 n;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
/* ----------
* Get the portal of the cursor by name
* ----------
*/
curvar = (PLpgSQL_var *) (estate->datums[stmt->curvar]);
if (curvar->isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("cursor variable \"%s\" is null", curvar->refname)));
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Use eval_mcontext for short-lived string */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
curname = TextDatumGetCString(curvar->value);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
portal = SPI_cursor_find(curname);
if (portal == NULL)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_CURSOR),
errmsg("cursor \"%s\" does not exist", curname)));
/* Calculate position for FETCH_RELATIVE or FETCH_ABSOLUTE */
if (stmt->expr)
{
bool isnull;
/* XXX should be doing this in LONG not INT width */
how_many = exec_eval_integer(estate, stmt->expr, &isnull);
if (isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("relative or absolute cursor position is null")));
exec_eval_cleanup(estate);
}
if (!stmt->is_move)
{
PLpgSQL_variable *target;
/* ----------
* Fetch 1 tuple from the cursor
* ----------
*/
SPI_scroll_cursor_fetch(portal, stmt->direction, how_many);
tuptab = SPI_tuptable;
n = SPI_processed;
/* ----------
* Set the target appropriately.
* ----------
*/
target = (PLpgSQL_variable *) estate->datums[stmt->target->dno];
if (n == 0)
exec_move_row(estate, target, NULL, tuptab->tupdesc);
else
exec_move_row(estate, target, tuptab->vals[0], tuptab->tupdesc);
exec_eval_cleanup(estate);
SPI_freetuptable(tuptab);
}
else
{
/* Move the cursor */
SPI_scroll_cursor_move(portal, stmt->direction, how_many);
n = SPI_processed;
}
/* Set the ROW_COUNT and the global FOUND variable appropriately. */
estate->eval_processed = n;
exec_set_found(estate, n != 0);
return PLPGSQL_RC_OK;
}
/* ----------
* exec_stmt_close Close a cursor
* ----------
*/
static int
exec_stmt_close(PLpgSQL_execstate *estate, PLpgSQL_stmt_close *stmt)
{
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
PLpgSQL_var *curvar;
Portal portal;
char *curname;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
/* ----------
* Get the portal of the cursor by name
* ----------
*/
curvar = (PLpgSQL_var *) (estate->datums[stmt->curvar]);
if (curvar->isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("cursor variable \"%s\" is null", curvar->refname)));
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Use eval_mcontext for short-lived string */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
curname = TextDatumGetCString(curvar->value);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
portal = SPI_cursor_find(curname);
if (portal == NULL)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_CURSOR),
errmsg("cursor \"%s\" does not exist", curname)));
/* ----------
* And close it.
* ----------
*/
SPI_cursor_close(portal);
return PLPGSQL_RC_OK;
}
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
/*
* exec_stmt_commit
*
* Commit the transaction.
*/
static int
exec_stmt_commit(PLpgSQL_execstate *estate, PLpgSQL_stmt_commit *stmt)
{
if (stmt->chain)
SPI_commit_and_chain();
else
{
SPI_commit();
SPI_start_transaction();
}
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
/*
* We need to build new simple-expression infrastructure, since the old
* data structures are gone.
*/
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
estate->simple_eval_estate = NULL;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
estate->simple_eval_resowner = NULL;
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
plpgsql_create_econtext(estate);
return PLPGSQL_RC_OK;
}
/*
* exec_stmt_rollback
*
* Abort the transaction.
*/
static int
exec_stmt_rollback(PLpgSQL_execstate *estate, PLpgSQL_stmt_rollback *stmt)
{
if (stmt->chain)
SPI_rollback_and_chain();
else
{
SPI_rollback();
SPI_start_transaction();
}
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
/*
* We need to build new simple-expression infrastructure, since the old
* data structures are gone.
*/
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
estate->simple_eval_estate = NULL;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
estate->simple_eval_resowner = NULL;
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
plpgsql_create_econtext(estate);
return PLPGSQL_RC_OK;
}
/* ----------
* exec_assign_expr Put an expression's result into a variable.
* ----------
*/
static void
exec_assign_expr(PLpgSQL_execstate *estate, PLpgSQL_datum *target,
PLpgSQL_expr *expr)
{
Datum value;
bool isnull;
Oid valtype;
int32 valtypmod;
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
/*
* If first time through, create a plan for this expression.
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
*/
if (expr->plan == NULL)
{
/*
* Mark the expression as being an assignment source, if target is a
* simple variable. (This is a bit messy, but it seems cleaner than
* modifying the API of exec_prepare_plan for the purpose. We need to
* stash the target dno into the expr anyway, so that it will be
* available if we have to replan.)
*/
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
if (target->dtype == PLPGSQL_DTYPE_VAR)
expr->target_param = target->dno;
else
expr->target_param = -1; /* should be that already */
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
exec_prepare_plan(estate, expr, 0);
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
}
value = exec_eval_expr(estate, expr, &isnull, &valtype, &valtypmod);
exec_assign_value(estate, target, value, isnull, valtype, valtypmod);
exec_eval_cleanup(estate);
}
/* ----------
* exec_assign_c_string Put a C string into a text variable.
*
* We take a NULL pointer as signifying empty string, not SQL null.
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
*
* As with the underlying exec_assign_value, caller is expected to do
* exec_eval_cleanup later.
* ----------
*/
static void
exec_assign_c_string(PLpgSQL_execstate *estate, PLpgSQL_datum *target,
const char *str)
{
text *value;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Use eval_mcontext for short-lived text value */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
if (str != NULL)
value = cstring_to_text(str);
else
value = cstring_to_text("");
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
exec_assign_value(estate, target, PointerGetDatum(value), false,
TEXTOID, -1);
}
/* ----------
* exec_assign_value Put a value into a target datum
*
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* Note: in some code paths, this will leak memory in the eval_mcontext;
* we assume that will be cleaned up later by exec_eval_cleanup. We cannot
* call exec_eval_cleanup here for fear of destroying the input Datum value.
* ----------
*/
static void
exec_assign_value(PLpgSQL_execstate *estate,
PLpgSQL_datum *target,
Datum value, bool isNull,
Oid valtype, int32 valtypmod)
{
switch (target->dtype)
{
case PLPGSQL_DTYPE_VAR:
case PLPGSQL_DTYPE_PROMISE:
{
/*
* Target is a variable
*/
PLpgSQL_var *var = (PLpgSQL_var *) target;
Datum newvalue;
newvalue = exec_cast_value(estate,
value,
&isNull,
valtype,
valtypmod,
var->datatype->typoid,
var->datatype->atttypmod);
if (isNull && var->notnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("null value cannot be assigned to variable \"%s\" declared NOT NULL",
var->refname)));
/*
* If type is by-reference, copy the new value (which is
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* probably in the eval_mcontext) into the procedure's main
* memory context. But if it's a read/write reference to an
* expanded object, no physical copy needs to happen; at most
* we need to reparent the object's memory context.
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
*
* If it's an array, we force the value to be stored in R/W
* expanded form. This wins if the function later does, say,
* a lot of array subscripting operations on the variable, and
* otherwise might lose. We might need to use a different
* heuristic, but it's too soon to tell. Also, are there
* cases where it'd be useful to force non-array values into
* expanded form?
*/
if (!var->datatype->typbyval && !isNull)
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
{
if (var->datatype->typisarray &&
!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(newvalue)))
{
/* array and not already R/W, so apply expand_array */
newvalue = expand_array(newvalue,
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
estate->datum_context,
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
NULL);
}
else
{
/* else transfer value if R/W, else just datumCopy */
newvalue = datumTransfer(newvalue,
false,
var->datatype->typlen);
}
}
/*
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
* Now free the old value, if any, and assign the new one. But
* skip the assignment if old and new values are the same.
* Note that for expanded objects, this test is necessary and
* cannot reliably be made any earlier; we have to be looking
* at the object's standard R/W pointer to be sure pointer
* equality is meaningful.
*
* Also, if it's a promise variable, we should disarm the
* promise in any case --- otherwise, assigning null to an
* armed promise variable would fail to disarm the promise.
*/
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
if (var->value != newvalue || var->isnull || isNull)
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(estate, var, newvalue, isNull,
(!var->datatype->typbyval && !isNull));
else
var->promise = PLPGSQL_PROMISE_NONE;
break;
}
case PLPGSQL_DTYPE_ROW:
{
2004-08-29 07:07:03 +02:00
/*
* Target is a row variable
2004-08-29 07:07:03 +02:00
*/
PLpgSQL_row *row = (PLpgSQL_row *) target;
2004-08-29 07:07:03 +02:00
if (isNull)
2004-08-29 07:07:03 +02:00
{
/* If source is null, just assign nulls to the row */
exec_move_row(estate, (PLpgSQL_variable *) row,
NULL, NULL);
2004-08-29 07:07:03 +02:00
}
else
{
/* Source must be of RECORD or composite type */
if (!type_is_rowtype(valtype))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("cannot assign non-composite value to a row variable")));
exec_move_row_from_datum(estate, (PLpgSQL_variable *) row,
value);
2004-08-29 07:07:03 +02:00
}
break;
}
case PLPGSQL_DTYPE_REC:
{
2004-08-29 07:07:03 +02:00
/*
* Target is a record variable
2004-08-29 07:07:03 +02:00
*/
PLpgSQL_rec *rec = (PLpgSQL_rec *) target;
2004-08-29 07:07:03 +02:00
if (isNull)
2004-08-29 07:07:03 +02:00
{
if (rec->notnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
errmsg("null value cannot be assigned to variable \"%s\" declared NOT NULL",
rec->refname)));
/* Set variable to a simple NULL */
exec_move_row(estate, (PLpgSQL_variable *) rec,
NULL, NULL);
}
else
{
/* Source must be of RECORD or composite type */
if (!type_is_rowtype(valtype))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("cannot assign non-composite value to a record variable")));
exec_move_row_from_datum(estate, (PLpgSQL_variable *) rec,
value);
2004-08-29 07:07:03 +02:00
}
break;
}
case PLPGSQL_DTYPE_RECFIELD:
{
/*
* Target is a field of a record
*/
PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) target;
PLpgSQL_rec *rec;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
ExpandedRecordHeader *erh;
2004-08-29 07:07:03 +02:00
rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
erh = rec->erh;
/*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* If record variable is NULL, instantiate it if it has a
* named composite type, else complain. (This won't change
* the logical state of the record, but if we successfully
* assign below, the unassigned fields will all become NULLs.)
*/
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (erh == NULL)
{
instantiate_empty_record_variable(estate, rec);
erh = rec->erh;
}
2004-08-29 07:07:03 +02:00
/*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* Look up the field's properties if we have not already, or
* if the tuple descriptor ID changed since last time.
2004-08-29 07:07:03 +02:00
*/
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (unlikely(recfield->rectupledescid != erh->er_tupdesc_id))
{
if (!expanded_record_lookup_field(erh,
recfield->fieldname,
&recfield->finfo))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("record \"%s\" has no field \"%s\"",
rec->refname, recfield->fieldname)));
recfield->rectupledescid = erh->er_tupdesc_id;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* We don't support assignments to system columns. */
if (recfield->finfo.fnumber <= 0)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot assign to system column \"%s\"",
recfield->fieldname)));
/* Cast the new value to the right type, if needed. */
value = exec_cast_value(estate,
value,
&isNull,
valtype,
valtypmod,
recfield->finfo.ftypeid,
recfield->finfo.ftypmod);
/* And assign it. */
expanded_record_set_field(erh, recfield->finfo.fnumber,
2018-05-16 20:56:52 +02:00
value, isNull, !estate->atomic);
break;
}
default:
elog(ERROR, "unrecognized dtype: %d", target->dtype);
}
}
/*
* exec_eval_datum Get current value of a PLpgSQL_datum
*
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
* The type oid, typmod, value in Datum format, and null flag are returned.
*
* At present this doesn't handle PLpgSQL_expr datums; that's not needed
* because we never pass references to such datums to SPI.
*
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
* NOTE: the returned Datum points right at the stored value in the case of
* pass-by-reference datatypes. Generally callers should take care not to
* modify the stored value. Some callers intentionally manipulate variables
* referenced by R/W expanded pointers, though; it is those callers'
* responsibility that the results are semantically OK.
*
* In some cases we have to palloc a return value, and in such cases we put
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* it into the estate's eval_mcontext.
*/
static void
exec_eval_datum(PLpgSQL_execstate *estate,
PLpgSQL_datum *datum,
Oid *typeid,
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
int32 *typetypmod,
Datum *value,
bool *isnull)
{
MemoryContext oldcontext;
switch (datum->dtype)
{
case PLPGSQL_DTYPE_PROMISE:
/* fulfill promise if needed, then handle like regular var */
plpgsql_fulfill_promise(estate, (PLpgSQL_var *) datum);
/* FALL THRU */
case PLPGSQL_DTYPE_VAR:
{
PLpgSQL_var *var = (PLpgSQL_var *) datum;
*typeid = var->datatype->typoid;
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
*typetypmod = var->datatype->atttypmod;
*value = var->value;
*isnull = var->isnull;
break;
}
case PLPGSQL_DTYPE_ROW:
{
PLpgSQL_row *row = (PLpgSQL_row *) datum;
HeapTuple tup;
2004-08-29 07:07:03 +02:00
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* We get here if there are multiple OUT parameters */
if (!row->rowtupdesc) /* should not happen */
elog(ERROR, "row variable has no tupdesc");
/* Make sure we have a valid type/typmod setting */
BlessTupleDesc(row->rowtupdesc);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
tup = make_tuple_from_row(estate, row, row->rowtupdesc);
if (tup == NULL) /* should not happen */
elog(ERROR, "row not compatible with its own tupdesc");
*typeid = row->rowtupdesc->tdtypeid;
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
*typetypmod = row->rowtupdesc->tdtypmod;
*value = HeapTupleGetDatum(tup);
*isnull = false;
Fix failure to detoast fields in composite elements of structured types. If we have an array of records stored on disk, the individual record fields cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are only prepared to deal with TOAST pointers appearing in top-level fields of a stored row. The same applies for ranges over composite types, nested composites, etc. However, the existing code only took care of expanding sub-field TOAST pointers for the case of nested composites, not for other structured types containing composites. For example, given a command such as UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ... where x is a direct reference to a field of an on-disk tuple, if that field is long enough to be toasted out-of-line then the TOAST pointer would be inserted as-is into the array column. If the source record for x is later deleted, the array field value would become a dangling pointer, leading to errors along the line of "missing chunk number 0 for toast value ..." when the value is referenced. A reproducible test case for this was provided by Jan Pecek, but it seems likely that some of the "missing chunk number" reports we've heard in the past were caused by similar issues. Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to produce a self-contained Datum value if the Datum is of composite type. Seen in this light, the problem is not just confined to arrays and ranges, but could also affect some other places where detoasting is done in that way, for example form_index_tuple(). I tried teaching the array code to apply toast_flatten_tuple_attribute() along with PG_DETOAST_DATUM() when the array element type is composite, but this was messy and imposed extra cache lookup costs whether or not any TOAST pointers were present, indeed sometimes when the array element type isn't even composite (since sometimes it takes a typcache lookup to find that out). The idea of extending that approach to all the places that currently use PG_DETOAST_DATUM() wasn't attractive at all. This patch instead solves the problem by decreeing that composite Datum values must not contain any out-of-line TOAST pointers in the first place; that is, we expand out-of-line fields at the point of constructing a composite Datum, not at the point where we're about to insert it into a larger tuple. This rule is applied only to true composite Datums, not to tuples that are being passed around the system as tuples, so it's not as invasive as it might sound at first. With this approach, the amount of code that has to be touched for a full solution is greatly reduced, and added cache lookup costs are avoided except when there actually is a TOAST pointer that needs to be inlined. The main drawback of this approach is that we might sometimes dereference a TOAST pointer that will never actually be used by the query, imposing a rather large cost that wasn't there before. On the other side of the coin, if the field value is used multiple times then we'll come out ahead by avoiding repeat detoastings. Experimentation suggests that common SQL coding patterns are unaffected either way, though. Applications that are very negatively affected could be advised to modify their code to not fetch columns they won't be using. In future, we might consider reverting this solution in favor of detoasting only at the point where data is about to be stored to disk, using some method that can drill down into multiple levels of nested structured types. That will require defining new APIs for structured types, though, so it doesn't seem feasible as a back-patchable fix. Note that this patch changes HeapTupleGetDatum() from a macro to a function call; this means that any third-party code using that macro will not get protection against creating TOAST-pointer-containing Datums until it's recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER(). It seems likely that this is not a big problem in practice: most of the tuple-returning functions in core and contrib produce outputs that could not possibly be toasted anyway, and the same probably holds for third-party extensions. This bug has existed since TOAST was invented, so back-patch to all supported branches.
2014-05-01 21:19:06 +02:00
MemoryContextSwitchTo(oldcontext);
break;
}
case PLPGSQL_DTYPE_REC:
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) datum;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (rec->erh == NULL)
{
/* Treat uninstantiated record as a simple NULL */
*value = (Datum) 0;
*isnull = true;
/* Report variable's declared type */
*typeid = rec->rectypeid;
*typetypmod = -1;
}
else
{
if (ExpandedRecordIsEmpty(rec->erh))
{
/* Empty record is also a NULL */
*value = (Datum) 0;
*isnull = true;
}
else
{
*value = ExpandedRecordGetDatum(rec->erh);
*isnull = false;
}
if (rec->rectypeid != RECORDOID)
{
/* Report variable's declared type, if not RECORD */
*typeid = rec->rectypeid;
*typetypmod = -1;
}
else
{
/* Report record's actual type if declared RECORD */
*typeid = rec->erh->er_typeid;
*typetypmod = rec->erh->er_typmod;
}
}
break;
}
case PLPGSQL_DTYPE_RECFIELD:
{
PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) datum;
PLpgSQL_rec *rec;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
ExpandedRecordHeader *erh;
rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
erh = rec->erh;
/*
* If record variable is NULL, instantiate it if it has a
* named composite type, else complain. (This won't change
* the logical state of the record: it's still NULL.)
*/
if (erh == NULL)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
instantiate_empty_record_variable(estate, rec);
erh = rec->erh;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Look up the field's properties if we have not already, or
* if the tuple descriptor ID changed since last time.
*/
if (unlikely(recfield->rectupledescid != erh->er_tupdesc_id))
{
if (!expanded_record_lookup_field(erh,
recfield->fieldname,
&recfield->finfo))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("record \"%s\" has no field \"%s\"",
rec->refname, recfield->fieldname)));
recfield->rectupledescid = erh->er_tupdesc_id;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Report type data. */
*typeid = recfield->finfo.ftypeid;
*typetypmod = recfield->finfo.ftypmod;
/* And fetch the field value. */
*value = expanded_record_get_field(erh,
recfield->finfo.fnumber,
isnull);
break;
}
default:
elog(ERROR, "unrecognized dtype: %d", datum->dtype);
}
}
/*
* plpgsql_exec_get_datum_type Get datatype of a PLpgSQL_datum
*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* This is the same logic as in exec_eval_datum, but we skip acquiring
* the actual value of the variable. Also, needn't support DTYPE_ROW.
*/
Oid
plpgsql_exec_get_datum_type(PLpgSQL_execstate *estate,
PLpgSQL_datum *datum)
{
Oid typeid;
switch (datum->dtype)
{
case PLPGSQL_DTYPE_VAR:
case PLPGSQL_DTYPE_PROMISE:
{
PLpgSQL_var *var = (PLpgSQL_var *) datum;
typeid = var->datatype->typoid;
break;
}
case PLPGSQL_DTYPE_REC:
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) datum;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (rec->erh == NULL || rec->rectypeid != RECORDOID)
{
/* Report variable's declared type */
typeid = rec->rectypeid;
}
else
{
/* Report record's actual type if declared RECORD */
typeid = rec->erh->er_typeid;
}
break;
}
case PLPGSQL_DTYPE_RECFIELD:
{
PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) datum;
PLpgSQL_rec *rec;
2004-08-29 07:07:03 +02:00
rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* If record variable is NULL, instantiate it if it has a
* named composite type, else complain. (This won't change
* the logical state of the record: it's still NULL.)
*/
if (rec->erh == NULL)
instantiate_empty_record_variable(estate, rec);
/*
* Look up the field's properties if we have not already, or
* if the tuple descriptor ID changed since last time.
*/
if (unlikely(recfield->rectupledescid != rec->erh->er_tupdesc_id))
{
if (!expanded_record_lookup_field(rec->erh,
recfield->fieldname,
&recfield->finfo))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("record \"%s\" has no field \"%s\"",
rec->refname, recfield->fieldname)));
recfield->rectupledescid = rec->erh->er_tupdesc_id;
}
typeid = recfield->finfo.ftypeid;
2004-08-29 07:07:03 +02:00
break;
}
default:
elog(ERROR, "unrecognized dtype: %d", datum->dtype);
typeid = InvalidOid; /* keep compiler quiet */
break;
}
return typeid;
}
/*
* plpgsql_exec_get_datum_type_info Get datatype etc of a PLpgSQL_datum
*
* An extended version of plpgsql_exec_get_datum_type, which also retrieves the
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* typmod and collation of the datum. Note however that we don't report the
* possibly-mutable typmod of RECORD values, but say -1 always.
*/
void
plpgsql_exec_get_datum_type_info(PLpgSQL_execstate *estate,
PLpgSQL_datum *datum,
Oid *typeId, int32 *typMod, Oid *collation)
{
switch (datum->dtype)
{
case PLPGSQL_DTYPE_VAR:
case PLPGSQL_DTYPE_PROMISE:
{
PLpgSQL_var *var = (PLpgSQL_var *) datum;
*typeId = var->datatype->typoid;
*typMod = var->datatype->atttypmod;
*collation = var->datatype->collation;
break;
}
case PLPGSQL_DTYPE_REC:
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) datum;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (rec->erh == NULL || rec->rectypeid != RECORDOID)
{
/* Report variable's declared type */
*typeId = rec->rectypeid;
*typMod = -1;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
}
else
{
/* Report record's actual type if declared RECORD */
*typeId = rec->erh->er_typeid;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* do NOT return the mutable typmod of a RECORD variable */
*typMod = -1;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
}
/* composite types are never collatable */
*collation = InvalidOid;
break;
}
case PLPGSQL_DTYPE_RECFIELD:
{
PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) datum;
PLpgSQL_rec *rec;
rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* If record variable is NULL, instantiate it if it has a
* named composite type, else complain. (This won't change
* the logical state of the record: it's still NULL.)
*/
if (rec->erh == NULL)
instantiate_empty_record_variable(estate, rec);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Look up the field's properties if we have not already, or
* if the tuple descriptor ID changed since last time.
*/
if (unlikely(recfield->rectupledescid != rec->erh->er_tupdesc_id))
{
if (!expanded_record_lookup_field(rec->erh,
recfield->fieldname,
&recfield->finfo))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("record \"%s\" has no field \"%s\"",
rec->refname, recfield->fieldname)));
recfield->rectupledescid = rec->erh->er_tupdesc_id;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
*typeId = recfield->finfo.ftypeid;
*typMod = recfield->finfo.ftypmod;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
*collation = recfield->finfo.fcollation;
break;
}
default:
elog(ERROR, "unrecognized dtype: %d", datum->dtype);
*typeId = InvalidOid; /* keep compiler quiet */
*typMod = -1;
*collation = InvalidOid;
break;
}
}
/* ----------
* exec_eval_integer Evaluate an expression, coerce result to int4
*
* Note we do not do exec_eval_cleanup here; the caller must do it at
* some later point. (We do this because the caller may be holding the
* results of other, pass-by-reference, expression evaluations, such as
* an array value to be subscripted.)
* ----------
*/
static int
exec_eval_integer(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr,
bool *isNull)
{
Datum exprdatum;
Oid exprtypeid;
int32 exprtypmod;
exprdatum = exec_eval_expr(estate, expr, isNull, &exprtypeid, &exprtypmod);
exprdatum = exec_cast_value(estate, exprdatum, isNull,
exprtypeid, exprtypmod,
INT4OID, -1);
return DatumGetInt32(exprdatum);
}
/* ----------
* exec_eval_boolean Evaluate an expression, coerce result to bool
*
* Note we do not do exec_eval_cleanup here; the caller must do it at
* some later point.
* ----------
*/
static bool
exec_eval_boolean(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr,
bool *isNull)
{
Datum exprdatum;
Oid exprtypeid;
int32 exprtypmod;
exprdatum = exec_eval_expr(estate, expr, isNull, &exprtypeid, &exprtypmod);
exprdatum = exec_cast_value(estate, exprdatum, isNull,
exprtypeid, exprtypmod,
BOOLOID, -1);
return DatumGetBool(exprdatum);
}
/* ----------
* exec_eval_expr Evaluate an expression and return
* the result Datum, along with data type/typmod.
*
* NOTE: caller must do exec_eval_cleanup when done with the Datum.
* ----------
*/
static Datum
exec_eval_expr(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr,
bool *isNull,
Oid *rettype,
int32 *rettypmod)
{
Datum result = 0;
int rc;
Form_pg_attribute attr;
/*
* If first time through, create a plan for this expression.
*/
if (expr->plan == NULL)
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
exec_prepare_plan(estate, expr, CURSOR_OPT_PARALLEL_OK);
/*
* If this is a simple expression, bypass SPI and use the executor
* directly
*/
if (exec_eval_simple_expr(estate, expr,
&result, isNull, rettype, rettypmod))
return result;
/*
* Else do it the hard way via exec_run_select
*/
rc = exec_run_select(estate, expr, 2, NULL);
if (rc != SPI_OK_SELECT)
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("query did not return data"),
errcontext("query: %s", expr->query)));
/*
* Check that the expression returns exactly one column...
*/
if (estate->eval_tuptable->tupdesc->natts != 1)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg_plural("query returned %d column",
"query returned %d columns",
estate->eval_tuptable->tupdesc->natts,
estate->eval_tuptable->tupdesc->natts),
errcontext("query: %s", expr->query)));
/*
* ... and get the column's datatype.
*/
attr = TupleDescAttr(estate->eval_tuptable->tupdesc, 0);
*rettype = attr->atttypid;
*rettypmod = attr->atttypmod;
/*
* If there are no rows selected, the result is a NULL of that type.
*/
if (estate->eval_processed == 0)
{
*isNull = true;
return (Datum) 0;
}
/*
* Check that the expression returned no more than one row.
*/
if (estate->eval_processed != 1)
ereport(ERROR,
(errcode(ERRCODE_CARDINALITY_VIOLATION),
errmsg("query returned more than one row"),
errcontext("query: %s", expr->query)));
/*
* Return the single result Datum.
*/
return SPI_getbinval(estate->eval_tuptable->vals[0],
estate->eval_tuptable->tupdesc, 1, isNull);
}
/* ----------
* exec_run_select Execute a select query
* ----------
*/
static int
exec_run_select(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr, long maxtuples, Portal *portalP)
{
ParamListInfo paramLI;
int rc;
/*
* On the first call for this expression generate the plan.
*
* If we don't need to return a portal, then we're just going to execute
* the query immediately, which means it's OK to use a parallel plan, even
* if the number of rows being fetched is limited. If we do need to
* return a portal (i.e., this is for a FOR loop), the user's code might
* invoke additional operations inside the FOR loop, making parallel query
* unsafe. In any case, we don't expect any cursor operations to be done,
* so specify NO_SCROLL for efficiency and semantic safety.
*/
if (expr->plan == NULL)
{
int cursorOptions = CURSOR_OPT_NO_SCROLL;
if (portalP == NULL)
cursorOptions |= CURSOR_OPT_PARALLEL_OK;
exec_prepare_plan(estate, expr, cursorOptions);
}
/*
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
* Set up ParamListInfo to pass to executor
*/
paramLI = setup_param_list(estate, expr);
/*
* If a portal was requested, put the query and paramlist into the portal
*/
if (portalP != NULL)
{
*portalP = SPI_cursor_open_with_paramlist(NULL, expr->plan,
paramLI,
estate->readonly_func);
if (*portalP == NULL)
elog(ERROR, "could not open implicit cursor for query \"%s\": %s",
expr->query, SPI_result_code_string(SPI_result));
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
exec_eval_cleanup(estate);
return SPI_OK_CURSOR;
}
/*
* Execute the query
*/
rc = SPI_execute_plan_with_paramlist(expr->plan, paramLI,
estate->readonly_func, maxtuples);
if (rc != SPI_OK_SELECT)
{
/*
* SELECT INTO deserves a special error message, because "query is not
* a SELECT" is not very helpful in that case.
*/
if (rc == SPI_OK_SELINTO)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("query is SELECT INTO, but it should be plain SELECT"),
errcontext("query: %s", expr->query)));
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("query is not a SELECT"),
errcontext("query: %s", expr->query)));
}
/* Save query results for eventual cleanup */
Assert(estate->eval_tuptable == NULL);
estate->eval_tuptable = SPI_tuptable;
estate->eval_processed = SPI_processed;
return rc;
}
/*
* exec_for_query --- execute body of FOR loop for each row from a portal
*
* Used by exec_stmt_fors, exec_stmt_forc and exec_stmt_dynfors
*/
static int
exec_for_query(PLpgSQL_execstate *estate, PLpgSQL_stmt_forq *stmt,
Portal portal, bool prefetch_ok)
{
PLpgSQL_variable *var;
SPITupleTable *tuptab;
bool found = false;
int rc = PLPGSQL_RC_OK;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
uint64 previous_id = INVALID_TUPLEDESC_IDENTIFIER;
bool tupdescs_match = true;
uint64 n;
/* Fetch loop variable's datum entry */
var = (PLpgSQL_variable *) estate->datums[stmt->var->dno];
/*
* Make sure the portal doesn't get closed by the user statements we
* execute.
*/
PinPortal(portal);
/*
* In a non-atomic context, we dare not prefetch, even if it would
* otherwise be safe. Aside from any semantic hazards that that might
* create, if we prefetch toasted data and then the user commits the
* transaction, the toast references could turn into dangling pointers.
* (Rows we haven't yet fetched from the cursor are safe, because the
* PersistHoldablePortal mechanism handles this scenario.)
*/
if (!estate->atomic)
prefetch_ok = false;
/*
* Fetch the initial tuple(s). If prefetching is allowed then we grab a
* few more rows to avoid multiple trips through executor startup
* overhead.
*/
SPI_cursor_fetch(portal, true, prefetch_ok ? 10 : 1);
tuptab = SPI_tuptable;
n = SPI_processed;
/*
* If the query didn't return any rows, set the target to NULL and fall
* through with found = false.
*/
if (n == 0)
{
exec_move_row(estate, var, NULL, tuptab->tupdesc);
exec_eval_cleanup(estate);
}
else
found = true; /* processed at least one tuple */
/*
* Now do the loop
*/
while (n > 0)
{
uint64 i;
for (i = 0; i < n; i++)
{
/*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* Assign the tuple to the target. Here, because we know that all
* loop iterations should be assigning the same tupdesc, we can
* optimize away repeated creations of expanded records with
* identical tupdescs. Testing for changes of er_tupdesc_id is
* reliable even if the loop body contains assignments that
* replace the target's value entirely, because it's assigned from
* a process-global counter. The case where the tupdescs don't
* match could possibly be handled more efficiently than this
* coding does, but it's not clear extra effort is worthwhile.
*/
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (var->dtype == PLPGSQL_DTYPE_REC)
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) var;
if (rec->erh &&
rec->erh->er_tupdesc_id == previous_id &&
tupdescs_match)
{
/* Only need to assign a new tuple value */
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(rec->erh, tuptab->vals[i],
true, !estate->atomic);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
}
else
{
/*
* First time through, or var's tupdesc changed in loop,
* or we have to do it the hard way because type coercion
* is needed.
*/
exec_move_row(estate, var,
tuptab->vals[i], tuptab->tupdesc);
/*
* Check to see if physical assignment is OK next time.
* Once the tupdesc comparison has failed once, we don't
* bother rechecking in subsequent loop iterations.
*/
if (tupdescs_match)
{
tupdescs_match =
(rec->rectypeid == RECORDOID ||
rec->rectypeid == tuptab->tupdesc->tdtypeid ||
compatible_tupdescs(tuptab->tupdesc,
expanded_record_get_tupdesc(rec->erh)));
}
previous_id = rec->erh->er_tupdesc_id;
}
}
else
exec_move_row(estate, var, tuptab->vals[i], tuptab->tupdesc);
exec_eval_cleanup(estate);
/*
* Execute the statements
*/
rc = exec_stmts(estate, stmt->body);
LOOP_RC_PROCESSING(stmt->label, goto loop_exit);
}
SPI_freetuptable(tuptab);
/*
* Fetch more tuples. If prefetching is allowed, grab 50 at a time.
*/
SPI_cursor_fetch(portal, true, prefetch_ok ? 50 : 1);
tuptab = SPI_tuptable;
n = SPI_processed;
}
loop_exit:
/*
* Release last group of tuples (if any)
*/
SPI_freetuptable(tuptab);
UnpinPortal(portal);
/*
* Set the FOUND variable to indicate the result of executing the loop
* (namely, whether we looped one or more times). This must be set last so
* that it does not interfere with the value of the FOUND variable inside
* the loop processing itself.
*/
exec_set_found(estate, found);
return rc;
}
/* ----------
* exec_eval_simple_expr - Evaluate a simple expression returning
* a Datum by directly calling ExecEvalExpr().
*
* If successful, store results into *result, *isNull, *rettype, *rettypmod
* and return true. If the expression cannot be handled by simple evaluation,
* return false.
*
* Because we only store one execution tree for a simple expression, we
* can't handle recursion cases. So, if we see the tree is already busy
* with an evaluation in the current xact, we just return false and let the
* caller run the expression the hard way. (Other alternatives such as
* creating a new tree for a recursive call either introduce memory leaks,
* or add enough bookkeeping to be doubtful wins anyway.) Another case that
* is covered by the expr_simple_in_use test is where a previous execution
* of the tree was aborted by an error: the tree may contain bogus state
* so we dare not re-use it.
*
* It is possible that we'd need to replan a simple expression; for example,
* someone might redefine a SQL function that had been inlined into the simple
* expression. That cannot cause a simple expression to become non-simple (or
* vice versa), but we do have to handle replacing the expression tree.
*
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* Note: if pass-by-reference, the result is in the eval_mcontext.
* It will be freed when exec_eval_cleanup is done.
* ----------
*/
static bool
exec_eval_simple_expr(PLpgSQL_execstate *estate,
PLpgSQL_expr *expr,
Datum *result,
bool *isNull,
Oid *rettype,
int32 *rettypmod)
{
ExprContext *econtext = estate->eval_econtext;
LocalTransactionId curlxid = MyProc->lxid;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
ParamListInfo paramLI;
void *save_setup_arg;
bool need_snapshot;
MemoryContext oldcontext;
/*
* Forget it if expression wasn't simple before.
*/
if (expr->expr_simple_expr == NULL)
return false;
/*
* If expression is in use in current xact, don't touch it.
*/
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
if (unlikely(expr->expr_simple_in_use) &&
expr->expr_simple_lxid == curlxid)
return false;
/*
* Ensure that there's a portal-level snapshot, in case this simple
* expression is the first thing evaluated after a COMMIT or ROLLBACK.
* We'd have to do this anyway before executing the expression, so we
* might as well do it now to ensure that any possible replanning doesn't
* need to take a new snapshot.
*/
EnsurePortalSnapshotExists();
/*
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
* Check to see if the cached plan has been invalidated. If not, and this
* is the first use in the current transaction, save a plan refcount in
* the simple-expression resowner.
*/
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
if (likely(CachedPlanIsSimplyValid(expr->expr_simple_plansource,
expr->expr_simple_plan,
(expr->expr_simple_plan_lxid != curlxid ?
estate->simple_eval_resowner : NULL))))
{
/*
* It's still good, so just remember that we have a refcount on the
* plan in the current transaction. (If we already had one, this
* assignment is a no-op.)
*/
expr->expr_simple_plan_lxid = curlxid;
}
else
{
/* Need to replan */
CachedPlan *cplan;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/*
* If we have a valid refcount on some previous version of the plan,
* release it, so we don't leak plans intra-transaction.
*/
if (expr->expr_simple_plan_lxid == curlxid)
{
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
ReleaseCachedPlan(expr->expr_simple_plan,
estate->simple_eval_resowner);
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
expr->expr_simple_plan = NULL;
expr->expr_simple_plan_lxid = InvalidLocalTransactionId;
}
/* Do the replanning work in the eval_mcontext */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
cplan = SPI_plan_get_cached_plan(expr->plan);
MemoryContextSwitchTo(oldcontext);
/*
* We can't get a failure here, because the number of
* CachedPlanSources in the SPI plan can't change from what
* exec_simple_check_plan saw; it's a property of the raw parsetree
* generated from the query text.
*/
Assert(cplan != NULL);
/*
* This test probably can't fail either, but if it does, cope by
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
* declaring the plan to be non-simple. On success, we'll acquire a
* refcount on the new plan, stored in simple_eval_resowner.
*/
if (CachedPlanAllowsSimpleValidityCheck(expr->expr_simple_plansource,
cplan,
estate->simple_eval_resowner))
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
{
/* Remember that we have the refcount */
expr->expr_simple_plan = cplan;
expr->expr_simple_plan_lxid = curlxid;
}
else
{
/* Release SPI_plan_get_cached_plan's refcount */
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
ReleaseCachedPlan(cplan, CurrentResourceOwner);
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/* Mark expression as non-simple, and fail */
expr->expr_simple_expr = NULL;
expr->expr_rw_param = NULL;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
return false;
}
/*
* SPI_plan_get_cached_plan acquired a plan refcount stored in the
* active resowner. We don't need that anymore, so release it.
*/
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
ReleaseCachedPlan(cplan, CurrentResourceOwner);
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/* Extract desired scalar expression from cached plan */
exec_save_simple_expr(expr, cplan);
}
/*
* Pass back previously-determined result type.
*/
*rettype = expr->expr_simple_type;
*rettypmod = expr->expr_simple_typmod;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
/*
* Set up ParamListInfo to pass to executor. For safety, save and restore
* estate->paramLI->parserSetupArg around our use of the param list.
*/
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
paramLI = estate->paramLI;
save_setup_arg = paramLI->parserSetupArg;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/*
* We can skip using setup_param_list() in favor of just doing this
* unconditionally, because there's no need for the optimization of
* possibly setting ecxt_param_list_info to NULL; we've already forced use
* of a generic plan.
*/
paramLI->parserSetupArg = (void *) expr;
econtext->ecxt_param_list_info = paramLI;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
/*
* Prepare the expression for execution, if it's not been done already in
* the current transaction. (This will be forced to happen if we called
* exec_save_simple_expr above.)
*/
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
if (unlikely(expr->expr_simple_lxid != curlxid))
{
2013-11-15 19:52:03 +01:00
oldcontext = MemoryContextSwitchTo(estate->simple_eval_estate->es_query_cxt);
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
expr->expr_simple_state =
ExecInitExprWithParams(expr->expr_simple_expr,
econtext->ecxt_param_list_info);
expr->expr_simple_in_use = false;
expr->expr_simple_lxid = curlxid;
MemoryContextSwitchTo(oldcontext);
}
/*
* We have to do some of the things SPI_execute_plan would do, in
* particular push a new snapshot so that stable functions within the
* expression can see updates made so far by our own function. However,
* we can skip doing that (and just invoke the expression with the same
* snapshot passed to our function) in some cases, which is useful because
* it's quite expensive relative to the cost of a simple expression. We
* can skip it if the expression contains no stable or volatile functions;
* immutable functions shouldn't need to see our updates. Also, if this
* is a read-only function, we haven't made any updates so again it's okay
* to skip.
*/
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
need_snapshot = (expr->expr_simple_mutable && !estate->readonly_func);
if (need_snapshot)
{
CommandCounterIncrement();
PushActiveSnapshot(GetTransactionSnapshot());
}
/*
* Mark expression as busy for the duration of the ExecEvalExpr call.
*/
expr->expr_simple_in_use = true;
/*
* Finally we can call the executor to evaluate the expression
*/
*result = ExecEvalExpr(expr->expr_simple_state,
econtext,
isNull);
/* Assorted cleanup */
expr->expr_simple_in_use = false;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
econtext->ecxt_param_list_info = NULL;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
paramLI->parserSetupArg = save_setup_arg;
if (need_snapshot)
PopActiveSnapshot();
MemoryContextSwitchTo(oldcontext);
/*
* That's it.
*/
return true;
}
/*
* Create a ParamListInfo to pass to SPI
*
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
* We use a single ParamListInfo struct for all SPI calls made to evaluate
* PLpgSQL_exprs in this estate. It contains no per-param data, just hook
* functions, so it's effectively read-only for SPI.
*
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
* An exception from pure read-only-ness is that the parserSetupArg points
* to the specific PLpgSQL_expr being evaluated. This is not an issue for
* statement-level callers, but lower-level callers must save and restore
* estate->paramLI->parserSetupArg just in case there's an active evaluation
* at an outer call level. (A plausible alternative design would be to
* create a ParamListInfo struct for each PLpgSQL_expr, but for the moment
* that seems like a waste of memory.)
*/
static ParamListInfo
setup_param_list(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
{
ParamListInfo paramLI;
/*
* We must have created the SPIPlan already (hence, query text has been
* parsed/analyzed at least once); else we cannot rely on expr->paramnos.
*/
Assert(expr->plan != NULL);
/*
* We only need a ParamListInfo if the expression has parameters. In
* principle we should test with bms_is_empty(), but we use a not-null
* test because it's faster. In current usage bits are never removed from
* expr->paramnos, only added, so this test is correct anyway.
*/
if (expr->paramnos)
{
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
/* Use the common ParamListInfo */
paramLI = estate->paramLI;
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
/*
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
* Set up link to active expr where the hook functions can find it.
* Callers must save and restore parserSetupArg if there is any chance
* that they are interrupting an active use of parameters.
*/
paramLI->parserSetupArg = (void *) expr;
/*
* Also make sure this is set before parser hooks need it. There is
* no need to save and restore, since the value is always correct once
* set. (Should be set already, but let's be sure.)
*/
expr->func = estate->func;
}
else
{
/*
* Expression requires no parameters. Be sure we represent this case
* as a NULL ParamListInfo, so that plancache.c knows there is no
* point in a custom plan.
*/
paramLI = NULL;
}
return paramLI;
}
/*
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
* plpgsql_param_fetch paramFetch callback for dynamic parameter fetch
*
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
* We always use the caller's workspace to construct the returned struct.
*
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
* Note: this is no longer used during query execution. It is used during
* planning (with speculative == true) and when the ParamListInfo we supply
* to the executor is copied into a cursor portal or transferred to a
* parallel child process.
*/
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
static ParamExternData *
plpgsql_param_fetch(ParamListInfo params,
int paramid, bool speculative,
ParamExternData *prm)
{
int dno;
PLpgSQL_execstate *estate;
PLpgSQL_expr *expr;
PLpgSQL_datum *datum;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
bool ok = true;
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
int32 prmtypmod;
/* paramid's are 1-based, but dnos are 0-based */
dno = paramid - 1;
Assert(dno >= 0 && dno < params->numParams);
/* fetch back the hook data */
estate = (PLpgSQL_execstate *) params->paramFetchArg;
expr = (PLpgSQL_expr *) params->parserSetupArg;
Assert(params->numParams == estate->ndatums);
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
/* now we can access the target datum */
datum = estate->datums[dno];
/*
* Since copyParamList() or SerializeParamList() will try to materialize
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
* every single parameter slot, it's important to return a dummy param
* when asked for a datum that's not supposed to be used by this SQL
* expression. Otherwise we risk failures in exec_eval_datum(), or
* copying a lot more data than necessary.
*/
if (!bms_is_member(dno, expr->paramnos))
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
ok = false;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
/*
* If the access is speculative, we prefer to return no data rather than
* to fail in exec_eval_datum(). Check the likely failure cases.
*/
else if (speculative)
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
{
switch (datum->dtype)
{
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
case PLPGSQL_DTYPE_VAR:
case PLPGSQL_DTYPE_PROMISE:
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
/* always safe */
break;
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
case PLPGSQL_DTYPE_ROW:
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
/* should be safe in all interesting cases */
break;
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
case PLPGSQL_DTYPE_REC:
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* always safe (might return NULL, that's fine) */
break;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
case PLPGSQL_DTYPE_RECFIELD:
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
{
PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) datum;
PLpgSQL_rec *rec;
rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* If record variable is NULL, don't risk anything.
*/
if (rec->erh == NULL)
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
ok = false;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Look up the field's properties if we have not already,
* or if the tuple descriptor ID changed since last time.
*/
else if (unlikely(recfield->rectupledescid != rec->erh->er_tupdesc_id))
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (expanded_record_lookup_field(rec->erh,
recfield->fieldname,
&recfield->finfo))
recfield->rectupledescid = rec->erh->er_tupdesc_id;
else
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
ok = false;
}
break;
}
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
default:
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
ok = false;
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
break;
}
}
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
/* Return "no such parameter" if not ok */
if (!ok)
{
prm->value = (Datum) 0;
prm->isnull = true;
prm->pflags = 0;
prm->ptype = InvalidOid;
return prm;
}
/* OK, evaluate the value and store into the return struct */
exec_eval_datum(estate, datum,
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
&prm->ptype, &prmtypmod,
&prm->value, &prm->isnull);
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
/* We can always mark params as "const" for executor's purposes */
prm->pflags = PARAM_FLAG_CONST;
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
/*
* If it's a read/write expanded datum, convert reference to read-only.
* (There's little point in trying to optimize read/write parameters,
* given the cases in which this function is used.)
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
*/
if (datum->dtype == PLPGSQL_DTYPE_VAR)
prm->value = MakeExpandedObjectReadOnly(prm->value,
prm->isnull,
((PLpgSQL_var *) datum)->datatype->typlen);
else if (datum->dtype == PLPGSQL_DTYPE_REC)
prm->value = MakeExpandedObjectReadOnly(prm->value,
prm->isnull,
-1);
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
return prm;
}
/*
* plpgsql_param_compile paramCompile callback for plpgsql parameters
*/
static void
plpgsql_param_compile(ParamListInfo params, Param *param,
ExprState *state,
Datum *resv, bool *resnull)
{
PLpgSQL_execstate *estate;
PLpgSQL_expr *expr;
int dno;
PLpgSQL_datum *datum;
ExprEvalStep scratch;
/* fetch back the hook data */
estate = (PLpgSQL_execstate *) params->paramFetchArg;
expr = (PLpgSQL_expr *) params->parserSetupArg;
/* paramid's are 1-based, but dnos are 0-based */
dno = param->paramid - 1;
Assert(dno >= 0 && dno < estate->ndatums);
/* now we can access the target datum */
datum = estate->datums[dno];
scratch.opcode = EEOP_PARAM_CALLBACK;
scratch.resvalue = resv;
scratch.resnull = resnull;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Select appropriate eval function. It seems worth special-casing
* DTYPE_VAR and DTYPE_RECFIELD for performance. Also, we can determine
* in advance whether MakeExpandedObjectReadOnly() will be required.
* Currently, only VAR/PROMISE and REC datums could contain read/write
* expanded objects.
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
*/
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
if (datum->dtype == PLPGSQL_DTYPE_VAR)
{
if (param != expr->expr_rw_param &&
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
((PLpgSQL_var *) datum)->datatype->typlen == -1)
scratch.d.cparam.paramfunc = plpgsql_param_eval_var_ro;
else
scratch.d.cparam.paramfunc = plpgsql_param_eval_var;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
else if (datum->dtype == PLPGSQL_DTYPE_RECFIELD)
scratch.d.cparam.paramfunc = plpgsql_param_eval_recfield;
else if (datum->dtype == PLPGSQL_DTYPE_PROMISE)
{
if (param != expr->expr_rw_param &&
((PLpgSQL_var *) datum)->datatype->typlen == -1)
scratch.d.cparam.paramfunc = plpgsql_param_eval_generic_ro;
else
scratch.d.cparam.paramfunc = plpgsql_param_eval_generic;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
else if (datum->dtype == PLPGSQL_DTYPE_REC &&
param != expr->expr_rw_param)
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
scratch.d.cparam.paramfunc = plpgsql_param_eval_generic_ro;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
else
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
scratch.d.cparam.paramfunc = plpgsql_param_eval_generic;
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
/*
* Note: it's tempting to use paramarg to store the estate pointer and
* thereby save an indirection or two in the eval functions. But that
* doesn't work because the compiled expression might be used with
* different estates for the same PL/pgSQL function.
*/
scratch.d.cparam.paramarg = NULL;
scratch.d.cparam.paramid = param->paramid;
scratch.d.cparam.paramtype = param->paramtype;
ExprEvalPushStep(state, &scratch);
}
/*
* plpgsql_param_eval_var evaluation of EEOP_PARAM_CALLBACK step
*
* This is specialized to the case of DTYPE_VAR variables for which
* we do not need to invoke MakeExpandedObjectReadOnly.
*/
static void
plpgsql_param_eval_var(ExprState *state, ExprEvalStep *op,
ExprContext *econtext)
{
ParamListInfo params;
PLpgSQL_execstate *estate;
int dno = op->d.cparam.paramid - 1;
PLpgSQL_var *var;
/* fetch back the hook data */
params = econtext->ecxt_param_list_info;
estate = (PLpgSQL_execstate *) params->paramFetchArg;
Assert(dno >= 0 && dno < estate->ndatums);
/* now we can access the target datum */
var = (PLpgSQL_var *) estate->datums[dno];
Assert(var->dtype == PLPGSQL_DTYPE_VAR);
/* inlined version of exec_eval_datum() */
*op->resvalue = var->value;
*op->resnull = var->isnull;
/* safety check -- an assertion should be sufficient */
Assert(var->datatype->typoid == op->d.cparam.paramtype);
}
/*
* plpgsql_param_eval_var_ro evaluation of EEOP_PARAM_CALLBACK step
*
* This is specialized to the case of DTYPE_VAR variables for which
* we need to invoke MakeExpandedObjectReadOnly.
*/
static void
plpgsql_param_eval_var_ro(ExprState *state, ExprEvalStep *op,
ExprContext *econtext)
{
ParamListInfo params;
PLpgSQL_execstate *estate;
int dno = op->d.cparam.paramid - 1;
PLpgSQL_var *var;
/* fetch back the hook data */
params = econtext->ecxt_param_list_info;
estate = (PLpgSQL_execstate *) params->paramFetchArg;
Assert(dno >= 0 && dno < estate->ndatums);
/* now we can access the target datum */
var = (PLpgSQL_var *) estate->datums[dno];
Assert(var->dtype == PLPGSQL_DTYPE_VAR);
/*
* Inlined version of exec_eval_datum() ... and while we're at it, force
* expanded datums to read-only.
*/
*op->resvalue = MakeExpandedObjectReadOnly(var->value,
var->isnull,
-1);
*op->resnull = var->isnull;
/* safety check -- an assertion should be sufficient */
Assert(var->datatype->typoid == op->d.cparam.paramtype);
}
/*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* plpgsql_param_eval_recfield evaluation of EEOP_PARAM_CALLBACK step
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* This is specialized to the case of DTYPE_RECFIELD variables, for which
* we never need to invoke MakeExpandedObjectReadOnly.
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
*/
static void
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
plpgsql_param_eval_recfield(ExprState *state, ExprEvalStep *op,
ExprContext *econtext)
{
ParamListInfo params;
PLpgSQL_execstate *estate;
int dno = op->d.cparam.paramid - 1;
PLpgSQL_recfield *recfield;
PLpgSQL_rec *rec;
ExpandedRecordHeader *erh;
/* fetch back the hook data */
params = econtext->ecxt_param_list_info;
estate = (PLpgSQL_execstate *) params->paramFetchArg;
Assert(dno >= 0 && dno < estate->ndatums);
/* now we can access the target datum */
recfield = (PLpgSQL_recfield *) estate->datums[dno];
Assert(recfield->dtype == PLPGSQL_DTYPE_RECFIELD);
/* inline the relevant part of exec_eval_datum */
rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]);
erh = rec->erh;
/*
* If record variable is NULL, instantiate it if it has a named composite
* type, else complain. (This won't change the logical state of the
* record: it's still NULL.)
*/
if (erh == NULL)
{
instantiate_empty_record_variable(estate, rec);
erh = rec->erh;
}
/*
* Look up the field's properties if we have not already, or if the tuple
* descriptor ID changed since last time.
*/
if (unlikely(recfield->rectupledescid != erh->er_tupdesc_id))
{
if (!expanded_record_lookup_field(erh,
recfield->fieldname,
&recfield->finfo))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("record \"%s\" has no field \"%s\"",
rec->refname, recfield->fieldname)));
recfield->rectupledescid = erh->er_tupdesc_id;
}
/* OK to fetch the field value. */
*op->resvalue = expanded_record_get_field(erh,
recfield->finfo.fnumber,
op->resnull);
/* safety check -- needed for, eg, record fields */
if (unlikely(recfield->finfo.ftypeid != op->d.cparam.paramtype))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("type of parameter %d (%s) does not match that when preparing the plan (%s)",
op->d.cparam.paramid,
format_type_be(recfield->finfo.ftypeid),
format_type_be(op->d.cparam.paramtype))));
}
/*
* plpgsql_param_eval_generic evaluation of EEOP_PARAM_CALLBACK step
*
* This handles all variable types, but assumes we do not need to invoke
* MakeExpandedObjectReadOnly.
*/
static void
plpgsql_param_eval_generic(ExprState *state, ExprEvalStep *op,
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
ExprContext *econtext)
{
ParamListInfo params;
PLpgSQL_execstate *estate;
int dno = op->d.cparam.paramid - 1;
PLpgSQL_datum *datum;
Oid datumtype;
int32 datumtypmod;
/* fetch back the hook data */
params = econtext->ecxt_param_list_info;
estate = (PLpgSQL_execstate *) params->paramFetchArg;
Assert(dno >= 0 && dno < estate->ndatums);
/* now we can access the target datum */
datum = estate->datums[dno];
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* fetch datum's value */
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
exec_eval_datum(estate, datum,
&datumtype, &datumtypmod,
op->resvalue, op->resnull);
/* safety check -- needed for, eg, record fields */
if (unlikely(datumtype != op->d.cparam.paramtype))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("type of parameter %d (%s) does not match that when preparing the plan (%s)",
op->d.cparam.paramid,
format_type_be(datumtype),
format_type_be(op->d.cparam.paramtype))));
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
}
Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 18:57:41 +01:00
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* plpgsql_param_eval_generic_ro evaluation of EEOP_PARAM_CALLBACK step
*
* This handles all variable types, but assumes we need to invoke
* MakeExpandedObjectReadOnly (hence, variable must be of a varlena type).
*/
static void
plpgsql_param_eval_generic_ro(ExprState *state, ExprEvalStep *op,
ExprContext *econtext)
{
ParamListInfo params;
PLpgSQL_execstate *estate;
int dno = op->d.cparam.paramid - 1;
PLpgSQL_datum *datum;
Oid datumtype;
int32 datumtypmod;
/* fetch back the hook data */
params = econtext->ecxt_param_list_info;
estate = (PLpgSQL_execstate *) params->paramFetchArg;
Assert(dno >= 0 && dno < estate->ndatums);
/* now we can access the target datum */
datum = estate->datums[dno];
/* fetch datum's value */
exec_eval_datum(estate, datum,
&datumtype, &datumtypmod,
op->resvalue, op->resnull);
/* safety check -- needed for, eg, record fields */
if (unlikely(datumtype != op->d.cparam.paramtype))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("type of parameter %d (%s) does not match that when preparing the plan (%s)",
op->d.cparam.paramid,
format_type_be(datumtype),
format_type_be(op->d.cparam.paramtype))));
/* force the value to read-only */
*op->resvalue = MakeExpandedObjectReadOnly(*op->resvalue,
*op->resnull,
-1);
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* exec_move_row Move one tuple's values into a record or row
*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* tup and tupdesc may both be NULL if we're just assigning an indeterminate
* composite NULL to the target. Alternatively, can have tup be NULL and
* tupdesc not NULL, in which case we assign a row of NULLs to the target.
*
* Since this uses the mcontext for workspace, caller should eventually call
* exec_eval_cleanup to prevent long-term memory leaks.
*/
static void
exec_move_row(PLpgSQL_execstate *estate,
PLpgSQL_variable *target,
HeapTuple tup, TupleDesc tupdesc)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
ExpandedRecordHeader *newerh = NULL;
/*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* If target is RECORD, we may be able to avoid field-by-field processing.
*/
if (target->dtype == PLPGSQL_DTYPE_REC)
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) target;
/*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* If we have no source tupdesc, just set the record variable to NULL.
* (If we have a source tupdesc but not a tuple, we'll set the
* variable to a row of nulls, instead. This is odd perhaps, but
* backwards compatible.)
*/
if (tupdesc == NULL)
{
if (rec->datatype &&
rec->datatype->typtype == TYPTYPE_DOMAIN)
{
/*
* If it's a composite domain, NULL might not be a legal
* value, so we instead need to make an empty expanded record
* and ensure that domain type checking gets done. If there
* is already an expanded record, piggyback on its lookups.
*/
newerh = make_expanded_record_for_rec(estate, rec,
NULL, rec->erh);
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(newerh, NULL, false, false);
assign_record_var(estate, rec, newerh);
}
else
{
/* Just clear it to NULL */
if (rec->erh)
DeleteExpandedObject(ExpandedRecordGetDatum(rec->erh));
rec->erh = NULL;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
return;
}
/*
* Build a new expanded record with appropriate tupdesc.
*/
newerh = make_expanded_record_for_rec(estate, rec, tupdesc, NULL);
/*
* If the rowtypes match, or if we have no tuple anyway, we can
* complete the assignment without field-by-field processing.
*
* The tests here are ordered more or less in order of cheapness. We
* can easily detect it will work if the target is declared RECORD or
* has the same typeid as the source. But when assigning from a query
* result, it's common to have a source tupdesc that's labeled RECORD
* but is actually physically compatible with a named-composite-type
* target, so it's worth spending extra cycles to check for that.
*/
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (rec->rectypeid == RECORDOID ||
rec->rectypeid == tupdesc->tdtypeid ||
!HeapTupleIsValid(tup) ||
compatible_tupdescs(tupdesc, expanded_record_get_tupdesc(newerh)))
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (!HeapTupleIsValid(tup))
{
/* No data, so force the record into all-nulls state */
deconstruct_expanded_record(newerh);
}
else
{
/* No coercion is needed, so just assign the row value */
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(newerh, tup, true, !estate->atomic);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Complete the assignment */
assign_record_var(estate, rec, newerh);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
return;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Otherwise, deconstruct the tuple and do field-by-field assignment,
* using exec_move_row_from_fields.
*/
if (tupdesc && HeapTupleIsValid(tup))
{
int td_natts = tupdesc->natts;
Datum *values;
bool *nulls;
Datum values_local[64];
bool nulls_local[64];
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Need workspace arrays. If td_natts is small enough, use local
* arrays to save doing a palloc. Even if it's not small, we can
* allocate both the Datum and isnull arrays in one palloc chunk.
*/
if (td_natts <= lengthof(values_local))
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
values = values_local;
nulls = nulls_local;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
else
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
char *chunk;
chunk = eval_mcontext_alloc(estate,
td_natts * (sizeof(Datum) + sizeof(bool)));
values = (Datum *) chunk;
nulls = (bool *) (chunk + td_natts * sizeof(Datum));
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
heap_deform_tuple(tup, tupdesc, values, nulls);
exec_move_row_from_fields(estate, target, newerh,
values, nulls, tupdesc);
}
else
{
/*
* Assign all-nulls.
*/
exec_move_row_from_fields(estate, target, newerh,
NULL, NULL, NULL);
}
}
Fix plpgsql to re-look-up composite type names at need. Commit 4b93f5799 rearranged things in plpgsql to make it cope better with composite types changing underneath it intra-session. However, I failed to consider the case of a composite type being dropped and recreated entirely. In my defense, the previous coding didn't consider that possibility at all either --- but it would accidentally work so long as you didn't change the type's field list, because the built-at-compile-time list of component variables would then still match the type's new definition. The new coding, however, occasionally tries to re-look-up the type by OID, and then fails to find the dropped type. To fix this, we need to save the TypeName struct, and then redo the type OID lookup from that. Of course that's expensive, so we don't want to do it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye on the type's typcache entry to see if its tupledesc has been invalidated. (Perhaps, at some point, this mechanism should be generalized so it can work for non-composite types too; but for now, plpgsql only tries to cope with intra-session redefinitions of composites.) I'm slightly hesitant to back-patch this into v11, because it changes the contents of struct PLpgSQL_type as well as the signature of plpgsql_build_datatype(), so in principle it could break code that is poking into the innards of plpgsql. However, the only popular extension of that ilk is pldebugger, and it doesn't seem to be affected. Since this is a regression for people who were relying on the old behavior, it seems worth taking the small risk of causing compatibility issues. Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799 came in. Discussion: https://postgr.es/m/15913-a7e112e16dedcffc@postgresql.org
2019-08-15 21:21:47 +02:00
/*
* Verify that a PLpgSQL_rec's rectypeid is up-to-date.
*/
static void
revalidate_rectypeid(PLpgSQL_rec *rec)
{
PLpgSQL_type *typ = rec->datatype;
TypeCacheEntry *typentry;
if (rec->rectypeid == RECORDOID)
return; /* it's RECORD, so nothing to do */
Assert(typ != NULL);
if (typ->tcache &&
typ->tcache->tupDesc_identifier == typ->tupdesc_id)
{
/*
* Although *typ is known up-to-date, it's possible that rectypeid
* isn't, because *rec is cloned during each function startup from a
* copy that we don't have a good way to update. Hence, forcibly fix
* rectypeid before returning.
*/
rec->rectypeid = typ->typoid;
return;
}
Fix plpgsql to re-look-up composite type names at need. Commit 4b93f5799 rearranged things in plpgsql to make it cope better with composite types changing underneath it intra-session. However, I failed to consider the case of a composite type being dropped and recreated entirely. In my defense, the previous coding didn't consider that possibility at all either --- but it would accidentally work so long as you didn't change the type's field list, because the built-at-compile-time list of component variables would then still match the type's new definition. The new coding, however, occasionally tries to re-look-up the type by OID, and then fails to find the dropped type. To fix this, we need to save the TypeName struct, and then redo the type OID lookup from that. Of course that's expensive, so we don't want to do it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye on the type's typcache entry to see if its tupledesc has been invalidated. (Perhaps, at some point, this mechanism should be generalized so it can work for non-composite types too; but for now, plpgsql only tries to cope with intra-session redefinitions of composites.) I'm slightly hesitant to back-patch this into v11, because it changes the contents of struct PLpgSQL_type as well as the signature of plpgsql_build_datatype(), so in principle it could break code that is poking into the innards of plpgsql. However, the only popular extension of that ilk is pldebugger, and it doesn't seem to be affected. Since this is a regression for people who were relying on the old behavior, it seems worth taking the small risk of causing compatibility issues. Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799 came in. Discussion: https://postgr.es/m/15913-a7e112e16dedcffc@postgresql.org
2019-08-15 21:21:47 +02:00
/*
* typcache entry has suffered invalidation, so re-look-up the type name
* if possible, and then recheck the type OID. If we don't have a
* TypeName, then we just have to soldier on with the OID we've got.
*/
if (typ->origtypname != NULL)
{
/* this bit should match parse_datatype() in pl_gram.y */
typenameTypeIdAndMod(NULL, typ->origtypname,
&typ->typoid,
&typ->atttypmod);
}
/* this bit should match build_datatype() in pl_comp.c */
typentry = lookup_type_cache(typ->typoid,
TYPECACHE_TUPDESC |
TYPECACHE_DOMAIN_BASE_INFO);
if (typentry->typtype == TYPTYPE_DOMAIN)
typentry = lookup_type_cache(typentry->domainBaseType,
TYPECACHE_TUPDESC);
if (typentry->tupDesc == NULL)
{
/*
* If we get here, user tried to replace a composite type with a
* non-composite one. We're not gonna support that.
*/
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("type %s is not composite",
format_type_be(typ->typoid))));
}
/*
* Update tcache and tupdesc_id. Since we don't support changing to a
* non-composite type, none of the rest of *typ needs to change.
*/
typ->tcache = typentry;
typ->tupdesc_id = typentry->tupDesc_identifier;
/*
* Update *rec, too. (We'll deal with subsidiary RECFIELDs as needed.)
*/
rec->rectypeid = typ->typoid;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* Build an expanded record object suitable for assignment to "rec".
*
* Caller must supply either a source tuple descriptor or a source expanded
* record (not both). If the record variable has declared type RECORD,
* it'll adopt the source's rowtype. Even if it doesn't, we may be able to
* piggyback on a source expanded record to save a typcache lookup.
*
* Caller must fill the object with data, then do assign_record_var().
*
* The new record is initially put into the mcontext, so it will be cleaned up
* if we fail before reaching assign_record_var().
*/
static ExpandedRecordHeader *
make_expanded_record_for_rec(PLpgSQL_execstate *estate,
PLpgSQL_rec *rec,
TupleDesc srctupdesc,
ExpandedRecordHeader *srcerh)
{
ExpandedRecordHeader *newerh;
MemoryContext mcontext = get_eval_mcontext(estate);
if (rec->rectypeid != RECORDOID)
{
Fix plpgsql to re-look-up composite type names at need. Commit 4b93f5799 rearranged things in plpgsql to make it cope better with composite types changing underneath it intra-session. However, I failed to consider the case of a composite type being dropped and recreated entirely. In my defense, the previous coding didn't consider that possibility at all either --- but it would accidentally work so long as you didn't change the type's field list, because the built-at-compile-time list of component variables would then still match the type's new definition. The new coding, however, occasionally tries to re-look-up the type by OID, and then fails to find the dropped type. To fix this, we need to save the TypeName struct, and then redo the type OID lookup from that. Of course that's expensive, so we don't want to do it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye on the type's typcache entry to see if its tupledesc has been invalidated. (Perhaps, at some point, this mechanism should be generalized so it can work for non-composite types too; but for now, plpgsql only tries to cope with intra-session redefinitions of composites.) I'm slightly hesitant to back-patch this into v11, because it changes the contents of struct PLpgSQL_type as well as the signature of plpgsql_build_datatype(), so in principle it could break code that is poking into the innards of plpgsql. However, the only popular extension of that ilk is pldebugger, and it doesn't seem to be affected. Since this is a regression for people who were relying on the old behavior, it seems worth taking the small risk of causing compatibility issues. Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799 came in. Discussion: https://postgr.es/m/15913-a7e112e16dedcffc@postgresql.org
2019-08-15 21:21:47 +02:00
/*
* Make sure rec->rectypeid is up-to-date before using it.
*/
revalidate_rectypeid(rec);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* New record must be of desired type, but maybe srcerh has already
* done all the same lookups.
*/
if (srcerh && rec->rectypeid == srcerh->er_decltypeid)
newerh = make_expanded_record_from_exprecord(srcerh,
mcontext);
else
newerh = make_expanded_record_from_typeid(rec->rectypeid, -1,
mcontext);
}
else
{
/*
* We'll adopt the input tupdesc. We can still use
* make_expanded_record_from_exprecord, if srcerh isn't a composite
* domain. (If it is, we effectively adopt its base type.)
*/
if (srcerh && !ExpandedRecordIsDomain(srcerh))
newerh = make_expanded_record_from_exprecord(srcerh,
mcontext);
else
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
if (!srctupdesc)
srctupdesc = expanded_record_get_tupdesc(srcerh);
newerh = make_expanded_record_from_tupdesc(srctupdesc,
mcontext);
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
}
return newerh;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* exec_move_row_from_fields Move arrays of field values into a record or row
*
* When assigning to a record, the caller must have already created a suitable
* new expanded record object, newerh. Pass NULL when assigning to a row.
*
* tupdesc describes the input row, which might have different column
* types and/or different dropped-column positions than the target.
* values/nulls/tupdesc can all be NULL if we just want to assign nulls to
* all fields of the record or row.
*
* Since this uses the mcontext for workspace, caller should eventually call
* exec_eval_cleanup to prevent long-term memory leaks.
*/
static void
exec_move_row_from_fields(PLpgSQL_execstate *estate,
PLpgSQL_variable *target,
ExpandedRecordHeader *newerh,
Datum *values, bool *nulls,
TupleDesc tupdesc)
{
int td_natts = tupdesc ? tupdesc->natts : 0;
int fnum;
int anum;
int strict_multiassignment_level = 0;
/*
* The extra check strict strict_multi_assignment can be active, only when
* input tupdesc is specified.
*/
if (tupdesc != NULL)
{
if (plpgsql_extra_errors & PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT)
strict_multiassignment_level = ERROR;
else if (plpgsql_extra_warnings & PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT)
strict_multiassignment_level = WARNING;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Handle RECORD-target case */
if (target->dtype == PLPGSQL_DTYPE_REC)
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) target;
TupleDesc var_tupdesc;
Datum newvalues_local[64];
bool newnulls_local[64];
Assert(newerh != NULL); /* caller must have built new object */
var_tupdesc = expanded_record_get_tupdesc(newerh);
/*
* Coerce field values if needed. This might involve dealing with
* different sets of dropped columns and/or coercing individual column
* types. That's sort of a pain, but historically plpgsql has allowed
* it, so we preserve the behavior. However, it's worth a quick check
* to see if the tupdescs are identical. (Since expandedrecord.c
* prefers to use refcounted tupdescs from the typcache, expanded
* records with the same rowtype will have pointer-equal tupdescs.)
*/
if (var_tupdesc != tupdesc)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
int vtd_natts = var_tupdesc->natts;
Datum *newvalues;
bool *newnulls;
/*
* Need workspace arrays. If vtd_natts is small enough, use local
* arrays to save doing a palloc. Even if it's not small, we can
* allocate both the Datum and isnull arrays in one palloc chunk.
*/
if (vtd_natts <= lengthof(newvalues_local))
{
newvalues = newvalues_local;
newnulls = newnulls_local;
}
else
{
char *chunk;
chunk = eval_mcontext_alloc(estate,
vtd_natts * (sizeof(Datum) + sizeof(bool)));
newvalues = (Datum *) chunk;
newnulls = (bool *) (chunk + vtd_natts * sizeof(Datum));
}
/* Walk over destination columns */
anum = 0;
for (fnum = 0; fnum < vtd_natts; fnum++)
{
Form_pg_attribute attr = TupleDescAttr(var_tupdesc, fnum);
Datum value;
bool isnull;
Oid valtype;
int32 valtypmod;
if (attr->attisdropped)
{
/* expanded_record_set_fields should ignore this column */
continue; /* skip dropped column in record */
}
while (anum < td_natts &&
TupleDescAttr(tupdesc, anum)->attisdropped)
anum++; /* skip dropped column in tuple */
if (anum < td_natts)
{
value = values[anum];
isnull = nulls[anum];
valtype = TupleDescAttr(tupdesc, anum)->atttypid;
valtypmod = TupleDescAttr(tupdesc, anum)->atttypmod;
anum++;
}
else
{
/* no source for destination column */
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
value = (Datum) 0;
isnull = true;
valtype = UNKNOWNOID;
valtypmod = -1;
/* When source value is missing */
if (strict_multiassignment_level)
ereport(strict_multiassignment_level,
(errcode(ERRCODE_DATATYPE_MISMATCH),
2019-05-09 09:14:37 +02:00
errmsg("number of source and target fields in assignment does not match"),
/* translator: %s represents a name of an extra check */
errdetail("%s check of %s is active.",
"strict_multi_assignment",
strict_multiassignment_level == ERROR ? "extra_errors" :
"extra_warnings"),
errhint("Make sure the query returns the exact list of columns.")));
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
}
/* Cast the new value to the right type, if needed. */
newvalues[fnum] = exec_cast_value(estate,
value,
&isnull,
valtype,
valtypmod,
attr->atttypid,
attr->atttypmod);
newnulls[fnum] = isnull;
}
/*
* When strict_multiassignment extra check is active, then ensure
* there are no unassigned source attributes.
*/
if (strict_multiassignment_level && anum < td_natts)
{
/* skip dropped columns in the source descriptor */
while (anum < td_natts &&
TupleDescAttr(tupdesc, anum)->attisdropped)
anum++;
if (anum < td_natts)
ereport(strict_multiassignment_level,
(errcode(ERRCODE_DATATYPE_MISMATCH),
2019-05-09 09:14:37 +02:00
errmsg("number of source and target fields in assignment does not match"),
/* translator: %s represents a name of an extra check */
errdetail("%s check of %s is active.",
"strict_multi_assignment",
strict_multiassignment_level == ERROR ? "extra_errors" :
"extra_warnings"),
errhint("Make sure the query returns the exact list of columns.")));
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
values = newvalues;
nulls = newnulls;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Insert the coerced field values into the new expanded record */
2018-05-16 20:56:52 +02:00
expanded_record_set_fields(newerh, values, nulls, !estate->atomic);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Complete the assignment */
assign_record_var(estate, rec, newerh);
return;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* newerh should not have been passed in non-RECORD cases */
Assert(newerh == NULL);
/*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* For a row, we assign the individual field values to the variables the
* row points to.
*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* NOTE: both this code and the record code above silently ignore extra
* columns in the source and assume NULL for missing columns. This is
* pretty dubious but it's the historical behavior.
*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* If we have no input data at all, we'll assign NULL to all columns of
* the row variable.
*/
if (target->dtype == PLPGSQL_DTYPE_ROW)
{
PLpgSQL_row *row = (PLpgSQL_row *) target;
anum = 0;
for (fnum = 0; fnum < row->nfields; fnum++)
{
PLpgSQL_var *var;
Datum value;
bool isnull;
Oid valtype;
int32 valtypmod;
var = (PLpgSQL_var *) (estate->datums[row->varnos[fnum]]);
while (anum < td_natts &&
TupleDescAttr(tupdesc, anum)->attisdropped)
anum++; /* skip dropped column in tuple */
if (anum < td_natts)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
value = values[anum];
isnull = nulls[anum];
valtype = TupleDescAttr(tupdesc, anum)->atttypid;
valtypmod = TupleDescAttr(tupdesc, anum)->atttypmod;
anum++;
}
else
{
/* no source for destination column */
value = (Datum) 0;
isnull = true;
valtype = UNKNOWNOID;
valtypmod = -1;
if (strict_multiassignment_level)
ereport(strict_multiassignment_level,
(errcode(ERRCODE_DATATYPE_MISMATCH),
2019-05-09 09:14:37 +02:00
errmsg("number of source and target fields in assignment does not match"),
/* translator: %s represents a name of an extra check */
errdetail("%s check of %s is active.",
"strict_multi_assignment",
strict_multiassignment_level == ERROR ? "extra_errors" :
"extra_warnings"),
errhint("Make sure the query returns the exact list of columns.")));
}
exec_assign_value(estate, (PLpgSQL_datum *) var,
value, isnull, valtype, valtypmod);
}
/*
* When strict_multiassignment extra check is active, ensure there are
* no unassigned source attributes.
*/
if (strict_multiassignment_level && anum < td_natts)
{
while (anum < td_natts &&
TupleDescAttr(tupdesc, anum)->attisdropped)
anum++; /* skip dropped column in tuple */
if (anum < td_natts)
ereport(strict_multiassignment_level,
(errcode(ERRCODE_DATATYPE_MISMATCH),
2019-05-09 09:14:37 +02:00
errmsg("number of source and target fields in assignment does not match"),
/* translator: %s represents a name of an extra check */
errdetail("%s check of %s is active.",
"strict_multi_assignment",
strict_multiassignment_level == ERROR ? "extra_errors" :
"extra_warnings"),
errhint("Make sure the query returns the exact list of columns.")));
}
return;
}
elog(ERROR, "unsupported target type: %d", target->dtype);
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* compatible_tupdescs: detect whether two tupdescs are physically compatible
*
* TRUE indicates that a tuple satisfying src_tupdesc can be used directly as
* a value for a composite variable using dst_tupdesc.
*/
static bool
compatible_tupdescs(TupleDesc src_tupdesc, TupleDesc dst_tupdesc)
{
int i;
/* Possibly we could allow src_tupdesc to have extra columns? */
if (dst_tupdesc->natts != src_tupdesc->natts)
return false;
for (i = 0; i < dst_tupdesc->natts; i++)
{
Form_pg_attribute dattr = TupleDescAttr(dst_tupdesc, i);
Form_pg_attribute sattr = TupleDescAttr(src_tupdesc, i);
if (dattr->attisdropped != sattr->attisdropped)
return false;
if (!dattr->attisdropped)
{
/* Normal columns must match by type and typmod */
if (dattr->atttypid != sattr->atttypid ||
(dattr->atttypmod >= 0 &&
dattr->atttypmod != sattr->atttypmod))
return false;
}
else
{
/* Dropped columns are OK as long as length/alignment match */
if (dattr->attlen != sattr->attlen ||
dattr->attalign != sattr->attalign)
return false;
}
}
return true;
}
/* ----------
* make_tuple_from_row Make a tuple from the values of a row object
*
* A NULL return indicates rowtype mismatch; caller must raise suitable error
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
*
* The result tuple is freshly palloc'd in caller's context. Some junk
* may be left behind in eval_mcontext, too.
* ----------
*/
static HeapTuple
make_tuple_from_row(PLpgSQL_execstate *estate,
PLpgSQL_row *row,
TupleDesc tupdesc)
{
int natts = tupdesc->natts;
HeapTuple tuple;
Datum *dvalues;
bool *nulls;
int i;
if (natts != row->nfields)
return NULL;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
dvalues = (Datum *) eval_mcontext_alloc0(estate, natts * sizeof(Datum));
nulls = (bool *) eval_mcontext_alloc(estate, natts * sizeof(bool));
for (i = 0; i < natts; i++)
{
Oid fieldtypeid;
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
int32 fieldtypmod;
if (TupleDescAttr(tupdesc, i)->attisdropped)
{
nulls[i] = true; /* leave the column as null */
continue;
}
exec_eval_datum(estate, estate->datums[row->varnos[i]],
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
&fieldtypeid, &fieldtypmod,
&dvalues[i], &nulls[i]);
if (fieldtypeid != TupleDescAttr(tupdesc, i)->atttypid)
return NULL;
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
/* XXX should we insist on typmod match, too? */
}
tuple = heap_form_tuple(tupdesc, dvalues, nulls);
return tuple;
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* deconstruct_composite_datum extract tuple+tupdesc from composite Datum
*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* The caller must supply a HeapTupleData variable, in which we set up a
* tuple header pointing to the composite datum's body. To make the tuple
* value outlive that variable, caller would need to apply heap_copytuple...
* but current callers only need a short-lived tuple value anyway.
*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* Returns a pointer to the TupleDesc of the datum's rowtype.
* Caller is responsible for calling ReleaseTupleDesc when done with it.
*
* Note: it's caller's responsibility to be sure value is of composite type.
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* Also, best to call this in a short-lived context, as it might leak memory.
*/
static TupleDesc
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
deconstruct_composite_datum(Datum value, HeapTupleData *tmptup)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
HeapTupleHeader td;
Oid tupType;
int32 tupTypmod;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Get tuple body (note this could involve detoasting) */
td = DatumGetHeapTupleHeader(value);
/* Build a temporary HeapTuple control structure */
tmptup->t_len = HeapTupleHeaderGetDatumLength(td);
ItemPointerSetInvalid(&(tmptup->t_self));
tmptup->t_tableOid = InvalidOid;
tmptup->t_data = td;
/* Extract rowtype info and find a tupdesc */
tupType = HeapTupleHeaderGetTypeId(td);
tupTypmod = HeapTupleHeaderGetTypMod(td);
return lookup_rowtype_tupdesc(tupType, tupTypmod);
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* exec_move_row_from_datum Move a composite Datum into a record or row
*
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* This is equivalent to deconstruct_composite_datum() followed by
* exec_move_row(), but we can optimize things if the Datum is an
* expanded-record reference.
*
* Note: it's caller's responsibility to be sure value is of composite type.
*/
static void
exec_move_row_from_datum(PLpgSQL_execstate *estate,
PLpgSQL_variable *target,
Datum value)
{
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Check to see if source is an expanded record */
if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(value)))
{
ExpandedRecordHeader *erh = (ExpandedRecordHeader *) DatumGetEOHP(value);
ExpandedRecordHeader *newerh = NULL;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
Assert(erh->er_magic == ER_MAGIC);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* These cases apply if the target is record not row... */
if (target->dtype == PLPGSQL_DTYPE_REC)
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) target;
/*
* If it's the same record already stored in the variable, do
* nothing. This would happen only in silly cases like "r := r",
* but we need some check to avoid possibly freeing the variable's
* live value below. Note that this applies even if what we have
* is a R/O pointer.
*/
if (erh == rec->erh)
return;
Fix plpgsql to re-look-up composite type names at need. Commit 4b93f5799 rearranged things in plpgsql to make it cope better with composite types changing underneath it intra-session. However, I failed to consider the case of a composite type being dropped and recreated entirely. In my defense, the previous coding didn't consider that possibility at all either --- but it would accidentally work so long as you didn't change the type's field list, because the built-at-compile-time list of component variables would then still match the type's new definition. The new coding, however, occasionally tries to re-look-up the type by OID, and then fails to find the dropped type. To fix this, we need to save the TypeName struct, and then redo the type OID lookup from that. Of course that's expensive, so we don't want to do it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye on the type's typcache entry to see if its tupledesc has been invalidated. (Perhaps, at some point, this mechanism should be generalized so it can work for non-composite types too; but for now, plpgsql only tries to cope with intra-session redefinitions of composites.) I'm slightly hesitant to back-patch this into v11, because it changes the contents of struct PLpgSQL_type as well as the signature of plpgsql_build_datatype(), so in principle it could break code that is poking into the innards of plpgsql. However, the only popular extension of that ilk is pldebugger, and it doesn't seem to be affected. Since this is a regression for people who were relying on the old behavior, it seems worth taking the small risk of causing compatibility issues. Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799 came in. Discussion: https://postgr.es/m/15913-a7e112e16dedcffc@postgresql.org
2019-08-15 21:21:47 +02:00
/*
* Make sure rec->rectypeid is up-to-date before using it.
*/
revalidate_rectypeid(rec);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* If we have a R/W pointer, we're allowed to just commandeer
* ownership of the expanded record. If it's of the right type to
* put into the record variable, do that. (Note we don't accept
* an expanded record of a composite-domain type as a RECORD
* value. We'll treat it as the base composite type instead;
* compare logic in make_expanded_record_for_rec.)
*/
if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(value)) &&
(rec->rectypeid == erh->er_decltypeid ||
(rec->rectypeid == RECORDOID &&
!ExpandedRecordIsDomain(erh))))
{
assign_record_var(estate, rec, erh);
return;
}
/*
* If we already have an expanded record object in the target
* variable, and the source record contains a valid tuple
* representation with the right rowtype, then we can skip making
* a new expanded record and just assign the tuple with
* expanded_record_set_tuple. (We can't do the equivalent if we
* have to do field-by-field assignment, since that wouldn't be
* atomic if there's an error.) We consider that there's a
* rowtype match only if it's the same named composite type or
* same registered rowtype; checking for matches of anonymous
* rowtypes would be more expensive than this is worth.
*/
if (rec->erh &&
(erh->flags & ER_FLAG_FVALUE_VALID) &&
erh->er_typeid == rec->erh->er_typeid &&
(erh->er_typeid != RECORDOID ||
(erh->er_typmod == rec->erh->er_typmod &&
erh->er_typmod >= 0)))
{
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(rec->erh, erh->fvalue,
true, !estate->atomic);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
return;
}
/*
* Otherwise we're gonna need a new expanded record object. Make
* it here in hopes of piggybacking on the source object's
* previous typcache lookup.
*/
newerh = make_expanded_record_for_rec(estate, rec, NULL, erh);
/*
* If the expanded record contains a valid tuple representation,
* and we don't need rowtype conversion, then just copying the
* tuple is probably faster than field-by-field processing. (This
* isn't duplicative of the previous check, since here we will
* catch the case where the record variable was previously empty.)
*/
if ((erh->flags & ER_FLAG_FVALUE_VALID) &&
(rec->rectypeid == RECORDOID ||
rec->rectypeid == erh->er_typeid))
{
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(newerh, erh->fvalue,
true, !estate->atomic);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
assign_record_var(estate, rec, newerh);
return;
}
/*
* Need to special-case empty source record, else code below would
* leak newerh.
*/
if (ExpandedRecordIsEmpty(erh))
{
/* Set newerh to a row of NULLs */
deconstruct_expanded_record(newerh);
assign_record_var(estate, rec, newerh);
return;
}
} /* end of record-target-only cases */
/*
* If the source expanded record is empty, we should treat that like a
* NULL tuple value. (We're unlikely to see such a case, but we must
* check this; deconstruct_expanded_record would cause a change of
* logical state, which is not OK.)
*/
if (ExpandedRecordIsEmpty(erh))
{
exec_move_row(estate, target, NULL,
expanded_record_get_tupdesc(erh));
return;
}
/*
* Otherwise, ensure that the source record is deconstructed, and
* assign from its field values.
*/
deconstruct_expanded_record(erh);
exec_move_row_from_fields(estate, target, newerh,
erh->dvalues, erh->dnulls,
expanded_record_get_tupdesc(erh));
}
else
{
/*
* Nope, we've got a plain composite Datum. Deconstruct it; but we
* don't use deconstruct_composite_datum(), because we may be able to
* skip calling lookup_rowtype_tupdesc().
*/
HeapTupleHeader td;
HeapTupleData tmptup;
Oid tupType;
int32 tupTypmod;
TupleDesc tupdesc;
MemoryContext oldcontext;
/* Ensure that any detoasted data winds up in the eval_mcontext */
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
/* Get tuple body (note this could involve detoasting) */
td = DatumGetHeapTupleHeader(value);
MemoryContextSwitchTo(oldcontext);
/* Build a temporary HeapTuple control structure */
tmptup.t_len = HeapTupleHeaderGetDatumLength(td);
ItemPointerSetInvalid(&(tmptup.t_self));
tmptup.t_tableOid = InvalidOid;
tmptup.t_data = td;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Extract rowtype info */
tupType = HeapTupleHeaderGetTypeId(td);
tupTypmod = HeapTupleHeaderGetTypMod(td);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* Now, if the target is record not row, maybe we can optimize ... */
if (target->dtype == PLPGSQL_DTYPE_REC)
{
PLpgSQL_rec *rec = (PLpgSQL_rec *) target;
/*
* If we already have an expanded record object in the target
* variable, and the source datum has a matching rowtype, then we
* can skip making a new expanded record and just assign the tuple
* with expanded_record_set_tuple. We consider that there's a
* rowtype match only if it's the same named composite type or
* same registered rowtype. (Checking to reject an anonymous
* rowtype here should be redundant, but let's be safe.)
*/
if (rec->erh &&
tupType == rec->erh->er_typeid &&
(tupType != RECORDOID ||
(tupTypmod == rec->erh->er_typmod &&
tupTypmod >= 0)))
{
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(rec->erh, &tmptup,
true, !estate->atomic);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
return;
}
/*
* If the source datum has a rowtype compatible with the target
* variable, just build a new expanded record and assign the tuple
* into it. Using make_expanded_record_from_typeid() here saves
* one typcache lookup compared to the code below.
*/
if (rec->rectypeid == RECORDOID || rec->rectypeid == tupType)
{
ExpandedRecordHeader *newerh;
MemoryContext mcontext = get_eval_mcontext(estate);
newerh = make_expanded_record_from_typeid(tupType, tupTypmod,
mcontext);
2018-05-16 20:56:52 +02:00
expanded_record_set_tuple(newerh, &tmptup,
true, !estate->atomic);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
assign_record_var(estate, rec, newerh);
return;
}
/*
* Otherwise, we're going to need conversion, so fall through to
* do it the hard way.
*/
}
/*
* ROW target, or unoptimizable RECORD target, so we have to expend a
* lookup to obtain the source datum's tupdesc.
*/
tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);
/* Do the move */
exec_move_row(estate, target, &tmptup, tupdesc);
/* Release tupdesc usage count */
ReleaseTupleDesc(tupdesc);
}
}
/*
* If we have not created an expanded record to hold the record variable's
* value, do so. The expanded record will be "empty", so this does not
* change the logical state of the record variable: it's still NULL.
* However, now we'll have a tupdesc with which we can e.g. look up fields.
*/
static void
instantiate_empty_record_variable(PLpgSQL_execstate *estate, PLpgSQL_rec *rec)
{
Assert(rec->erh == NULL); /* else caller error */
/* If declared type is RECORD, we can't instantiate */
if (rec->rectypeid == RECORDOID)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("record \"%s\" is not assigned yet", rec->refname),
errdetail("The tuple structure of a not-yet-assigned record is indeterminate.")));
Fix plpgsql to re-look-up composite type names at need. Commit 4b93f5799 rearranged things in plpgsql to make it cope better with composite types changing underneath it intra-session. However, I failed to consider the case of a composite type being dropped and recreated entirely. In my defense, the previous coding didn't consider that possibility at all either --- but it would accidentally work so long as you didn't change the type's field list, because the built-at-compile-time list of component variables would then still match the type's new definition. The new coding, however, occasionally tries to re-look-up the type by OID, and then fails to find the dropped type. To fix this, we need to save the TypeName struct, and then redo the type OID lookup from that. Of course that's expensive, so we don't want to do it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye on the type's typcache entry to see if its tupledesc has been invalidated. (Perhaps, at some point, this mechanism should be generalized so it can work for non-composite types too; but for now, plpgsql only tries to cope with intra-session redefinitions of composites.) I'm slightly hesitant to back-patch this into v11, because it changes the contents of struct PLpgSQL_type as well as the signature of plpgsql_build_datatype(), so in principle it could break code that is poking into the innards of plpgsql. However, the only popular extension of that ilk is pldebugger, and it doesn't seem to be affected. Since this is a regression for people who were relying on the old behavior, it seems worth taking the small risk of causing compatibility issues. Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799 came in. Discussion: https://postgr.es/m/15913-a7e112e16dedcffc@postgresql.org
2019-08-15 21:21:47 +02:00
/* Make sure rec->rectypeid is up-to-date before using it */
revalidate_rectypeid(rec);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* OK, do it */
rec->erh = make_expanded_record_from_typeid(rec->rectypeid, -1,
estate->datum_context);
}
/* ----------
* convert_value_to_string Convert a non-null Datum to C string
*
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* Note: the result is in the estate's eval_mcontext, and will be cleared
* by the next exec_eval_cleanup() call. The invoked output function might
* leave additional cruft there as well, so just pfree'ing the result string
* would not be enough to avoid memory leaks if we did not do it like this.
* In most usages the Datum being passed in is also in that context (if
* pass-by-reference) and so an exec_eval_cleanup() call is needed anyway.
*
* Note: not caching the conversion function lookup is bad for performance.
* However, this function isn't currently used in any places where an extra
* catalog lookup or two seems like a big deal.
* ----------
*/
static char *
convert_value_to_string(PLpgSQL_execstate *estate, Datum value, Oid valtype)
{
char *result;
MemoryContext oldcontext;
Oid typoutput;
bool typIsVarlena;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
getTypeOutputInfo(valtype, &typoutput, &typIsVarlena);
result = OidOutputFunctionCall(typoutput, value);
MemoryContextSwitchTo(oldcontext);
return result;
}
/* ----------
* exec_cast_value Cast a value if required
*
* Note that *isnull is an input and also an output parameter. While it's
* unlikely that a cast operation would produce null from non-null or vice
* versa, that could happen in principle.
*
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* Note: the estate's eval_mcontext is used for temporary storage, and may
* also contain the result Datum if we have to do a conversion to a pass-
* by-reference data type. Be sure to do an exec_eval_cleanup() call when
* done with the result.
* ----------
*/
static inline Datum
exec_cast_value(PLpgSQL_execstate *estate,
Datum value, bool *isnull,
Oid valtype, int32 valtypmod,
Oid reqtype, int32 reqtypmod)
{
/*
* If the type of the given value isn't what's requested, convert it.
*/
if (valtype != reqtype ||
(valtypmod != reqtypmod && reqtypmod != -1))
{
/* We keep the slow path out-of-line. */
value = do_cast_value(estate, value, isnull, valtype, valtypmod,
reqtype, reqtypmod);
}
return value;
}
/* ----------
* do_cast_value Slow path for exec_cast_value.
* ----------
*/
static Datum
do_cast_value(PLpgSQL_execstate *estate,
Datum value, bool *isnull,
Oid valtype, int32 valtypmod,
Oid reqtype, int32 reqtypmod)
{
plpgsql_CastHashEntry *cast_entry;
cast_entry = get_cast_hashentry(estate,
valtype, valtypmod,
reqtype, reqtypmod);
if (cast_entry)
{
ExprContext *econtext = estate->eval_econtext;
MemoryContext oldcontext;
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
econtext->caseValue_datum = value;
econtext->caseValue_isNull = *isnull;
cast_entry->cast_in_use = true;
2015-07-17 21:53:09 +02:00
value = ExecEvalExpr(cast_entry->cast_exprstate, econtext,
isnull);
2015-07-17 21:53:09 +02:00
cast_entry->cast_in_use = false;
MemoryContextSwitchTo(oldcontext);
}
return value;
}
/* ----------
2015-07-17 21:53:09 +02:00
* get_cast_hashentry Look up how to perform a type cast
*
2015-07-17 21:53:09 +02:00
* Returns a plpgsql_CastHashEntry if an expression has to be evaluated,
* or NULL if the cast is a mere no-op relabeling. If there's work to be
* done, the cast_exprstate field contains an expression evaluation tree
* based on a CaseTestExpr input, and the cast_in_use field should be set
* true while executing it.
* ----------
*/
2015-07-17 21:53:09 +02:00
static plpgsql_CastHashEntry *
get_cast_hashentry(PLpgSQL_execstate *estate,
Oid srctype, int32 srctypmod,
Oid dsttype, int32 dsttypmod)
{
plpgsql_CastHashKey cast_key;
plpgsql_CastHashEntry *cast_entry;
bool found;
2015-07-17 21:53:09 +02:00
LocalTransactionId curlxid;
MemoryContext oldcontext;
/* Look for existing entry */
cast_key.srctype = srctype;
cast_key.dsttype = dsttype;
cast_key.srctypmod = srctypmod;
cast_key.dsttypmod = dsttypmod;
cast_entry = (plpgsql_CastHashEntry *) hash_search(estate->cast_hash,
(void *) &cast_key,
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
HASH_ENTER, &found);
if (!found) /* initialize if new entry */
cast_entry->cast_cexpr = NULL;
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
if (cast_entry->cast_cexpr == NULL ||
!cast_entry->cast_cexpr->is_valid)
2015-07-17 21:53:09 +02:00
{
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
/*
* We've not looked up this coercion before, or we have but the cached
* expression has been invalidated.
*/
2015-07-17 21:53:09 +02:00
Node *cast_expr;
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
CachedExpression *cast_cexpr;
2015-07-17 21:53:09 +02:00
CaseTestExpr *placeholder;
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
/*
* Drop old cached expression if there is one.
*/
if (cast_entry->cast_cexpr)
{
FreeCachedExpression(cast_entry->cast_cexpr);
cast_entry->cast_cexpr = NULL;
}
2015-07-17 21:53:09 +02:00
/*
* Since we could easily fail (no such coercion), construct a
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* temporary coercion expression tree in the short-lived
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
* eval_mcontext, then if successful save it as a CachedExpression.
2015-07-17 21:53:09 +02:00
*/
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
2015-07-17 21:53:09 +02:00
/*
* We use a CaseTestExpr as the base of the coercion tree, since it's
* very cheap to insert the source value for that.
*/
placeholder = makeNode(CaseTestExpr);
placeholder->typeId = srctype;
placeholder->typeMod = srctypmod;
placeholder->collation = get_typcollation(srctype);
2015-07-17 21:53:09 +02:00
/*
* Apply coercion. We use the special coercion context
* COERCION_PLPGSQL to match plpgsql's historical behavior, namely
* that any cast not available at ASSIGNMENT level will be implemented
* as an I/O coercion. (It's somewhat dubious that we prefer I/O
* coercion over cast pathways that exist at EXPLICIT level. Changing
* that would cause assorted minor behavioral differences though, and
* a user who wants the explicit-cast behavior can always write an
* explicit cast.)
2015-07-17 21:53:09 +02:00
*
* If source type is UNKNOWN, coerce_to_target_type will fail (it only
* expects to see that for Const input nodes), so don't call it; we'll
* apply CoerceViaIO instead. Likewise, it doesn't currently work for
* coercing RECORD to some other type, so skip for that too.
*/
if (srctype == UNKNOWNOID || srctype == RECORDOID)
cast_expr = NULL;
else
cast_expr = coerce_to_target_type(NULL,
2015-07-17 21:53:09 +02:00
(Node *) placeholder, srctype,
dsttype, dsttypmod,
COERCION_PLPGSQL,
COERCE_IMPLICIT_CAST,
-1);
2015-07-17 21:53:09 +02:00
/*
* If there's no cast path according to the parser, fall back to using
* an I/O coercion; this is semantically dubious but matches plpgsql's
* historical behavior. We would need something of the sort for
* UNKNOWN literals in any case. (This is probably now only reachable
* in the case where srctype is UNKNOWN/RECORD.)
2015-07-17 21:53:09 +02:00
*/
if (cast_expr == NULL)
{
CoerceViaIO *iocoerce = makeNode(CoerceViaIO);
iocoerce->arg = (Expr *) placeholder;
iocoerce->resulttype = dsttype;
iocoerce->resultcollid = InvalidOid;
iocoerce->coerceformat = COERCE_IMPLICIT_CAST;
iocoerce->location = -1;
cast_expr = (Node *) iocoerce;
if (dsttypmod != -1)
cast_expr = coerce_to_target_type(NULL,
cast_expr, dsttype,
dsttype, dsttypmod,
COERCION_ASSIGNMENT,
COERCE_IMPLICIT_CAST,
-1);
}
/* Note: we don't bother labeling the expression tree with collation */
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
/* Plan the expression and build a CachedExpression */
cast_cexpr = GetCachedExpression(cast_expr);
cast_expr = cast_cexpr->expr;
2015-07-17 21:53:09 +02:00
/* Detect whether we have a no-op (RelabelType) coercion */
if (IsA(cast_expr, RelabelType) &&
((RelabelType *) cast_expr)->arg == (Expr *) placeholder)
cast_expr = NULL;
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
/* Now we can fill in the hashtable entry. */
cast_entry->cast_cexpr = cast_cexpr;
2015-07-17 21:53:09 +02:00
cast_entry->cast_expr = (Expr *) cast_expr;
cast_entry->cast_exprstate = NULL;
cast_entry->cast_in_use = false;
cast_entry->cast_lxid = InvalidLocalTransactionId;
Drop no-op CoerceToDomain nodes from expressions at planning time. If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 19:24:43 +01:00
MemoryContextSwitchTo(oldcontext);
}
2015-07-17 21:53:09 +02:00
/* Done if we have determined that this is a no-op cast. */
if (cast_entry->cast_expr == NULL)
return NULL;
/*
2015-07-17 21:53:09 +02:00
* Prepare the expression for execution, if it's not been done already in
* the current transaction; also, if it's marked busy in the current
* transaction, abandon that expression tree and build a new one, so as to
* avoid potential problems with recursive cast expressions and failed
* executions. (We will leak some memory intra-transaction if that
* happens a lot, but we don't expect it to.) It's okay to update the
* hash table with the new tree because all plpgsql functions within a
* given transaction share the same simple_eval_estate. (Well, regular
* functions do; DO blocks have private simple_eval_estates, and private
* cast hash tables to go with them.)
*/
2015-07-17 21:53:09 +02:00
curlxid = MyProc->lxid;
if (cast_entry->cast_lxid != curlxid || cast_entry->cast_in_use)
{
oldcontext = MemoryContextSwitchTo(estate->simple_eval_estate->es_query_cxt);
cast_entry->cast_exprstate = ExecInitExpr(cast_entry->cast_expr, NULL);
cast_entry->cast_in_use = false;
cast_entry->cast_lxid = curlxid;
MemoryContextSwitchTo(oldcontext);
}
2015-07-17 21:53:09 +02:00
return cast_entry;
}
/* ----------
* exec_simple_check_plan - Check if a plan is simple enough to
* be evaluated by ExecEvalExpr() instead
* of SPI.
Fix corner-case uninitialized-variable issues in plpgsql. If an error was raised during our initial attempt to check whether a successfully-compiled expression is "simple", subsequent calls of exec_stmt_execsql would suppose that stmt->mod_stmt was already computed when it had not been. This could lead to assertion failures in debug builds; in production builds the effect would typically be to act as if INTO STRICT had been specified even when it had not been. Of course that only matters if the subsequent attempt to execute the expression succeeds, so that the problem can only be reached by fixing a failure in some referenced, inline-able SQL function and then retrying the calling plpgsql function in the same session. (There might be even-more-obscure ways to change the expression's behavior without changing the plpgsql function, but that one seems like the only one people would be likely to hit in practice.) The most foolproof way to fix this would be to arrange for exec_prepare_plan to not set expr->plan until we've finished the subsidiary simple-expression check. But it seems hard to do that without creating reference-count leak issues. So settle for documenting the hazard in a comment and fixing exec_stmt_execsql to test separately for whether it's computed stmt->mod_stmt. (That adds a test-and-branch per execution, but hopefully that's negligible in context.) In v11 and up, also fix exec_stmt_call which had a variant of the same issue. Per bug #17113 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17113-077605ce00e0e7ec@postgresql.org
2021-07-20 19:01:48 +02:00
*
* Note: the refcount manipulations in this function assume that expr->plan
* is a "saved" SPI plan. That's a bit annoying from the caller's standpoint,
* but it's otherwise difficult to avoid leaking the plan on failure.
* ----------
*/
static void
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
{
Fix plpgsql's reporting of plan-time errors in possibly-simple expressions. exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression *would* show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
2013-01-31 02:02:23 +01:00
List *plansources;
CachedPlanSource *plansource;
Query *query;
CachedPlan *cplan;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
/*
* Initialize to "not simple".
*/
expr->expr_simple_expr = NULL;
expr->expr_rw_param = NULL;
/*
* Check the analyzed-and-rewritten form of the query to see if we will be
* able to treat it as a simple expression. Since this function is only
* called immediately after creating the CachedPlanSource, we need not
* worry about the query being stale.
*/
/*
* We can only test queries that resulted in exactly one CachedPlanSource
*/
Fix plpgsql's reporting of plan-time errors in possibly-simple expressions. exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression *would* show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
2013-01-31 02:02:23 +01:00
plansources = SPI_plan_get_plan_sources(expr->plan);
if (list_length(plansources) != 1)
return;
Fix plpgsql's reporting of plan-time errors in possibly-simple expressions. exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression *would* show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
2013-01-31 02:02:23 +01:00
plansource = (CachedPlanSource *) linitial(plansources);
/*
* 1. There must be one single querytree.
*/
if (list_length(plansource->query_list) != 1)
return;
query = (Query *) linitial(plansource->query_list);
/*
* 2. It must be a plain SELECT query without any input tables
*/
if (!IsA(query, Query))
return;
if (query->commandType != CMD_SELECT)
return;
if (query->rtable != NIL)
return;
/*
* 3. Can't have any subplans, aggregates, qual clauses either. (These
* tests should generally match what inline_function() checks before
* inlining a SQL function; otherwise, inlining could change our
* conclusion about whether an expression is simple, which we don't want.)
*/
if (query->hasAggs ||
query->hasWindowFuncs ||
Improve parser's and planner's handling of set-returning functions. Teach the parser to reject misplaced set-returning functions during parse analysis using p_expr_kind, in much the same way as we do for aggregates and window functions (cf commit eaccfded9). While this isn't complete (it misses nesting-based restrictions), it's much better than the previous error reporting for such cases, and it allows elimination of assorted ad-hoc expression_returns_set() error checks. We could add nesting checks later if it seems important to catch all cases at parse time. There is one case the parser will now throw error for although previous versions allowed it, which is SRFs in the tlist of an UPDATE. That never behaved sensibly (since it's ill-defined which generated row should be used to perform the update) and it's hard to see why it should not be treated as an error. It's a release-note-worthy change though. Also, add a new Query field hasTargetSRFs reporting whether there are any SRFs in the targetlist (including GROUP BY/ORDER BY expressions). The parser can now set that basically for free during parse analysis, and we can use it in a number of places to avoid expression_returns_set searches. (There will be more such checks soon.) In some places, this allows decontorting the logic since it's no longer expensive to check for SRFs in the tlist --- so I made the checks parallel to the handling of hasAggs/hasWindowFuncs wherever it seemed appropriate. catversion bump because adding a Query field changes stored rules. Andres Freund and Tom Lane Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 19:54:24 +02:00
query->hasTargetSRFs ||
query->hasSubLinks ||
query->cteList ||
query->jointree->fromlist ||
query->jointree->quals ||
query->groupClause ||
query->groupingSets ||
query->havingQual ||
query->windowClause ||
query->distinctClause ||
query->sortClause ||
query->limitOffset ||
query->limitCount ||
query->setOperations)
return;
/*
* 4. The query must have a single attribute as result
*/
if (list_length(query->targetList) != 1)
return;
/*
* OK, we can treat it as a simple plan.
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
*
* Get the generic plan for the query. If replanning is needed, do that
Fix corner-case uninitialized-variable issues in plpgsql. If an error was raised during our initial attempt to check whether a successfully-compiled expression is "simple", subsequent calls of exec_stmt_execsql would suppose that stmt->mod_stmt was already computed when it had not been. This could lead to assertion failures in debug builds; in production builds the effect would typically be to act as if INTO STRICT had been specified even when it had not been. Of course that only matters if the subsequent attempt to execute the expression succeeds, so that the problem can only be reached by fixing a failure in some referenced, inline-able SQL function and then retrying the calling plpgsql function in the same session. (There might be even-more-obscure ways to change the expression's behavior without changing the plpgsql function, but that one seems like the only one people would be likely to hit in practice.) The most foolproof way to fix this would be to arrange for exec_prepare_plan to not set expr->plan until we've finished the subsidiary simple-expression check. But it seems hard to do that without creating reference-count leak issues. So settle for documenting the hazard in a comment and fixing exec_stmt_execsql to test separately for whether it's computed stmt->mod_stmt. (That adds a test-and-branch per execution, but hopefully that's negligible in context.) In v11 and up, also fix exec_stmt_call which had a variant of the same issue. Per bug #17113 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17113-077605ce00e0e7ec@postgresql.org
2021-07-20 19:01:48 +02:00
* work in the eval_mcontext. (Note that replanning could throw an error,
* in which case the expr is left marked "not simple", which is fine.)
*/
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
Fix plpgsql's reporting of plan-time errors in possibly-simple expressions. exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression *would* show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
2013-01-31 02:02:23 +01:00
cplan = SPI_plan_get_cached_plan(expr->plan);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
Fix plpgsql's reporting of plan-time errors in possibly-simple expressions. exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression *would* show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
2013-01-31 02:02:23 +01:00
/* Can't fail, because we checked for a single CachedPlanSource above */
Assert(cplan != NULL);
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/*
* Verify that plancache.c thinks the plan is simple enough to use
* CachedPlanIsSimplyValid. Given the restrictions above, it's unlikely
* that this could fail, but if it does, just treat plan as not simple. On
* success, save a refcount on the plan in the simple-expression resowner.
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
*/
if (CachedPlanAllowsSimpleValidityCheck(plansource, cplan,
estate->simple_eval_resowner))
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
{
/* Remember that we have the refcount */
expr->expr_simple_plansource = plansource;
expr->expr_simple_plan = cplan;
expr->expr_simple_plan_lxid = MyProc->lxid;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/* Share the remaining work with the replan code path */
exec_save_simple_expr(expr, cplan);
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
}
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/*
* Release the plan refcount obtained by SPI_plan_get_cached_plan. (This
* refcount is held by the wrong resowner, so we can't just repurpose it.)
*/
Improve performance of repeated CALLs within plpgsql procedures. This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 04:28:29 +01:00
ReleaseCachedPlan(cplan, CurrentResourceOwner);
}
/*
* exec_save_simple_expr --- extract simple expression from CachedPlan
*/
static void
exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan)
{
PlannedStmt *stmt;
Plan *plan;
Expr *tle_expr;
/*
* Given the checks that exec_simple_check_plan did, none of the Asserts
* here should ever fail.
*/
/* Extract the single PlannedStmt */
Assert(list_length(cplan->stmt_list) == 1);
stmt = linitial_node(PlannedStmt, cplan->stmt_list);
Assert(stmt->commandType == CMD_SELECT);
/*
* Ordinarily, the plan node should be a simple Result. However, if
* force_parallel_mode is on, the planner might've stuck a Gather node
* atop that. The simplest way to deal with this is to look through the
* Gather node. The Gather node's tlist would normally contain a Var
* referencing the child node's output, but it could also be a Param, or
* it could be a Const that setrefs.c copied as-is.
*/
plan = stmt->planTree;
for (;;)
{
/* Extract the single tlist expression */
Assert(list_length(plan->targetlist) == 1);
tle_expr = linitial_node(TargetEntry, plan->targetlist)->expr;
if (IsA(plan, Result))
{
Assert(plan->lefttree == NULL &&
plan->righttree == NULL &&
plan->initPlan == NULL &&
plan->qual == NULL &&
((Result *) plan)->resconstantqual == NULL);
break;
}
else if (IsA(plan, Gather))
{
Assert(plan->lefttree != NULL &&
plan->righttree == NULL &&
plan->initPlan == NULL &&
plan->qual == NULL);
/* If setrefs.c copied up a Const, no need to look further */
if (IsA(tle_expr, Const))
break;
/* Otherwise, it had better be a Param or an outer Var */
Assert(IsA(tle_expr, Param) || (IsA(tle_expr, Var) &&
((Var *) tle_expr)->varno == OUTER_VAR));
/* Descend to the child node */
plan = plan->lefttree;
}
else
elog(ERROR, "unexpected plan node type: %d",
(int) nodeTag(plan));
}
/*
* Save the simple expression, and initialize state to "not valid in
* current transaction".
*/
expr->expr_simple_expr = tle_expr;
expr->expr_simple_state = NULL;
expr->expr_simple_in_use = false;
expr->expr_simple_lxid = InvalidLocalTransactionId;
/* Also stash away the expression result type */
expr->expr_simple_type = exprType((Node *) tle_expr);
expr->expr_simple_typmod = exprTypmod((Node *) tle_expr);
/* We also want to remember if it is immutable or not */
expr->expr_simple_mutable = contain_mutable_functions((Node *) tle_expr);
/*
* Lastly, check to see if there's a possibility of optimizing a
* read/write parameter.
*/
exec_check_rw_parameter(expr);
}
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
/*
* exec_check_rw_parameter --- can we pass expanded object as read/write param?
*
* If we have an assignment like "x := array_append(x, foo)" in which the
* top-level function is trusted not to corrupt its argument in case of an
* error, then when x has an expanded object as value, it is safe to pass the
* value as a read/write pointer and let the function modify the value
* in-place.
*
* This function checks for a safe expression, and sets expr->expr_rw_param
* to the address of any Param within the expression that can be passed as
* read/write (there can be only one); or to NULL when there is no safe Param.
*
* Note that this mechanism intentionally applies the safety labeling to just
* one Param; the expression could contain other Params referencing the target
* variable, but those must still be treated as read-only.
*
* Also note that we only apply this optimization within simple expressions.
* There's no point in it for non-simple expressions, because the
* exec_run_select code path will flatten any expanded result anyway.
* Also, it's safe to assume that an expr_simple_expr tree won't get copied
* somewhere before it gets compiled, so that looking for pointer equality
* to expr_rw_param will work for matching the target Param. That'd be much
* shakier in the general case.
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
*/
static void
exec_check_rw_parameter(PLpgSQL_expr *expr)
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
{
int target_dno;
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
Oid funcid;
List *fargs;
ListCell *lc;
/* Assume unsafe */
expr->expr_rw_param = NULL;
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
/* Done if expression isn't an assignment source */
target_dno = expr->target_param;
if (target_dno < 0)
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
return;
/*
* If target variable isn't referenced by expression, no need to look
* further.
*/
if (!bms_is_member(target_dno, expr->paramnos))
return;
/* Shouldn't be here for non-simple expression */
Assert(expr->expr_simple_expr != NULL);
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
/*
* Top level of expression must be a simple FuncExpr, OpExpr, or
* SubscriptingRef, else we can't optimize.
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
*/
if (IsA(expr->expr_simple_expr, FuncExpr))
{
FuncExpr *fexpr = (FuncExpr *) expr->expr_simple_expr;
funcid = fexpr->funcid;
fargs = fexpr->args;
}
else if (IsA(expr->expr_simple_expr, OpExpr))
{
OpExpr *opexpr = (OpExpr *) expr->expr_simple_expr;
funcid = opexpr->opfuncid;
fargs = opexpr->args;
}
else if (IsA(expr->expr_simple_expr, SubscriptingRef))
{
SubscriptingRef *sbsref = (SubscriptingRef *) expr->expr_simple_expr;
/* We only trust standard varlena arrays to be safe */
if (get_typsubscript(sbsref->refcontainertype, NULL) !=
F_ARRAY_SUBSCRIPT_HANDLER)
return;
/* We can optimize the refexpr if it's the target, otherwise not */
if (sbsref->refexpr && IsA(sbsref->refexpr, Param))
{
Param *param = (Param *) sbsref->refexpr;
if (param->paramkind == PARAM_EXTERN &&
param->paramid == target_dno + 1)
{
/* Found the Param we want to pass as read/write */
expr->expr_rw_param = param;
return;
}
}
return;
}
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
else
return;
/*
* The top-level function must be one that we trust to be "safe".
* Currently we hard-wire the list, but it would be very desirable to
* allow extensions to mark their functions as safe ...
*/
if (!(funcid == F_ARRAY_APPEND ||
funcid == F_ARRAY_PREPEND))
return;
/*
* The target variable (in the form of a Param) must appear as a direct
* argument of the top-level function. References further down in the
* tree can't be optimized; but on the other hand, they don't invalidate
* optimizing the top-level call, since that will be executed last.
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
*/
foreach(lc, fargs)
{
Node *arg = (Node *) lfirst(lc);
if (arg && IsA(arg, Param))
{
Param *param = (Param *) arg;
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
if (param->paramkind == PARAM_EXTERN &&
param->paramid == target_dno + 1)
{
/* Found the Param we want to pass as read/write */
expr->expr_rw_param = param;
return;
}
}
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
}
}
/* ----------
* exec_set_found Set the global found variable to true/false
* ----------
*/
static void
exec_set_found(PLpgSQL_execstate *estate, bool state)
{
PLpgSQL_var *var;
var = (PLpgSQL_var *) (estate->datums[estate->found_varno]);
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(estate, var, BoolGetDatum(state), false, false);
}
/*
* plpgsql_create_econtext --- create an eval_econtext for the current function
*
2013-11-15 19:52:03 +01:00
* We may need to create a new shared_simple_eval_estate too, if there's not
* one already for the current transaction. The EState will be cleaned up at
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
* transaction end. Ditto for shared_simple_eval_resowner.
*/
static void
plpgsql_create_econtext(PLpgSQL_execstate *estate)
{
SimpleEcontextStackEntry *entry;
/*
* Create an EState for evaluation of simple expressions, if there's not
* one already in the current transaction. The EState is made a child of
* TopTransactionContext so it will have the right lifespan.
2013-11-15 19:52:03 +01:00
*
* Note that this path is never taken when beginning a DO block; the
* required EState was already made by plpgsql_inline_handler. However,
* if the DO block executes COMMIT or ROLLBACK, then we'll come here and
* make a shared EState to use for the rest of the DO block. That's OK;
* see the comments for shared_simple_eval_estate. (Note also that a DO
* block will continue to use its private cast hash table for the rest of
* the block. That's okay for now, but it might cause problems someday.)
*/
2013-11-15 19:52:03 +01:00
if (estate->simple_eval_estate == NULL)
{
MemoryContext oldcontext;
if (shared_simple_eval_estate == NULL)
{
oldcontext = MemoryContextSwitchTo(TopTransactionContext);
shared_simple_eval_estate = CreateExecutorState();
MemoryContextSwitchTo(oldcontext);
}
2013-11-15 19:52:03 +01:00
estate->simple_eval_estate = shared_simple_eval_estate;
}
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
/*
* Likewise for the simple-expression resource owner.
*/
if (estate->simple_eval_resowner == NULL)
{
if (shared_simple_eval_resowner == NULL)
shared_simple_eval_resowner =
ResourceOwnerCreate(TopTransactionResourceOwner,
"PL/pgSQL simple expressions");
estate->simple_eval_resowner = shared_simple_eval_resowner;
}
/*
* Create a child econtext for the current function.
*/
2013-11-15 19:52:03 +01:00
estate->eval_econtext = CreateExprContext(estate->simple_eval_estate);
/*
* Make a stack entry so we can clean up the econtext at subxact end.
* Stack entries are kept in TopTransactionContext for simplicity.
*/
entry = (SimpleEcontextStackEntry *)
MemoryContextAlloc(TopTransactionContext,
sizeof(SimpleEcontextStackEntry));
entry->stack_econtext = estate->eval_econtext;
entry->xact_subxid = GetCurrentSubTransactionId();
entry->next = simple_econtext_stack;
simple_econtext_stack = entry;
}
/*
* plpgsql_destroy_econtext --- destroy function's econtext
*
* We check that it matches the top stack entry, and destroy the stack
* entry along with the context.
*/
static void
plpgsql_destroy_econtext(PLpgSQL_execstate *estate)
{
SimpleEcontextStackEntry *next;
Assert(simple_econtext_stack != NULL);
Assert(simple_econtext_stack->stack_econtext == estate->eval_econtext);
next = simple_econtext_stack->next;
pfree(simple_econtext_stack);
simple_econtext_stack = next;
FreeExprContext(estate->eval_econtext, true);
estate->eval_econtext = NULL;
}
/*
* plpgsql_xact_cb --- post-transaction-commit-or-abort cleanup
*
* If a simple-expression EState was created in the current transaction,
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
* it has to be cleaned up. The same for the simple-expression resowner.
*/
void
plpgsql_xact_cb(XactEvent event, void *arg)
{
/*
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
* If we are doing a clean transaction shutdown, free the EState and tell
* the resowner to release whatever plancache references it has, so that
* all remaining resources will be released correctly. (We don't need to
* actually delete the resowner here; deletion of the
* TopTransactionResourceOwner will take care of that.)
*
* In an abort, we expect the regular abort recovery procedures to release
* everything of interest, so just clear our pointers.
*/
if (event == XACT_EVENT_COMMIT ||
event == XACT_EVENT_PARALLEL_COMMIT ||
event == XACT_EVENT_PREPARE)
{
Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command *committing* a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command *aborting* a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2018-01-22 14:30:16 +01:00
simple_econtext_stack = NULL;
2013-11-15 19:52:03 +01:00
if (shared_simple_eval_estate)
FreeExecutorState(shared_simple_eval_estate);
shared_simple_eval_estate = NULL;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
if (shared_simple_eval_resowner)
ResourceOwnerReleaseAllPlanCacheRefs(shared_simple_eval_resowner);
shared_simple_eval_resowner = NULL;
}
else if (event == XACT_EVENT_ABORT ||
event == XACT_EVENT_PARALLEL_ABORT)
{
simple_econtext_stack = NULL;
2013-11-15 19:52:03 +01:00
shared_simple_eval_estate = NULL;
Improve performance of "simple expressions" in PL/pgSQL. For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 23:58:57 +01:00
shared_simple_eval_resowner = NULL;
}
}
/*
* plpgsql_subxact_cb --- post-subtransaction-commit-or-abort cleanup
*
* Make sure any simple-expression econtexts created in the current
* subtransaction get cleaned up. We have to do this explicitly because
2013-11-15 19:52:03 +01:00
* no other code knows which econtexts belong to which level of subxact.
*/
void
plpgsql_subxact_cb(SubXactEvent event, SubTransactionId mySubid,
SubTransactionId parentSubid, void *arg)
{
if (event == SUBXACT_EVENT_COMMIT_SUB || event == SUBXACT_EVENT_ABORT_SUB)
{
while (simple_econtext_stack != NULL &&
simple_econtext_stack->xact_subxid == mySubid)
{
SimpleEcontextStackEntry *next;
2007-11-15 22:14:46 +01:00
FreeExprContext(simple_econtext_stack->stack_econtext,
(event == SUBXACT_EVENT_COMMIT_SUB));
next = simple_econtext_stack->next;
pfree(simple_econtext_stack);
simple_econtext_stack = next;
}
}
}
/*
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
* assign_simple_var --- assign a new value to any VAR datum.
*
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
* This should be the only mechanism for assignment to simple variables,
2018-05-16 20:56:52 +02:00
* lest we do the release of the old value incorrectly (not to mention
* the detoasting business).
*/
static void
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(PLpgSQL_execstate *estate, PLpgSQL_var *var,
Datum newvalue, bool isnull, bool freeable)
{
Assert(var->dtype == PLPGSQL_DTYPE_VAR ||
var->dtype == PLPGSQL_DTYPE_PROMISE);
2018-05-16 20:56:52 +02:00
/*
* In non-atomic contexts, we do not want to store TOAST pointers in
* variables, because such pointers might become stale after a commit.
* Forcibly detoast in such cases. We don't want to detoast (flatten)
* expanded objects, however; those should be OK across a transaction
* boundary since they're just memory-resident objects. (Elsewhere in
* this module, operations on expanded records likewise need to request
* detoasting of record fields when !estate->atomic. Expanded arrays are
* not a problem since all array entries are always detoasted.)
*/
if (!estate->atomic && !isnull && var->datatype->typlen == -1 &&
VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
{
MemoryContext oldcxt;
Datum detoasted;
/*
* Do the detoasting in the eval_mcontext to avoid long-term leakage
* of whatever memory toast fetching might leak. Then we have to copy
* the detoasted datum to the function's main context, which is a
* pain, but there's little choice.
*/
oldcxt = MemoryContextSwitchTo(get_eval_mcontext(estate));
detoasted = PointerGetDatum(detoast_external_attr((struct varlena *) DatumGetPointer(newvalue)));
2018-05-16 20:56:52 +02:00
MemoryContextSwitchTo(oldcxt);
/* Now's a good time to not leak the input value if it's freeable */
if (freeable)
pfree(DatumGetPointer(newvalue));
/* Once we copy the value, it's definitely freeable */
newvalue = datumCopy(detoasted, false, -1);
freeable = true;
/* Can't clean up eval_mcontext here, but it'll happen before long */
}
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
/* Free the old value if needed */
if (var->freeval)
{
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
if (DatumIsReadWriteExpandedObject(var->value,
var->isnull,
var->datatype->typlen))
DeleteExpandedObject(var->value);
else
pfree(DatumGetPointer(var->value));
}
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
/* Assign new value to datum */
var->value = newvalue;
var->isnull = isnull;
var->freeval = freeable;
/*
* If it's a promise variable, then either we just assigned the promised
* value, or the user explicitly assigned an overriding value. Either
* way, cancel the promise.
*/
var->promise = PLPGSQL_PROMISE_NONE;
}
/*
* free old value of a text variable and assign new value from C string
*/
static void
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_text_var(PLpgSQL_execstate *estate, PLpgSQL_var *var, const char *str)
{
Further reduce overhead for passing plpgsql variables to the executor. This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before 21dcda271365). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.
2015-07-05 18:57:17 +02:00
assign_simple_var(estate, var, CStringGetTextDatum(str), false, true);
}
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/*
* assign_record_var --- assign a new value to any REC datum.
*/
static void
assign_record_var(PLpgSQL_execstate *estate, PLpgSQL_rec *rec,
ExpandedRecordHeader *erh)
{
Assert(rec->dtype == PLPGSQL_DTYPE_REC);
/* Transfer new record object into datum_context */
TransferExpandedRecord(erh, estate->datum_context);
/* Free the old value ... */
if (rec->erh)
DeleteExpandedObject(ExpandedRecordGetDatum(rec->erh));
/* ... and install the new */
rec->erh = erh;
}
/*
* exec_eval_using_params --- evaluate params of USING clause
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
*
* The result data structure is created in the stmt_mcontext, and should
* be freed by resetting that context.
*/
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
static ParamListInfo
exec_eval_using_params(PLpgSQL_execstate *estate, List *params)
{
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
ParamListInfo paramLI;
int nargs;
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
MemoryContext stmt_mcontext;
MemoryContext oldcontext;
int i;
ListCell *lc;
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/* Fast path for no parameters: we can just return NULL */
if (params == NIL)
return NULL;
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
nargs = list_length(params);
stmt_mcontext = get_stmt_mcontext(estate);
oldcontext = MemoryContextSwitchTo(stmt_mcontext);
paramLI = makeParamList(nargs);
MemoryContextSwitchTo(oldcontext);
i = 0;
foreach(lc, params)
{
PLpgSQL_expr *param = (PLpgSQL_expr *) lfirst(lc);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
ParamExternData *prm = &paramLI->params[i];
int32 ppdtypmod;
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
/*
* Always mark params as const, since we only use the result with
* one-shot plans.
*/
prm->pflags = PARAM_FLAG_CONST;
prm->value = exec_eval_expr(estate, param,
&prm->isnull,
&prm->ptype,
&ppdtypmod);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
oldcontext = MemoryContextSwitchTo(stmt_mcontext);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
if (prm->ptype == UNKNOWNOID)
{
/*
* Treat 'unknown' parameters as text, since that's what most
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
* people would expect. The SPI functions can coerce unknown
* constants in a more intelligent way, but not unknown Params.
* This code also takes care of copying into the right context.
* Note we assume 'unknown' has the representation of C-string.
*/
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
prm->ptype = TEXTOID;
if (!prm->isnull)
prm->value = CStringGetTextDatum(DatumGetCString(prm->value));
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* pass-by-ref non null values must be copied into stmt_mcontext */
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
else if (!prm->isnull)
{
int16 typLen;
bool typByVal;
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
get_typlenbyval(prm->ptype, &typLen, &typByVal);
if (!typByVal)
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
prm->value = datumCopy(prm->value, typByVal, typLen);
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
exec_eval_cleanup(estate);
i++;
}
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
return paramLI;
}
/*
* Open portal for dynamic query
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
*
* Caution: this resets the stmt_mcontext at exit. We might eventually need
* to move that responsibility to the callers, but currently no caller needs
* to have statement-lifetime temp data that survives past this, so it's
* simpler to do it here.
*/
static Portal
exec_dynquery_with_params(PLpgSQL_execstate *estate,
PLpgSQL_expr *dynquery,
List *params,
const char *portalname,
int cursorOptions)
{
Portal portal;
Datum query;
bool isnull;
Oid restype;
int32 restypmod;
char *querystr;
SPIParseOpenOptions options;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext stmt_mcontext = get_stmt_mcontext(estate);
/*
* Evaluate the string expression after the EXECUTE keyword. Its result is
* the querystring we have to execute.
*/
query = exec_eval_expr(estate, dynquery, &isnull, &restype, &restypmod);
if (isnull)
ereport(ERROR,
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
2009-02-18 12:33:04 +01:00
errmsg("query string argument of EXECUTE is null")));
/* Get the C-String representation */
querystr = convert_value_to_string(estate, query, restype);
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* copy it into the stmt_mcontext before we clean up */
querystr = MemoryContextStrdup(stmt_mcontext, querystr);
exec_eval_cleanup(estate);
/*
* Open an implicit cursor for the query. We use SPI_cursor_parse_open
* even when there are no params, because this avoids making and freeing
* one copy of the plan.
*/
memset(&options, 0, sizeof(options));
options.params = exec_eval_using_params(estate, params);
options.cursorOptions = cursorOptions;
options.read_only = estate->readonly_func;
portal = SPI_cursor_parse_open(portalname, querystr, &options);
if (portal == NULL)
elog(ERROR, "could not open implicit cursor for query \"%s\": %s",
querystr, SPI_result_code_string(SPI_result));
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
/* Release transient data */
MemoryContextReset(stmt_mcontext);
return portal;
}
/*
* Return a formatted string with information about an expression's parameters,
* or NULL if the expression does not take any parameters.
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* The result is in the eval_mcontext.
*/
static char *
format_expr_params(PLpgSQL_execstate *estate,
const PLpgSQL_expr *expr)
{
int paramno;
int dno;
StringInfoData paramstr;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
if (!expr->paramnos)
return NULL;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
initStringInfo(&paramstr);
paramno = 0;
dno = -1;
while ((dno = bms_next_member(expr->paramnos, dno)) >= 0)
{
Datum paramdatum;
Oid paramtypeid;
bool paramisnull;
int32 paramtypmod;
PLpgSQL_var *curvar;
curvar = (PLpgSQL_var *) estate->datums[dno];
Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
exec_eval_datum(estate, (PLpgSQL_datum *) curvar,
&paramtypeid, &paramtypmod,
&paramdatum, &paramisnull);
appendStringInfo(&paramstr, "%s%s = ",
paramno > 0 ? ", " : "",
curvar->refname);
if (paramisnull)
appendStringInfoString(&paramstr, "NULL");
else
appendStringInfoStringQuoted(&paramstr,
convert_value_to_string(estate,
paramdatum,
paramtypeid),
-1);
paramno++;
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
return paramstr.data;
}
/*
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
* Return a formatted string with information about the parameter values,
* or NULL if there are no parameters.
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
* The result is in the eval_mcontext.
*/
static char *
format_preparedparamsdata(PLpgSQL_execstate *estate,
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
ParamListInfo paramLI)
{
int paramno;
StringInfoData paramstr;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContext oldcontext;
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
if (!paramLI)
return NULL;
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
initStringInfo(&paramstr);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
for (paramno = 0; paramno < paramLI->numParams; paramno++)
{
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
ParamExternData *prm = &paramLI->params[paramno];
/*
* Note: for now, this is only used on ParamListInfos produced by
* exec_eval_using_params(), so we don't worry about invoking the
* paramFetch hook or skipping unused parameters.
*/
appendStringInfo(&paramstr, "%s$%d = ",
paramno > 0 ? ", " : "",
paramno + 1);
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
if (prm->isnull)
appendStringInfoString(&paramstr, "NULL");
else
appendStringInfoStringQuoted(&paramstr,
convert_value_to_string(estate,
Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 18:14:32 +02:00
prm->value,
prm->ptype),
-1);
}
Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 20:51:10 +02:00
MemoryContextSwitchTo(oldcontext);
return paramstr.data;
}