Commit Graph

924 Commits

Author SHA1 Message Date
Tom Lane dfcc46fe30 Remove extra increment of plpgsql's statement counter for FOR loops.
This left gaps in the internal statement numbering, which is not
terribly harmful (else we'd have noticed sooner), but it's not
great either.

Oversight in bbd5c207b; backpatch to v12 where that came in.

Pavel Stehule

Discussion: https://postgr.es/m/CAFj8pRDXyQaJmpotNTQVc-t-WxdWZC35V2PnmwOaV1-taidFWA@mail.gmail.com
2021-02-02 14:35:12 -05:00
Tom Lane d5a83d79c9 Rethink recently-added SPI interfaces.
SPI_execute_with_receiver and SPI_cursor_parse_open_with_paramlist are
new in v14 (cf. commit 2f48ede08).  Before they can get out the door,
let's change their APIs to follow the practice recently established by
SPI_prepare_extended etc: shove all optional arguments into a struct
that callers are supposed to pre-zero.  The hope is to allow future
addition of more options without either API breakage or a continuing
proliferation of new SPI entry points.  With that in mind, choose
slightly more generic names for them: SPI_execute_extended and
SPI_cursor_parse_open respectively.

Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-26 16:37:12 -05:00
Tom Lane 7292fd8f1c Suppress compiler warnings from commit ee895a655.
For obscure reasons, some buildfarm members are now generating
complaints about plpgsql_call_handler's "retval" variable possibly
being used uninitialized.  It seems no less safe than it was before
that commit, but these complaints are (mostly?) new.  I trust that
initializing the variable where it's declared will be enough to
shut that up.

I also notice that some compilers are warning about setjmp clobber
of the same variable, which is maybe a bit more defensible.  Mark
it volatile to silence that.

Also, rearrange the logic to give procedure_resowner a single
point of initialization, in hopes of silencing some setjmp-clobber
warnings about that.  (Marking it volatile would serve too, but
its sibling variables are depending on single assignment, so let's
stick with that method.)

Discussion: https://postgr.es/m/E1l4F1z-0000cN-Lx@gemulon.postgresql.org
2021-01-26 13:58:18 -05:00
Tom Lane ee895a655c Improve performance of repeated CALLs within plpgsql procedures.
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4.  That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context.  That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did.  While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.

This patch therefore makes the following API changes:

* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.

* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan.  This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).

* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot".  It's better to treat that as an option of
SPI_prepare_extended.

Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands.  (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.

Pavel Stehule, with some further hacking by me

Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-25 22:28:29 -05:00
Tom Lane 39d4a15310 Fix plpgsql tests for debug_invalidate_system_caches_always.
Commit c9d529848 resulted in having a couple more places where
the error context stack for a failure varies depending on
debug_invalidate_system_caches_always (nee CLOBBER_CACHE_ALWAYS).
This is not very surprising, since we have to re-parse cached
plans if the plan cache is clobbered.  Stabilize the expected
test output by hiding the context stack in these places,
as we've done elsewhere in this test script.

(Another idea worth considering, now that we have
debug_invalidate_system_caches_always, is to force it to zero for
these test cases.  That seems like it'd risk reducing the coverage
of cache-clobber testing, which might or might not be worth being
able to verify that we get the expected error output in normal
cases.  For the moment I just stuck with the existing technique.)

In passing, update comments that referred to CLOBBER_CACHE_ALWAYS.

Per buildfarm member hyrax.
2021-01-08 18:12:07 -05:00
Tom Lane 1c1cbe279b Rethink the "read/write parameter" mechanism in pl/pgsql.
Performance issues with the preceding patch to re-implement array
element assignment within pl/pgsql led me to realize that the read/write
parameter mechanism is misdesigned.  Instead of requiring the assignment
source expression to be such that *all* its references to the target
variable could be passed as R/W, we really want to identify *one*
reference to the target variable to be passed as R/W, allowing any other
ones to be passed read/only as they would be by default.  As long as the
R/W reference is a direct argument to the top-level (hence last to be
executed) function in the expression, there is no harm in R/O references
being passed to other lower parts of the expression.  Nor is there any
use-case for more than one argument of the top-level function being R/W.

Hence, rewrite that logic to identify one single Param that references
the target variable, and make only that Param pass a read/write
reference, not any other Params referencing the target variable.

Discussion: https://postgr.es/m/4165684.1607707277@sss.pgh.pa.us
2021-01-04 12:39:27 -05:00
Tom Lane 1788828d33 Remove PLPGSQL_DTYPE_ARRAYELEM datum type within pl/pgsql.
In the wake of the previous commit, we don't really need this anymore,
since array assignment is primarily handled by the core code.

The only way that that code could still be reached is that a GET
DIAGNOSTICS target variable could be an array element.  But that
doesn't seem like a particularly essential feature.  I'd added it
in commit 55caaaeba, but just because it was easy not because
anyone had actually asked for it.  Hence, revert that patch and
then remove the now-unreachable stuff.  (If we really had to,
we could probably reimplement GET DIAGNOSTICS using the new
assignment machinery; but the cost/benefit ratio looks very poor,
and it'd likely be a bit slower.)

Note that PLPGSQL_DTYPE_RECFIELD remains.  It's possible that we
could get rid of that too, but maintaining the existing behaviors
for RECORD-type variables seems like it might be difficult.  Since
there's not any functional limitation in those code paths as there
was in the ARRAYELEM code, I've not pursued the idea.

Discussion: https://postgr.es/m/4165684.1607707277@sss.pgh.pa.us
2021-01-04 12:14:37 -05:00
Tom Lane c9d5298485 Re-implement pl/pgsql's expression and assignment parsing.
Invent new RawParseModes that allow the core grammar to handle
pl/pgsql expressions and assignments directly, and thereby get rid
of a lot of hackery in pl/pgsql's parser.  This moves a good deal
of knowledge about pl/pgsql into the core code: notably, we have to
invent a CoercionContext that matches pl/pgsql's (rather dubious)
historical behavior for assignment coercions.  That's getting away
from the original idea of pl/pgsql as an arm's-length extension of
the core, but really we crossed that bridge a long time ago.

The main advantage of doing this is that we can now use the core
parser to generate FieldStore and/or SubscriptingRef nodes to handle
assignments to pl/pgsql variables that are records or arrays.  That
fixes a number of cases that had never been implemented in pl/pgsql
assignment, such as nested records and array slicing, and it allows
pl/pgsql assignment to support the datatype-specific subscripting
behaviors introduced in commit c7aba7c14.

There are cosmetic benefits too: when a syntax error occurs in a
pl/pgsql expression, the error report no longer includes the confusing
"SELECT" keyword that used to get prefixed to the expression text.
Also, there seem to be some small speed gains.

Discussion: https://postgr.es/m/4165684.1607707277@sss.pgh.pa.us
2021-01-04 11:52:00 -05:00
Tom Lane 844fe9f159 Add the ability for the core grammar to have more than one parse target.
This patch essentially allows gram.y to implement a family of related
syntax trees, rather than necessarily always parsing a list of SQL
statements.  raw_parser() gains a new argument, enum RawParseMode,
to say what to do.  As proof of concept, add a mode that just parses
a TypeName without any other decoration, and use that to greatly
simplify typeStringToTypeName().

In addition, invent a new SPI entry point SPI_prepare_extended() to
allow SPI users (particularly plpgsql) to get at this new functionality.
In hopes of making this the last variant of SPI_prepare(), set up its
additional arguments as a struct rather than direct arguments, and
promise that future additions to the struct can default to zero.
SPI_prepare_cursor() and SPI_prepare_params() can perhaps go away at
some point.

Discussion: https://postgr.es/m/4165684.1607707277@sss.pgh.pa.us
2021-01-04 11:03:22 -05:00
Bruce Momjian ca3b37487b Update copyright for 2021
Backpatch-through: 9.5
2021-01-02 13:06:25 -05:00
Tom Lane 5f2e09bccc Further fix thinko in plpgsql memory leak fix.
There's a second call of get_eval_mcontext() that should also be
get_stmt_mcontext().  This is actually dead code, since no
interesting allocations happen before switching back to the
original context, but we should keep it in sync with the other
call to forestall possible future bugs.

Discussion: https://postgr.es/m/f075f7be-c654-9aa8-3ffc-e9214622f02a@enterprisedb.com
2020-12-28 11:55:23 -05:00
Tom Lane ea80d8d943 Fix thinko in plpgsql memory leak fix.
Commit a6b1f5365 intended to place the transient "target" list of
a CALL statement in the function's statement-lifespan context,
but I fat-fingered that and used get_eval_mcontext() instead of
get_stmt_mcontext().  The eval_mcontext belongs to the "simple
expression" infrastructure, which is destroyed at transaction end.
The net effect is that a CALL in a procedure to another procedure
that has OUT or INOUT parameters would fail if the called procedure
did a COMMIT.

Per report from Peter Eisentraut.  Back-patch to v11, like the
prior patch.

Discussion: https://postgr.es/m/f075f7be-c654-9aa8-3ffc-e9214622f02a@enterprisedb.com
2020-12-28 11:41:25 -05:00
Alexander Korotkov 6df7a9698b Multirange datatypes
Multiranges are basically sorted arrays of non-overlapping ranges with
set-theoretic operations defined over them.

Since v14, each range type automatically gets a corresponding multirange
datatype.  There are both manual and automatic mechanisms for naming multirange
types.  Once can specify multirange type name using multirange_type_name
attribute in CREATE TYPE.  Otherwise, a multirange type name is generated
automatically.  If the range type name contains "range" then we change that to
"multirange".  Otherwise, we add "_multirange" to the end.

Implementation of multiranges comes with a space-efficient internal
representation format, which evades extra paddings and duplicated storage of
oids.  Altogether this format allows fetching a particular range by its index
in O(n).

Statistic gathering and selectivity estimation are implemented for multiranges.
For this purpose, stored multirange is approximated as union range without gaps.
This field will likely need improvements in the future.

Catversion is bumped.

Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com
Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0
Author: Paul Jungwirth, revised by me
Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule
Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston
Reviewed-by: Zhihong Yu, Alexander Korotkov
2020-12-20 07:20:33 +03:00
Tom Lane b3817f5f77 Improve hash_create()'s API for some added robustness.
Invent a new flag bit HASH_STRINGS to specify C-string hashing, which
was formerly the default; and add assertions insisting that exactly
one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set.
This is in hopes of preventing recurrences of the type of oversight
fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS).

Also, when HASH_STRINGS is specified, insist that the keysize be
more than 8 bytes.  This is a heuristic, but it should catch
accidental use of HASH_STRINGS for integer or pointer keys.
(Nearly all existing use-cases set the keysize to NAMEDATALEN or
more, so there's little reason to think this restriction should
be problematic.)

Tweak hash_create() to insist that the HASH_ELEM flag be set, and
remove the defaults it had for keysize and entrysize.  Since those
defaults were undocumented and basically useless, no callers
omitted HASH_ELEM anyway.

Also, remove memset's zeroing the HASHCTL parameter struct from
those callers that had one.  This has never been really necessary,
and while it wasn't a bad coding convention it was confusing that
some callers did it and some did not.  We might as well save a few
cycles by standardizing on "not".

Also improve the documentation for hash_create().

In passing, improve reinit.c's usage of a hash table by storing
the key as a binary Oid rather than a string; and, since that's
a temporary hash table, allocate it in CurrentMemoryContext for
neatness.

Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us
2020-12-15 11:38:53 -05:00
Tom Lane c7aba7c14e Support subscripting of arbitrary types, not only arrays.
This patch generalizes the subscripting infrastructure so that any
data type can be subscripted, if it provides a handler function to
define what that means.  Traditional variable-length (varlena) arrays
all use array_subscript_handler(), while the existing fixed-length
types that support subscripting use raw_array_subscript_handler().
It's expected that other types that want to use subscripting notation
will define their own handlers.  (This patch provides no such new
features, though; it only lays the foundation for them.)

To do this, move the parser's semantic processing of subscripts
(including coercion to whatever data type is required) into a
method callback supplied by the handler.  On the execution side,
replace the ExecEvalSubscriptingRef* layer of functions with direct
calls to callback-supplied execution routines.  (Thus, essentially
no new run-time overhead should be caused by this patch.  Indeed,
there is room to remove some overhead by supplying specialized
execution routines.  This patch does a little bit in that line,
but more could be done.)

Additional work is required here and there to remove formerly
hard-wired assumptions about the result type, collation, etc
of a SubscriptingRef expression node; and to remove assumptions
that the subscript values must be integers.

One useful side-effect of this is that we now have a less squishy
mechanism for identifying whether a data type is a "true" array:
instead of wiring in weird rules about typlen, we can look to see
if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER.  For this
to be bulletproof, we have to forbid user-defined types from using
that handler directly; but there seems no good reason for them to
do so.

This patch also removes assumptions that the number of subscripts
is limited to MAXDIM (6), or indeed has any hard-wired limit.
That limit still applies to types handled by array_subscript_handler
or raw_array_subscript_handler, but to discourage other dependencies
on this constant, I've moved it from c.h to utils/array.h.

Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov,
Peter Eisentraut, Pavel Stehule

Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com
Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09 12:40:37 -05:00
Tom Lane f90149e628 Don't use custom OID symbols in pg_type.dat, either.
On the same reasoning as in commit 36b931214, forbid using custom
oid_symbol macros in pg_type as well as pg_proc, so that we always
rely on the predictable macro names generated by genbki.pl.

We do continue to grant grandfather status to the names CASHOID and
LSNOID, although those are now considered deprecated aliases for the
preferred names MONEYOID and PG_LSNOID.  This is because there's
likely to be client-side code using the old names, and this bout of
neatnik-ism doesn't quite seem worth breaking client code.

There might be a case for grandfathering EVTTRIGGEROID, too, since
externally-maintained PLs may reference that symbol.  But renaming
such references to EVENT_TRIGGEROID doesn't seem like a particularly
heavy lift --- we make far more significant backend API changes in
every major release.  For now I didn't add that, but we could
reconsider if there's pushback.

The other names changed here seem pretty unlikely to have any outside
uses.  Again, we could add alias macros if there are complaints, but
for now I didn't.

As before, no need for a catversion bump.

John Naylor

Discussion: https://postgr.es/m/CAFBsxsHpCbjfoddNGpnnnY5pHwckWfiYkMYSF74PmP1su0+ZOw@mail.gmail.com
2020-10-29 13:33:38 -04:00
Peter Eisentraut 2453ea1422 Support for OUT parameters in procedures
Unlike for functions, OUT parameters for procedures are part of the
signature.  Therefore, they have to be listed in pg_proc.proargtypes
as well as mentioned in ALTER PROCEDURE and DROP PROCEDURE.

Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/2b8490fe-51af-e671-c504-47359dc453c5@2ndquadrant.com
2020-10-05 09:21:43 +02:00
Tom Lane a6b1f5365d Fix memory leak in plpgsql's CALL processing.
When executing a CALL or DO in a non-atomic context (i.e., not inside
a function or query), plpgsql creates a new plan each time through,
as a rather hacky solution to some resource management issues.  But
it failed to free this plan until exit of the current procedure or DO
block, resulting in serious memory bloat in procedures that called
other procedures many times.  Fix by remembering to free the plan,
and by being more honest about restoring the previous state (otherwise,
recursive procedure calls have a problem).

There was also a smaller leak associated with recalculation of the
"target" list of output variables.  Fix that by using the statement-
lifespan context to hold non-permanent values.

Back-patch to v11 where procedures were introduced.

Pavel Stehule and Tom Lane

Discussion: https://postgr.es/m/CAFj8pRDiiU1dqym+_P4_GuTWm76knJu7z9opWayBJTC0nQGUUA@mail.gmail.com
2020-09-29 11:18:30 -04:00
Tom Lane f859c2ffa0 Fix a few more generator scripts to produce pgindent-clean output.
This completes the project of making all our derived files be
pgindent-clean (or else explicitly excluded from indentation),
so that no surprises result when running pgindent in a built-out
development tree.

Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com
2020-09-21 13:58:26 -04:00
Peter Eisentraut 80fc96eceb Standardize order of use strict and use warnings in Perl code
The standard order in PostgreSQL and other code is use strict first,
but some code was uselessly inconsistent about this.
2020-09-21 17:04:36 +02:00
Peter Eisentraut fd5e3b2914 Remove unused parameter
unused since 39bd3fd1db

Discussion: https://www.postgresql.org/message-id/flat/511bb100-f829-ba21-2f10-9f952ec06ead%402ndquadrant.com
2020-09-02 15:17:33 +02:00
Tom Lane f3faf35f37 Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites.  But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).

So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them.  This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.

Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value.  Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.

Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.

Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 15:43:22 -04:00
Tom Lane fe2e206cdb Inline the fast path of plpgsql's exec_cast_value().
In the common case where this function isn't actually asked to perform
any type conversion, there's nothing it has to do beyond comparing the
arguments.  Arrange for that part to be inlined into callers, with the
slower path remaining out-of-line.  This seems to be good for several
percent speedup on simple cases, with only minimal code bloat.

Amit Khandekar

Discussion: https://postgr.es/m/CAJ3gD9eBNrmUD7WBBLG8ohaZ485H9y+4eihQTgr+K8Lhka3vcQ@mail.gmail.com
2020-07-05 13:12:31 -04:00
Tom Lane 1f902d499e Inline plpgsql's exec_stmt() into exec_stmts().
This saves one level of C function call per plpgsql statement executed,
and permits a tiny additional optimization of not saving and restoring
estate->err_stmt for each statement in a block.  The net effect seems
nearly un-measurable on x86_64, but there's a clear win on aarch64,
amounting to two or three percent in a loop over a few simple plpgsql
statements.

To do this, we have to get rid of the other existing call sites for
exec_stmt().  Replace them with exec_toplevel_block(), which is just
defined to do what exec_stmts() does, but for a single
PLpgSQL_stmt_block statement.  Hard-wiring the expectation of which
statement type applies here allows us to skip the dispatch switch,
making this not much uglier than the previous factorization.

Amit Khandekar, tweaked a bit by me

Discussion: https://postgr.es/m/CAJ3gD9eBNrmUD7WBBLG8ohaZ485H9y+4eihQTgr+K8Lhka3vcQ@mail.gmail.com
2020-07-03 15:42:10 -04:00
Tom Lane 2f48ede080 Avoid using a cursor in plpgsql's RETURN QUERY statement.
plpgsql has always executed the query given in a RETURN QUERY command
by opening it as a cursor and then fetching a few rows at a time,
which it turns around and dumps into the function's result tuplestore.
The point of this was to keep from blowing out memory with an oversized
SPITupleTable result (note that while a tuplestore can spill tuples
to disk, SPITupleTable cannot).  However, it's rather inefficient, both
because of extra data copying and because of executor entry/exit
overhead.  In recent versions, a new performance problem has emerged:
use of a cursor prevents use of a parallel plan for the executed query.

We can improve matters by skipping use of a cursor and having the
executor push result tuples directly into the function's result
tuplestore.  However, a moderate amount of new infrastructure is needed
to make that idea work:

* We can use the existing tstoreReceiver.c DestReceiver code to funnel
executor output to the tuplestore, but it has to be extended to support
plpgsql's requirement for possibly applying a tuple conversion map.

* SPI needs to be extended to allow use of a caller-supplied
DestReceiver instead of its usual receiver that puts tuples into
a SPITupleTable.  Two new API calls are needed to handle both the
RETURN QUERY and RETURN QUERY EXECUTE cases.

I also felt that I didn't want these new API calls to use the legacy
method of specifying query parameter values with "char" null flags
(the old ' '/'n' convention); rather they should accept ParamListInfo
objects containing the parameter type and value info.  This required
a bit of additional new infrastructure since we didn't yet have any
parse analysis callback that would interpret $N parameter symbols
according to type data supplied in a ParamListInfo.  There seems to be
no harm in letting makeParamList install that callback by default,
rather than leaving a new ParamListInfo's parserSetup hook as NULL.
(Indeed, as of HEAD, I couldn't find anyplace that was using the
parserSetup field at all; plpgsql was using parserSetupArg for its
own purposes, but parserSetup seemed to be write-only.)

We can actually get plpgsql out of the business of using legacy null
flags altogether, and using ParamListInfo instead of its ad-hoc
PreparedParamsData structure; but this requires inventing one more
SPI API call that can replace SPI_cursor_open_with_args.  That seems
worth doing, though.

SPI_execute_with_args and SPI_cursor_open_with_args are now unused
anywhere in the core PG distribution.  Perhaps someday we could
deprecate/remove them.  But cleaning up the crufty bits of the SPI
API is a task for a different patch.

Per bug #16040 from Jeremy Smith.  This is unfortunately too invasive to
consider back-patching.  Patch by me; thanks to Hamid Akhtar for review.

Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 12:14:32 -04:00
Peter Eisentraut 350f47786c Spelling adjustments
similar to 0fd2a79a63
2020-06-09 10:41:41 +02:00
Peter Eisentraut ac449d8801 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 031ca65d7825c3e539a3e62ea9d6630af12e6b6b
2020-05-18 12:49:30 +02:00
Tom Lane fa27dd40d5 Run pgindent with new pg_bsd_indent version 2.1.1.
Thomas Munro fixed a longstanding annoyance in pg_bsd_indent, that
it would misformat lines containing IsA() macros on the assumption
that the IsA() call should be treated like a cast.  This improves
some other cases involving field/variable names that match typedefs,
too.  The only places that get worse are a couple of uses of the
OpenSSL macro STACK_OF(); we'll gladly take that trade-off.

Discussion: https://postgr.es/m/20200114221814.GA19630@alvherre.pgsql
2020-05-16 11:54:51 -04:00
Alvaro Herrera 17cc133f01
Dial back -Wimplicit-fallthrough to level 3
The additional pain from level 4 is excessive for the gain.

Also revert all the source annotation changes to their original
wordings, to avoid back-patching pain.

Discussion: https://postgr.es/m/31166.1589378554@sss.pgh.pa.us
2020-05-13 15:31:14 -04:00
Alvaro Herrera 3e9744465d
Add -Wimplicit-fallthrough to CFLAGS and CXXFLAGS
Use it at level 4, a bit more restrictive than the default level, and
tweak our commanding comments to FALLTHROUGH.

(However, leave zic.c alone, since it's external code; to avoid the
warnings that would appear there, change CFLAGS for that file in the
Makefile.)

Author: Julien Rouhaud <rjuju123@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20200412081825.qyo5vwwco3fv4gdo@nol
Discussion: https://postgr.es/m/flat/E1fDenm-0000C8-IJ@gemulon.postgresql.org
2020-05-12 16:07:30 -04:00
Peter Eisentraut 7a9c9ce641 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 80d8f54b3c5533ec036404bd3c3b24ff4825d037
2020-05-11 13:14:32 +02:00
Tom Lane 0b34e7d307 Improve user control over truncation of logged bind-parameter values.
This patch replaces the boolean GUC log_parameters_on_error introduced
by commit ba79cb5dc with an integer log_parameter_max_length_on_error,
adding the ability to specify how many bytes to trim each logged
parameter value to.  (The previous coding hard-wired that choice at
64 bytes.)

In addition, add a new parameter log_parameter_max_length that provides
similar control over truncation of query parameters that are logged in
response to statement-logging options, as opposed to errors.  Previous
releases always logged such parameters in full, possibly causing log
bloat.

For backwards compatibility with prior releases,
log_parameter_max_length defaults to -1 (log in full), while
log_parameter_max_length_on_error defaults to 0 (no logging).

Per discussion, log_parameter_max_length is SUSET since the DBA should
control routine logging behavior, but log_parameter_max_length_on_error
is USERSET because it also affects errcontext data sent back to the
client.

Alexey Bashtanov, editorialized a little by me

Discussion: https://postgr.es/m/b10493cc-a399-a03a-67c7-068f2791ee50@imap.cc
2020-04-02 15:04:51 -04:00
Tom Lane fbc7a71608 Rearrange validity checks for plpgsql "simple" expressions.
Buildfarm experience shows what probably should've occurred to me before:
if a cache flush occurs partway through building a generic plan, then
the plansource may have is_valid = false even though the plan is valid.
We need to accept this case, use the generated plan, and then try to
replan the next time.  We can't try to replan immediately, because that
would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover
it's really overkill.  (We can assume that the plan is valid, it's just
possibly a bit stale.  Note that the pre-existing code behaved this way,
and the non-simple-expression code paths do too.)  Conversely, not using
the generated plan would drop us into the not-a-simple-expression code
path, which is bad for performance and would also cause regression-test
failures due to visibly different error-reporting behavior.

Hence, refactor the validity-check functions so that the initial check
and recheck cases can react differently to plansource->is_valid.
This makes their usage a bit simpler, too.

Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-27 14:47:34 -04:00
Tom Lane 8f59f6b9c0 Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.

First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).

The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session.  This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.

Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation.  The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.

Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.

Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here.  Also thanks to Andres Freund for review.

Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 18:58:57 -04:00
Tom Lane 86e5badd22 Ensure that plpgsql cleans up cleanly during parallel-worker exit.
plpgsql_xact_cb ought to treat events XACT_EVENT_PARALLEL_COMMIT and
XACT_EVENT_PARALLEL_ABORT like XACT_EVENT_COMMIT and XACT_EVENT_ABORT
respectively, since its goal is to do process-local cleanup.  This
oversight caused plpgsql's end-of-transaction cleanup to not get done
in parallel workers.  Since a parallel worker will exit just after the
transaction cleanup, the effects of this are limited.  I couldn't find
any case in the core code with user-visible effects, but perhaps there
are some in extensions.  In any case it's wrong, so let's fix it before
it bites us not after.

In passing, add some comments around the handling of expression
evaluation resources in DO blocks.  There's no live bug there, but it's
quite unobvious what's happening; at least I thought so.  This isn't
related to the other issue, except that I found both things while poking
at expression-evaluation performance.

Back-patch the plpgsql_xact_cb fix to 9.5 where those event types
were introduced, and the DO-block commentary to v11 where DO blocks
gained the ability to issue COMMIT/ROLLBACK.

Discussion: https://postgr.es/m/10353.1585247879@sss.pgh.pa.us
2020-03-26 18:06:55 -04:00
Tom Lane bda6dedbea Go back to returning int from ereport auxiliary functions.
This reverts the parts of commit 17a28b0364
that changed ereport's auxiliary functions from returning dummy integer
values to returning void.  It turns out that a minority of compilers
complain (not entirely unreasonably) about constructs such as

	(condition) ? errdetail(...) : 0

if errdetail() returns void rather than int.  We could update those
call sites to say "(void) 0" perhaps, but the expectation for this
patch set was that ereport callers would not have to change anything.
And this aspect of the patch set was already the most invasive and
least compelling part of it, so let's just drop it.

Per buildfarm.

Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-25 11:57:36 -04:00
Tom Lane 17a28b0364 Improve the internal implementation of ereport().
Change all the auxiliary error-reporting routines to return void,
now that we no longer need to pretend they are passing something
useful to errfinish().  While this probably doesn't save anything
significant at the machine-code level, it allows detection of some
additional types of mistakes.

Pass the error location details (__FILE__, __LINE__, PG_FUNCNAME_MACRO)
to errfinish not errstart.  This shaves a few cycles off the case where
errstart decides we're not going to emit anything.

Re-implement elog() as a trivial wrapper around ereport(), removing
the separate support infrastructure it used to have.  Aside from
getting rid of some now-surplus code, this means that elog() now
really does have exactly the same semantics as ereport(), in particular
that it can skip evaluation work if the message is not to be emitted.

Andres Freund and Tom Lane

Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-24 12:08:48 -04:00
Tom Lane 24e2885ee3 Introduce "anycompatible" family of polymorphic types.
This patch adds the pseudo-types anycompatible, anycompatiblearray,
anycompatiblenonarray, and anycompatiblerange.  They work much like
anyelement, anyarray, anynonarray, and anyrange respectively, except
that the actual input values need not match precisely in type.
Instead, if we can find a common supertype (using the same rules
as for UNION/CASE type resolution), then the parser automatically
promotes the input values to that type.  For example,
"myfunc(anycompatible, anycompatible)" can match a call with one
integer and one bigint argument, with the integer automatically
promoted to bigint.  With anyelement in the definition, the user
would have had to cast the integer explicitly.

The new types also provide a second, independent set of type variables
for function matching; thus with "myfunc(anyelement, anyelement,
anycompatible) returns anycompatible" the first two arguments are
constrained to be the same type, but the third can be some other
type, and the result has the type of the third argument.  The need
for more than one set of type variables was foreseen back when we
first invented the polymorphic types, but we never did anything
about it.

Pavel Stehule, revised a bit by me

Discussion: https://postgr.es/m/CAFj8pRDna7VqNi8gR+Tt2Ktmz0cq5G93guc3Sbn_NVPLdXAkqA@mail.gmail.com
2020-03-19 11:43:11 -04:00
Tom Lane bb03010b9f Remove the "opaque" pseudo-type and associated compatibility hacks.
A long time ago, it was necessary to declare datatype I/O functions,
triggers, and language handler support functions in a very type-unsafe
way involving a single pseudo-type "opaque".  We got rid of those
conventions in 7.3, but there was still support in various places to
automatically convert such functions to the modern declaration style,
to be able to transparently re-load dumps from pre-7.3 servers.
It seems unnecessary to continue to support that anymore, so take out
the hacks; whereupon the "opaque" pseudo-type itself is no longer
needed and can be dropped.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Discussion: https://postgr.es/m/4110.1583255415@sss.pgh.pa.us
2020-03-05 15:48:56 -05:00
Tom Lane 3ed2005ff5 Introduce macros for typalign and typstorage constants.
Our usual practice for "poor man's enum" catalog columns is to define
macros for the possible values and use those, not literal constants,
in C code.  But for some reason lost in the mists of time, this was
never done for typalign/attalign or typstorage/attstorage.  It's never
too late to make it better though, so let's do that.

The reason I got interested in this right now is the need to duplicate
some uses of the TYPSTORAGE constants in an upcoming ALTER TYPE patch.
But in general, this sort of change aids greppability and readability,
so it's a good idea even without any specific motivation.

I may have missed a few places that could be converted, and it's even
more likely that pending patches will re-introduce some hard-coded
references.  But that's not fatal --- there's no expectation that
we'd actually change any of these values.  We can clean up stragglers
over time.

Discussion: https://postgr.es/m/16457.1583189537@sss.pgh.pa.us
2020-03-04 10:34:25 -05:00
Alvaro Herrera 2f9661311b
Represent command completion tags as structs
The backend was using strings to represent command tags and doing string
comparisons in multiple places, but that's slow and unhelpful.  Create a
new command list with a supporting structure to use instead; this is
stored in a tag-list-file that can be tailored to specific purposes with
a caller-definable C macro, similar to what we do for WAL resource
managers.  The first first such uses are a new CommandTag enum and a
CommandTagBehavior struct.

Replace numerous occurrences of char *completionTag with a
QueryCompletion struct so that the code no longer stores information
about completed queries in a cstring.  Only at the last moment, in
EndCommand(), does this get converted to a string.

EventTriggerCacheItem no longer holds an array of palloc’d tag strings
in sorted order, but rather just a Bitmapset over the CommandTags.

Author: Mark Dilger, with unsolicited help from Álvaro Herrera
Reviewed-by: John Naylor, Tom Lane
Discussion: https://postgr.es/m/981A9DB4-3F0C-4DA5-88AD-CB9CFF4D6CAD@enterprisedb.com
2020-03-02 18:19:51 -03:00
Tom Lane 70a7732007 Remove support for upgrading extensions from "unpackaged" state.
Andres Freund pointed out that allowing non-superusers to run
"CREATE EXTENSION ... FROM unpackaged" has security risks, since
the unpackaged-to-1.0 scripts don't try to verify that the existing
objects they're modifying are what they expect.  Just attaching such
objects to an extension doesn't seem too dangerous, but some of them
do more than that.

We could have resolved this, perhaps, by still requiring superuser
privilege to use the FROM option.  However, it's fair to ask just what
we're accomplishing by continuing to lug the unpackaged-to-1.0 scripts
forward.  None of them have received any real testing since 9.1 days,
so they may not even work anymore (even assuming that one could still
load the previous "loose" object definitions into a v13 database).
And an installation that's trying to go from pre-9.1 to v13 or later
in one jump is going to have worse compatibility problems than whether
there's a trivial way to convert their contrib modules into extension
style.

Hence, let's just drop both those scripts and the core-code support
for "CREATE EXTENSION ... FROM".

Discussion: https://postgr.es/m/20200213233015.r6rnubcvl4egdh5r@alap3.anarazel.de
2020-02-19 16:59:14 -05:00
Tom Lane 761a5688b1 Fix confusion about event trigger vs. plain function in plpgsql.
The function hash table keys made by compute_function_hashkey() failed
to distinguish event-trigger call context from regular call context.
This meant that once we'd successfully made a hash entry for an event
trigger (either by validation, or by normal use as an event trigger),
an attempt to call the trigger function as a plain function would
find this hash entry and thereby bypass the you-can't-do-that check in
do_compile().  Thus we'd attempt to execute the function, leading to
strange errors or even crashes, depending on function contents and
server version.

To fix, add an isEventTrigger field to PLpgSQL_func_hashkey,
paralleling the longstanding infrastructure for regular triggers.
This fits into what had been pad space, so there's no risk of an ABI
break, even assuming that any third-party code is looking at these
hash keys.  (I considered replacing isTrigger with a PLpgSQL_trigtype
enum field, but felt that that carried some API/ABI risk.  Maybe we
should change it in HEAD though.)

Per bug #16266 from Alexander Lakhin.  This has been broken since
event triggers were invented, so back-patch to all supported branches.

Discussion: https://postgr.es/m/16266-fcd7f838e97ba5d4@postgresql.org
2020-02-19 14:45:17 -05:00
Tom Lane 50fc694e43 Invent "trusted" extensions, and remove the pg_pltemplate catalog.
This patch creates a new extension property, "trusted".  An extension
that's marked that way in its control file can be installed by a
non-superuser who has the CREATE privilege on the current database,
even if the extension contains objects that normally would have to be
created by a superuser.  The objects within the extension will (by
default) be owned by the bootstrap superuser, but the extension itself
will be owned by the calling user.  This allows replicating the old
behavior around trusted procedural languages, without all the
special-case logic in CREATE LANGUAGE.  We have, however, chosen to
loosen the rules slightly: formerly, only a database owner could take
advantage of the special case that allowed installation of a trusted
language, but now anyone who has CREATE privilege can do so.

Having done that, we can delete the pg_pltemplate catalog, moving the
knowledge it contained into the extension script files for the various
PLs.  This ends up being no change at all for the in-core PLs, but it is
a large step forward for external PLs: they can now have the same ease
of installation as core PLs do.  The old "trusted PL" behavior was only
available to PLs that had entries in pg_pltemplate, but now any
extension can be marked trusted if appropriate.

This also removes one of the stumbling blocks for our Python 2 -> 3
migration, since the association of "plpythonu" with Python 2 is no
longer hard-wired into pg_pltemplate's initial contents.  Exactly where
we go from here on that front remains to be settled, but one problem
is fixed.

Patch by me, reviewed by Peter Eisentraut, Stephen Frost, and others.

Discussion: https://postgr.es/m/5889.1566415762@sss.pgh.pa.us
2020-01-29 18:42:43 -05:00
Tom Lane 166ab9c8d3 Teach plpgsql's "make clean" to remove generated test files.
Copy the rules that src/test/regress/GNUmakefile uses for this purpose.
Since these files are .gitignore'd, the mistake wasn't obvious unless
you happened to look at "git status --ignored" in an allegedly clean
tree.

Oversight in commit 1858b105b.  No need for back-patch since that's
not in the back branches.
2020-01-29 11:07:04 -05:00
Tom Lane 7f380c59f8 Reduce size of backend scanner's tables.
Previously, the core scanner's yy_transition[] array had 37045 elements.
Since that number is larger than INT16_MAX, Flex generated the array to
contain 32-bit integers.  By reimplementing some of the bulkier scanner
rules, this patch reduces the array to 20495 elements.  The much smaller
total length, combined with the consequent use of 16-bit integers for
the array elements reduces the binary size by over 200kB.  This was
accomplished in two ways:

1. Consolidate handling of quote continuations into a new start condition,
rather than duplicating that logic for five different string types.

2. Treat Unicode strings and identifiers followed by a UESCAPE sequence
as three separate tokens, rather than one.  The logic to de-escape
Unicode strings is moved to the filter code in parser.c, which already
had the ability to provide special processing for token sequences.
While we could have implemented the conversion in the grammar, that
approach was rejected for performance and maintainability reasons.

Performance in microbenchmarks of raw parsing seems equal or slightly
faster in most cases, and it's reasonable to expect that in real-world
usage (with more competition for the CPU cache) there will be a larger
win.  The exception is UESCAPE sequences; lexing those is about 10%
slower, primarily because the scanner now has to be called three times
rather than one.  This seems acceptable since that feature is very
rarely used.

The psql and epcg lexers are likewise modified, primarily because we
want to keep them all in sync.  Since those lexers don't use the
space-hogging -CF option, the space savings is much less, but it's
still good for perhaps 10kB apiece.

While at it, merge the ecpg lexer's handling of C-style comments used
in SQL and in C.  Those have different rules regarding nested comments,
but since we already have the ability to keep track of the previous
start condition, we can use that to handle both cases within a single
start condition.  This matches the core scanner more closely.

John Naylor

Discussion: https://postgr.es/m/CACPNZCvaoa3EgVWm5yZhcSTX6RAtaLgniCPcBVOCwm8h3xpWkw@mail.gmail.com
2020-01-13 15:04:31 -05:00
Bruce Momjian 7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
Tom Lane 4ba4bfaf25 Fix possible loss of sync between rectypeid and underlying PLpgSQL_type.
When revalidate_rectypeid() acts to update a stale record type OID
in plpgsql's data structures, it fixes the active PLpgSQL_rec struct
as well as the PLpgSQL_type struct it references.  However, the latter
is shared across function executions while the former is not.  In a
later function execution, the PLpgSQL_rec struct would be reinitialized
by copy_plpgsql_datums and would then contain a stale type OID,
typically leading to "could not open relation with OID NNNN" errors.
revalidate_rectypeid() can easily fix this, fortunately, just by
treating typ->typoid as authoritative.

Per report and diagnosis from Ashutosh Sharma, though this is not his
suggested fix.  Back-patch to v11 where this code came in.

Discussion: https://postgr.es/m/CAE9k0Pkd4dZwt9J5pS9xhJFWpUtqs05C9xk_GEwPzYdV=GxwWg@mail.gmail.com
2019-12-26 15:19:39 -05:00
Alvaro Herrera 6cafde1bd4 Add backend-only appendStringInfoStringQuoted
This provides a mechanism to emit literal values in informative
messages, such as query parameters.  The new code is more complex than
what it replaces, primarily because it wants to be more efficient.
It also has the (currently unused) additional optional capability of
specifying a maximum size to print.

The new function lives out of common/stringinfo.c so that frontend users
of that file need not pull in unnecessary multibyte-encoding support
code.

Author: Álvaro Herrera and Alexey Bashtanov, after a suggestion from Andres Freund
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20190920203905.xkv5udsd5dxfs6tr@alap3.anarazel.de
2019-12-10 17:12:56 -03:00
Alvaro Herrera 3974c4a724 Remove useless "return;" lines
Discussion: https://postgr.es/m/20191128144653.GA27883@alvherre.pgsql
2019-11-28 16:48:37 -03:00