>
> > The second issue is where plperl returns a large result set.
I have attached the following seven patches to address this problem:
1. Trivial. Replaces some errant spaces with tabs.
2. Trivial. Fixes the spelling of Jan's name, and gets rid of many
inane, useless, annoying, and often misleading comments. Here's
a sample: "plperl_init_all() - Initialize all".
(I have tried to add some useful comments here and there, and will
continue to do so now and again.)
3. Trivial. Splits up some long lines.
4. Converts SRFs in PL/Perl to use a Tuplestore and SFRM_Materialize
to return the result set, based on the PL/PgSQL model.
There are two major consequences: result sets will spill to disk when
they can no longer fit in work_mem; and "select foo_srf()" no longer
works. (I didn't lose sleep over the latter, since that form is not
valid in PL/PgSQL, and it's not documented in PL/Perl.)
5. Trivial, but important. Fixes use of "undef" instead of undef. This
would cause empty functions to fail in bizarre ways. I suspect that
there's still another (old) bug here. I'll investigate further.
6. Moves the majority of (4) out into a new plperl_return_next()
function, to make it possible to expose the functionality to
Perl; cleans up some of the code besides.
7. Add an spi_return_next function for use in Perl code.
If you want to apply the patches and try them out, 8-composite.diff is
what you should use. (Note: my patches depend upon Andrew's use-strict
and %_SHARED patches being applied.)
Here's something to try:
create or replace function foo() returns setof record as $$
$i = 0;
for ("World", "PostgreSQL", "PL/Perl") {
spi_return_next({f1=>++$i, f2=>'Hello', f3=>$_});
}
return;
$$ language plperl;
select * from foo() as (f1 integer, f2 text, f3 text);
(Many thanks to Andrews Dunstan and Supernews for their help.)
Abhijit Menon-Sen
spotted by Qingqing Zhou. The HASH_ENTER action now automatically
fails with elog(ERROR) on out-of-memory --- which incidentally lets
us eliminate duplicate error checks in quite a bunch of places. If
you really need the old return-NULL-on-out-of-memory behavior, you
can ask for HASH_ENTER_NULL. But there is now an Assert in that path
checking that you aren't hoping to get that behavior in a palloc-based
hash table.
Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions,
which were not being used anywhere anymore, and were surely too ugly
and unsafe to want to see revived again.
for testing PLs and contrib_regression for testing contrib, instead of
overwriting the core system's regression database as formerly done.
Andrew Dunstan
which is neither needed by nor related to that header. Remove the bogus
inclusion and instead include the header in those C files that actually
need it. Also fix unnecessary inclusions and bad inclusion order in
tsearch2 files.
to produce when running the executor. This is consistent with the internal
executor APIs (such as ExecutorRun), which also use a long for this purpose.
It also allows FETCH_ALL to be passed -- since FETCH_ALL is defined as
LONG_MAX, this wouldn't have worked on platforms where int and long are of
different sizes. Per report from Tzahi Fadida.
only one argument. (Per recent discussion, the option to accept multiple
arguments is pretty useless for user-defined types, and would be a likely
source of security holes if it was used.) Simplify call sites of
output/send functions to not bother passing more than one argument.
indexes. Replace all heap_openr and index_openr calls by heap_open
and index_open. Remove runtime lookups of catalog OID numbers in
various places. Remove relcache's support for looking up system
catalogs by name. Bulky but mostly very boring patch ...
output parameters or VOID or a set. There seems no particular reason to
insist on a RETURN in these cases, since the function return value is
determined by other elements anyway. Per recent discussion.
OPENed on non-SELECT commands such as EXPLAIN or SHOW (anything that
returns tuples is allowed). This flexibility already existed for
bound cursors, but OPEN was artificially restricting what it would
take. Per a gripe some months back.
change saves a great deal of space in pg_proc and its primary index,
and it eliminates the former requirement that INDEX_MAX_KEYS and
FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded
in the on-disk representation (because it affects index tuple header
size), but FUNC_MAX_ARGS is not. I believe it would now be possible
to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet.
There are still a lot of vestigial references to FUNC_MAX_ARGS, which
I will clean up in a separate pass. However, getting rid of it
altogether would require changing the FunctionCallInfoData struct,
and I'm not sure I want to buy into that.
Document use of macros for pg_printf functions.
Bump major versions of all interfaces to handle movement of get_progname
from libpq to libpgport in 8.0, and probably other libpgport changes in 8.1.
and parsing work in PL/PgSQL:
- memory management is now done via palloc(). The compiled representation
of each function now has its own memory context. Therefore, the storage
consumed by a function can be reclaimed via MemoryContextDelete().
During compilation, the CurrentMemoryContext is the function's memory
context. This means that a palloc() is sufficient to allocate memory
that will have the same lifetime as the function itself. As a result,
code invoked during compilation should be careful to pfree() temporary
allocations to avoid leaking memory. Since a lot of the code in the
backend is not careful about releasing palloc'ed memory, that means
we should switch into a temporary memory context before invoking
backend functions. A temporary context appropriate for such allocations
is `compile_tmp_cxt'.
- The ability to use palloc() allows us to simply a lot of the code in
the parser. Rather than representing lists of elements via ad hoc
linked lists or arrays, we can use the List type. Rather than doing
malloc followed by memset(0), we can just use palloc0().
- We now check that the user has supplied the right number of parameters
to a RAISE statement. Supplying either too few or too many results in
an error (at runtime).
- PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we
do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this
means we need to be quite lax in what the PL/PgSQL grammar considers
a "SQL statement". This can lead to misleading behavior if there is a
syntax error in the function definition, since we assume a malformed
PL/PgSQL construct is a SQL statement. Furthermore, these errors were
only detected at runtime (when we tried to execute the alleged "SQL
statement" via SPI).
To rectify this, the patch changes the parser to invoke the main SQL
parser when it sees a string it believes to be a SQL expression. This
means that synctically-invalid SQL will be rejected during the
compilation of the PL/PgSQL function. This is only done when compiling
for "validation" purposes (i.e. at CREATE FUNCTION time), so it should
not impose a runtime overhead.
- Fixes for the various buffer overruns I've patched in stable branches
in the past few weeks. I've rewritten code where I thought it was
warranted (unlike the patches applied to older branches, which were
minimally invasive).
- Various other minor changes and cleanups.
- Updates to the regression tests.
initially NULL. For 8.0 we changed the main executor to have this
behavior in an UPDATE of an array column, but plpgsql's equivalent case
was overlooked. Per report from Sven Willenberger.
MemoryContextAllocZero back to MemoryContextAlloc, same as it was in 7.4.
The zeroing is unnecessary since all the meaningful fields are filled in
just below. I had made it do that out of neatnik-ism, but some testing
with an example provided by Pavel Stehule showed that the zeroing was
accounting for about 5% of the runtime in a compute-intensive plpgsql
function. That seems a bit high of a price for neatnik-ism...
advancing ActiveSnapshot when we are inside a volatile function.
Per example from Gaetano Mendola. Add a regression test to catch
similar problems in future.
several reports of users being confused when they attempt to use ELSEIF
and run into trouble due to PL/PgSQL's lax parser. The parser will be
improved for 8.1, but we can fix most of the problem by allowing ELSEIF
for now.
of an inheritance child table is binary-compatible with the rowtype of
its parent, invent an expression node type that does the conversion
correctly. Fixes the new bug exhibited by Kris Shannon as well as a
lot of old bugs that would only show up when using multiple inheritance
or after altering the parent table.
data returned from Perl. Consolidate multiple bits of code to convert
a Perl hash to a tuple, and drive the conversion off the keys present
in the hash rather than the tuple column names, so we detect error if
the hash contains keys it shouldn't. (This means keys not in the hash
will silently default to NULL, which seems ok to me.) Fix a bunch of
reference-count leaks too.
subtransactions quite right either: the ReleaseCurrentSubTransaction
call should occur inside the PG_TRY, so that the proper path is taken
if an error occurs during subtransaction commit. This assumes that
AbortSubTransaction can cope with the state left behind if
CommitSubTransaction fails partway through, but we were already
requiring that.
operations are now run as subtransactions, so that errors in them
can be reported as ordinary Perl or Tcl errors and caught by the
normal error handling convention of those languages. Also do some
minor code cleanup in pltcl.c: extract a large chunk of duplicated
code in pltcl_SPI_execute and pltcl_SPI_execute_plan into a shared
subroutine.
rather than longjmp'ing clear out of Perl and thereby leaving Perl in
a broken state. Also some minor prettification of error messages.
Still need to do something with spi_exec_query() error handling.
for the languages even when not installed in a standard directory.
pltcl may need this treatment as well, but we don't have the right path
conveniently available, so I'll leave it alone as long as there aren't
actual reports of trouble.
may expand the Perl stack, therefore we must SPAGAIN to reload the local
stack pointer after calling it. Also a couple other marginal readability
improvements.
some of the bugs exposed thereby. The remaining 'might be used uninitialized'
warnings look like live bugs, but I am not familiar enough with Perl/C hacking
to tell how to fix them.
We don't really want to start a new SPI connection, just keep using the old
one; otherwise we have memory management problems as illustrated by
John Kennedy's bug report of today. This requires a bit of a hack to
ensure the SPI stack state is properly restored, but then again what we
were doing before was a hack too, strictly speaking. Add a regression
test to cover this case.
1. Two minor cleanups:
- We don't need to call hv_exists+hv_fetch; we should just check the
return value of hv_fetch.
- newSVpv("undef",0) is the string "undef", not a real undef.
2. This should fix the bug Andrew Dunstan described in a recent -hackers
post. It replaces three bogus "eval_pv(key, 0)" calls with newSVpv,
and eliminates another redundant hv_exists+hv_fetch pair.
3. plperl_build_tuple_argument builds up a string of Perl code to create
a hash representing the tuple. This patch creates the hash directly.
4. Another minor cleanup: replace a couple of av_store()s with av_push.
5. Analogous to #3 for plperl_trigger_build_args. This patch removes the
static sv_add_tuple_value function, which does much the same as two
other utility functions defined later, and merges the functionality
into plperl_hash_from_tuple.
I have tested the patches to the best of my limited ability, but I would
appreciate it very much if someone else could review and test them too.
(Thanks to Andrew and David Fetter for their help with some testing.)
Abhijit Menon-Sen
argument, leading to label matching failures at run-time. Per report from
Patrick Fiche. Also, fix it so that an unrecognized label argument draws
a more useful error message than 'syntax error'.
-L spec rather than assuming libpython is in the standard search path
(this returns to the way 7.4 did it). But check the distutils output
to see if it looks like Python has built a shared library, and if so
link with that instead of the probably-not-shared library found in
configdir.
- replace some function signatures of the form "some_type foo()" with
"some_type foo(void)"
- replace a few instances of a literal 0 being used as a NULL pointer;
there are more instances of this in the code, but I just fixed a few
- in src/backend/utils/mb/wstrncmp.c, replace K&R style function
declarations with ANSI style, remove use of 'register' keyword
- remove an "extern" modifier that was applied to a function definition
(rather than a declaration)
to unreserved keyword, use ereport not elog, assign a separate error code
for 'could not obtain lock' so that applications will be able to detect
that case cleanly.
as per recent discussions. Invent SubTransactionIds that are managed like
CommandIds (ie, counter is reset at start of each top transaction), and
use these instead of TransactionIds to keep track of subtransaction status
in those modules that need it. This means that a subtransaction does not
need an XID unless it actually inserts/modifies rows in the database.
Accordingly, don't assign it an XID nor take a lock on the XID until it
tries to do that. This saves a lot of overhead for subtransactions that
are only used for error recovery (eg plpgsql exceptions). Also, arrange
to release a subtransaction's XID lock as soon as the subtransaction
exits, in both the commit and abort cases. This avoids holding many
unique locks after a long series of subtransactions. The price is some
additional overhead in XactLockTableWait, but that seems acceptable.
Finally, restructure the state machine in xact.c to have a more orthogonal
set of states for subtransactions.
mode see a fresh snapshot for each command in the function, rather than
using the latest interactive command's snapshot. Also, suppress fresh
snapshots as well as CommandCounterIncrement inside STABLE and IMMUTABLE
functions, instead using the snapshot taken for the most closely nested
regular query. (This behavior is only sane for read-only functions, so
the patch also enforces that such functions contain only SELECT commands.)
As per my proposal of 6-Sep-2004; I note that I floated essentially the
same proposal on 19-Jun-2002, but that discussion tailed off without any
action. Since 8.0 seems like the right place to be taking possibly
nontrivial backwards compatibility hits, let's get it done now.
executed. Previously, the DECLARE would succeed but subsequent FETCHes
would fail since the parameter values supplied to DECLARE were not
propagated to the portal created for the cursor.
In support of this, add type Oids to ParamListInfo entries, which seems
like a good idea anyway since code that extracts a value can double-check
that it got the type of value it was expecting.
Oliver Jowett, with minor editorialization by Tom Lane.
number of active subtransaction XIDs in each backend's PGPROC entry,
and use this to avoid expensive probes into pg_subtrans during
TransactionIdIsInProgress. Extend EOXactCallback API to allow add-on
modules to get control at subxact start/end. (This is deliberately
not compatible with the former API, since any uses of that API probably
need manual review anyway.) Add basic reference documentation for
SAVEPOINT and related commands. Minor other cleanups to check off some
of the open issues for subtransactions.
Alvaro Herrera and Tom Lane.
more nearly Oracle-equivalent. Allow matching by category as well as
specific error code. Document the set of available condition names
(or more accurately, synchronize it with the existing documentation). In
passing, update errcodes.sgml to include codes added during 7.5 development.
Create a shared function to convert a SPI error code into a string
(replacing near-duplicate code in several PLs), and use it anywhere
that a SPI function call error is reported.
There are still some things that need refinement; in particular I fear
that the recognized set of error condition names probably has little in
common with what Oracle recognizes. But it's a start.
possible to trap an error inside a function rather than letting it
propagate out to PostgresMain. You still have to use AbortCurrentTransaction
to clean up, but at least the error handling itself will cooperate.
* perl_useshrplib gets set to "yes" and not to "true". I assume it's set
to "true" on unix, so I left both.
* Need to translate backslashes into slashes
* The linker config coming out of perl was for MSVC and not for mingw
Magnus Hagander
currently unapplied regarding spi_internal.c, makes some additional
fixes relating to return types, and also contains the fix for
preventing the use of insecure versions of Safe.pm.
There is one remaing return case that does not appear to work, namely
return of a composite directly in a select, i.e. if foo returns some
composite type, 'select * from foo()' works but 'select foo()' doesn't.
We will either fix that or document it as a limitation.
The function plperl_func_handler is a mess - I will try to get it
cleaned up (and split up) in a subsequent patch, time permitting.
Also, reiterating previous advice - this changes slightly the API for
spi_exec_query - the returned object has either 2 or 3 members: 'status'
(string) and 'proceesed' (int,- number of rows) and, if rows are
returned, 'rows' (array of tuple hashes).
Andrew Dunstan
FOR loops are giving weird syntax errors. Restructure parsing of FOR
loops so that the integer-loop-vs-query-loop decision is driven off
the presence of '..' between IN and LOOP, rather than the presence
of a matching record/row variable name. Hopefully this will make the
behavior a bit more transparent.
plperlNG. Review and minor cleanup/improvements by Joe Conway.
Summary of new functionality:
- Shared data space and namespace. There is a new global variable %_SHARED
that functions can use to store and save data between invocations of a
function, or between different functions. Also, all trusted plperl function
now share a common Safe container (this is an optimization, also), which
they can use for storing non-lexical variables, functions, etc.
- Triggers are now supported
- Records can now be returned (as a hash reference)
- Sets of records can now be returned (as a reference to an array of hash
references).
- New function spi_exec_query() provided for performing db functions or
getting data from db.
- Optimization for counting hash keys (Abhijit Menon-Sen)
- Allow return of 'record' and 'setof record'
As a side effect, cause subscripts in INSERT targetlists to do something
more or less sensible; previously we evaluated such subscripts and then
effectively ignored them. Another side effect is that UPDATE-ing an
element or slice of an array value that is NULL now produces a non-null
result, namely an array containing just the assigned-to positions.
of a composite type to get that type's OID as their second parameter,
in place of typelem which is useless. The actual changes are mostly
centralized in getTypeInputInfo and siblings, but I had to fix a few
places that were fetching pg_type.typelem for themselves instead of
using the lsyscache.c routines. Also, I renamed all the related variables
from 'typelem' to 'typioparam' to discourage people from assuming that
they necessarily contain array element types.