Commit Graph

1633 Commits

Author SHA1 Message Date
Michael Paquier 8548ddc61b Fix inconsistencies and typos in the tree, take 9
This addresses more issues with code comments, variable names and
unreferenced variables.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/7ab243e0-116d-3e44-d120-76b3df7abefd@gmail.com
2019-08-05 12:14:58 +09:00
Michael Paquier eb43f3d193 Fix inconsistencies and typos in the tree
This is numbered take 8, and addresses again a set of issues with code
comments, variable names and unreferenced variables.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/b137b5eb-9c95-9c2f-586e-38aba7d59788@gmail.com
2019-07-29 12:28:30 +09:00
Michael Paquier 23bccc823d Fix inconsistencies and typos in the tree
This is numbered take 7, and addresses a set of issues with code
comments, variable names and unreferenced variables.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/dff75442-2468-f74f-568c-6006e141062f@gmail.com
2019-07-22 10:01:50 +09:00
Tom Lane 1cff1b95ab Represent Lists as expansible arrays, not chains of cons-cells.
Originally, Postgres Lists were a more or less exact reimplementation of
Lisp lists, which consist of chains of separately-allocated cons cells,
each having a value and a next-cell link.  We'd hacked that once before
(commit d0b4399d8) to add a separate List header, but the data was still
in cons cells.  That makes some operations -- notably list_nth() -- O(N),
and it's bulky because of the next-cell pointers and per-cell palloc
overhead, and it's very cache-unfriendly if the cons cells end up
scattered around rather than being adjacent.

In this rewrite, we still have List headers, but the data is in a
resizable array of values, with no next-cell links.  Now we need at
most two palloc's per List, and often only one, since we can allocate
some values in the same palloc call as the List header.  (Of course,
extending an existing List may require repalloc's to enlarge the array.
But this involves just O(log N) allocations not O(N).)

Of course this is not without downsides.  The key difficulty is that
addition or deletion of a list entry may now cause other entries to
move, which it did not before.

For example, that breaks foreach() and sister macros, which historically
used a pointer to the current cons-cell as loop state.  We can repair
those macros transparently by making their actual loop state be an
integer list index; the exposed "ListCell *" pointer is no longer state
carried across loop iterations, but is just a derived value.  (In
practice, modern compilers can optimize things back to having just one
loop state value, at least for simple cases with inline loop bodies.)
In principle, this is a semantics change for cases where the loop body
inserts or deletes list entries ahead of the current loop index; but
I found no such cases in the Postgres code.

The change is not at all transparent for code that doesn't use foreach()
but chases lists "by hand" using lnext().  The largest share of such
code in the backend is in loops that were maintaining "prev" and "next"
variables in addition to the current-cell pointer, in order to delete
list cells efficiently using list_delete_cell().  However, we no longer
need a previous-cell pointer to delete a list cell efficiently.  Keeping
a next-cell pointer doesn't work, as explained above, but we can improve
matters by changing such code to use a regular foreach() loop and then
using the new macro foreach_delete_current() to delete the current cell.
(This macro knows how to update the associated foreach loop's state so
that no cells will be missed in the traversal.)

There remains a nontrivial risk of code assuming that a ListCell *
pointer will remain good over an operation that could now move the list
contents.  To help catch such errors, list.c can be compiled with a new
define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents
whenever that could possibly happen.  This makes list operations
significantly more expensive so it's not normally turned on (though it
is on by default if USE_VALGRIND is on).

There are two notable API differences from the previous code:

* lnext() now requires the List's header pointer in addition to the
current cell's address.

* list_delete_cell() no longer requires a previous-cell argument.

These changes are somewhat unfortunate, but on the other hand code using
either function needs inspection to see if it is assuming anything
it shouldn't, so it's not all bad.

Programmers should be aware of these significant performance changes:

* list_nth() and related functions are now O(1); so there's no
major access-speed difference between a list and an array.

* Inserting or deleting a list element now takes time proportional to
the distance to the end of the list, due to moving the array elements.
(However, it typically *doesn't* require palloc or pfree, so except in
long lists it's probably still faster than before.)  Notably, lcons()
used to be about the same cost as lappend(), but that's no longer true
if the list is long.  Code that uses lcons() and list_delete_first()
to maintain a stack might usefully be rewritten to push and pop at the
end of the list rather than the beginning.

* There are now list_insert_nth...() and list_delete_nth...() functions
that add or remove a list cell identified by index.  These have the
data-movement penalty explained above, but there's no search penalty.

* list_concat() and variants now copy the second list's data into
storage belonging to the first list, so there is no longer any
sharing of cells between the input lists.  The second argument is
now declared "const List *" to reflect that it isn't changed.

This patch just does the minimum needed to get the new implementation
in place and fix bugs exposed by the regression tests.  As suggested
by the foregoing, there's a fair amount of followup work remaining to
do.

Also, the ENABLE_LIST_COMPAT macros are finally removed in this
commit.  Code using those should have been gone a dozen years ago.

Patch by me; thanks to David Rowley, Jesper Pedersen, and others
for review.

Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 13:41:58 -04:00
Michael Paquier 6b8548964b Fix inconsistencies in the code
This addresses a couple of issues in the code:
- Typos and inconsistencies in comments and function declarations.
- Removal of unreferenced function declarations.
- Removal of unnecessary compile flags.
- A cleanup error in regressplans.sh.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/0c991fdf-2670-1997-c027-772a420c4604@gmail.com
2019-07-08 13:15:09 +09:00
Tom Lane 5683b34956 Ensure plpgsql result tuples have the right composite type marking.
A function that is declared to return a named composite type must
return tuple datums that are physically marked as having that type.
The plpgsql code path that allowed directly returning an expanded-record
datum forgot to check that, so that an expanded record marked as type
RECORDOID could be returned if it had a physically-compatible tupdesc.
This'd be harmless, I think, if the record value never escaped the
current session --- but it's possible for it to get stored into a table,
and then subsequent sessions can't interpret the anonymous record type.

Fix by flattening the record into a tuple datum and overwriting its
type/typmod fields, if its declared type doesn't match the function's
declared type.  (In principle it might be possible to just change the
expanded record's stored type ID info, but there are enough tricky
consequences that I didn't want to mess with that, especially not in
a back-patched bug fix.)

Per bug report from Steve Rogerson.  Back-patch to v11 where the bug
was introduced.

Discussion: https://postgr.es/m/cbaecae6-7b87-584e-45f6-4d047b92ca2a@yewtc.demon.co.uk
2019-07-03 18:08:53 -04:00
Peter Eisentraut 91acff7a53 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 1a710c413ce4c4cd081843e563cde256bb95f490
2019-06-17 15:30:20 +02:00
Michael Paquier 3412030205 Fix more typos and inconsistencies in the tree
Author: Alexander Lakhin
Discussion: https://postgr.es/m/0a5419ea-1452-a4e6-72ff-545b1a5a8076@gmail.com
2019-06-17 16:13:16 +09:00
Tom Lane 032627ee78 Clean up PL/Perl's handling of the _() macro.
Perl likes to redefine the _() macro:

#ifdef CAN_PROTOTYPE
#define	_(args) args
#else ...

There was lots not to like about the way we dealt with this before:

1. Instead of taking care of the conflict centrally in plperl.h, we
expected every one of its ever-growing number of includers to do so.
This is duplicative and error-prone in itself, plus it means that
plperl.h fails to meet the expectation of being compilable standalone,
resulting in macro-redefinition warnings in cpluspluscheck.

2. We left _() with its Perl definition, meaning that if someone tried
to use it in any Perl-related extension, it would silently fail to
provide run-time translation.  I don't see any live bugs of this ilk,
but it's clearly a hard-to-notice bug waiting to happen.

So fix that by centralizing the cleanup logic, making it match what
we're already doing for other macro conflicts with Perl.  Since we only
expect plperl.h to be included by extensions not core code, we should
redefine _() as dgettext() not gettext().
2019-06-02 12:23:39 -04:00
Tom Lane 3f61b3205f Fix C++ incompatibilities in plpgsql's header files.
Rename some exposed parameters so that they don't conflict with
C++ reserved words.

Back-patch to all supported versions.

George Tarasov

Discussion: https://postgr.es/m/b517ec3918d645eb950505eac8dd434e@gaz-is.ru
2019-05-31 12:34:54 -04:00
Tom Lane 7640f93123 Fix assorted header files that failed to compile standalone.
We have a longstanding project convention that all .h files should
be includable with no prerequisites other than postgres.h.  This is
tested/relied-on by cpluspluscheck.  However, cpluspluscheck has not
historically been applied to most headers outside the src/include
tree, with the predictable consequence that some of them don't work.
Fix that, usually by adding missing #include dependencies.

The change in printf_hack.h might require some explanation: without
it, my C++ compiler whines that the function is unused.  There's
not so many call sites that "inline" is going to cost much, and
besides all the callers are in test code that we really don't care
about the size of.

There's no actual bugs being fixed here, so I see no need to back-patch.

Discussion: https://postgr.es/m/b517ec3918d645eb950505eac8dd434e@gaz-is.ru
2019-05-31 11:45:33 -04:00
Tom Lane 8255c7a5ee Phase 2 pgindent run for v12.
Switch to 2.1 version of pg_bsd_indent.  This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.

Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
2019-05-22 13:04:48 -04:00
Tom Lane be76af171c Initial pgindent run for v12.
This is still using the 2.0 version of pg_bsd_indent.
I thought it would be good to commit this separately,
so as to document the differences between 2.0 and 2.1 behavior.

Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
2019-05-22 12:55:34 -04:00
Peter Eisentraut 3c439a58df Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: a20bf6b8a5b4e32450967055eb5b07cee4704edd
2019-05-20 16:00:53 +02:00
Peter Eisentraut 02daece4ab Fix grammar in error message 2019-05-09 09:16:59 +02:00
Tom Lane 4d5840cea9 Fix problems with auto-held portals.
HoldPinnedPortals() did things in the wrong order: it must not mark
a portal autoHeld until it's been successfully held.  Otherwise,
a failure while persisting the portal results in a server crash
because we think the portal is in a good state when it's not.

Also add a check that portal->status is READY before attempting to
hold a pinned portal.  We have such a check before the only other
use of HoldPortal(), so it seems unwise not to check it here.

Lastly, rethink the responsibility for where to call HoldPinnedPortals.
The comment for it imagined that it was optional for any individual PL
to call it or not, but that cannot be the case: if some outer level of
procedure has a pinned portal, failing to persist it when an inner
procedure commits is going to be trouble.  Let's have SPI do it instead
of the individual PLs.  That's not a complete solution, since in theory
a PL might not be using SPI to perform commit/rollback, but such a PL
is going to have to be aware of lots of related requirements anyway.
(This change doesn't cause an API break for any external PLs that might
be calling HoldPinnedPortals per the old regime, because calling it
twice during a commit or rollback sequence won't hurt.)

Per bug #15703 from Julian Schauder.  Back-patch to v11 where this code
came in.

Discussion: https://postgr.es/m/15703-c12c5bc0ea34ba26@postgresql.org
2019-04-19 11:20:37 -04:00
Tom Lane 6726d8d476 Move plpgsql error-trapping tests to a new module-specific test file.
The test for statement timeout has a 2-second timeout, which was only
moderately annoying when it was written, but nowadays it contributes
a pretty significant chunk of the elapsed time needed to run the core
regression tests on a fast machine.  We can improve this situation by
pushing the test into a plpgsql-specific test file instead of having
it in a core regression test.  That's a clean win when considering
just the core tests.  Even when considering check-world or a buildfarm
test run, we should come out ahead because the core tests get run
more times in those sequences.

Furthermore, since the plpgsql tests aren't currently parallelized,
it seems likely that the timing problems reflected in commit f1e671a0b
(which increased that timeout from 1 sec to 2) will be much less severe
in this context.  Hence, let's try cutting the timeout back to 1 second
in hopes of a further win for check-world.  We can undo that if
buildfarm experience proves it to be a bad idea.

To give the new test file some modicum of intellectual coherency,
I moved the surrounding tests related to error-trapping along with
the statement timeout test proper.  Those other tests don't run long
enough to have any particular bearing on test-runtime considerations.
The tests are the same as before, except with minor adjustments to
not depend on an externally-created table.

Discussion: https://postgr.es/m/735.1554935715@sss.pgh.pa.us
2019-04-11 15:09:28 -04:00
Peter Eisentraut fc22b6623b Generated columns
This is an SQL-standard feature that allows creating columns that are
computed from expressions rather than assigned, similar to a view or
materialized view but on a column basis.

This implements one kind of generated column: stored (computed on
write).  Another kind, virtual (computed on read), is planned for the
future, and some room is left for it.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b151f851-4019-bdb1-699e-ebab07d2f40a@2ndquadrant.com
2019-03-30 08:15:57 +01:00
Peter Eisentraut 280a408b48 Transaction chaining
Add command variants COMMIT AND CHAIN and ROLLBACK AND CHAIN, which
start new transactions with the same transaction characteristics as the
just finished one, per SQL standard.

Support for transaction chaining in PL/pgSQL is also added.  This
functionality is especially useful when running COMMIT in a loop in
PL/pgSQL.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://www.postgresql.org/message-id/flat/28536681-324b-10dc-ade8-ab46f7645a5a@2ndquadrant.com
2019-03-24 11:33:02 +01:00
Peter Eisentraut aefcc2bba2 PL/Tcl: Improve trigger tests organization
The trigger tests for PL/Tcl were spread aroud pltcl_setup.sql and
pltcl_queries.sql, mixed with other tests, which makes them hard to
follow and edit.  Move all the trigger-related pieces to a new file
pltcl_trigger.sql.  This also makes the test setup more similar to
plperl and plpython.
2019-03-15 12:42:07 +01:00
Peter Eisentraut c6ff0b892c Refactor ParamListInfo initialization
There were six copies of identical nontrivial code.  Put it into a
function.
2019-03-14 13:30:09 +01:00
Peter Eisentraut bc09d5e4cc Remove unnecessary use of PROCEDURAL
Remove some unnecessary, legacy-looking use of the PROCEDURAL keyword
before LANGUAGE.  We mostly don't use this anymore, so some of these
look a bit old.

There is still some use in pg_dump, which is harder to remove because
it's baked into the archive format, so I'm not touching that.

Discussion: https://www.postgresql.org/message-id/2330919b-62d9-29ac-8de3-58c024fdcb96@2ndquadrant.com
2019-02-25 08:38:59 +01:00
Peter Eisentraut 37d9916020 More unconstify use
Replace casts whose only purpose is to cast away const with the
unconstify() macro.

Discussion: https://www.postgresql.org/message-id/flat/53a28052-f9f3-1808-fed9-460fd43035ab%402ndquadrant.com
2019-02-13 11:50:16 +01:00
Alvaro Herrera 558d77f20e Renaming for new subscripting mechanism
Over at patch https://commitfest.postgresql.org/21/1062/ Dmitry wants to
introduce a more generic subscription mechanism, which allows
subscripting not only arrays but also other object types such as JSONB.
That functionality is introduced in a largish invasive patch, out of
which this internal renaming patch was extracted.

Author: Dmitry Dolgov
Reviewed-by: Tom Lane, Arthur Zakirov
Discussion: https://postgr.es/m/CA+q6zcUK4EqPAu7XRRO5CCjMwhz5zvg+rfWuLzVoxp_5sKS6=w@mail.gmail.com
2019-02-01 12:50:32 -03:00
Tom Lane f09346a9c6 Refactor planner's header files.
Create a new header optimizer/optimizer.h, which exposes just the
planner functions that can be used "at arm's length", without need
to access Paths or the other planner-internal data structures defined
in nodes/relation.h.  This is intended to provide the whole planner
API seen by most of the rest of the system; although FDWs still need
to use additional stuff, and more thought is also needed about just
what selfuncs.c should rely on.

The main point of doing this now is to limit the amount of new
#include baggage that will be needed by "planner support functions",
which I expect to introduce later, and which will be in relevant
datatype modules rather than anywhere near the planner.

This commit just moves relevant declarations into optimizer.h from
other header files (a couple of which go away because everything
got moved), and adjusts #include lists to match.  There's further
cleanup that could be done if we want to decide that some stuff
being exposed by optimizer.h doesn't belong in the planner at all,
but I'll leave that for another day.

Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
2019-01-29 15:48:51 -05:00
Andres Freund a9c35cf85c Change function call information to be variable length.
Before this change FunctionCallInfoData, the struct arguments etc for
V1 function calls are stored in, always had space for
FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two
arrays.  For nearly every function call 100 arguments is far more than
needed, therefore wasting memory. Arg and argnull being two separate
arrays also guarantees that to access a single argument, two
cachelines have to be touched.

Change the layout so there's a single variable-length array with pairs
of value / isnull. That drastically reduces memory consumption for
most function calls (on x86-64 a two argument function now uses
64bytes, previously 936 bytes), and makes it very likely that argument
value and its nullness are on the same cacheline.

Arguments are stored in a new NullableDatum struct, which, due to
padding, needs more memory per argument than before. But as usually
far fewer arguments are stored, and individual arguments are cheaper
to access, that's still a clear win.  It's likely that there's other
places where conversion to NullableDatum arrays would make sense,
e.g. TupleTableSlots, but that's for another commit.

Because the function call information is now variable-length
allocations have to take the number of arguments into account. For
heap allocations that can be done with SizeForFunctionCallInfoData(),
for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro
that helps to allocate an appropriately sized and aligned variable.

Some places with stack allocation function call information don't know
the number of arguments at compile time, and currently variably sized
stack allocations aren't allowed in postgres. Therefore allow for
FUNC_MAX_ARGS space in these cases. They're not that common, so for
now that seems acceptable.

Because of the need to allocate FunctionCallInfo of the appropriate
size, older extensions may need to update their code. To avoid subtle
breakages, the FunctionCallInfoData struct has been renamed to
FunctionCallInfoBaseData. Most code only references FunctionCallInfo,
so that shouldn't cause much collateral damage.

This change is also a prerequisite for more efficient expression JIT
compilation (by allocating the function call information on the stack,
allowing LLVM to optimize it away); previously the size of the call
information caused problems inside LLVM's optimizer.

Author: Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
2019-01-26 14:17:52 -08:00
Peter Eisentraut bbd5c207b9 PL/pgSQL: Add statement ID to statement structures
This can be used by a profiler as the index for an array of
per-statement metrics.

Author: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRDRCjN6rpM9ZccU7Ta_afsNX7mg9=n34F+r445Nt9v2tA@mail.gmail.com/
2019-01-24 22:23:12 +01:00
Heikki Linnakangas 95931133a9 Fix misc typos in comments.
Spotted mostly by Fabien Coelho.

Discussion: https://www.postgresql.org/message-id/alpine.DEB.2.21.1901230947050.16643@lancre
2019-01-23 13:39:00 +02:00
Tom Lane c64d0cd5ce Use perfect hashing, instead of binary search, for keyword lookup.
We've been speculating for a long time that hash-based keyword lookup
ought to be faster than binary search, but up to now we hadn't found
a suitable tool for generating the hash function.  Joerg Sonnenberger
provided the inspiration, and sample code, to show us that rolling our
own generator wasn't a ridiculous idea.  Hence, do that.

The method used here requires a lookup table of approximately 4 bytes
per keyword, but that's less than what we saved in the predecessor commit
afb0d0712, so it's not a big problem.  The time savings is indeed
significant: preliminary testing suggests that the total time for raw
parsing (flex + bison phases) drops by ~20%.

Patch by me, but it owes its existence to Joerg Sonnenberger;
thanks also to John Naylor for review.

Discussion: https://postgr.es/m/20190103163340.GA15803@britannica.bec.de
2019-01-09 19:47:46 -05:00
Tom Lane 59029b6fb7 Update docs & tests to reflect that unassigned OLD/NEW are now NULL.
For a long time, plpgsql has allowed trigger functions to parse
references to OLD and NEW even if the current trigger event type didn't
assign a value to one or the other variable; but actually executing such
a reference would fail.  The v11 changes to use "expanded records" for
DTYPE_REC variables changed the behavior so that the unassigned variable
now reads as a null composite value.  While this behavioral change was
more or less unintentional, it seems that leaving it like this is better
than adding code and complexity to be bug-compatible with the old way.
The change doesn't break any code that worked before, and it eliminates
a gotcha that often required extra code to work around.

Hence, update the docs to say that these variables are "null" not
"unassigned" when not relevant to the event type.  And add a regression
test covering the behavior, so that we'll notice if we ever break it
again.

Per report from Kristjan Tammekivi.

Discussion: https://postgr.es/m/CAABK7uL-uC9ZxKBXzo_68pKt7cECfNRv+c35CXZpjq6jCAzYYA@mail.gmail.com
2019-01-09 11:35:14 -05:00
Tom Lane afb0d0712f Replace the data structure used for keyword lookup.
Previously, ScanKeywordLookup was passed an array of string pointers.
This had some performance deficiencies: the strings themselves might
be scattered all over the place depending on the compiler (and some
quick checking shows that at least with gcc-on-Linux, they indeed
weren't reliably close together).  That led to very cache-unfriendly
behavior as the binary search touched strings in many different pages.
Also, depending on the platform, the string pointers might need to
be adjusted at program start, so that they couldn't be simple constant
data.  And the ScanKeyword struct had been designed with an eye to
32-bit machines originally; on 64-bit it requires 16 bytes per
keyword, making it even more cache-unfriendly.

Redesign so that the keyword strings themselves are allocated
consecutively (as part of one big char-string constant), thereby
eliminating the touch-lots-of-unrelated-pages syndrome.  And get
rid of the ScanKeyword array in favor of three separate arrays:
uint16 offsets into the keyword array, uint16 token codes, and
uint8 keyword categories.  That reduces the overhead per keyword
to 5 bytes instead of 16 (even less in programs that only need
one of the token codes and categories); moreover, the binary search
only touches the offsets array, further reducing its cache footprint.
This also lets us put the token codes somewhere else than the
keyword strings are, which avoids some unpleasant build dependencies.

While we're at it, wrap the data used by ScanKeywordLookup into
a struct that can be treated as an opaque type by most callers.
That doesn't change things much right now, but it will make it
less painful to switch to a hash-based lookup method, as is being
discussed in the mailing list thread.

Most of the change here is associated with adding a generator
script that can build the new data structure from the same
list-of-PG_KEYWORD header representation we used before.
The PG_KEYWORD lists that plpgsql and ecpg used to embed in
their scanner .c files have to be moved into headers, and the
Makefiles have to be taught to invoke the generator script.
This work is also necessary if we're to consider hash-based lookup,
since the generator script is what would be responsible for
constructing a hash table.

Aside from saving a few kilobytes in each program that includes
the keyword table, this seems to speed up raw parsing (flex+bison)
by a few percent.  So it's worth doing even as it stands, though
we think we can gain even more with a follow-on patch to switch
to hash-based lookup.

John Naylor, with further hacking by me

Discussion: https://postgr.es/m/CAJVSVGXdFVU2sgym89XPL=Lv1zOS5=EHHQ8XWNzFL=mTXkKMLw@mail.gmail.com
2019-01-06 17:02:57 -05:00
Tom Lane 4879a5172a Support plpgsql variable names that conflict with unreserved SQL keywords.
A variable name matching a statement-introducing keyword, such as
"comment" or "update", caused parse failures if one tried to write
a statement using that keyword.  Commit bb1b8f69 already addressed
this scenario for the case of variable names matching unreserved
plpgsql keywords, but we didn't think about unreserved core-grammar
keywords.  The same heuristic (viz, it can't be a variable name
unless the next token is assignment or '[') should work fine for
that case too, and as a bonus the code gets shorter and less
duplicative.

Per bug #15555 from Feike Steenbergen.  Since this hasn't been
complained of before, and is easily worked around anyway,
I won't risk a back-patch.

Discussion: https://postgr.es/m/15555-149bbd70ddc7b4b6@postgresql.org
2019-01-04 12:16:19 -05:00
Bruce Momjian 97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00
Michael Paquier e0ef136d52 Trigger stmt_beg and stmt_end for top-level statement blocks of PL/pgSQL
PL/pgSQL provides a set of callbacks which can be used for extra
instrumentation of functions written in this language called at function
setup, begin and end, as well as statement begin and end.  When calling
a routine, a trigger, or an event trigger, statement callbacks are not
getting called for the top-level statement block leading to an
inconsistent handling compared to the other statements.  This
inconsistency can potentially complicate extensions doing
instrumentation work on top of PL/pgSQL, so this commit makes sure that
all statement blocks, including the top-level one, go through the
correct corresponding callbacks.

Author: Pavel Stehule
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAFj8pRArEANsaUjo5in9_iQt0vKf9ecwDAmsdN_EBwL13ps12A@mail.gmail.com
2018-12-30 14:35:15 +09:00
Tom Lane 586b98fdf1 Make type "name" collation-aware.
The "name" comparison operators now all support collations, making them
functionally equivalent to "text" comparisons, except for the different
physical representation of the datatype.  They do, in fact, mostly share
the varstr_cmp and varstr_sortsupport infrastructure, which has been
slightly enlarged to handle the case.

To avoid changes in the default behavior of the datatype, set name's
typcollation to C_COLLATION_OID not DEFAULT_COLLATION_OID, so that
by default comparisons to a name value will continue to use strcmp
semantics.  (This would have been the case for system catalog columns
anyway, because of commit 6b0faf723, but doing this makes it true for
user-created name columns as well.  In particular, this avoids
locale-dependent changes in our regression test results.)

In consequence, tweak a couple of places that made assumptions about
collatable base types always having typcollation DEFAULT_COLLATION_OID.
I have not, however, attempted to relax the restriction that user-
defined collatable types must have that.  Hence, "name" doesn't
behave quite like a user-defined type; it acts more like a domain
with COLLATE "C".  (Conceivably, if we ever get rid of the need for
catalog name columns to be fixed-length, "name" could actually become
such a domain over text.  But that'd be a pretty massive undertaking,
and I'm not volunteering.)

Discussion: https://postgr.es/m/15938.1544377821@sss.pgh.pa.us
2018-12-19 17:46:25 -05:00
Tom Lane 04fe805a17 Drop no-op CoerceToDomain nodes from expressions at planning time.
If a domain has no constraints, then CoerceToDomain doesn't really do
anything and can be simplified to a RelabelType.  This not only
eliminates cycles at execution, but allows the planner to optimize better
(for instance, match the coerced expression to an index on the underlying
column).  However, we do have to support invalidating the plan later if
a constraint gets added to the domain.  That's comparable to the case of
a change to a SQL function that had been inlined into a plan, so all the
necessary logic already exists for plans depending on functions.  We
need only duplicate or share that logic for domains.

ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval
messages for the domain's pg_type entry, since those operations don't
update that row.  (ALTER DOMAIN SET/DROP NOT NULL do update that row,
so no code change is needed for them.)

Testing this revealed what's really a pre-existing bug in plpgsql:
it caches the SQL-expression-tree expansion of type coercions and
had no provision for invalidating entries in that cache.  Up to now
that was only a problem if such an expression had inlined a SQL
function that got changed, which is unlikely though not impossible.
But failing to track changes of domain constraints breaks an existing
regression test case and would likely cause practical problems too.

We could fix that locally in plpgsql, but what seems like a better
idea is to build some generic infrastructure in plancache.c to store
standalone expressions and track invalidation events for them.
(It's tempting to wonder whether plpgsql's "simple expression" stuff
could use this code with lower overhead than its current use of the
heavyweight plancache APIs.  But I've left that idea for later.)

Other stuff fixed in passing:

* Allow estimate_expression_value() to drop CoerceToDomain
unconditionally, effectively assuming that the coercion will succeed.
This will improve planner selectivity estimates for cases involving
estimatable expressions that are coerced to domains.  We could have
done this independently of everything else here, but there wasn't
previously any need for eval_const_expressions_mutator to know about
CoerceToDomain at all.

* Use a dlist for plancache.c's list of cached plans, rather than a
manually threaded singly-linked list.  That eliminates a potential
performance problem in DropCachedPlan.

* Fix a couple of inconsistencies in typecmds.c about whether
operations on domains drop RowExclusiveLock on pg_type.  Our common
practice is that DDL operations do drop catalog locks, so standardize
on that choice.

Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2018-12-13 13:24:43 -05:00
Michael Paquier 730422afcd Fix some errhint and errdetail strings missing a period
As per the error message style guide of the documentation, those should
be full sentences.

Author: Daniel Gustafsson
Reviewed-by: Michael Paquier, Álvaro Herrera
Discussion: https://1E8D49B4-16BC-4420-B4ED-58501D9E076B@yesql.se
2018-12-07 07:47:42 +09:00
Andres Freund 578b229718 Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.

This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row.  Neither pg_dump nor COPY included the contents of the
oid column by default.

The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.

WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.

Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
  WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
  issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
  restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
  OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
  plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.

The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.

The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such.  This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.

The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.

Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).

The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.

While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.

Catversion bump, for obvious reasons.

Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-20 16:00:17 -08:00
Andrew Dunstan 79376e0712 fix typo 2018-11-18 12:43:03 -05:00
Andrew Dunstan d5d7f7f3b7 Silence MSVC warnings about redefinition of isnan
Some versions of perl.h define isnan when the compiler is MSVC. To avoid
warnings about this, undefine the symbol before including perl.h and
re-add the definition afterwards if it wasn't recreated.

Discussion: https://postgr.es/m/caf0568e-3c1f-07fd-6914-d903f22560f2@2ndQuadrant.com
2018-11-18 12:36:31 -05:00
Tom Lane f26c06a404 Fix error-cleanup mistakes in exec_stmt_call().
Commit 15c729347 was a couple bricks shy of a load: we need to
ensure that expr->plan gets reset to NULL on any error exit,
if it's not supposed to be saved.  Also ensure that the
stmt->target calculation gets redone if needed.

The easy way to exhibit a problem is to set up code that
violates the writable-argument restriction and then execute
it twice.  But error exits out of, eg, setup_param_list()
could also break it.  Make the existing PG_TRY block cover
all of that code to be sure.

Per report from Pavel Stehule.

Discussion: https://postgr.es/m/CAFj8pRAeXNTO43W2Y0Cn0YOVFPv1WpYyOqQrrzUiN6s=dn7gCg@mail.gmail.com
2018-11-09 22:04:14 -05:00
Tom Lane 15c7293477 Fix bugs in plpgsql's handling of CALL argument lists.
exec_stmt_call() tried to extract information out of a CALL statement's
argument list without using expand_function_arguments(), apparently in
the hope of saving a few nanoseconds by not processing defaulted
arguments.  It got that quite wrong though, leading to crashes with
named arguments, as well as failure to enforce writability of the
argument for a defaulted INOUT parameter.  Fix and simplify the logic
by using expand_function_arguments() before examining the list.

Also, move the argument-examination to just after producing the CALL
command's plan, before invoking the called procedure.  This ensures
that we'll track possible changes in the procedure's argument list
correctly, and avoids a hazard of the plan cache being flushed while
the procedure executes.

Also fix assorted falsehoods and omissions in associated documentation.

Per bug #15477 from Alexey Stepanov.

Patch by me, with some help from Pavel Stehule.  Back-patch to v11.

Discussion: https://postgr.es/m/15477-86075b1d1d319e0a@postgresql.org
Discussion: https://postgr.es/m/CAFj8pRA6UsujpTs9Sdwmk-R6yQykPx46wgjj+YZ7zxm4onrDyw@mail.gmail.com
2018-11-04 13:25:39 -05:00
Tom Lane e74dd00f53 Still further rethinking of build changes for macOS Mojave.
To avoid the sorts of problems complained of by Jakob Egger, it'd be
best if configure didn't emit any references to the sysroot path at all.
In the case of PL/Tcl, we can do that just by keeping our hands off the
TCL_INCLUDE_SPEC string altogether.  In the case of PL/Perl, we need to
substitute -iwithsysroot for -I in the compile commands, which is easily
handled if we change to using a configure output variable that includes
the switch not only the directory name.  Since PL/Tcl and PL/Python
already do it like that, this seems like good consistency cleanup anyway.

Hence, this replaces the advice given to Perl-related extensions in commit
5e2217131; instead of writing "-I$(perl_archlibexp)/CORE", they should
just write "$(perl_includespec)".  (The old way continues to work, but not
on recent macOS.)

It's still the case that configure needs to be aware of the sysroot
path internally, but that's cleaner than what we had before.

As before, back-patch to all supported versions.

Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us
2018-10-18 14:55:23 -04:00
Tom Lane 7767aadd94 Fix omissions in snprintf.c's coverage of standard *printf functions.
A warning on a NetBSD box revealed to me that pg_waldump/compat.c
is using vprintf(), which snprintf.c did not provide coverage for.
This is not good if we want to have uniform *printf behavior, and
it's pretty silly to omit when it's a one-line function.

I also noted that snprintf.c has pg_vsprintf() but for some reason
it was not exposed to the outside world, creating another way in
which code might accidentally invoke the platform *printf family.

Let's just make sure that we replace all eight of the POSIX-standard
printf family.

Also, upgrade plperl.h and plpython.h to make sure that they do
their undefine/redefine rain dance for all eight, not some random
maybe-sufficient subset thereof.
2018-10-08 19:15:55 -04:00
Tom Lane 82ff0cc91d Advance transaction timestamp for intra-procedure transactions.
Per discussion, this behavior seems less astonishing than not doing so.

Peter Eisentraut and Tom Lane

Discussion: https://postgr.es/m/20180920234040.GC29981@momjian.us
2018-10-08 16:16:36 -04:00
Tom Lane 113a659914 Ensure that PLPGSQL_DTYPE_ROW variables have valid refname fields.
Without this, the syntax-tree-dumping functions in pl_funcs.c crash,
and there are other places that might be at risk too.  Per report
from Pavel Stehule.

Looks like I broke this in commit f9263006d, so back-patch to v11.

Discussion: https://postgr.es/m/CAFj8pRA+3f5n4642q2g8BXCKjbTd7yU9JMYAgDyHgozk6cQ-VA@mail.gmail.com
2018-10-05 12:45:37 -04:00
Andres Freund cc2905e963 Use slots more widely in tuple mapping code and make naming more consistent.
It's inefficient to use a single slot for mapping between tuple
descriptors for multiple tuples, as previously done when using
ConvertPartitionTupleSlot(), as that means the slot's tuple descriptors
change for every tuple.

Previously we also, via ConvertPartitionTupleSlot(), built new tuples
after the mapping even in cases where we, immediately afterwards,
access individual columns again.

Refactor the code so one slot, on demand, is used for each
partition. That avoids having to change the descriptor (and allows to
use the more efficient "fixed" tuple slots). Then use slot->slot
mapping, to avoid unnecessarily forming a tuple.

As the naming between the tuple and slot mapping functions wasn't
consistent, rename them to execute_attr_map_{tuple,slot}.  It's likely
that we'll also rename convert_tuples_by_* to denote that these
functions "only" build a map, but that's left for later.

Author: Amit Khandekar and Amit Langote, editorialized by me
Reviewed-By: Amit Langote, Amit Khandekar, Andres Freund
Discussion:
    https://postgr.es/m/CAJ3gD9fR0wRNeAE8VqffNTyONS_UfFPRpqxhnD9Q42vZB+Jvpg@mail.gmail.com
    https://postgr.es/m/e4f9d743-cd4b-efb0-7574-da21d86a7f36%40lab.ntt.co.jp
Backpatch: -
2018-10-02 11:14:26 -07:00
Tom Lane 8b91d25884 Clean up *printf macros to avoid conflict with format archetypes.
We must define the macro "printf" with arguments, else it can mess
up format archetype attributes in builds where PG_PRINTF_ATTRIBUTE
is just "printf".  Fortunately, that's easy to do now that we're
requiring C99; we can use __VA_ARGS__.

On the other hand, it's better not to use __VA_ARGS__ for the rest
of the *printf crew, so that one can take the addresses of those
functions without surprises.

I'd proposed doing this some time ago, but forgot to make it happen;
buildfarm failures subsequent to 96bf88d52 reminded me.

Discussion: https://postgr.es/m/22709.1535135640@sss.pgh.pa.us
Discussion: https://postgr.es/m/20180926190934.ea4xvzhkayuw7gkx@alap3.anarazel.de
2018-09-26 17:35:01 -04:00
Tom Lane d6c55de1f9 Implement %m in src/port/snprintf.c, and teach elog.c to rely on that.
I started out with the idea that we needed to detect use of %m format specs
in contexts other than elog/ereport calls, because we couldn't rely on that
working in *printf calls.  But a better answer is to fix things so that it
does work.  Now that we're using snprintf.c all the time, we can implement
%m in that and we've fixed the problem.

This requires also adjusting our various printf-wrapping functions so that
they ensure "errno" is preserved when they call snprintf.c.

Remove elog.c's handmade implementation of %m, and let it rely on
snprintf to support the feature.  That should provide some performance
gain, though I've not attempted to measure it.

There are a lot of places where we could now simplify 'printf("%s",
strerror(errno))' into 'printf("%m")', but I'm not in any big hurry
to make that happen.

Patch by me, reviewed by Michael Paquier

Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us
2018-09-26 13:31:56 -04:00
Tom Lane 96bf88d527 Always use our own versions of *printf().
We've spent an awful lot of effort over the years in coping with
platform-specific vagaries of the *printf family of functions.  Let's just
forget all that mess and standardize on always using src/port/snprintf.c.
This gets rid of a lot of configure logic, and it will allow a saner
approach to dealing with %m (though actually changing that is left for
a follow-on patch).

Preliminary performance testing suggests that as it stands, snprintf.c is
faster than the native printf functions for some tasks on some platforms,
and slower for other cases.  A pending patch will improve that, though
cases with floating-point conversions will doubtless remain slower unless
we want to put a *lot* of effort into that.  Still, we've not observed
that *printf is really a performance bottleneck for most workloads, so
I doubt this matters much.

Patch by me, reviewed by Michael Paquier

Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us
2018-09-26 13:13:57 -04:00