This has been requested a few times, but the use-case for it was never
entirely clear. The reason for adding it now is that transmission of
error reports from parallel workers fails when NLS is active, because
pq_parse_errornotice() wrongly assumes that the existing severity field
is nonlocalized. There are other ways we could have fixed that, but the
other options were basically kluges, whereas this way provides something
that's at least arguably a useful feature along with the bug fix.
Per report from Jakob Egger. Back-patch into 9.6, because otherwise
parallel query is essentially unusable in non-English locales. The
problem exists in 9.5 as well, but we don't want to risk changing
on-the-wire behavior in 9.5 (even though the possibility of new error
fields is specifically called out in the protocol document). It may
be sufficient to leave the issue unfixed in 9.5, given the very limited
usefulness of pq_parse_errornotice in that version.
Discussion: <A88E0006-13CB-49C6-95CC-1A77D717213C@eggerapps.at>
HandleParallelMessages leaked memory into the caller's context. Since it's
called from ProcessInterrupts, there is basically zero certainty as to what
CurrentMemoryContext is, which means we could be leaking into long-lived
contexts. Over the processing of many worker messages that would grow to
be a problem. Things could be even worse than just a leak, if we happened
to service the interrupt while ErrorContext is current: elog.c thinks it
can reset that on its own whim, possibly yanking storage out from under
HandleParallelMessages.
Give HandleParallelMessages its own dedicated context instead, which we can
reset during each call to ensure there's no accumulation of wasted memory.
Discussion: <16610.1472222135@sss.pgh.pa.us>
The guiding principle for the last few patches in this area apparently
involved throwing darts.
Cosmetic only, but back-patch to 9.6 because there is no reason for
9.6 and HEAD to diverge yet in this file.
Copy the palloc'd strings into the correct context, ie ErrorContext
not wherever the source ErrorData is. This would be a large bug,
except that it appears that all catchers of thrown errors do either
EmitErrorReport or CopyErrorData before doing anything that would
cause transient memory contexts to be cleaned up. Still, it's wrong
and it will bite somebody someday.
Fix failure to copy cursorpos and internalpos.
Utter the appropriate incantations involving recursion_depth, so that
we'll behave sanely if we get an error inside pstrdup. (In general,
the body of this function ought to act like, eg, errdetail().)
Per code reading induced by Jakob Egger's report.
The previous coding here was capable of adding a "parallel worker" context
line to errors that were not, in fact, returned from a parallel worker.
Instead of using an errcontext callback to add that annotation, just paste
it onto the message by hand; this looks uglier but is more reliable.
Discussion: <19757.1472151987@sss.pgh.pa.us>
When there is an identifiable REPLICA IDENTITY index on the target table,
heap_update leaks the id_attrs bitmapset. That's not many bytes, but it
adds up over enough rows, since the code typically runs in a query-lifespan
context. Bug introduced in commit e55704d8b, which did a rather poor job
of cloning the existing use-pattern for RelationGetIndexAttrBitmap().
Per bug #14293 from Zhou Digoal. Back-patch to 9.4 where the bug was
introduced.
Report: <20160824114320.15676.45171@wrigleys.postgresql.org>
ExecReScanAgg's check for whether it could re-use a previously calculated
hashtable neglected the possibility that the Agg node might reference
PARAM_EXEC Params that are not referenced by its input plan node. That's
okay if the Params are in upper tlist or qual expressions; but if one
appears in aggregate input expressions, then the hashtable contents need
to be recomputed when the Param's value changes.
To avoid unnecessary performance degradation in the case of a Param that
isn't within an aggregate input, add logic to the planner to determine
which Params are within aggregate inputs. This requires a new field in
struct Agg, but fortunately we never write plans to disk, so this isn't
an initdb-forcing change.
Per report from Jeevan Chalke. This has been broken since forever,
so back-patch to all supported branches.
Andrew Gierth, with minor adjustments by me
Report: <CAM2+6=VY8ykfLT5Q8vb9B6EbeBk-NGuLbT6seaQ+Fq4zXvrDcA@mail.gmail.com>
With Asserts off, these variables are set but never used, resulting
in warnings from pickier compilers. Fix that with our standard solution.
Per report from Jeff Janes.
AF_INET is apparently defined in something that's pulled in automatically
on Linux, but the buildfarm says that's not true everywhere. Comparing
to network_gist.c suggests that including <sys/socket.h> ought to fix it,
and the POSIX standard concurs.
This seems to offer significantly better search performance than the
existing GiST opclass for inet/cidr, at least on data with a wide mix
of network mask lengths. (That may suggest that the data splitting
heuristics in the GiST opclass could be improved.)
Emre Hasegeli, with mostly-cosmetic adjustments by me
Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>
If you have previously pinned a segment and decide that you don't
actually want to keep it around until shutdown, this new API lets you
remove the pin. This is pretty trivial except on Windows, where it
requires closing the duplicate handle that was used to implement the
pin.
Thomas Munro and Amit Kapila, reviewed by Amit Kapila and by me.
Previously, the spgSplitTuple action could only create a new upper tuple
containing a single labeled node. This made it useless for opclasses
that prefer to work with fixed sets of nodes (labeled or otherwise),
which meant that restrictive prefixes could not be used with such
node definitions. Change the output field set for the choose() method
to allow it to specify any valid node set for the new upper tuple,
and to specify which of these nodes to place the modified lower tuple in.
In addition to its primary use for fixed node sets, this feature could
allow existing opclasses that use variable node sets to skip a separate
spgAddNode action when splitting a tuple, by setting up the node needed
for the incoming value as part of the spgSplitTuple action. However, care
would have to be taken to add the extra node only when it would not make
the tuple bigger than before. (spgAddNode can enlarge the tuple,
spgSplitTuple can't.)
This is a prerequisite for an upcoming SP-GiST inet opclass, but is
being committed separately to increase the visibility of the API change.
In passing, improve the documentation about the traverse-values feature
that was added by commit ccd6eb49a.
Emre Hasegeli, with cosmetic adjustments and documentation rework by me
Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>
Add a variant of txid_current() that returns NULL if no transaction ID
is assigned. This version can be used even on a standby server,
although it will always return NULL since no transaction IDs can be
assigned during recovery.
Craig Ringer, per suggestion from Jim Nasby. Reviewed by Petr Jelinek
and by me.
Merge several copies of "copy an inet value and adjust the mask length"
code to create a single, conveniently C-callable function. This function
is exported for future use by inet SPGiST support, but it's good cleanup
anyway since we had three slightly-different-for-no-good-reason copies.
(Extracted from a larger patch, to separate new code from refactoring
of old code)
Emre Hasegeli
Due to an error in the abbreviated key abort logic, the most recently
processed SortTuple could be incorrectly marked NULL, resulting in an
incorrect final sort order.
In the worst case, this could result in a corrupt btree index, which
would need to be rebuild using REINDEX. However, abbrevation doesn't
abort very often, not all data types use it, and only one tuple would
end up in the wrong place, so the practical impact of this mistake may
be somewhat limited.
Report and patch by Peter Geoghegan.
In particular, left join to pg_authid so that rows in pg_stat_activity
don't disappear if the session's owning user has been dropped.
Also convert a few joins to pg_database to left joins, in the same spirit,
though that case might be harder to hit. We were doing this in other
views already, so it was a bit inconsistent that these views didn't.
Oskari Saarenmaa, with some further tweaking by me
Discussion: <56E87CD8.60007@ohmu.fi>
Obvious brain fade in set_rel_consider_parallel(). Noticed it while
adjusting the adjacent RTE_FUNCTION case.
In 9.6, also make the code look more like what I just did in HEAD
by removing the unnecessary function_rte_parallel_ok subroutine
(it does nothing that expression_tree_walker wouldn't do).
We need to scan the whole parse tree for parallel-unsafe functions.
If there are none, we'll later need to determine whether particular
subtrees contain any parallel-restricted functions. The previous coding
retained no knowledge from the first scan, even though this is very
wasteful in the common case where the query contains only parallel-safe
functions. We can bypass all of the later scans by remembering that fact.
This provides a small but measurable speed improvement when the case
applies, and shouldn't cost anything when it doesn't.
Patch by me, reviewed by Robert Haas
Discussion: <3740.1471538387@sss.pgh.pa.us>
Clobbering errno during cleanup after an error is an oft-repeated, easy
to make mistake. Deal with it here as everywhere else, by saving it
aside and restoring after cleanup, before ereport'ing.
In passing, add a missing errcode declaration in another ereport() call
in the same file, which I noticed while skimming the file looking for
similar problems.
Backpatch to 9.4, where this code was introduced.
pcolor was used to represent function arguments that are nominally of
type color, but when using a pre-ANSI C compiler would be passed as the
promoted integer type. We really don't need that anymore.
The regex library used to have a notion of a "collating element" that was
distinct from a "character", but Henry Spencer never actually implemented
his planned support for multi-character collating elements, and the Tcl
crew ripped out most of the stubs for that years ago. The only thing left
that distinguished the "celt" typedef from the "chr" typedef was that
"celt" was supposed to also be able to hold the not-a-character "NOCELT"
value. However, NOCELT was not used anywhere after the MCCE stub removal
changes, which means there's no need for celt to be different from chr.
Removing the separate typedef simplifies matters and also removes a trap
for the unwary, in that celt is signed while chr may not be, so comparisons
could mean different things. There's no bug there today because we
restrict CHR_MAX to be less than INT_MAX, but I think there may have been
such bugs before we did that, and there could be again if anyone ever
decides to fool with the range of chr.
This patch also removes assorted unnecessary casts to "chr" of values
that are already chrs. Many of these seem to be leftover from days when
the code was compatible with pre-ANSI C.
All we need is 4 bytes at the moment, for MD5 authentication. But in
upcomint patches for SCRAM authentication, SCRAM will need a salt of
different length. It's less scary for the caller to pass the buffer
length anyway, than assume a certain-sized output buffer.
Author: Michael Paquier
Discussion: <CAB7nPqQvO4sxLFeS9D+NM3wpy08ieZdAj_6e117MQHZAfxBFsg@mail.gmail.com>
This way sendAuthRequest doesn't need to know the details of all the
different authentication methods. This is in preparation for adding SCRAM
authentication, which will add yet another authentication request message
type, with different payload.
Reviewed-By: Michael Paquier
Discussion: <CAB7nPqQvO4sxLFeS9D+NM3wpy08ieZdAj_6e117MQHZAfxBFsg@mail.gmail.com>
INSERT .. ON CONFLICT runs a pre-check of the possible conflicting
constraints before performing the actual speculative insertion. In case
the inserted tuple included TOASTed columns the ON CONFLICT condition
would be handled correctly in case the conflict was caught by the
pre-check, but if two transactions entered the speculative insertion
phase at the same time, one would have to re-try, and the code for
aborting a speculative insertion did not handle deleting the
speculatively inserted TOAST datums correctly.
TOAST deletion would fail with "ERROR: attempted to delete invisible
tuple" as we attempted to remove the TOAST tuples using
simple_heap_delete which reasoned that the given tuples should not be
visible to the command that wrote them.
This commit updates the heap_abort_speculative() function which aborts
the conflicting tuple to use itself, via toast_delete, for deleting
associated TOAST datums. Like before, the inserted toast rows are not
marked as being speculative.
This commit also adds a isolationtester spec test, exercising the
relevant code path. Unfortunately 9.5 cannot handle two waiting
sessions, and thus cannot execute this test.
Reported-By: Viren Negi, Oskari Saarenmaa
Author: Oskari Saarenmaa, edited a bit by me
Bug: #14150
Discussion: <20160519123338.12513.20271@wrigleys.postgresql.org>
Backpatch: 9.5, where ON CONFLICT was introduced
regexp_match() is like regexp_matches(), but it disallows the 'g' flag
and in consequence does not need to return a set. Instead, it returns
a simple text array value, or NULL if there's no match. Previously people
usually got that behavior with a sub-select, but this way is considerably
more efficient.
Documentation adjusted so that regexp_match() is presented first and then
regexp_matches() is introduced as a more complicated version. This is
a bit historically revisionist but seems pedagogically better.
Still TODO: extend contrib/citext to support this function.
Emre Hasegeli, reviewed by David Johnston
Discussion: <CAE2gYzy42sna2ME_e3y1KLQ-4UBrB-eVF0SWn8QG39sQSeVhEw@mail.gmail.com>
Slot creation did not clear all fields upon creation. After start the
memory is zeroed, but when a physical replication slot was created in
the shared memory of a previously existing logical slot, catalog_xmin
would not be cleared. That in turn would prevent vacuum from doing its
duties.
To fix initialize all the fields. To make similar future bugs less
likely, zero all of ReplicationSlotPersistentData, and re-order the
rest of the initialization to be in struct member order.
Analysis: Andrew Gierth
Reported-By: md@chewy.com
Author: Michael Paquier
Discussion: <20160705173502.1398.70934@wrigleys.postgresql.org>
Backpatch: 9.4, where replication slots were introduced
The performance overhead of this can be significant on Windows, and most
people don't have the tools to view it anyway as Windows does not have
native support for process titles.
Discussion: <0A3221C70F24FB45833433255569204D1F5BE3E8@G01JPEXMBYT05>
Takayuki Tsunakawa
We implement a dozen or so parameterless functions that the SQL standard
defines special syntax for. Up to now, that was done by converting them
into more or less ad-hoc constructs such as "'now'::text::date". That's
messy for multiple reasons: it exposes what should be implementation
details to users, and performance is worse than it needs to be in several
cases. To improve matters, invent a new expression node type
SQLValueFunction that can represent any of these parameterless functions.
Bump catversion because this changes stored parsetrees for rules.
Discussion: <30058.1463091294@sss.pgh.pa.us>
I'm not sure which bozo thought it's a problem to use strtol() only
for its endptr result, but silence the warning using same method
used elsewhere.
Report: <f845d3a6-5328-3e2a-924f-f8e91aa2b6d2@2ndquadrant.com>
This is somewhat cosmetic, since as long as you know what you are looking
at, "10.0" is a serviceable substitute for "10". But there is a potential
for confusion between version numbers with minor numbers and those without
--- we don't want people asking "why is psql saying 10.0 when my server is
10.2". Therefore, back-patch as far as practical, which turns out to be
9.3. I could have redone the patch to use fprintf(stderr) in place of
psql_error(), but it seems more work than is warranted for branches that
will be EOL or nearly so by the time v10 comes out.
Although only psql seems to contain any code that needs this, I chose
to put the support function into fe_utils, since it seems likely we'll
need it in other client programs in future. (In 9.3-9.5, use dumputils.c,
the predecessor of fe_utils/string_utils.c.)
In HEAD, also fix the backend code that whines about loadable-library
version mismatch. I don't see much need to back-patch that.
Commit af33039317 aimed to reduce
leakage from tqueue.c, which is good. Unfortunately, by changing the
memory context in which all of gather_readnext() executes, it also
changed the context in which ExecShutdownGatherWorkers executes, which
is not good, because that function eventually causes a call to
ExecParallelRetrieveInstrumentation, which proceeds to allocate
planstate->worker_instrument in a short-lived context, causing a
crash.
Rushabh Lathia, reviewed by Amit Kapila and by me.
Prior to commit 7882c3b0b9, it was
possible to use LWLocks within DSM segments, but that commit broke
this use case by switching from a doubly linked list to a circular
linked list. Switch back, using a new bit of general infrastructure
for maintaining lists of PGPROCs.
Thomas Munro, reviewed by me.
This is a good bit more complicated than the average new-version stamping
commit, because it includes various adjustments in pursuit of changing
from three-part to two-part version numbers. It's likely some further
work will be needed around that change; but this is enough to get through
the regression tests, at least in Unix builds.
Peter Eisentraut and Tom Lane
NUMERIC_MAX_PRECISION is a purely arbitrary constraint on the precision
and scale you can write in a numeric typmod. It might once have had
something to do with the allowed range of a typmod-less numeric value,
but at least since 9.1 we've allowed, and documented that we allowed,
any value that would physically fit in the numeric storage format;
which is something over 100000 decimal digits, not 1000.
Hence, get rid of numeric_in()'s use of NUMERIC_MAX_PRECISION as a limit
on the allowed range of the exponent in scientific-format input. That was
especially silly in view of the fact that you can enter larger numbers as
long as you don't use 'e' to do it. Just constrain the value enough to
avoid localized overflow, and let make_result be the final arbiter of what
is too large. Likewise adjust ecpg's equivalent of this code.
Also get rid of numeric_recv()'s use of NUMERIC_MAX_PRECISION to limit the
number of base-NBASE digits it would accept. That created a dump/restore
hazard for binary COPY without doing anything useful; the wire-format
limit on number of digits (65535) is about as tight as we would want.
In HEAD, also get rid of pg_size_bytes()'s unnecessary intimacy with what
the numeric range limit is. That code doesn't exist in the back branches.
Per gripe from Aravind Kumar. Back-patch to all supported branches,
since they all contain the documentation claim about allowed range of
NUMERIC (cf commit cabf5d84b).
Discussion: <2895.1471195721@sss.pgh.pa.us>
Per discussion, we should provide such functions to replace the lost
ability to discover AM properties by inspecting pg_am (cf commit
65c5fcd35). The added functionality is also meant to displace any code
that was looking directly at pg_index.indoption, since we'd rather not
believe that the bit meanings in that field are part of any client API
contract.
As future-proofing, define the SQL API to not assume that properties that
are currently AM-wide or index-wide will remain so unless they logically
must be; instead, expose them only when inquiring about a specific index
or even specific index column. Also provide the ability for an index
AM to override the behavior.
In passing, document pg_am.amtype, overlooked in commit 473b93287.
Andrew Gierth, with kibitzing by me and others
Discussion: <87mvl5on7n.fsf@news-spur.riddles.org.uk>
Apparently that's not obvious to everybody, so let's belabor the point.
In passing, document that DROP POLICY has CASCADE/RESTRICT options (which
it does, per gram.y) but they do nothing (I assume, anyway). Also update
some long-obsolete commentary in gram.y.
Discussion: <20160805104837.1412.84915@wrigleys.postgresql.org>
EXPLAIN (ANALYZE, TIMING OFF) would print an elapsed time of zero for
a trigger function, because no measurement has been taken but it printed
the field anyway. This isn't what EXPLAIN does elsewhere, so suppress it.
In the same vein, EXPLAIN (ANALYZE, BUFFERS) with non-text output format
would print buffer I/O timing numbers even when no measurement has been
taken because track_io_timing is off. That seems not per policy, either,
so change it.
Back-patch to 9.2 where these features were introduced.
Maksim Milyutin
Discussion: <081c0540-ecaa-bd29-3fd2-6358f3b359a9@postgrespro.ru>
Commit 14e8803f1 removed LWLocks when accessing MyProc->syncRepState
but didn't clean up the surrounding code and comments.
Cleanup and backpatch to 9.5, to keep code similar.
Julien Rouhaud, improved by suggestion from Michael Paquier,
implemented trivially by myself.
Commit 874fe3aea changed the command tag returned for CREATE MATVIEW/CREATE
TABLE AS ... WITH NO DATA, but missed that there was code in spi.c that
expected the command tag to always be "SELECT". Fortunately, the
consequence was only an Assert failure, so this oversight should have no
impact in production builds.
Since this code path was evidently un-exercised, add a regression test.
Per report from Shivam Saxena. Back-patch to 9.3, like the previous commit.
Michael Paquier
Report: <97218716-480B-4527-B5CD-D08D798A0C7B@dresources.com>
Several places in NUM_numpart_from_char(), which is called from the SQL
function to_number(text, text), could accidentally read one byte past
the end of the input buffer (which comes from the input text datum and
is not null-terminated).
1. One leading space character would be skipped, but there was no check
that the input was at least one byte long. This does not happen in
practice, but for defensiveness, add a check anyway.
2. Commit 4a3a1e2cf apparently accidentally doubled that code that skips
one space character (so that two spaces might be skipped), but there
was no overflow check before skipping the second byte. Fix by
removing that duplicate code.
3. A logic error would allow a one-byte over-read when looking for a
trailing sign (S) placeholder.
In each case, the extra byte cannot be read out directly, but looking at
it might cause a crash.
The third item was discovered by Piotr Stefaniak, the first two were
found and analyzed by Tom Lane and Peter Eisentraut.
ExecEvalCase() tried to save a cycle or two by passing
&econtext->caseValue_isNull as the isNull argument to its sub-evaluation of
the CASE value expression. If that subexpression itself contained a CASE,
then *isNull was an alias for econtext->caseValue_isNull within the
recursive call of ExecEvalCase(), leading to confusion about whether the
inner call's caseValue was null or not. In the worst case this could lead
to a core dump due to dereferencing a null pointer. Fix by not assigning
to the global variable until control comes back from the subexpression.
Also, avoid using the passed-in isNull pointer transiently for evaluation
of WHEN expressions. (Either one of these changes would have been
sufficient to fix the known misbehavior, but it's clear now that each of
these choices was in itself dangerous coding practice and best avoided.
There do not seem to be any similar hazards elsewhere in execQual.c.)
Also, it was possible for inlining of a SQL function that implements the
equality operator used for a CASE comparison to result in one CASE
expression's CaseTestExpr node being inserted inside another CASE
expression. This would certainly result in wrong answers since the
improperly nested CaseTestExpr would be caused to return the inner CASE's
comparison value not the outer's. If the CASE values were of different
data types, a crash might result; moreover such situations could be abused
to allow disclosure of portions of server memory. To fix, teach
inline_function to check for "bare" CaseTestExpr nodes in the arguments of
a function to be inlined, and avoid inlining if there are any.
Heikki Linnakangas, Michael Paquier, Tom Lane
Report: https://github.com/greenplum-db/gpdb/pull/327
Report: <4DDCEEB8.50602@enterprisedb.com>
Security: CVE-2016-5423
Although the standard has routines.result_cast_character_set_name, given
the naming of the surrounding columns, we concluded that this must have
been a mistake and that result_cast_char_set_name was intended, so
change the implementation. The documentation was already using the new
name.
found by Clément Prévost <prevostclement@gmail.com>