We should throw an error for indeterminate collation, but bpcharne()
was missing that logic, resulting in a much less user-friendly error
(either an assertion failure or "cache lookup failed for collation 0").
Per report from Manuel Rigger. Back-patch to v12 where the mistake
came in, evidently in commit 5e1963fb7. (Before non-deterministic
collations, this function wasn't collation sensitive.)
Discussion: https://postgr.es/m/CA+u7OA4HOjtymxAbuGNh4-X_2R0Lw5n01tzvP8E5-i-2gQXYWA@mail.gmail.com
This gets rid of our former behavior of forcibly downcasing
the postmaster's hostname list and truncating the elements to
NAMEDATALEN. In principle, DNS hostnames are case-insensitive
so the first behavior should be harmless, and server hostnames
are seldom long enough for the second behavior to be an issue.
But it's still dubious, and an easy fix is available: just use
SplitGUCList instead.
AFAICT, all other SplitIdentifierString calls in the backend are
OK: either the items actually are SQL identifiers, or they are
keywords that are short and case-insensitive.
Per thinking about bug #16106. While this has been wrong for
a very long time, the lack of field complaints means there's
little reason to back-patch.
Discussion: https://postgr.es/m/16106-7d319e4295d08e70@postgresql.org
Commit 6b76f1bb5 changed all the RADIUS auth parameters to be lists
rather than single values. But its use of SplitIdentifierString
to parse the list format was not very carefully thought through,
because that function thinks it's parsing SQL identifiers, which
means it will (a) downcase the strings and (b) truncate them to
be shorter than NAMEDATALEN. While downcasing should be harmless
for the server names and ports, it's just wrong for the shared
secrets, and probably for the NAS Identifier strings as well.
The truncation aspect is at least potentially a problem too,
though typical values for these parameters would fit in 63 bytes.
Fortunately, we now have a function SplitGUCList that is exactly
the same except for not doing the two unwanted things, so fixing
this is a trivial matter of calling that function instead.
While here, improve the documentation to show how to double-quote
the parameter values. I failed to resist the temptation to do
some copy-editing as well.
Report and patch from Marcos David (bug #16106); doc changes by me.
Back-patch to v10 where the aforesaid commit came in, since this is
arguably a regression from our previous behavior with RADIUS auth.
Discussion: https://postgr.es/m/16106-7d319e4295d08e70@postgresql.org
The TableFunc node (i.e., XMLTABLE) includes type and collation OIDs
that might not be referenced anywhere else in the expression tree,
so they need to be accounted for when extracting dependencies.
Fortunately, the practical effects of this are limited, since
(a) it's somewhat unlikely that people would be extracting
columns of non-builtin types from an XML document, and (b)
in many scenarios, the query would contain other references
to such types, or functions depending on them. However, it's
not hard to construct examples wherein the existing code lets
one drop a type used in XMLTABLE and thereby break a view.
This is evidently an original oversight in the XMLTABLE patch,
so back-patch to v10 where that came in.
Discussion: https://postgr.es/m/18427.1573508501@sss.pgh.pa.us
This commit changes xact_desc() so that it reports the detail information about
PREPARE TRANSACTION record, like GID (global transaction identifier),
timestamp at prepare transaction, delete-on-abort/commit relations,
XID of subtransactions, and invalidation messages. These are helpful
when diagnosing 2PC-related troubles.
Author: Fujii Masao
Reviewed-by: Michael Paquier, Andrey Lepikhov, Kyotaro Horiguchi, Julien Rouhaud, Alvaro Herrera
Discussion: https://postgr.es/m/CAHGQGwEvhASad4JJnCv=0dW2TJypZgW_Vpb-oZik2a3utCqcrA@mail.gmail.com
This new option terminates the other sessions connected to the target
database and then drop it. To terminate other sessions, the current user
must have desired permissions (same as pg_terminate_backend()). We don't
allow to terminate the sessions if prepared transactions, active logical
replication slots or subscriptions are present in the target database.
Author: Pavel Stehule with changes by me
Reviewed-by: Dilip Kumar, Vignesh C, Ibrar Ahmed, Anthony Nowocien,
Ryan Lambert and Amit Kapila
Discussion: https://postgr.es/m/CAP_rwwmLJJbn70vLOZFpxGw3XD7nLB_7+NKz46H5EOO2k5H7OQ@mail.gmail.com
It is pointless to show in those views auxiliary processes that don't
open network connections.
A small incompatibility is that anybody joining pg_stat_activity and
pg_stat_ssl/pg_stat_gssapi will have to use a left join if they want to
see such auxiliary processes.
Author: Euler Taveira
Discussion: https://postgr.es/m/20190904151535.GA29108@alvherre.pgsql
An upcoming patch that adds deduplication to the nbtree AM will rely on
_bt_keep_natts_fast() understanding that differences in TOAST input
state can never affect its answer. In particular, two opclass-equal
datums (with opclasses deemed safe for deduplication) should never be
treated as unequal by _bt_keep_natts_fast() due to TOAST input
differences.
This also seems like a good idea on general principle. nbtsplitloc.c
will now occasionally make better decisions about where to split a leaf
page. The behavior of _bt_keep_natts_fast() is now somewhat closer to
the behavior of _bt_keep_natts().
Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
Bring datum_image_eq() in line with datumIsEqual() by adding support for
comparing cstring datums.
An upcoming patch that adds deduplication to the nbtree AM will use
datum_image_eq(). datum_image_eq() will need to work with all datatypes
that can be used as the storage type of a B-Tree index column, including
cstring. (cstring is used as the storage type for columns of type
"name" as a space-saving optimization.)
Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
This completes the task begun in commit 1408d5d86, to synchronize
ECPG's exported definitions with the definition of bool used by
c.h (and, therefore, the one actually in use in the ECPG library).
On practically all modern platforms, ecpglib.h will now just
include <stdbool.h>, which should surprise nobody anymore.
That removes a header-inclusion-order hazard for ECPG clients,
who previously might get build failures or unexpected behavior
depending on whether they'd included <stdbool.h> themselves,
and if so, whether before or after ecpglib.h.
On platforms where sizeof(_Bool) is not 1 (only old PPC-based
Mac systems, as far as I know), things are still messy, as
inclusion of <stdbool.h> could still break ECPG client code.
There doesn't seem to be any clean fix for that, and given the
probably-negligible population of users who would care anymore,
it's not clear we should go far out of our way to cope with it.
This change at least fixes some header-inclusion-order hazards
for our own code, since c.h and ecpglib.h previously disagreed
on whether bool should be char or unsigned char.
To implement this with minimal invasion of ECPG client namespace,
move the choice of whether to rely on <stdbool.h> into configure,
and have it export a configuration symbol PG_USE_STDBOOL.
ecpglib.h no longer exports definitions for TRUE and FALSE,
only their lowercase brethren. We could undo that if we get
push-back about it.
Ideally we'd back-patch this as far as v11, which is where c.h
started to rely on <stdbool.h>. But the odds of creating problems
for formerly-working ECPG client code seem about as large as the
odds of fixing any non-working cases, so we'll just do this in HEAD.
Discussion: https://postgr.es/m/CAA4eK1LmaKO7Du9M9Lo=kxGU8sB6aL8fa3sF6z6d5yYYVe3BuQ@mail.gmail.com
PredicateLockTuple() has a fast exit if tuple was written by the current
transaction, as in that case it already has a lock. This check can be
performed using TransactionIdIsCurrentTransactionId() instead of
SubTransGetTopmostTransaction(), to avoid any chance of having to hit the
disk.
Author: Ashwin Agrawal, based on a suggestion from Andres Freund
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
If the passed in xid is the current top transaction, we can do a fast
check and exit early. This should work well for the current heap but
also works very well for proposed AMs that don't use a separate xid
for subtransactions.
Author: Ashwin Agrawal, based on a suggestion from Andres Freund
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
This happens when we add a replica identity column on a subscriber
that does not yet exist on the publisher, according to the mapping
maintained by the subscriber. Code that checks whether the target
relation on the subscriber is updatable would check the replica
identity attribute bitmap with a column number -1, which would result
in an error. To fix, skip such columns in the bitmap lookup and
consider the relation not updatable. The result is consistent with
the rule that the replica identity columns on the subscriber must be a
subset of those on the publisher, since if the column doesn't exist on
the publisher, the column set on the subscriber can't be a subset.
Reported-by: Tim Clarke <tim.clarke@minerva.info>
Analyzed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://www.postgresql.org/message-id/flat/a9139c29-7ddd-973b-aa7f-71fed9c38d75%40minerva.info
Add some support for automatically showing backtraces in certain error
situations in the server. Backtraces are shown on assertion failure;
also, a new setting backtrace_functions can be set to a list of C
function names, and all ereport()s and elog()s from the mentioned
functions will have backtraces generated. Finally, the function
errbacktrace() can be manually added to an ereport() call to generate a
backtrace for that call.
Authors: Peter Eisentraut, Álvaro Herrera
Discussion: https://postgr.es/m//5f48cb47-bf1e-05b6-7aae-3bf2cd01586d@2ndquadrant.com
Discussion: https://postgr.es/m/CAMsr+YGL+yfWE=JvbUbnpWtrRZNey7hJ07+zT4bYJdVp4Szdrg@mail.gmail.com
nbtree index builds once stashed the "minimum key" for a page, which was
used as the basis of the pivot tuple that gets placed in the next level
up (i.e. the tuple that stores the downlink to the page in question).
It doesn't quite work that way anymore, so the "minimum key" terminology
now seems misleading (these days the minimum key is actually a straight
copy of the high key from the left sibling, which is a distinct thing in
subtle but important ways). Rename this concept to "low key". This
name is a lot clearer given that there is now a sharp distinction
between pivot and non-pivot tuples. Also remove comments that describe
obsolete details about how the minimum key concept used to work.
Rather than generating the minus infinity item for the leftmost page on
a level by copying the new item and truncating that copy, simply
allocate a small buffer. The old approach confusingly created the
impression that the new item had some kind of significance. This was
another artifact of how things used to work before commits 8224de4f and
dd299df8.
SET CONSTRAINTS ... DEFERRED failed on partitioned tables, because of a
sanity check that ensures that the affected constraints have triggers.
On partitioned tables, the triggers are in the leaf partitions, not in
the partitioned relations themselves, so the sanity check fails.
Removing the sanity check solves the problem, because the code needed to
support the case is already there.
Backpatch to 11.
Note: deferred unique constraints are not affected by this bug, because
they do have triggers in the parent partitioned table. I did not add a
test for this scenario.
Discussion: https://postgr.es/m/20191105212915.GA11324@alvherre.pgsql
This patch adopts the overflow check logic introduced by commit cbdb8b4c0
into two more places. interval_mul() failed to notice if it computed a
new microseconds value that was one more than INT64_MAX, and pgbench's
double-to-int64 logic had the same sorts of edge-case problems that
cbdb8b4c0 fixed in the core code.
To make this easier to get right in future, put the guts of the checks
into new macros in c.h, and add commentary about how to use the macros
correctly.
Back-patch to all supported branches, as we did with the previous fix.
Yuya Watari
Discussion: https://postgr.es/m/CAJ2pMkbkkFw2hb9Qb1Zj8d06EhWAQXFLy73St4qWv6aX=vqnjw@mail.gmail.com
If there is the WAL page that the continuation WAL record just fits within
(i.e., the continuation record ends just at the end of the page) and
the LSN in such page is specified with -s option, previously pg_waldump
caused an assertion failure. The cause of this assertion failure was that
XLogFindNextRecord() that pg_waldump -s calls mistakenly handled
such special WAL page.
This commit changes XLogFindNextRecord() so that it can handle
such WAL page correctly.
Back-patch to all supported versions.
Author: Andrey Lepikhov
Reviewed-by: Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/99303554-5dd5-06e6-f943-b3005ccd6edd@postgrespro.ru
SPI gets used to build a list of relation OIDs for XML object
generation, and one code path building a list uses SPI_execute() without
looking at errors it produces. So fix that.
Author: Mark Dilger
Reviewed-by: Michael Paquier, Pavel Stehule
Discussion: https://postgr.es/m/17d30445-4862-7917-170f-84328dcd292d@gmail.com
This allows logging a sample of statements, without incurring excessive
log traffic (which may impact performance). This can be useful when
analyzing workloads with lots of short queries.
The sampling is configured using two new GUC parameters:
* log_min_duration_sample - minimum required statement duration
* log_statement_sample_rate - sample rate (0.0 - 1.0)
Only statements with duration exceeding log_min_duration_sample are
considered for sampling. To enable sampling, both those GUCs have to
be set correctly.
The existing log_min_duration_statement GUC has a higher priority, i.e.
statements with duration exceeding log_min_duration_statement will be
always logged, irrespectedly of how the sampling is configured. This
means only configurations
log_min_duration_sample < log_min_duration_statement
do actually sample the statements, instead of logging everything.
Author: Adrien Nayrat
Reviewed-by: David Rowley, Vik Fearing, Tomas Vondra
Discussion: https://postgr.es/m/bbe0a1a8-a8f7-3be2-155a-888e661cc06c@anayrat.info
Avoid creating transiently-inconsistent slot states where possible,
by not setting TTS_FLAG_SHOULDFREE until after the slot actually has
a free'able tuple pointer, and by making sure that we reset tts_nvalid
and related derived state before we replace the tuple contents. This
would only matter if something were to examine the slot after we'd
suffered some kind of error (e.g. out of memory) while manipulating
the slot. We typically don't do that, so these changes might just be
cosmetic --- but even if so, it seems like good future-proofing.
Also remove some redundant Asserts, and add a couple for consistency.
Back-patch to v12 where all this code was rewritten.
Discussion: https://postgr.es/m/16095-c3ff2e5283b8dba5@postgresql.org
Since commit d26a810eb, we've defined bool as being either _Bool from
<stdbool.h>, or "unsigned char"; but that commit overlooked the fact
that probes.d has "#define bool char". For consistency, make it say
"unsigned char" instead. This should be strictly a cosmetic change,
but it seems best to be in sync.
Formally, in the now-normal case where we're using <stdbool.h>, it'd
be better to write "#define bool _Bool". However, then we'd need
some build infrastructure to inject that configuration choice into
probes.d, and it doesn't seem worth the trouble. We only use
<stdbool.h> if sizeof(_Bool) is 1, so having DTrace think that
bool parameters are "unsigned char" should be close enough.
Back-patch to v12 where d26a810eb came in.
Discussion: https://postgr.es/m/CAA4eK1LmaKO7Du9M9Lo=kxGU8sB6aL8fa3sF6z6d5yYYVe3BuQ@mail.gmail.com
When sending data for logical decoding using the streaming replication
protocol via a WAL sender, the timestamp of the sent write message is
allocated at the beginning of the message when preparing for the write,
and actually computed when the write message is ready to be sent.
The timestamp was getting computed after sending the message. This
impacts anything using logical decoding, causing for example logical
replication to report mostly NULL for last_msg_send_time in
pg_stat_subscription.
This commit makes sure that the timestamp is computed before sending the
message. This is wrong since 5a991ef, so backpatch down to 9.4.
Author: Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1z=WMn8jt7iEdC5sYNaPgAgOASb_OW5JYv-vMdYaJSL-w@mail.gmail.com
Backpatch-through: 9.4
WindowAgg will potentially store large numbers of input rows into
tuplestores to allow access to other rows in the frame. If the input
is coming via an explicit Sort node, then unneeded columns will
already have been discarded (since Sort requests a small tlist); but
there are idioms like COUNT(*) OVER () that result in the input not
being sorted at all, and cases where the input is being sorted by some
means other than a Sort; if we don't request a small tlist, then
WindowAgg's storage requirement is inflated by the unneeded columns.
Backpatch back to 9.6, where the current tlist handling was added.
(Prior to that, WindowAgg would always use a small tlist.)
Discussion: https://postgr.es/m/87a7ator8n.fsf@news-spur.riddles.org.uk
Previously ALTER MATERIALIZED VIEW / FOREIGN TABLE ... RENAME COLUMN ...
returned "ALTER TABLE" as a command tag. This commit fixes them so that
they return "ALTER MATERIALIZED VIEW" and "ALTER FOREIGN TABLE" as
command tags, respectively.
This issue exists in all supported versions, but we don't back-patch this
because it's not enough of a bug to justify taking any compatibility risks for.
Otherwise, the back-patch would cause minor version update to break,
for example, the existing event trigger functions using TG_TAG.
Author: Fujii Masao
Reviewed-by: Ibrar Ahmed
Discussion: https://postgr.es/m/CAHGQGwGUaC03FFdTFoHsCuDrrNvFvNVQ6xyd40==P25WvuBJjg@mail.gmail.com
There's plenty places in frontend code that could benefit from a
string buffer implementation. Some because it yields simpler and
faster code, and some others because of the desire to share code
between backend and frontend.
While there is a string buffer implementation available to frontend
code, libpq's PQExpBuffer, it is clunkier than stringinfo, it
introduces a libpq dependency, doesn't allow for sharing between
frontend and backend code, and has a higher API/ABI stability
requirement due to being exposed via libpq.
Therefore it seems best to just making StringInfo being usable by
frontend code. There's not much to do for that, except for rewriting
two subsequent elog/ereport calls into others types of error
reporting, and deciding on a maximum string length.
For the maximum string size I decided to privately define MaxAllocSize
to the same value as used in the backend. It seems likely that we'll
want to reconsider this for both backend and frontend code in the not
too far away future.
For now I've left stringinfo.h in lib/, rather than common/, to reduce
the likelihood of unnecessary breakage. We could alternatively decide
to provide a redirecting stringinfo.h in lib/, or just not provide
compatibility.
Author: Andres Freund
Reviewed-By: Kyotaro Horiguchi, Daniel Gustafsson
Discussion: https://postgr.es/m/20190920051857.2fhnvhvx4qdddviz@alap3.anarazel.de
When maintaining or merging patches, one of the most common sources
for conflicts are the list of objects in makefiles. Especially when
the split across lines has been changed on both sides, which is
somewhat common due to attempting to stay below 80 columns, those
conflicts are unnecessarily laborious to resolve.
By splitting, and alphabetically sorting, OBJS style lines into one
object per line, conflicts should be less frequent, and easier to
resolve when they still occur.
Author: Andres Freund
Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de
For a long time (since commit aed378e8d) we have had a policy to log
nothing about a connection if the client disconnects when challenged
for a password. This is because libpq-using clients will typically
do that, and then come back for a new connection attempt once they've
collected a password from their user, so that logging the abandoned
connection attempt will just result in log spam. However, this did
not work well for PAM authentication: the bottom-level function
pam_passwd_conv_proc() was on board with it, but we logged messages
at higher levels anyway, for lack of any reporting mechanism.
Add a flag and tweak the logic so that the case is silent, as it is
for other password-using auth mechanisms.
Per complaint from Yoann La Cancellera. It's been like this for awhile,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com
get_relkind_objtype, and hence get_object_type, failed when applied to a
toast table. This is not a good thing, because it prevents reporting of
perfectly legitimate permissions errors. (At present, these functions
are in fact *only* used to determine the ObjectType argument for
acl_error() calls.) It seems best to have them fall back to returning
OBJECT_TABLE in every case where they can't determine an object type
for a pg_class entry, so do that.
In passing, make some edits to alter.c to make it more obvious that
those calls of get_object_type() are used only for error reporting.
This might save a few cycles in the non-error code path, too.
Back-patch to v11 where this issue originated.
John Hsu, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/C652D3DF-2B0C-4128-9420-FB5379F6B1E4@amazon.com
Commit d25ea0127 got rid of what I thought were entirely unnecessary
derived child expressions in EquivalenceClasses for EC members that
mention multiple baserels. But it turns out that some of the child
expressions that code created are necessary for partitionwise joins,
else we fail to find matching pathkeys for Sort nodes. (This happens
only for certain shapes of the resulting plan; it may be that
partitionwise aggregation is also necessary to show the failure,
though I'm not sure of that.)
Reverting that commit entirely would be quite painful performance-wise
for large partition sets. So instead, add code that explicitly
generates child expressions that match only partitionwise child join
rels we have actually generated.
Per report from Justin Pryzby. (Amit Langote noticed the problem
earlier, though it's not clear if he recognized then that it could
result in a planner error, not merely failure to exploit partitionwise
join, in the code as-committed.) Back-patch to v12 where commit
d25ea0127 came in.
Amit Langote, with lots of kibitzing from me
Discussion: https://postgr.es/m/CA+HiwqG2WVUGmLJqtR0tPFhniO=H=9qQ+Z3L_ZC+Y3-EVQHFGg@mail.gmail.com
Discussion: https://postgr.es/m/20191011143703.GN10470@telsasoft.com
Historically, the code to build relation options has been shaped the
same way in multiple code paths by using a set of datums in input with
the options parsed with a static table which is then filled with the
option values. This introduces a new common routine in reloptions.c to
do most of the legwork for the in-core code paths.
Author: Amit Langote
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CA+HiwqGsoSn_uTPPYT19WrtR7oYpYtv4CdS0xuedTKiHHWuk_g@mail.gmail.com
As the code stands, nEntries counts the number of ginEntryInsert()
calls, so that's what you end up with at the end of a GIN index build.
However, ginvacuumcleanup() recomputes nEntries as the number of
surviving leaf tuples, and that's generally consistent with the way that
gincostestimate() uses the value. So let's clearly define nEntries
as the number of leaf tuples, and therefore adjust ginEntryInsert() to
increment it only when we make a new one, not when we add TIDs into an
existing tuple or posting tree.
In practice this inconsistency probably has little impact, so I don't
feel a need to back-patch.
Insung Moon and Keisuke Kuroda
Discussion: https://postgr.es/m/CAEMmqBuH_O-oXL+3_ArQ6F5cJ7kXVow2SGQB3HRacku_T+xkmA@mail.gmail.com
Rearrange the logic in record_image_cmp() and datum_image_eq() to
error out on unexpected typlens (either not supported there or
completely invalid due to corruption). Barring corruption, this is
not possible today but it seems more future-proof and robust to fix
this.
Reported-by: Peter Geoghegan <pg@bowt.ie>
Commit 8af1624e3 introduced a warning about possibly returning
without a value, on compilers that don't realize that ereport(ERROR)
doesn't return. Tweak the code to avoid that.
Per buildfarm. Back-patch to 9.6, like the aforesaid commit.