Currently, there are no built-in functions that are SECURITY DEFINER.
But we just found an instance where one was mistakenly marked that way,
so it seems prudent to add a test about it. If we ever grow some
functions that are intentionally SECURITY DEFINER, we can alter the
expected output of this test, or adjust the query to filter out functions
for which it's okay.
Per suggestion from Robert Haas.
Discussion: https://postgr.es/m/CA+TgmoYXg7McY33+jbWmG=rS-HNUur0S6W8Q8kVNFf7epFimVA@mail.gmail.com
When commit 0f33a719fd removed the
instructions to start/stop the new cluster before running rsync, it was
now possible for pg_resetwal/pg_resetxlog to leave the final WAL record
at wal_level=minimum, preventing upgraded standby servers from
reconnecting.
This patch fixes that by having pg_upgrade unconditionally start/stop
the new cluster after pg_resetwal/pg_resetxlog has run.
Backpatch through 9.2 since, though the instructions were added in PG
9.5, they worked all the way back to 9.2.
Discussion: https://postgr.es/m/20170620171844.GC24975@momjian.us
Backpatch-through: 9.2
load_libraries(), which processes the various xxx_preload_libraries GUCs,
was parsing them using SplitIdentifierString() which isn't really
appropriate for values that could be path names: it downcases unquoted
text, and it doesn't allow embedded whitespace unless quoted.
Use SplitDirectoriesString() instead. That also allows us to simplify
load_libraries() a bit, since canonicalize_path() is now done for it.
While this definitely seems like a bug fix, it has the potential to
break configuration settings that accidentally worked before because
of the downcasing behavior. Also, there's an easy workaround for the
bug, namely to double-quote troublesome text. Hence, no back-patch.
QL Zhuo, tweaked a bit by me
Discussion: https://postgr.es/m/CAB-oJtxHVDc3H+Km3CjB9mY1VDzuyaVH_ZYSz7iXcRqCtb93Ew@mail.gmail.com
Viewing a table with \d in psql also shows the publications at table is
in. If a publication is concurrently dropped, this shows an error,
because the view pg_publication_tables internally uses
pg_get_publication_tables(), which uses a catalog snapshot. This can be
particularly annoying if a for-all-tables publication is concurrently
dropped.
To avoid that, write the query in psql differently. Expose the function
pg_relation_is_publishable() to SQL and write the query using that.
That still has a risk of being affected by concurrent catalog changes,
but in this case it would be a table drop that causes problems, and then
the psql \d command wouldn't be interesting anymore anyway.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
When materialized views were added, psql's \d commands were made to
treat them as a separate object category ... but not everyplace in the
documentation or comments got the memo.
Noted by David Johnston. Back-patch to 9.3 where matviews came in.
Discussion: https://postgr.es/m/CAKFQuwb27M3VXRhHErjCpkWwN9eKThbqWb1=trtoXi9_ejqPXQ@mail.gmail.com
David Rowley found that the "use the smallest per-column selectivity"
heuristic applied in some cases by get_foreign_key_join_selectivity()
was badly off if the FK columns are independent, producing estimates
much worse than we got before that code was added in 9.6.
One case where that heuristic was used was for LEFT and FULL outer joins
with the referenced rel on the outside of the join. But we should not
really need to special-case those here. eqjoinsel() never has had such a
special case; the correction is applied by calc_joinrel_size_estimate()
instead. Let's just estimate such cases like inner joins and rely on that
later adjustment. (I think there was something of a thinko here, in that
the comments seem to be thinking about the selectivity as defined for
semi/anti joins; but that shouldn't apply to left/full joins.) Add a
regression test exercising such a case to show that this is sane in
at least some cases.
The other case where we used that heuristic was for SEMI/ANTI outer joins,
either if the referenced rel was on the outside, or if it was on the inside
but was part of a join within the RHS. In either case, the FK doesn't give
us a lot of traction towards estimating the selectivity. To ensure that
we don't have regressions from what happened before 9.6, let's punt by
ignoring the FK in such cases and applying the traditional selectivity
calculation. (We might be able to improve on that later, but for now
I just want to be sure it's not worse than 9.5.)
Report and patch by David Rowley, simplified a bit by me. Back-patch
to 9.6 where this code was added.
Discussion: https://postgr.es/m/CAKJS1f8NO8oCDcxrteohG6O72uU1saEVT9qX=R8pENr5QWerXw@mail.gmail.com
The combination of -Z -Fp and output to stdout resulted in corrupted
output data, because we left stdout in text mode, resulting in newline
conversion being done on the compressed stream. Switch stdout to binary
mode for this case, at the same place where we do it for non-text output
formats.
Report and patch by Kuntal Ghosh, tested by Ashutosh Sharma and Neha
Sharma. Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAGz5QCJPvbBjXAmJuGx1B_41yVCetAJhp7rtaDf7XQGWuB1GSw@mail.gmail.com
When, during logical decoding, a transaction gets too big, it's
contents get spilled to disk. Not just the top-transaction gets
spilled, but *also* all of its subtransactions, even if they're not
that large themselves. Unfortunately we didn't clean up
such small spilled subtransactions from disk.
Fix that, by keeping better track of whether a transaction has been
spilled to disk.
Author: Andres Freund
Reported-By: Dmitriy Sarafannikov, Fabrízio de Royes Mello
Discussion:
https://postgr.es/m/1457621358.355011041@f382.i.mail.ruhttps://postgr.es/m/CAFcNs+qNMhNYii4nxpO6gqsndiyxNDYV0S=JNq0v_sEE+9PHXg@mail.gmail.com
Backpatch: 9.4-, where logical decoding was introduced
This should normally be determined by a configure check, but until
someone figures out how to do that on Windows, it's better that the code
uses the new function by default.
If a .c or .h file corresponds to a .y or .l file, skip indenting it.
There's no point in reindenting derived files, and these files tend to
confuse pgindent. (Which probably indicates a bug in BSD indent, but
I can't get excited about trying to fix it.)
For the same reasons, add src/backend/utils/fmgrtab.c to the set of
files excluded by src/tools/pgindent/exclude_file_patterns.
The point of doing this is that it makes it safe to run pgindent over
the tree without doing "make maintainer-clean" first. While these are
not the only derived .c/.h files in the tree, they are the only ones
pgindent fails on. Removing that prerequisite step results in one less
way to mess up a pgindent run, and it's necessary if we ever hope to get
to the ease of running pgindent via "make indent".
Windows uses a separate code path for libc locales. The code previously
ended up there also if an ICU collation should be used, leading to a
crash.
Reported-by: Ashutosh Sharma <ashu.coek88@gmail.com>
When a new base type is created using the old-style procedure of first
creating the input/output functions with "opaque" in place of the base
type, the "opaque" argument/return type is changed to the final base type,
on CREATE TYPE. However, we did not create a pg_depend record when doing
that, so the functions were left not depending on the type.
Fixes bug #14706, reported by Karen Huddleston.
Discussion: https://www.postgresql.org/message-id/20170614232259.1424.82774@wrigleys.postgresql.org
The _equalTableFunc() omission of coltypmods has semantic significance,
but I did not track down resulting user-visible bugs, if any. The other
changes are cosmetic only, affecting order. catversion bump due to
readfuncs.c field order change.
We had three occurrences of essentially the same coding pattern
wherein we tried to retrieve a query result from a libpq connection
without blocking. In the case where PQconsumeInput failed (typically
indicating a lost connection), all three loops simply gave up and
returned, forgetting to clear any previously-collected PGresult
object. Since those are malloc'd not palloc'd, the oversight results
in a process-lifespan memory leak.
One instance, in libpqwalreceiver, is of little significance because
the walreceiver process would just quit anyway if its connection fails.
But we might as well fix it.
The other two instances, in postgres_fdw, are somewhat more worrisome
because at least in principle the scenario could be repeated, allowing
the amount of memory leaked to build up to something worth worrying
about. Moreover, in these cases the loops contain CHECK_FOR_INTERRUPTS
calls, as well as other calls that could potentially elog(ERROR),
providing another way to exit without having cleared the PGresult.
Here we need to add PG_TRY logic similar to what exists in quite a
few other places in postgres_fdw.
Coverity noted the libpqwalreceiver bug; I found the other two cases
by checking all calls of PQconsumeInput.
Back-patch to all supported versions as appropriate (9.2 lacks
postgres_fdw, so this is really quite unexciting for that branch).
Discussion: https://postgr.es/m/22620.1497486981@sss.pgh.pa.us
Show "All tables" property in \dRp and \dRp+. Don't list tables for
such publications in \dRp+, since it's redundant and the list could be
very long.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Jeff Janes <jeff.janes@gmail.com>
Previously we required every exported transaction to have an xid
assigned. That was used to check that the exporting transaction is
still running, which in turn is needed to guarantee that that
necessary rows haven't been removed in between exporting and importing
the snapshot.
The exported xid caused unnecessary problems with logical decoding,
because slot creation has to wait for all concurrent xid to finish,
which in turn serializes concurrent slot creation. It also
prohibited snapshots to be exported on hot-standby replicas.
Instead export the virtual transactionid, which avoids the unnecessary
serialization and the inability to export snapshots on standbys. This
changes the file name of the exported snapshot, but since we never
documented what that one means, that seems ok.
Author: Petr Jelinek, slightly editorialized by me
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/f598b4b8-8cd7-0d54-0939-adda763d8c34@2ndquadrant.com
In a CHECK clause, a null result means true, whereas in a WHERE clause
it means false. predtest.c provided different functions depending on
which set of semantics applied to the predicate being proved, but had
no option to control what a null meant in the clauses provided as
axioms. Add one.
Use that in the partitioning code when figuring out whether the
validation scan on a new partition can be skipped. Rip out the
old logic that attempted (not very successfully) to compensate
for the absence of the necessary support in predtest.c.
Ashutosh Bapat and Robert Haas, reviewed by Amit Langote and
incorporating feedback from Tom Lane.
Discussion: http://postgr.es/m/CAFjFpReT_kq_uwU_B8aWDxR7jNGE=P0iELycdq5oupi=xSQTOw@mail.gmail.com
The optimized code in 728bd991c3 contains a few invalid locking
sequences. To wit, the original code would try to acquire an lwlock
that it already holds. Avoid this by moving lock acquisitions to
higher-level code, and install appropriate assertions in low-level that
the correct mode is held.
Authors: Michael Paquier, Álvaro Herrera
Reported-By: chuanting wang
Bug: #14680
Discussion: https://postgr.es/m/20170531033228.1487.10124@wrigleys.postgresql.org
expression_returns_set() used to short-circuit its recursion upon
seeing certain node types, such as DistinctExpr, that it knew the
executor did not support set-valued arguments for. That was never
inherent, though, just a reflection of laziness in execQual.c.
With the new implementation of SRFs there is no reason to think
that any scalar-valued expression node could not have a set-valued
subexpression, except for AggRefs and WindowFuncs where we know there
is a parser check rejecting it. And indeed, the shortcut causes
unexpected failures for cases such as a SRF underneath DistinctExpr,
because the planner stops looking for SRFs too soon.
Discussion: https://postgr.es/m/5259.1497044025@sss.pgh.pa.us
In commits 2f5c9d9c9 and ab0289651 we invented an abstraction layer
to insulate catalog manipulations from direct heap update calls.
But evidently some patches that hadn't landed in-tree at that point
didn't get the memo completely. Fix a couple of direct calls to
simple_heap_delete to use CatalogTupleDelete instead; these appear
to have been added in commits 7c4f52409 and 7b504eb28. This change is
purely cosmetic ATM, but there's no point in having an abstraction layer
if we allow random code to break it.
Masahiko Sawada and Tom Lane
Discussion: https://postgr.es/m/CAD21AoDOPRSVcwbnCN3Y1n_68ATyTspsU6=ygtHz_uY0VcdZ8A@mail.gmail.com
Table partitioning, introduced in commit f0e44751d7, added a new
relkind - RELKIND_PARTITIONED_TABLE. Update a couple of places in
PL/pgSQL to handle it. Specifically plpgsql_parse_cwordtype() and
build_row_from_class() needed updating in order to make table%ROWTYPE
and table.col%TYPE work for partitioned tables.
Dean Rasheed, reviewed by Amit Langote.
Discussion: https://postgr.es/m/CAEZATCUnNOKN8sLML9jUzxecALWpEXK3a3W7y0PgFR4%2Buhgc%3Dg%40mail.gmail.com
When we reimplemented SRFs in commit 69f4b9c85, our initial choice was
to allow the behavior to vary from historical practice in cases where a
SRF call appeared within a conditional-execution construct (currently,
only CASE or COALESCE). But that was controversial to begin with, and
subsequent discussion has resulted in a consensus that it's better to
throw an error instead of executing the query differently from before,
so long as we can provide a reasonably clear error message and a way to
rewrite the query.
Hence, add a parser mechanism to allow detection of such cases during
parse analysis. The mechanism just requires storing, in the ParseState,
a pointer to the set-returning FuncExpr or OpExpr most recently emitted
by parse analysis. Then the parsing functions for CASE and COALESCE can
detect the presence of a SRF in their arguments by noting whether this
pointer changes while analyzing their arguments. Furthermore, if it does,
it provides a suitable error cursor location for the complaint. (This
means that if there's more than one SRF in the arguments, the error will
point at the last one to be analyzed not the first. While connoisseurs of
parsing behavior might find that odd, it's unlikely the average user would
ever notice.)
While at it, we can also provide more specific error messages than before
about some pre-existing restrictions, such as no-SRFs-within-aggregates.
Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM
construct would need to return a set. We've never supported that, but the
restriction is depended on in more subtle ways now, so it seems wise to
detect it at the start.
Also, provide some documentation about how to rewrite a SRF-within-CASE
query using a custom wrapper SRF.
It turns out that the information_schema.user_mapping_options view
contained an instance of exactly the behavior we're now forbidding; but
rewriting it makes it more clear and safer too.
initdb forced because of user_mapping_options change.
Patch by me, with error message suggestions from Alvaro Herrera and
Andres Freund, pursuant to a complaint from Regina Obe.
Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
Table partitioning, introduced in commit f0e44751d7, added a new
relkind - RELKIND_PARTITIONED_TABLE. Update relation_is_updatable() to
handle it. Specifically, partitioned tables and simple views built on
top of them are updatable.
This affects the SQL-callable functions pg_relation_is_updatable() and
pg_column_is_updatable(), and the views information_schema.views and
information_schema.columns.
Dean Rasheed, reviewed by Ashutosh Bapat.
Discussion: https://postgr.es/m/CAEZATCXnbiFkMXgF4Ez1pmM2c-tS1z33bSq7OGbw7QQhHov%2B6Q%40mail.gmail.com
Previously, you could write _null_ in a BKI DATA line for a column that's
supposed to be NOT NULL and initdb would let it pass, probably breaking
subsequent accesses to the row. No doubt the original coding overlooked
this simple sanity check because in the beginning we didn't have any way
to mark catalog columns NOT NULL at initdb time.
This will not work on restore, but it will allow dumping out pg_catalog
for research and documentation.
Reported-by: Neil Anderson <neil.t.anderson@gmail.com>
Bug: #14701
ExecInitModifyTable() thought there was a plan per partition, but no,
there's only one. The problem had escaped detection so far because there
would only be visible misbehavior if there were a SubPlan (not an InitPlan)
in the quals being duplicated for each partition. However, valgrind
detected a bogus memory access in test cases added by commit 4f7a95be2,
and investigation of that led to discovery of the bug. The additional
test case added here crashes without the patch.
Patch by Amit Langote, test case by me.
Discussion: https://postgr.es/m/10974.1497227727@sss.pgh.pa.us
During pg_upgrade's restore run, all relfilenode choices should be
overridden by commands in the dump script. If we ever find ourselves
choosing a relfilenode in the ordinary way, someone blew it. Likewise for
pg_type OIDs. Since pg_upgrade might well succeed anyway, if there happens
not to be a conflict during the regression test run, we need assertions
here to keep us on the straight and narrow.
We might someday be able to remove the assertion in GetNewRelFileNode,
if pg_upgrade is rewritten to remove its assumption that old and new
relfilenodes always match. But it's hard to see how to get rid of the
pg_type OID constraint, since those OIDs are embedded in user tables
in some cases.
Back-patch as far as 9.5, because of the risk of back-patches breaking
something here even if it works in HEAD. I'd prefer to go back further,
but 9.4 fails both assertions due to get_rel_infos()'s use of a temporary
table. We can't use the later-branch solution of a CTE for compatibility
reasons (cf commit 5d16332e9), and it doesn't seem worth inventing some
other way to do the query. (I did check, by dint of changing the Asserts
to elog(WARNING), that there are no other cases of unwanted OID assignments
during 9.4's regression test run.)
Discussion: https://postgr.es/m/19785.1497215827@sss.pgh.pa.us
It's not necessary for it to do that, since OWNED BY requires only ordinary
catalog updates and doesn't affect future sequence values. And pg_upgrade
needs to use OWNED BY without having it change the sequence's relfilenode.
Commit 3d79013b9 broke this by making all forms of ALTER SEQUENCE change
the relfilenode; that seems to be the explanation for the hard-to-reproduce
buildfarm failures we've been seeing since then.
Discussion: https://postgr.es/m/19785.1497215827@sss.pgh.pa.us
The original code only added ICU_CFLAGS to the backend build. But it is
also needed for building external modules that include pg_locale.h. So
add it to the global CPPFLAGS. (This is only relevant if ICU is not in
a compiler default path, so it apparently hasn't bitten many.)
When a table sync worker is in waiting state and the subscription table
entry is removed because of a concurrent subscription refresh, the
worker could be left orphaned. To avoid that, explicitly stop the
worker when the pg_subscription_rel entry is removed.
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
generateSerialExtraStmts() was sloppy about handling the case where
SEQUENCE NAME is given with a not-schema-qualified name. It was generating
a CreateSeqStmt with an unqualified sequence name, and an AlterSeqStmt
whose "owned_by" DefElem contained a T_String Value with a null string
pointer in the schema-name position. The generated nextval() argument was
also underqualified. This accidentally failed to fail at runtime, but only
so long as the current default creation namespace at runtime is the right
namespace. That's bogus; the parse-time transformation is supposed to be
inserting the right schema name in all cases, so as to avoid any possible
skew in that selection. I'm not sure this could fail in pg_dump's usage,
but it's still wrong; we have had real bugs in this area before adopting
the policy that parse_utilcmd.c should generate only fully-qualified
auxiliary commands. A slightly lesser problem, which is what led me to
notice this in the first place, is that pprint() dumped core on the
AlterSeqStmt because of the bogus T_String.
Noted while poking into the open problem with ALTER SEQUENCE breaking
pg_upgrade.
The new partitioned table capability added a new relkind, namely
RELKIND_PARTITIONED_TABLE. Update fireRIRrules() to apply RLS
policies on RELKIND_PARTITIONED_TABLE as it does RELKIND_RELATION.
In addition, add RLS regression test coverage for partitioned tables.
Issue raised by Fakhroutdinov Evgenievich and patch by Mike Palmiotto.
Regression test editorializing by me.
Discussion: https://postgr.es/m/flat/20170601065959.1486.69906@wrigleys.postgresql.org
When a table is removed from a subscription before the tablesync worker
could start, this would previously result in an error when reading
pg_subscription_rel. Now we just ignore this.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
If you accidentally pass a host name in the hostaddr option, e.g.
hostaddr=localhost, you get an error like:
psql: could not translate host name "localhost" to address: Name or service not known
That's a bit confusing, because it implies that we tried to look up
"localhost" in DNS, but it failed. To make it more clear that we tried to
parse "localhost" as a numeric network address, change the message to:
psql: could not parse network address "localhost": Name or service not known
Discussion: https://www.postgresql.org/message-id/10badbc6-4d5a-a769-623a-f7ada43e14dd@iki.fi
Previously the exit handling was only able to exit from within the
main loop, and not from within the backend code it calls. Fix that by
using the standard die() SIGTERM handler, and adding the necessary
CHECK_FOR_INTERRUPTS() call.
This requires adding yet another process-type-specific branch to
ProcessInterrupts(), which hints that we probably should generalize
that handling. But that's work for another day.
Author: Petr Jelinek
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/fe072153-babd-3b5d-8052-73527a6eb657@2ndquadrant.com
Since 7c4f52409a (merged in v10), a shutdown master is reported as
FATAL: unexpected result after CommandComplete: server closed the connection unexpectedly
by walsender. It used to be
LOG: replication terminated by primary server
FATAL: could not send end-of-streaming message to primary: no COPY in progress
while the old message clearly is not perfect, it's definitely better
than what's reported now.
The change comes from the attempt to handle finished COPYs without
erroring out, needed for the new logical replication, which wasn't
needed before.
There's probably better ways to handle this, but for now just
explicitly check for a closed connection.
Author: Petr Jelinek
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/f7c7dd08-855c-e4ed-41f4-d064a6c0665a@2ndquadrant.com
Backpatch: -
Doing a cross-version upgrade test with test.sh evidently hasn't been
tested since circa 9.2, because the script lacked case branches for
old-version servers newer than 9.1. Future-proof that a bit, and
clean up breakage induced by our recent drop of V0 function call
protocol (namely that oldstyle_length() isn't in the regression
suite anymore).
(This isn't enough to make the test work perfectly cleanly across
versions, but at least it finishes and provides dump files that
you can diff manually. One issue I didn't touch is that we might
want to execute the "reindex_hash.sql" file in the new DB before
dumping it, so that the hash indexes don't vanish from the dump.)
Improve the TESTING doc file: put the tl;dr version at the top not
the bottom, and bring its explanation of how to run a cross-version
test up to speed, since the installcheck target isn't there and won't
be resurrected. Improve the comment in the Makefile about why not.
In passing, teach .gitignore and "make clean" about a couple more
junk output files.
Discussion: https://postgr.es/m/14058.1496892482@sss.pgh.pa.us
Most of the improvements were in the new SCRAM code:
* In SCRAM protocol violation messages, use errdetail to provide the
details.
* If pg_backend_random() fails, throw an ERROR rather than just LOG. We
shouldn't continue authentication if we can't generate a random nonce.
* Use ereport() rather than elog() for the "invalid SCRAM verifier"
messages. They shouldn't happen, if everything works, but it's not
inconceivable that someone would have invalid scram verifiers in
pg_authid, e.g. if a broken client application was used to generate the
verifier.
But this change applied to old code:
* Use ERROR rather than COMMERROR for protocol violation errors. There's
no reason to not tell the client what they did wrong. The client might be
confused already, so that it cannot read and display the error correctly,
but let's at least try. In the "invalid password packet size" case, we
used to actually continue with authentication anyway, but that is now a
hard error.
Patch by Michael Paquier and me. Thanks to Daniel Varrazzo for spotting
the typo in one of the messages that spurred the discussion and these
larger changes.
Discussion: https://www.postgresql.org/message-id/CA%2Bmi_8aZYLhuyQi1Jo0hO19opNZ2OEATEOM5fKApH7P6zTOZGg%40mail.gmail.com
A logical replication worker should not insert new rows into
pg_subscription_rel, only update existing rows, so that there are no
races if a concurrent refresh removes rows. Adjust the API to be able
to choose that behavior.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
Since tuple-routing implicitly checks the partitioning constraints
at least for the levels of the partitioning hierarchy it traverses,
there's normally no need to revalidate the partitioning constraint
after performing tuple routing. However, if there's a BEFORE trigger
on the target partition, it could modify the tuple, causing the
partitioning constraint to be violated. Catch that case.
Also, instead of checking the root table's partition constraint after
tuple-routing, check it beforehand. Otherwise, the rules for when
the partitioning constraint gets checked get too complicated, because
you sometimes have to check part of the constraint but not all of it.
This effectively reverts commit 39162b2030
in favor of a different approach altogether.
Report by me. Initial debugging by Jeevan Ladhe. Patch by Amit
Langote, reviewed by me.
Discussion: http://postgr.es/m/CA+Tgmoa9DTgeVOqopieV8d1QRpddmP65aCdxyjdYDoEO5pS5KA@mail.gmail.com
If authentication over an SSL connection fails, with sslmode=prefer,
libpq will reconnect without SSL and retry. However, we did not clear
the variables related to GSS, SSPI, and SASL authentication state, when
reconnecting. Because of that, the second authentication attempt would
always fail with a "duplicate GSS/SASL authentication request" error.
pg_SSPI_startup did not check for duplicate authentication requests like
the corresponding GSS and SASL functions, so with SSPI, you would leak
some memory instead.
Another way this could manifest itself, on version 10, is if you list
multiple hostnames in the "host" parameter. If the first server requests
Kerberos or SCRAM authentication, but it fails, the attempts to connect to
the other servers will also fail with "duplicate authentication request"
errors.
To fix, move the clearing of authentication state from closePGconn to
pgDropConnection, so that it is cleared also when re-connecting.
Patch by Michael Paquier, with some kibitzing by me.
Backpatch down to 9.3. 9.2 has the same bug, but the code around closing
the connection is somewhat different, so that this patch doesn't apply.
To fix this in 9.2, I think we would need to back-port commit 210eb9b743
first, and then apply this patch. However, given that we only bumped into
this in our own testing, we haven't heard any reports from users about
this, and that 9.2 will be end-of-lifed in a couple of months anyway, it
doesn't seem worth the risk and trouble.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRuOUm0MyJaUy9L3eXYJU3AKCZ-0-03=-aDTZJGV4GyWw@mail.gmail.com
The logic to free the buffer after the gss_init_sec_context() call was
always a bit wonky. Because gss_init_sec_context() sets the GSS context
variable, conn->gctx, we would in fact always attempt to free the buffer.
That only works, because previously conn->ginbuf.value was initialized to
NULL, and free(NULL) is a no-op. Commit 61bf96cab0 refactored things so
that the GSS input token buffer is allocated locally in pg_GSS_continue,
and not held in the PGconn object. After that, the now-local ginbuf.value
variable isn't initialized when it's not used, so we pass a bogus pointer
to free().
To fix, only try to free the input buffer if we allocated it. That was the
intention, certainly after the refactoring, and probably even before that.
But because there's no live bug before the refactoring, I refrained from
backpatching this.
The bug was also independently reported by Graham Dutton, as bug #14690.
Patch reviewed by Michael Paquier.
Discussion: https://www.postgresql.org/message-id/6288d80e-a0bf-d4d3-4e12-7b79c77f1771%40iki.fi
Discussion: https://www.postgresql.org/message-id/20170605130954.1438.90535%40wrigleys.postgresql.org
The logical replication apply worker uses the subscription name as
application name, except for table sync. This was incorrectly set to
use the replication slot name, which might be different, in one case.
Also add a comment why the other case is different.
The larger part of this patch replaces usages of MyProc->procLatch
with MyLatch. The latter works even early during backend startup,
where MyProc->procLatch doesn't yet. While the affected code
shouldn't run in cases where it's not initialized, it might get copied
into places where it might. Using MyLatch is simpler and a bit faster
to boot, so there's little point to stick with the previous coding.
While doing so I noticed some weaknesses around newly introduced uses
of latches that could lead to missed events, and an omitted
CHECK_FOR_INTERRUPTS() call in worker_spi.
As all the actual bugs are in v10 code, there doesn't seem to be
sufficient reason to backpatch this.
Author: Andres Freund
Discussion:
https://postgr.es/m/20170606195321.sjmenrfgl2nu6j63@alap3.anarazel.dehttps://postgr.es/m/20170606210405.sim3yl6vpudhmufo@alap3.anarazel.de
Backpatch: -
Make apply busy wait check the catalog instead of shmem state to ensure
that next transaction will see the expected table synchronization state.
Also make the handover always go through same set of steps to make the
overall process easier to understand and debug.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Tested-by: Mark Kirkwood <mark.kirkwood@catalyst.net.nz>
Tested-by: Erik Rijkers <er@xs4all.nl>
Consistent with what we do for indexes, we shouldn't try to record
dependencies on collation OID 0 or the default collation OID (which
is pinned). Also, the fact that indcollation and partcollation can
contain zero OIDs when the data type is not collatable should be
documented.
Amit Langote, per a complaint from me.
Discussion: http://postgr.es/m/CA+Tgmoba5mtPgM3NKfG06vv8na5gGbVOj0h4zvivXQwLw8wXXQ@mail.gmail.com
This allows to cancel commands run over replication connections. While
it might have some use before v10, it has become important now that
normal SQL commands are allowed in database connected walsender
connections.
Author: Petr Jelinek
Reviewed-By: Andres Freund, Michael Paquier
Discussion: https://postgr.es/m/7966f454-7cd7-2b0c-8b70-cdca9d5a8c97@2ndquadrant.com
Because walsender and normal backends share the same main loop it's
problematic to have two different flag variables, set in signal
handlers, indicating a pending configuration reload. Only certain
walsender commands reach code paths checking for the
variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT
... LOGICAL, notably not base backups).
This is a bug present since the introduction of walsender, but has
gotten worse in releases since then which allow walsender to do more.
A later patch, not slated for v10, will similarly unify SIGHUP
handling in other types of processes as well.
Author: Petr Jelinek, Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de
Backpatch: 9.2-, bug is present since 9.0
When the checkpointer writes the shutdown checkpoint, it checks
afterwards whether any WAL has been written since it started and
throws a PANIC if so. At that point, only walsenders are still
active, so one might think this could not happen, but walsenders can
also generate WAL, for instance in BASE_BACKUP and logical decoding
related commands (e.g. via hint bits). So they can trigger this panic
if such a command is run while the shutdown checkpoint is being
written.
To fix this, divide the walsender shutdown into two phases. First,
checkpointer, itself triggered by postmaster, sends a
PROCSIG_WALSND_INIT_STOPPING signal to all walsenders. If the backend
is idle or runs an SQL query this causes the backend to shutdown, if
logical replication is in progress all existing WAL records are
processed followed by a shutdown. Otherwise this causes the walsender
to switch to the "stopping" state. In this state, the walsender will
reject any further replication commands. The checkpointer begins the
shutdown checkpoint once all walsenders are confirmed as
stopping. When the shutdown checkpoint finishes, the postmaster sends
us SIGUSR2. This instructs walsender to send any outstanding WAL,
including the shutdown checkpoint record, wait for it to be replicated
to the standby, and then exit.
Author: Andres Freund, based on an earlier patch by Michael Paquier
Reported-By: Fujii Masao, Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxpvs2@alap3.anarazel.de
Backpatch: 9.4, where logical decoding was introduced
The non-participation in procsignal was a problem for both changes in
master, e.g. parallelism not working for normal statements run in
walsender backends, and older branches, e.g. recovery conflicts and
catchup interrupts not working for logical decoding walsenders.
This commit thus replaces the previous WalSndXLogSendHandler with
procsignal_sigusr1_handler. In branches since db0f6cad48 that can
lead to additional SetLatch calls, but that only rarely seems to make
a difference.
Author: Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170421014030.fdzvvvbrz4nckrow@alap3.anarazel.de
Backpatch: 9.4, earlier commits don't seem to benefit sufficiently
This reverts commit 086221cf6b, which
was made to master only.
The approach implemented in the above commit has some issues. While
those could easily be fixed incrementally, doing so would make
backpatching considerably harder, so instead first revert this patch.
Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxpvs2@alap3.anarazel.de
There was a grammar ambiguity between SET PUBLICATION name REFRESH and
SET PUBLICATION SKIP REFRESH, because SKIP is not a reserved word. To
resolve that, fold the refresh choice into the WITH options. Refreshing
is the default now.
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
Otherwise code that uses this will abort with an assertion failure,
because postmaster_alive_fds are not initialized.
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
The current method of computing the record length (excluding the
lenght of full-page images) has been wrong since the WAL format has
been revamped in 2c03216d83. Only the
main record's length was counted, but that can be significantly too
little if there's data associated with further blocks.
Fix by computing the record length as total_lenght - fpi_length.
Reported-By: Chen Huajun
Bug: #14687
Reviewed-By: Heikki Linnakangas
Discussion: https://postgr.es/m/20170603165939.1436.58887@wrigleys.postgresql.org
Backpatch: 9.5-
Declare the toc_nentry field as uint32 not Size. Since shm_toc_lookup()
reads the field without any lock, it has to be atomically readable, and
we do not assume that for fields wider than 32 bits. Performance would
be impossibly bad for entry counts approaching 2^32 anyway, so there is
no need to try to preserve maximum width here.
This is probably an academic issue, because even if reading int64 isn't
atomic, the high order half would never change in practice. Still, it's
a coding rule violation, so let's fix it.
Adjust some other not-terribly-well-chosen data types too, and copy-edit
some comments. Make shm_toc_attach's Asserts consistent with
shm_toc_create's.
None of this looks to be a live bug, so no need for back-patch.
Discussion: https://postgr.es/m/16984.1496679541@sss.pgh.pa.us
Some openssl builds put their lib files in a VC subdirectory, others do
not. Cater for both cases.
Backpatch to all live branches.
From an offline discussion with Leonardo Cecchi.
Given the possibility of race conditions and so on, it seems entirely
unsafe to just assume that shm_toc_lookup() always finds the key it's
looking for --- but that was exactly what all but one call site were
doing. To fix, add a "bool noError" argument, similarly to what we
have in many other functions, and throw an error on an unexpected
lookup failure. Remove now-redundant Asserts that a rather random
subset of call sites had.
I doubt this will throw any light on buildfarm member lorikeet's
recent failures, because if an unnoticed lookup failure were involved,
you'd kind of expect a null-pointer-dereference crash rather than the
observed symptom. But you never know ... and this is better coding
practice even if it never catches anything.
Discussion: https://postgr.es/m/9697.1496675981@sss.pgh.pa.us
get_partition_parent felt that it could simply Assert that systable_getnext
found a tuple. This is unlike any other caller of that function, and it's
unsafe IMO --- in fact, the reason I noticed it was that the Assert failed.
(OK, I was working with known-inconsistent catalog contents, but I wasn't
expecting the DB to fall over quite that violently. The behavior in a
non-assert-enabled build wouldn't be very nice, either.) Fix it to do what
other callers do, namely an actual runtime-test-and-elog.
Also, standardize the wording of elog messages that are complaining about
unexpected failure of systable_getnext. 90% of them say "could not find
tuple for <object>", so make the remainder do likewise. Many of the
holdouts were using the phrasing "cache lookup failed", which is outright
misleading since no catcache search is involved.
I'd always assumed that backend/optimizer/geqo/'s remarkably poor
showing on code coverage metrics was because we weren't exercising
it much in the regression tests. But it turns out that a good chunk
of the problem is that there's a bunch of code that is physically
unreachable (because the calls to it are #ifdef'd out in geqo_main.c)
but is being built anyway. Making the called code have #if guards
similar to the calling code saves a couple of kilobytes of executable
size and should make the coverage numbers more reflective of reality.
It's arguable that we should just delete all the unused recombination
mechanisms altogether, but I didn't feel a need to go that far today.
If we allow this, whatever outer command has the table open will not know
about the new index and may fail to update it as needed, as shown in a
report from Laurenz Albe. We already had such a prohibition in place for
ALTER TABLE, but the CREATE INDEX syntax missed the check.
Fixing it requires an API change for DefineIndex(), which conceivably
would break third-party extensions if we were to back-patch it. Given
how long this problem has existed without being noticed, fixing it in
the back branches doesn't seem worth that risk.
Discussion: https://postgr.es/m/A737B7A37273E048B164557ADEF4A58B53A4DC9A@ntex2010i.host.magwien.gv.at
\if and related commands were overlooked here, as were \dRp and \dRs
from the logical-replication patch, as was \?.
While here, reformat the list to put each new first command letter on
a separate line; perhaps that will limit the need to reflow the whole
list when we add more commands in future.
Masahiko Sawada (reformatting by me)
Discussion: https://postgr.es/m/CAD21AoDW1QHtBsM33hV+Fg2mYEs+FWj4qtoCU72AwHAXQ3U6ZQ@mail.gmail.com
These estimators returned 1 minus the corresponding equality/match
estimate, which is incorrect: we need to subtract off the fraction
of nulls in the column, since those are neither equal nor not equal
to the comparison value. The error only becomes obvious if the
nullfrac is large, but it could be very bad in a mostly-nulls
column, as reported in bug #14676 from Marko Tiikkaja.
To fix the <> case, refactor eqsel() and neqsel() to call a common
support routine, which can be made to account for nullfrac correctly.
The pattern-match cases were already factored that way, and it was
simply an oversight that patternsel() wasn't subtracting off nullfrac.
neqjoinsel() has a similar problem, but since we're elsewhere discussing
changing its behavior entirely, I left it alone for now.
This is a very longstanding bug, but I'm hesitant to back-patch a fix for
it. Given the lack of prior complaints, such cases must not come up often,
so it's probably not worth the risk of destabilizing plans in stable
branches.
Discussion: https://postgr.es/m/20170529153847.4275.95416@wrigleys.postgresql.org
When costing a nestloop with stop-at-first-inner-match semantics, and a
non-indexscan inner path, final_cost_nestloop() wants to charge the full
scan cost of the inner rel at least once, with additional scans charged
at inner_rescan_run_cost which might be less. However the logic for
doing this effectively assumed that outer_matched_rows is at least 1.
If it's zero, which is not unlikely for a small outer rel, we ended up
charging inner_run_cost plus N times inner_rescan_run_cost, as much as
double the correct charge for an outer rel with only one row that
we're betting won't be matched. (Unless the inner rel is materialized,
in which case it has very small inner_rescan_run_cost and the cost
is not so far off what it should have been.)
The upshot of this was that the planner had a tendency to select plans
that failed to make effective use of the stop-at-first-inner-match
semantics, and that might have Materialize nodes in them even when the
predicted number of executions of the Materialize subplan was only 1.
This was not so obvious before commit 9c7f5229a, because the case only
arose in connection with semi/anti joins where there's not freedom to
reverse the join order. But with the addition of unique-inner joins,
it could result in some fairly bad planning choices, as reported by
Teodor Sigaev. Indeed, some of the test cases added by that commit
have plans that look dubious on closer inspection, and are changed
by this patch.
Fix the logic to ensure that we don't charge for too many inner scans.
I chose to adjust it so that the full-freight scan cost is associated
with an unmatched outer row if possible, not a matched one, since that
seems like a better model of what would happen at runtime.
This is a longstanding bug, but given the lesser impact in back branches,
and the lack of field complaints, I won't risk a back-patch.
Discussion: https://postgr.es/m/CAKJS1f-LzkUsFxdJ_-Luy38orQ+AdEXM5o+vANR+-pHAWPSecg@mail.gmail.com
We didn't accept any invalidation messages until the whole sync process
had finished (because it flattens all the remote transactions in the
single one). So the sync worker didn't learn about subscription
changes/drop until it has finished. This could lead to "orphaned" sync
workers.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
This avoids "orphaned" sync workers.
This was caused by a thinko in wait_for_sync_status_change.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
Previously this was not allowed, as copy.c didn't set the
CURSOR_OPT_PARALLEL_OK flag when planning the query. Set it.
While the lack of parallel query for COPY isn't strictly speaking a
bug, it does prevent parallelism from being used in a facility
commonly used to run long running queries. Thus backpatch to 9.6.
Author: Andres Freund
Discussion: https://postgr.es/m/20170531231958.ihanapplorptykzm@alap3.anarazel.de
Backpatch: 9.6, where parallelism was introduced.
When trying to access a replication slot that is supposed to already
exist, we don't need to check the naming rules again. If the slot
does not exist, we will then get a "does not exist" error message, which
is generally more useful from the perspective of an end user.
The logical replication worker processes now use the normal die()
handler for SIGTERM and CHECK_FOR_INTERRUPTS() instead of custom code.
One problem before was that the apply worker would not exit promptly
when a subscription was dropped, which could lead to deadlocks.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
Move the walrcv_disconnect() calls into the before_shmem_exit handler.
This makes sure the call is always made even during exit by signal, it
saves some duplicate code, and it makes the logic more similar to
walreceiver.c.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
If a FOR ALL TABLES publication was present, \d of a table would claim
for each table that it was part of the publication, even for tables that
are ignored for this purpose, such as system tables and unlogged tables.
Fix the query by using the function pg_get_publication_tables(), which
was intended for this purpose.
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
Reviewed-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Previously the changes to the "data" part of the sequence, i.e. the
one containing the current value, were not transactional, whereas the
definition, including minimum and maximum value were. That leads to
odd behaviour if a schema change is rolled back, with the potential
that out-of-bound sequence values can be returned.
To avoid the issue create a new relfilenode fork whenever ALTER
SEQUENCE is executed, similar to how TRUNCATE ... RESTART IDENTITY
already is already handled.
This commit also makes ALTER SEQUENCE RESTART transactional, as it
seems to be too confusing to have some forms of ALTER SEQUENCE behave
transactionally, some forms not. This way setval() and nextval() are
not transactional, but DDL is, which seems to make sense.
This commit also rolls back parts of the changes made in 3d092fe540
and f8dc1985f as they're now not needed anymore.
Author: Andres Freund
Discussion: https://postgr.es/m/20170522154227.nvafbsm62sjpbxvd@alap3.anarazel.de
Backpatch: Bug is in master/v10 only
On some platforms, -fpic fails for sufficiently large shared libraries.
We've mostly not hit that boundary yet, but there are some extensions
such as Citus and pglogical where it's becoming a problem. A bit of
research suggests that the penalty for -fPIC is small, in the
single-digit-percentage range --- and there's none at all on popular
platforms such as x86_64. So let's just default to -fPIC everywhere
and provide one less thing for extension developers to worry about.
Per complaint from Christoph Berg. Back-patch to all supported branches.
(I did not bother to touch the recently-removed Makefiles for sco and
unixware in the back branches, though. We'd have no way to test that
it doesn't break anything on those platforms.)
Discussion: https://postgr.es/m/20170529155850.qojdfrwkkqnjb3ap@msg.df7cb.de
Using the client pid can easily be non-unique when used on different
hosts. Using the backend pid should be guaranteed unique, since the
temporary slot gets removed when the client disconnects so it will be
gone even if the pid is renewed.
Reported by Ludovic Vaugeois-Pepin
The GET/SET_n_BYTES macros are meant to be infrastructure for the
DatumGetFoo/FooGetDatum macros, which include a cast to the intended
target type. Using them directly without a cast, as DatumGetFloat4
and friends previously did, can yield warnings when -Wconversion is on.
This is of little significance when building Postgres proper, because
there are such a huge number of such warnings in the server that nobody
would think -Wconversion is of any use. But some extensions build with
-Wconversion due to outside constraints. Commit 14cca1bf8 did a disservice
to those extensions by moving DatumGetFloat4 et al into postgres.h,
where they can now cause warnings in extension builds.
To fix, use DatumGetInt32 and friends in place of the low-level macros.
This is arguably a bit cleaner anyway.
Chapman Flack
Discussion: https://postgr.es/m/592E4D04.1070609@anastigmatix.net
Remove some gratuituous message differences by making the AM name
previously embedded in each message be a %s instead. While at it, get
rid of terminology that's unclear and unnecessary in one message.
Discussion: https://postgr.es/m/20170523001557.bq2hbq7hxyvyw62q@alvherre.pgsql
We could have limped along without this for v10, which was my intention
when I annotated the bug in commit 76a3df6e5. But consensus is that it's
better to fix it now and take the cost of a post-beta1 initdb (which is
needed because these node types are stored in pg_class.relpartbound).
Since we're forcing initdb anyway, take the opportunity to make the node
type identification strings match the node struct names, instead of being
randomly different from them.
Discussion: https://postgr.es/m/E1dFBEX-0004wt-8t@gemulon.postgresql.org
Per our message style guidelines, error messages incorporating the
results of format_type_be() and its siblings should not add quotes
around those results, because those functions already add quotes
at need. Fix a few places that hadn't gotten that memo.
json_populate_record throws an error if asked to convert a JSON scalar
or array into a composite type. jsonb_populate_record was returning
a record full of NULL fields instead. It seems better to make it
throw an error for this case as well.
Nikita Glukhov
Discussion: https://postgr.es/m/fbd1d566-bba0-a3de-d6d0-d3b1d7c24ff2@postgrespro.ru
The macro gave the wrong answers for a JsObject with is_json == 0:
it would return 1 if jsonb_cont == NULL, or if that wasn't NULL,
it would return 1 for any non-zero size.
We could fix that, but the only use of this macro at present is in the
JsObjectIsEmpty() macro, so it seems simpler and clearer to get rid of
JsObjectSize() and put corrected logic into JsObjectIsEmpty().
Thinko in commit cf35346e8, so no need for back-patch.
Nikita Glukhov
Discussion: https://postgr.es/m/fbd1d566-bba0-a3de-d6d0-d3b1d7c24ff2@postgrespro.ru
pg_resetwal (formerly pg_resetxlog) doesn't insist on finding a matching
version number in pg_control, and that seems like an important thing to
preserve since recovering from corrupt pg_control is a prime reason to
need to run it. However, that means you can try to run it against a
data directory of a different major version, which is at best useless
and at worst disastrous. So as to provide some protection against that
type of pilot error, inspect PG_VERSION at startup and refuse to do
anything if it doesn't match. PG_VERSION is read-only after initdb,
so it's unlikely to get corrupted, and even if it were corrupted it would
be easy to fix by hand.
This hazard has been there all along, so back-patch to all supported
branches.
Michael Paquier, with some kibitzing by me
Discussion: https://postgr.es/m/f4b8eb91-b934-8a0d-b3cc-68f06e2279d1@enterprisedb.com
The NumericOnly grammar production accepted ICONST, + ICONST, - ICONST,
FCONST, and - FCONST, but for some reason not + FCONST. This led to
strange inconsistencies like
regression=# set random_page_cost = +4;
SET
regression=# set random_page_cost = 4000000000;
SET
regression=# set random_page_cost = +4000000000;
ERROR: syntax error at or near "4000000000"
(because 4000000000 is too large to be an ICONST). While there's
no actual functional reason to need to write a "+", if we allow
it for integers it seems like we should allow it for numerics too.
It's been like that forever, so back-patch to all supported branches.
Discussion: https://postgr.es/m/30908.1496006184@sss.pgh.pa.us
Avoid trashing the input PartitionBoundSpec; while that might be safe for
current callers, it's certainly trouble waiting to happen. In the same
vein, make sure that all of the result data structure is freshly palloc'd,
rather than some of it being pointers into the input data structures
(which we don't know the lifespans of).
Simplify the logic for tacking on IS NULL or IS NOT NULL conditions some
more; commit 85c2b9a15 left a lot on the table there. And rearrange the
construction of the nodes into (what seems to me) a more logical order.
In passing, make sure that get_qual_for_range() also returns a freshly
palloc'd structure, since there's no value in having that guarantee for
only one kind of partitioning. And improve some comments there.
Jeevan Ladhe, with further tweaking by me
Discussion: https://postgr.es/m/CAOgcT0MAcYoMs93W80iTUf_dP36=1mZQzeUk+nnwY_-qWDrCfw@mail.gmail.com
Fix failure to check that we got a plain Const from const-simplification of
a coercion request. This is the cause of bug #14666 from Tian Bing: there
is an int4 to money cast, but it's only stable not immutable (because of
dependence on lc_monetary), resulting in a FuncExpr that the code was
miserably unequipped to deal with, or indeed even to notice that it was
failing to deal with. Add test cases around this coercion behavior.
In view of the above, sprinkle the code liberally with castNode() macros,
in hope of catching the next such bug a bit sooner. Also, change some
functions that were randomly declared to take Node* to take more specific
pointer types. And change some struct fields that were declared Node*
but could be given more specific types, allowing removal of assorted
explicit casts.
Place PARTITION_MAX_KEYS check a bit closer to the code it's protecting.
Likewise check only-one-key-for-list-partitioning restriction in a less
random place.
Avoid not-per-project-style usages like !strcmp(...).
Fix assorted failures to avoid scribbling on the input of parse
transformation. I'm not sure how necessary this is, but it's entirely
silly for these functions to be expending cycles to avoid that and not
getting it right.
Add guards against partitioning on system columns.
Put backend/nodes/ support code into an order that matches handling
of these node types elsewhere.
Annotate the fact that somebody added location fields to PartitionBoundSpec
and PartitionRangeDatum but forgot to handle them in
outfuncs.c/readfuncs.c. This is fairly harmless for production purposes
(since readfuncs.c would just substitute -1 anyway) but it's still bogus.
It's not worth forcing a post-beta1 initdb just to fix this, but if we
have another reason to force initdb before 10.0, we should go back and
clean this up.
Contrariwise, somebody added location fields to PartitionElem and
PartitionSpec but forgot to teach exprLocation() about them.
Consolidate duplicative code in transformPartitionBound().
Improve a couple of error messages.
Improve assorted commentary.
Re-pgindent the files touched by this patch; this affects a few comment
blocks that must have been added quite recently.
Report: https://postgr.es/m/20170524024550.29935.14396@wrigleys.postgresql.org
Use 'COLLATE "C"' to force locale-independent sorting of the iexit
view results in select_views.sql. We aren't particularly interested
in the exact sorting behavior here, and this doesn't change the shape
of the generated plan, so it seems like a wash as far as the goals
of this test go.
This is in response to bug #14637 from Tomasz Kontusz. It doesn't
fully resolve his problem, because he also saw some diffs in the
create_index test. But other people have had issues with select_views
too, and this fix lets us drop the select_views_1.out variant expected
file altogether, which is a nice win from a maintenance standpoint.
Emre Hasegeli
Discussion: https://postgr.es/m/20170501000609.24360.24248@wrigleys.postgresql.org
Dunno what 'p' was supposed to mean, but since neither the code below
here nor pg_collation.h think it's valid, it must be a mistake.
Per report from Thomas Kellerer.
Discussion: https://postgr.es/m/og9q8f%24oes%241%40blaine.gmane.org
Commit 9aa3c782c added code to allow CREATE TABLE/CREATE TYPE to not fail
when the desired type name conflicts with an autogenerated array type, by
dint of renaming the array type out of the way. But I (tgl) overlooked
that the same case arises in ALTER TABLE/TYPE RENAME. Fix that too.
Back-patch to all supported branches.
Report and patch by Vik Fearing, modified a bit by me
Discussion: https://postgr.es/m/0f4ade49-4f0b-a9a3-c120-7589f01d1eb8@2ndquadrant.com
If an operator class has no operators or functions, and doesn't need
a STORAGE clause, we emitted "CREATE OPERATOR CLASS ... AS ;" which
is syntactically invalid. Fix by forcing a STORAGE clause to be
emitted anyway in this case.
(At some point we might consider changing the grammar to allow CREATE
OPERATOR CLASS without an opclass_item_list. But probably we'd want to
omit the AS in that case, so that wouldn't fix this pg_dump issue anyway.)
It's been like this all along, so back-patch to all supported branches.
Daniel Gustafsson, tweaked by me to avoid a dangling-pointer bug
Discussion: https://postgr.es/m/D9E5FC64-7A37-4F3D-B946-7E4FB468F88A@yesql.se
Logical replication supports replicating between tables with different
column order. But this failed for the initial table sync because of a
logic error in how the column list for the internal COPY command was
composed. Fix that and also add a test.
Also fix a minor omission in the column name mapping cache. When
creating the mapping list, it would not skip locally dropped columns.
So if a remote column had the same name as a locally dropped
column (...pg.dropped...), then the expected error would not occur.
Reduce some redundant messages to DEBUG1. Be clearer about the
distinction between apply workers and table synchronization workers.
Add subscription and table name where possible.
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
We need not consider the case where both nulltest1 and nulltest2 are
NULL; the partition either accepts nulls or it does not.
Jeevan Ladhe. I added an assertion.
This patch replaces isspace() calls with scanner_isspace() in functions
that are likely to be presented with non-ASCII input. isspace() has
the small advantage that it will correctly recognize no-break space
in single-byte encodings (such as LATIN1); but it cannot work successfully
for any multibyte character, and depending on platform it might return
false positive results for some fragments of multibyte characters. That's
disastrous for functions that are trying to discard whitespace between
valid strings, as noted in bug #14662 from Justin Muise. Even treating
no-break space as whitespace is pretty questionable for the usages touched
here, because the core scanner would think it is an identifier character.
Affected functions are parse_ident(), parseNameAndArgTypes (underlying
regprocedurein() and siblings), SplitIdentifierString (used for parsing
GUCs and options that are qualified names or lists of names), and
SplitDirectoriesString (used for parsing GUCs that are lists of
directories).
All the functions adjusted here are parsing SQL identifiers and similar
constructs, so it's reasonable to insist that their definition of
whitespace match the core scanner. So we can hope that this won't cause
many backwards-compatibility problems. I've left alone isspace() calls
in places that aren't really expecting any non-ASCII input characters,
such as float8in().
Back-patch to all supported branches.
Discussion: https://postgr.es/m/10129.1495302480@sss.pgh.pa.us
The nonce consists of client and server nonces concatenated together. The
client checks the nonce contained the client nonce, but it would get fooled
if the server sent a truncated or even empty nonce.
Reported by Steven Fackler to security@postgresql.org. Neither me or Steven
are sure what harm a malicious server could do with this, but let's fix it.
The cash_div_intX functions applied rint() to the result of the division.
That's not merely useless (because the result is already an integer) but
it causes precision loss for values larger than 2^52 or so, because of
the forced conversion to float8.
On the other hand, the cash_mul_fltX functions neglected to apply rint() to
their multiplication results, thus possibly causing off-by-one outputs.
Per C standard, arithmetic between any integral value and a float value is
performed in float format. Thus, cash_mul_flt4 and cash_div_flt4 produced
answers good to only about six digits, even when the float value is exact.
We can improve matters noticeably by widening the float inputs to double.
(It's tempting to consider using "long double" arithmetic if available,
but that's probably too much of a stretch for a back-patched fix.)
Also, document that cash_div_intX operators truncate rather than round.
Per bug #14663 from Richard Pistole. Back-patch to all supported branches.
Discussion: https://postgr.es/m/22403.1495223615@sss.pgh.pa.us
Using flex's -i switch to achieve case-insensitivity is not a very safe
practice, because the scanner's behavior may then depend on the locale
that flex was invoked in. In the particular example at hand, that's
not academic: the possible matches for "FIRST" will be different in a
Turkish locale than elsewhere. Do it the hard way instead, as our
other scanners do.
Also, drop use of -b -CF -p, because this scanner is only used when
parsing the contents of a GUC variable. That's not done often, and
the amount of text to be parsed can be expected to be trivial, so
prioritizing scanner speed over code size seems like quite the wrong
tradeoff. Using flex's default optimization options reduces the
size of syncrep_gram.o by more than 50%.
The case-insensitivity problem is new in HEAD (cf commit 3901fd70c).
The poor choice of optimization flags exists also in 9.6, but it doesn't
seem important enough to back-patch.
Discussion: https://postgr.es/m/24403.1495225931@sss.pgh.pa.us
Mark any old hash indexes as invalid so that they don't get used, and
create a script to run REINDEX on all of them. Without this, we'd
still try to use any upgraded hash indexes, but it would fail.
Amit Kapila, reviewed by me. Per a suggestion from Tom Lane.
Discussion: http://postgr.es/m/CAA4eK1Jidtagm7Q81q-WoegOVgkotv0OxvHOjFxcvFRP4X=mSw@mail.gmail.com
If one host in a multi-host connection string times out, move on to
the next specified host instead of giving up entirely.
Takayuki Tsunakawa, reviewed by Michael Paquier. I added
a minor adjustment to the documentation.
Discussion: http://postgr.es/m/0A3221C70F24FB45833433255569204D1F6F42F5@G01JPEXMBYT05
Otherwise, set_plan_refs() can get applied to the same list
multiple times through different references, leading to chaos.
Amit Langote, Dilip Kumar, and Robert Haas, reviewed by Ashutosh
Bapat. Original report by Sveinn Sveinsson.
Discussion: http://postgr.es/m/20170517141151.1435.79890@wrigleys.postgresql.org
Since commit e7b3349a8a, MergeAttributes
destructively modifies the input List, to which the caller's
CreateStmt still points. One may wonder whether this was already a
bug, but commit f0e44751d7 made things
noticeably worse by adding additional destructive modifications so
that the caller's List might, in the case of creation a partitioned
table, no longer even be structurally valid. Restore the status quo
ante by assigning the return value of MergeAttributes back to
stmt->tableElts in the caller.
In most of the places where DefineRelation is called, it doesn't
matter what stmt->tableElts points to here or whether it's valid or
not, because the caller doesn't use the statement for anything after
DefineRelation returns anyway. However, ProcessUtilitySlow passes it
to EventTriggerCollectSimpleCommand, and that function tries to invoke
copyObject on it. If any of the CreateStmt's substructure is invalid
at that point, undefined behavior will result.
One might wonder whether this whole area needs further revision -
perhaps DefineRelation() ought not to be destructively modifying the
caller-provided CreateStmt at all. However, that would be a behavior
change for any event triggers using C code to inspect the CreateStmt,
so for now, just fix the crash.
Report by Amit Langote, who provided a somewhat different patch for it.
Discussion: http://postgr.es/m/bf6a39a7-100a-74bd-1156-3c16a1429d88@lab.ntt.co.jp
This seemed like a good idea originally because there's no way to mark
a range partition as accepting NULL, but that now seems more like a
current limitation than something we want to lock down for all time.
For example, there's a proposal to add the notion of a default
partition which accepts all rows not otherwise routed, which directly
conflicts with the idea that a range-partitioned table should never
allow nulls anywhere. So let's change this while we still can, by
putting the NOT NULL test into the partition constraint instead of
changing the column properties.
Amit Langote and Robert Haas, reviewed by Amit Kapila
Discussion: http://postgr.es/m/8e2dd63d-c6fb-bb74-3c2b-ed6d63629c9d@lab.ntt.co.jp
When creating a subscription with slot_name = NONE, we failed to check
that also create_slot = false and enabled = false were set. This
created an invalid subscription and could later lead to a crash if a
NULL slot name was accessed. Add more checks around that for
robustness.
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
Partially revert commit c079673dcb.
There were complaints that splitting switch descriptions would
complicate translation efforts. There are probably ways to resolve
the formatting problem without doing that, but undo it while we're
discussing.
Add some tests for parsing different option combinations. Fix some of
the resulting error messages for recent changes in option naming.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
When stdin is a terminal, it's possible to end a COPY FROM STDIN with
a keyboard EOF signal (typically control-D), and then keep on issuing
SQL commands. One would expect another COPY FROM STDIN to work as well,
but on some platforms it did not. This turns out to be because we were
not resetting the stream's feof() flag, and BSD-ish versions of fread()
and fgets() won't attempt to read more data if that's set.
The misbehavior is observed on BSDen (including macOS), but not Linux,
Windows, or SysV-ish Unixen, which makes this a portability bug not
just a missing feature.
Add a clearerr() call to fix the behavior, and improve the prompt that's
issued when copying from a TTY to mention that EOF signals work.
It's been like this forever, so back-patch to all supported branches.
Thomas Munro
Discussion: https://postgr.es/m/CAEepm=0MCGfYf=JAMiYhO6JPtv9-3ZfBo8fcGeCZ8oMzaw+Z+Q@mail.gmail.com
We used to only check for a supported relkind on the subscriber during
replication, which is needed to ensure that the setup is valid and we
don't crash. But it's also useful to tell the user immediately when
CREATE or ALTER SUBSCRIPTION is executed that the relation being added
to the subscription is not of a supported relkind.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
Reformat various places in which pgindent will make a mess, and
fix a few small violations of coding style that I happened to notice
while perusing the diffs from a pgindent dry run.
There is one actual bug fix here: the need-to-enlarge-the-buffer code
path in icu_convert_case was obviously broken. Perhaps it's unreachable
in our usage? Or maybe this is just sadly undertested.
Commit 827d6f977 contained the same misunderstanding of hash_create's API
as commit 090010f2e. As in 5d00b764c, remove the unnecessary layer of
memory context. (This bug is less significant than the other one, since
the extra context would be under a relatively short-lived context, but
it's still a bug.)
Since commit 4e37b3e15, buildfarm member frogmouth has been failing
occasionally with symptoms indicating that some expected stats data is
getting dropped. The reason that that commit changed the behavior seems
probably to be that more data is getting shoved at the collector in a short
span of time. In current sources, the stats test's first session sends
about 9KB of data while exiting, which is probably the same as what was
sent just before wait_for_stats() in the previous test design. But now,
the test's second session is starting up concurrently, and it sends another
2KB (presumably reflecting its initial catalog accesses). Since frogmouth
is running on Windows XP, which reputedly has a default socket receive
buffer size of only 8KB, it is not very surprising if this has put us over
the threshold where the receive buffer can overflow and drop messages.
The same mechanism could very easily explain the intermittent stats test
failures we've been seeing for years, since background processes such
as the bgwriter will sometimes send data concurrently with all this, and
could thus cause occasional buffer overflows.
Hence, insert some code into pgstat_init() to increase the stats socket's
receive buffer size to 100KB if it's less than that. (On failure, emit a
LOG message, but keep going.) Modern systems seem to have default sizes
in the range of 100KB-250KB, but older platforms don't. I couldn't find
any platforms that wouldn't accept 100KB, so in theory this won't cause
any portability problems.
If this is successful at reducing the buildfarm failure rate in HEAD,
we should back-patch it, because it's certain that similar buffer overflows
happen in the field on platforms with small buffer sizes. Going forward,
there might be an argument for trying to increase the buffer size even
more, but let's take a baby step first.
Discussion: https://postgr.es/m/22173.1494788088@sss.pgh.pa.us
Modifying the permissions of a persistent file isn't really much nicer
than modifying its contents, even if git doesn't currently notice it.
Adjust the test script to make a copy and set the permissions of that
instead.
Michael Paquier, per a gripe from me. Back-patch to 9.5 where these
tests were introduced.
Discussion: https://postgr.es/m/14836.1494885946@sss.pgh.pa.us
The CommentStmt made by RebuildConstraintComment() has to pstrdup the
relation name, else it will contain a dangling pointer after that
relcache entry is flushed. (I'm less sure that pstrdup'ing conname
is necessary, but let's be safe.) Failure to do this leads to weird
errors or crashes, as reported by Marko Elezovic.
Bug introduced by commit e42375fc8, so back-patch to 9.5 as that was.
Fix by David Rowley, regression test by Michael Paquier
Discussion: https://postgr.es/m/DB6PR03MB30775D58E732D4EB0C13725B9AE00@DB6PR03MB3077.eurprd03.prod.outlook.com
In 1753b1b027, the pg_sequence system
catalog was introduced. This made sequence metadata changes
transactional, while the actual sequence values are still behaving
nontransactionally. This requires some refinement in how ALTER
SEQUENCE, which operates on both, locks the sequence and the catalog.
The main problems were:
- Concurrent ALTER SEQUENCE causes "tuple concurrently updated" error,
caused by updates to pg_sequence catalog.
- Sequence WAL writes and catalog updates are not protected by same
lock, which could lead to inconsistent recovery order.
- nextval() disregarding uncommitted ALTER SEQUENCE changes.
To fix, nextval() and friends now lock the sequence using
RowExclusiveLock instead of AccessShareLock. ALTER SEQUENCE locks the
sequence using ShareRowExclusiveLock. This means that nextval() and
ALTER SEQUENCE block each other, and ALTER SEQUENCE on the same sequence
blocks itself. (This was already the case previously for the OWNER TO,
RENAME, and SET SCHEMA variants.) Also, rearrange some code so that the
entire AlterSequence is protected by the lock on the sequence.
As an exception, use reduced locking for ALTER SEQUENCE ... RESTART.
Since that is basically a setval(), it does not require the full locking
of other ALTER SEQUENCE actions. So check whether we are only running a
RESTART and run with less locking if so.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reported-by: Jason Petersen <jason@citusdata.com>
Reported-by: Andres Freund <andres@anarazel.de>
pg_stat_get_snapshot_timestamp() returns the timestamp seen in the "global"
stats file. Because pgstat_write_statsfiles() writes per-DB stats files
before the global file (or at least before renaming it into place), there
is a window where the test backend can see all the stats updates that
wait_for_stats() was checking for (all of which come from the per-DB file)
but also see the same global stats file it had seen at the start of the
test script. This results in a failure in only the "snapshot_newer" query,
as reported by a couple of buildfarm members recently.
I suspect that this ought to be back-patched. Commit 4e37b3e15 has
evidently increased the probability of this window getting hit, but
it's not apparent why it could not have been hit before. I'll refrain
for the moment though.
Code review for commit 090010f2e.
Fix cases where an elog(ERROR) partway through a function would leave the
persistent data structures in a corrupt state. pgstat_report_stat got this
wrong by invalidating PgStat_TableEntry structs before removing hashtable
entries pointing to them, and get_tabstat_entry got it wrong by ignoring
the possibility of palloc failure after it had already created a hashtable
entry.
Also, avoid leaking a memory context per transaction, which the previous
code did through misunderstanding hash_create's API. We do not need to
create a context to hold the hash table; hash_create will do that.
(The leak wasn't that large, amounting to only a memory context header
per iteration, but it's still surprising that nobody noticed it yet.)
Commit 60690a6fe attempted to fix the wait_for_stats() function in this
test so that it would wait properly if the tenk2 scans were done in
parallel workers instead of the main session (typically as a consequence of
force_parallel_mode being turned on). However, we made it test for whether
the main session's actions had been reported by looking for inserts on
'trunc_stats_test'. This is the Wrong Thing, because those aren't the last
updates we expect the main session to do. As shown by recent failures on
buildfarm member frogmouth, it's entirely likely that the trunc_stats_test
updates will be reported in a separate message from later updates, which
means there can be a window in which wait_for_stats() will exit but not all
the updates we are expecting to see will have arrived. We should test for
the last updates we're expecting, namely those on 'trunc_stats_test4'.
Unfortunately, I doubt that this explains frogmouth's failures, because
there's no reason to believe that it's running the tenk2 queries in
parallel. Still, the test is wrong on its own terms, so fix and back-patch
to 9.6 where parallel query came in.
Per a report from Tom Lane, newer versions of gcc apparently think
that partexprs_item_saved can be used uninitialized. Try to convince
them otherwise.
Remove default cases from assorted switches over ObjectClass and some
related enum types, so that we'll get compiler warnings when someone
adds a new enum value without accounting for it in all these places.
In passing, re-order some switch cases as needed to match the declaration
of enum ObjectClass. OK, that's just neatnik-ism, but I dislike code
that looks like it was assembled with the help of a dartboard.
Discussion: https://postgr.es/m/20170512221010.nglatgt5azzdxjlj@alvherre.pgsql
ALTER COLUMN TYPE on a column used by a statistics object fails since
commit 928c4de30, because the relevant switch in ATExecAlterColumnType
is unprepared for columns to have dependencies from OCLASS_STATISTIC_EXT
objects.
Although the existing types of extended statistics don't actually need us
to do any work for a column type change, it seems completely indefensible
that that assumption is hidden behind the failure of an unrelated module
to contain any code for the case. Hence, create and call an API function
in statscmds.c where the assumption can be explained, and where we could
add code to deal with the problem when it inevitably becomes real.
Also, the reason this wasn't handled before, neither for extended stats
nor for the last half-dozen new OCLASS kinds :-(, is that the default:
in that switch suppresses compiler warnings, allowing people to miss the
need to consider it when adding an OCLASS. We don't really need a default
because surely getObjectClass should only return valid values of the enum;
so remove it, and add the missed OCLASS entries where they should be.
Discussion: https://postgr.es/m/20170512221010.nglatgt5azzdxjlj@alvherre.pgsql
Consistently refer to such an entry as a "statistics object", not just
"statistics" or "extended statistics". Previously we had a mismash of
terms, accompanied by utter confusion as to whether the term was
singular or plural. That's not only grating (at least to the ear of
a native English speaker) but could be outright misleading, eg in error
messages that seemed to be referring to multiple objects where only one
could be meant.
This commit fixes the code and a lot of comments (though I may have
missed a few). I also renamed two new SQL functions,
pg_get_statisticsextdef -> pg_get_statisticsobjdef
pg_statistic_ext_is_visible -> pg_statistics_obj_is_visible
to conform better with this terminology.
I have not touched the SGML docs other than fixing those function
names; the docs certainly need work but it seems like a separable task.
Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us
Before 955a684e04 logical decoding snapshot maintenance needed to
cope with transactions it might not have seen in their entirety. For
such transactions we'd to assume they modified the catalog (could have
happened before we were watching), and thus a new snapshot had to be
built, and distributed to concurrently running transactions.
That's problematic because building a new snapshot isn't that cheap ,
especially as the the array of committed transactions needs to be
sorted. When creating a slot on a server with a lot of transactions,
this could make logical slot creation infeasibly expensive.
After 955a684e04 there's no need to deal with transaction that
aren't guaranteed to be fully observable. That allows to avoid
building snapshots for transactions that haven't modified catalog,
even before reaching consistency.
While this isn't necessarily a bugfix, slot creation being impossible
in some production workloads, is severe enough to warrant
backpatching.
Author: Andres Freund, based on a quite different patch from Petr Jelinek
Analyzed-By: Petr Jelinek
Reviewed-By: Petr Jelinek
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backpatch: 9.4-, where logical decoding has been introduced
The snapshot assembly during the creation of logical slots relied
waiting for transactions in xl_running_xacts to end, by checking for
their commit/abort records. Unfortunately, despite locking, it is
possible to see an xl_running_xact record listing transactions as
ready, that have already WAL-logged an commit/abort record, as the
locking just prevents the ProcArray to be adjusted, and the commit
record has to be logged first.
That lead to either delayed or hanging snapshot creation, because
snapbuild.c would wait "forever" to see commit/abort records for some
transactions. That hang resolved only if a xl_running_xacts record
without any running transactions happened to be logged, far from
certain on a busy server.
It's impractical to prevent that via more heavyweight locking, the
likelihood of deadlocks and significantly increased contention would
be too big.
Instead change the initial snapshot creation to be solely based on
tracking the oldest running transaction via
xl_running_xacts->oldestRunningXid - that actually ends up
significantly simplifying the code. That has two disadvantages:
1) Because we cannot fully "trust" the contents of xl_running_xacts,
we cannot use it to build the initial snapshot. Instead we have to
wait twice for all running transactions to finish.
2) Previously a slot, unless the race occurred, could be created when
the all transaction perceived as running based on commit/abort
records, now we have to wait for the next xl_running_xacts record.
To address that, trigger logging new xl_running_xacts record from
within snapbuild.c exactly when necessary.
Unfortunately snabuild.c's SnapBuild is stored on disk, one of the
stupider ideas of a certain Mr Freund, so we can't change it in a
minor release. As this is going to be backpatched, we have to hack
around a bit to keep on-disk compatibility. A later commit will
rejigger that on master.
Author: Andres Freund, based on a quite different patch from Petr Jelinek
Analyzed-By: Petr Jelinek
Reviewed-By: Petr Jelinek
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backpatch: 9.4-, where logical decoding has been introduced
The mess cleaned up in commit da0759600 is clear evidence that it's a
bug hazard to expect the caller of get_attstatsslot()/free_attstatsslot()
to provide the correct type OID for the array elements in the slot.
Moreover, we weren't even getting any performance benefit from that,
since get_attstatsslot() was extracting the real type OID from the array
anyway. So we ought to get rid of that requirement; indeed, it would
make more sense for get_attstatsslot() to pass back the type OID it found,
in case the caller isn't sure what to expect, which is likely in binary-
compatible-operator cases.
Another problem with the current implementation is that if the stats array
element type is pass-by-reference, we incur a palloc/memcpy/pfree cycle
for each element. That seemed acceptable when the code was written because
we were targeting O(10) array sizes --- but these days, stats arrays are
almost always bigger than that, sometimes much bigger. We can save a
significant number of cycles by doing one palloc/memcpy/pfree of the whole
array. Indeed, in the now-probably-common case where the array is toasted,
that happens anyway so this method is basically free. (Note: although the
catcache code will inline any out-of-line toasted values, it doesn't
decompress them. At the other end of the size range, it doesn't expand
short-header datums either. In either case, DatumGetArrayTypeP would have
to make a copy. We do end up using an extra array copy step if the element
type is pass-by-value and the array length is neither small enough for a
short header nor large enough to have suffered compression. But that
seems like a very acceptable price for winning in pass-by-ref cases.)
Hence, redesign to take these insights into account. While at it,
convert to an API in which we fill a struct rather than passing a bunch
of pointers to individual output arguments. That will make it less
painful if we ever want further expansion of what get_attstatsslot can
pass back.
It's certainly arguable that this is new development and not something to
push post-feature-freeze. However, I view it as primarily bug-proofing
and therefore something that's better to have sooner not later. Since
we aren't quite at beta phase yet, let's put it in.
Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us
On faster machines, the overall runtime for running the core regression
tests is under twenty seconds these days, of which the hard-wired delays
in the stats test are a significant fraction. But on closer inspection,
it seems like we shouldn't need those.
The initial 2-second delay is there only to reduce the risk of the test's
stats messages not getting sent due to contention. But analysis of the
last ten years' worth of buildfarm runs shows no evidence that such
failures actually occur. (We do see failures that look like stats
messages not getting sent, particularly on Windows; but there is little
reason to believe that the initial delay reduces their frequency.)
The later 1-second delay is there to ensure that our session's stats
will have gotten sent. But we could also do that by starting a fresh
session, which takes well under 1 second even on very slow machines.
Hence, let's remove both delays and see what happens. The first delay
was the only test of pg_sleep_for() in the regression tests, but we can
move that responsibility into wait_for_stats().
Discussion: https://postgr.es/m/17795.1493869423@sss.pgh.pa.us
Tab-completing DROP STATISTICS would only work if you started writing
the schema name containing the statistics object, because the visibility
clause was missing. To add it, we need to add SQL-callable support for
testing visibility of a statistics object, like all other object types
already have.
Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us
We have now grown enough registerable syscache-invalidation callback
functions that the original assumption that there would be few of them
is causing performance problems. In particular, let's fix things so that
CallSyscacheCallbacks doesn't have to search the whole array to find
which callback(s) to invoke for a given cache ID. Preserve the original
behavior that callbacks are called in order of registration, just in
case there's someplace that depends on that (which I doubt).
In support of this, export the number of syscaches from syscache.h.
People could have found that out anyway from the enum, but adding a
#define makes that much safer.
This provides a useful additional speedup in Mathieu Fenniak's
logical-decoding test case, although we're reaching the point of
diminishing returns there. I think any further improvement will have
to come from reducing the number of cache invalidations that are
triggered in the first place. Still, we can hope that this change
gives some incremental benefit for all invalidation scenarios.
Back-patch to 9.4 where logical decoding was introduced.
Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
A test case provided by Mathieu Fenniak shows that hash_seq_search'ing
this hashtable can consume a very significant amount of overhead during
logical decoding, which triggers frequent cache invalidation. Testing
suggests that the actual population of the hashtable is often no more
than a few dozen entries, so we can cut the overhead just by dropping
the initial number of buckets down from 1024 --- I chose to cut it to 64.
(In situations where we do have a significant number of entries, we
shouldn't get any real penalty from doing this, as the dynahash.c code
will resize the hashtable automatically.)
This gives a further factor-of-two savings in Mathieu's test case.
That may be overly optimistic for real-world benefit, as real cases
may have larger average table populations, but it's hard to see it
turning into a net negative for any workload.
Back-patch to 9.4 where relfilenodemap.c was introduced.
Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
This was missed in 7b504eb282.
Remove the "default:" clause in the switch, to avoid this problem in the
future. Other switches involving the same enum should probably be
changed in the same way, but are not touched by this patch.
Discussion: https://postgr.es/m/20170512204800.iqt2uwyx3c32j45r@alvherre.pgsql
A test case provided by Mathieu Fenniak shows that the initial search for
the target catcache in CatalogCacheIdInvalidate consumes a very significant
amount of overhead in cases where cache invalidation is triggered but has
little useful work to do. There is no good reason for that search to exist
at all, as the index array maintained by syscache.c allows direct lookup of
the catcache from its ID. We just need a frontend function in syscache.c,
matching the division of labor for most other cache-accessing operations.
While there's more that can be done in this area, this patch alone reduces
the runtime of Mathieu's example by 2X. We can hope that it offers some
useful benefit in other cases too, although usually cache invalidation
overhead is not such a striking fraction of the total runtime.
Back-patch to 9.4 where logical decoding was introduced. It might be
worth going further back, but presently the only case we know of where
cache invalidation is really a significant burden is in logical decoding.
Also, older branches have fewer catcaches, reducing the possible benefit.
(Note: although this nominally changes catcache's API, we have always
documented CatalogCacheIdInvalidate as a private function, so I would
have little sympathy for an external module calling it directly. So
backpatching should be fine.)
Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
A stats object ought to have a dependency on each individual column
it reads, not the entire table. Doing this honestly lets us get rid
of the hard-wired logic in RemoveStatisticsExt, which seems to have
been misguidedly modeled on RemoveStatistics; and it will be far easier
to extend to multiple tables later.
Also, add overlooked dependency on owner, and make the dependency on
schema be NORMAL like every other such dependency.
There remains some unfinished work here, which is to allow statistics
objects to be extension members. That takes more effort than just
adding the dependency call, though, so I left it out for now.
initdb forced because this changes the set of pg_depend records that
should exist for a statistics object.
Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us
Previously, we had the WITH clause in the middle of the command, where
you'd specify both generic options as well as statistic types. Few
people liked this, so this commit changes it to remove the WITH keyword
from that clause and makes it accept statistic types only. (We
currently don't have any generic options, but if we invent in the
future, we will gain a new WITH clause, probably at the end of the
command).
Also, the column list is now specified without parens, which makes the
whole command look more similar to a SELECT command. This change will
let us expand the command to supporting expressions (not just columns
names) as well as multiple tables and their join conditions.
Tom added lots of code comments and fixed some parts of the CREATE
STATISTICS reference page, too; more changes in this area are
forthcoming. He also fixed a potential problem in the alter_generic
regression test, reducing verbosity on a cascaded drop to avoid
dependency on message ordering, as we do in other tests.
Tom also closed a security bug: we documented that table ownership was
required in order to create a statistics object on it, but didn't
actually implement it.
Implement tab-completion for statistics objects. This can stand some
more improvement.
Authors: Alvaro Herrera, with lots of cleanup by Tom Lane
Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql
On MSVC builds and on back branches that means removing the hardcoded
--verbose setting. On master for Unix that means removing the empty
setting in the global Makefile so that the value can be acquired from
the environment as well as from the make arguments.
Backpatch to 9.4 where we introduced TAP tests
On Unix this path is detected via the use of xml2-config, but that's not
available on Windows. This means that users building with libxml2 will
no longer need to move things around from the standard libxml2
installation for MSVC builds.
Backpatch to all live branches.
For CREATE/ALTER PUBLICATION/SUBSCRIPTION, use similar option style as
other statements that use a WITH clause for options.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Certain recovery tests use the Perl IPC::Run module's start/kill_kill
method of processing. On at least some versions of perl this causes the
whole process and its caller to crash. If we ever find a better way of
doing these tests they can be re-enabled on this platform. This does not
affect Mingw or Cygwin builds, which use a different perl and a
different shell and so are not affected.
Lag tracking is called for each commit, but we introduce
a pacing delay to ensure we don't swamp the lag tracker.
Author: Petr Jelinek, with minor pacing delay code from me
Increase from the historical value of 32 to 64. We are up to 31 callers
of CacheRegisterSyscacheCallback() in HEAD, so if they were all to be
exercised in one process that would leave only one slot for add-on modules.
It's probably not possible for that to happen, but still we clearly need
more daylight here. (At some point it might be worth making the array
dynamically resizable; but since we've never heard a complaint of "out of
syscache_callback_list slots" happening in the field, I doubt it's worth
it yet.)
Back-patch as far as 9.4, which is where we increased the companion limit
MAX_RELCACHE_CALLBACKS (cf commit f01d1ae3a). It's not as urgent in
released branches, which have only a couple dozen call sites in core, but
it still seems that somebody might hit the limit before these branches die.
Discussion: https://postgr.es/m/12184.1494450131@sss.pgh.pa.us
Per discussion, "location" is a rather vague term that could refer to
multiple concepts. "LSN" is an unambiguous term for WAL locations and
should be preferred. Some function names, view column names, and function
output argument names used "lsn" already, but others used "location",
as well as yet other terms such as "wal_position". Since we've already
renamed a lot of things in this area from "xlog" to "wal" for v10,
we may as well incur a bit more compatibility pain and make these names
all consistent.
David Rowley, minor additional docs hacking by me
Discussion: https://postgr.es/m/CAKJS1f8O0njDKe8ePFQ-LK5-EjwThsDws6ohJ-+c6nWK+oUxtg@mail.gmail.com
This reverts commits fa2fa99552 and 42f50cb8fa.
While the functionality that was intended to be provided by these
commits is desired, the patch didn't actually solve as many of the
problematic situations as we hoped, and it created a bunch of its own
problems. Since we're going to require more extensive changes soon for
other reasons and users have been working around these problems for a
long time already, there is no point in spending effort in fixing this
halfway measure.
Per complaint from Tom Lane.
Discussion: https://postgr.es/m/21407.1484606922@sss.pgh.pa.us
(Commit fa2fa99552 had already been reverted in branches 9.5 as
f858524ee4 and 9.6 as e9e44a0953, so this touches master only.
Commit 42f50cb8fa was not present in the older branches.)
Previously, the memory used by the logical replication apply worker for
processing messages would never be freed, so that could end up using a
lot of memory. To improve that, change the existing ApplyContext memory
context to ApplyMessageContext and reset that after every
message (similar to MessageContext used elsewhere). For consistency of
naming, rename the ApplyCacheContext to ApplyContext.
Author: Stas Kelvich <s.kelvich@postgrespro.ru>
It turned out this approach had problems, because a DROP command should
not have any options other than CASCADE and RESTRICT. Instead, always
attempt to drop the slot if there is one configured, but also add an
ALTER SUBSCRIPTION action to set the slot to NONE.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/29431.1493730652@sss.pgh.pa.us
Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8,
not of the type of the column the statistics are for.
This bug is at least partly the fault of sloppy specification comments
for get_attstatsslot()/free_attstatsslot(): the type OID they want is that
of the stavalues entries, not of the underlying column. (I double-checked
other callers and they seem to get this right.) Adjust the comments to be
more correct.
Per buildfarm.
Security: CVE-2017-7484
Previously it would allow an invalid connection string to be set.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
This new arrangement ensures that statistics are reported right after
commit of transactions. The previous arrangement didn't get this quite
right and could lead to assertion failures.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Erik Rijkers <er@xs4all.nl>
Both views replace the umoptions field with NULL when the user does not
meet qualifications to see it. They used different qualifications, and
pg_user_mappings documented qualifications did not match its implemented
qualifications. Make its documentation and implementation match those
of user_mapping_options. One might argue for stronger qualifications,
but these have long, documented tenure. pg_user_mappings has always
exhibited this problem, so back-patch to 9.2 (all supported versions).
Michael Paquier and Feike Steenbergen. Reviewed by Jeff Janes.
Reported by Andrew Wheelwright.
Security: CVE-2017-7486
Commit 65c3bf19fd moved handling of the,
already then, deprecated requiressl parameter into conninfo_storeval().
The default PGREQUIRESSL environment variable was however lost in the
change resulting in a potentially silent accept of a non-SSL connection
even when set. Its documentation remained. Restore its implementation.
Also amend the documentation to mark PGREQUIRESSL as deprecated for
those not following the link to requiressl. Back-patch to 9.3, where
commit 65c3bf1 first appeared.
Behavior has been more complex when the user provides both deprecated
and non-deprecated settings. Before commit 65c3bf1, libpq operated
according to the first of these found:
requiressl=1
PGREQUIRESSL=1
sslmode=*
PGSSLMODE=*
(Note requiressl=0 didn't override sslmode=*; it would only suppress
PGREQUIRESSL=1 or a previous requiressl=1. PGREQUIRESSL=0 had no effect
whatsoever.) Starting with commit 65c3bf1, libpq ignored PGREQUIRESSL,
and order of precedence changed to this:
last of requiressl=* or sslmode=*
PGSSLMODE=*
Starting now, adopt the following order of precedence:
last of requiressl=* or sslmode=*
PGSSLMODE=*
PGREQUIRESSL=1
This retains the 65c3bf1 behavior for connection strings that contain
both requiressl=* and sslmode=*. It retains the 65c3bf1 change that
either connection string option overrides both environment variables.
For the first time, PGSSLMODE has precedence over PGREQUIRESSL; this
avoids reducing security of "PGREQUIRESSL=1 PGSSLMODE=verify-full"
configurations originating under v9.3 and later.
Daniel Gustafsson
Security: CVE-2017-7485