Commit Graph

45512 Commits

Author SHA1 Message Date
Alvaro Herrera
6872c2be6a Update FSM on WAL replay of page all-visible/frozen
We aren't very strict about keeping FSM up to date on WAL replay,
because per-page freespace values aren't critical in replicas (can't
write to heap in a replica; and if the replica is promoted, the values
would be updated by VACUUM anyway).  However, VACUUM since 9.6 can skip
processing pages marked all-visible or all-frozen, and if such pages are
recorded in FSM with wrong values, those values are blindly propagated
to FSM's upper layers by VACUUM's FreeSpaceMapVacuum.  (This rationale
assumes that crashes are not very frequent, because those would cause
outdated FSM to occur in the primary.)

Even when the FSM is outdated in standby, things are not too bad
normally, because, most per-page FSM values will be zero (other than
those propagated with the base-backup that created the standby); only
once the remaining free space is less than 0.2*BLCKSZ the per-page value
is maintained by WAL replay of heap ins/upd/del.  However, if
wal_log_hints=on causes complete FSM pages to be propagated to a standby
via full-page images, many too-optimistic per-page values can end up
being registered in the standby.

Incorrect per-page values aren't critical in most cases, since an
inserter that is given a page that doesn't actually contain the claimed
free space will update FSM with the correct value, and retry until it
finds a usable page.  However, if there are many such updates to do, an
inserter can spend a long time doing them before a usable page is found;
in a heavily trafficked insert-only table with many concurrent inserters
this has been observed to cause several second stalls, causing visible
application malfunction.

To fix this problem, it seems sufficient to have heap_xlog_visible
(replay of setting all-visible and all-frozen VM bits for a heap page)
update the FSM value for the page being processed.  This fixes the
per-page counters together with making the page skippable to vacuum, so
when vacuum does FreeSpaceMapVacuum, the values propagated to FSM upper
layers are the correct ones, avoiding the problem.

While at it, apply the same fix to heap_xlog_clean (replay of tuple
removal by HOT pruning and vacuum).  This makes any space freed by the
cleaning available earlier than the next vacuum in the promoted replica.

Backpatch to 9.6, where this problem was diagnosed on an insert-only
table with all-frozen pages, which were introduced as a concept in that
release.  Theoretically it could apply with all-visible pages to older
branches, but there's been no report of that and it doesn't backpatch
cleanly anyway.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20180802172857.5skoexsilnjvgruk@alvherre.pgsql
2018-08-15 18:09:29 -03:00
Tom Lane
d7ed4eea53 Clean up assorted misuses of snprintf()'s result value.
Fix a small number of places that were testing the result of snprintf()
but doing so incorrectly.  The right test for buffer overrun, per C99,
is "result >= bufsize" not "result > bufsize".  Some places were also
checking for failure with "result == -1", but the standard only says
that a negative value is delivered on failure.

(Note that this only makes these places correct if snprintf() delivers
C99-compliant results.  But at least now these places are consistent
with all the other places where we assume that.)

Also, make psql_start_test() and isolation_start_test() check for
buffer overrun while constructing their shell commands.  There seems
like a higher risk of overrun, with more severe consequences, here
than there is for the individual file paths that are made elsewhere
in the same functions, so this seemed like a worthwhile change.

Also fix guc.c's do_serialize() to initialize errno = 0 before
calling vsnprintf.  In principle, this should be unnecessary because
vsnprintf should have set errno if it returns a failure indication ...
but the other two places this coding pattern is cribbed from don't
assume that, so let's be consistent.

These errors are all very old, so back-patch as appropriate.  I think
that only the shell command overrun cases are even theoretically
reachable in practice, but there's not much point in erroneous error
checks.

Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us
2018-08-15 16:29:32 -04:00
Bruce Momjian
995133410d pg_upgrade: fix shutdown check for standby servers
Commit 244142d32a only tested for the
pg_controldata output for primary servers, but standby servers have
different "Database cluster state" output, so check for that too.

Diagnosed-by: Michael Paquier

Discussion: https://postgr.es/m/20180810164240.GM13638@paquier.xyz

Backpatch-through: 9.3
2018-08-14 17:19:02 -04:00
Peter Eisentraut
c6eedb4d86 doc: Update broken links
Discussion: https://www.postgresql.org/message-id/flat/153044458767.13254.16049977382403131287%40wrigleys.postgresql.org
2018-08-14 22:56:07 +02:00
Tom Lane
d39079a8b2 Remove duplicate function declarations.
Christoph Berg

Discussion: https://postgr.es/m/20180814165536.GB21152@msg.df7cb.de
2018-08-14 14:25:14 -04:00
Peter Eisentraut
4c9611e576 Remove obsolete comment
The sequence name is no longer stored in the sequence relation, since
1753b1b027.
2018-08-13 21:08:01 +02:00
Tom Lane
998c736641 Fix libpq's implementation of per-host connection timeouts.
Commit 5f374fe7a attempted to turn the connect_timeout from an overall
maximum time limit into a per-host limit, but it didn't do a great job of
that.  The timer would only get restarted if we actually detected timeout
within connectDBComplete(), not if we changed our attention to a new host
for some other reason.  In that case the old timeout continued to run,
possibly causing a premature timeout failure for the new host.

Fix that, and also tweak the logic so that if we do get a timeout,
we advance to the next available IP address, not to the next host name.
There doesn't seem to be a good reason to assume that all the IP
addresses supplied for a given host name will necessarily fail the
same way as the current one.  Moreover, this conforms better to the
admittedly-vague documentation statement that the timeout is "per
connection attempt".  I changed that to "per host name or IP address"
to be clearer.  (Note that reconnections to the same server, such as for
switching protocol version or SSL status, don't get their own separate
timeout; that was true before and remains so.)

Also clarify documentation about the interpretation of connect_timeout
values less than 2.

This seems like a bug, so back-patch to v10 where this logic came in.

Tom Lane, reviewed by Fabien Coelho

Discussion: https://postgr.es/m/5735.1533828184@sss.pgh.pa.us
2018-08-13 13:07:52 -04:00
Michael Paquier
943576bddc Make autovacuum more aggressive to remove orphaned temp tables
Commit dafa084, added in 10, made the removal of temporary orphaned
tables more aggressive.  This commit makes an extra step into the
aggressiveness by adding a flag in each backend's MyProc which tracks
down any temporary namespace currently in use.  The flag is set when the
namespace gets created and can be reset if the temporary namespace has
been created in a transaction or sub-transaction which is aborted.  The
flag value assignment is assumed to be atomic, so this can be done in a
lock-less fashion like other flags already present in PGPROC like
databaseId or backendId, still the fact that the temporary namespace and
table created are still locked until the transaction creating those
commits acts as a barrier for other backends.

This new flag gets used by autovacuum to discard more aggressively
orphaned tables by additionally checking for the database a backend is
connected to as well as its temporary namespace in-use, removing
orphaned temporary relations even if a backend reuses the same slot as
one which created temporary relations in a past session.

The base idea of this patch comes from Robert Haas, has been written in
its first version by Tsunakawa Takayuki, then heavily reviewed by me.

Author: Tsunakawa Takayuki
Reviewed-by: Michael Paquier, Kyotaro Horiguchi, Andres Freund
Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8A4DC6@G01JPEXMBYT05
Backpatch: 11-, as PGPROC gains a new flag and we don't want silent ABI
breakages on already released versions.
2018-08-13 11:49:12 +02:00
Amit Kapila
f34d4b79fa Adjust comment atop ExecShutdownNode.
After commits a315b967cc and b805b63ac2, part of the comment atop
ExecShutdownNode is redundant.  Adjust it.

Author: Amit Kapila
Backpatch-through: 10 where both the mentioned commits are present.
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
2018-08-13 10:12:39 +05:30
Amit Kapila
c054afd0a2 Prohibit shutting down resources if there is a possibility of back up.
Currently, we release the asynchronous resources as soon as it is evident
that no more rows will be needed e.g. when a Limit is filled.  This can be
problematic especially for custom and foreign scans where we can scan
backward.  Fix that by disallowing the shutting down of resources in such
cases.

Reported-by: Robert Haas
Analysed-by: Robert Haas and Amit Kapila
Author: Amit Kapila
Reviewed-by: Robert Haas
Backpatch-through: 9.6 where this code was introduced
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
2018-08-13 08:33:55 +05:30
Andrew Gierth
78f70e07e2 Avoid query-lifetime memory leaks in XMLTABLE (bug #15321)
Multiple calls to XMLTABLE in a query (e.g. laterally applying it to a
table with an xml column, an important use-case) were leaking large
amounts of memory into the per-query context, blowing up memory usage.

Repair by reorganizing memory context usage in nodeTableFuncscan; use
the usual per-tuple context for row-by-row evaluations instead of
perValueCxt, and use the explicitly created context -- renamed from
perValueCxt to perTableCxt -- for arguments and state for each
individual table-generation operation.

Backpatch to PG10 where this code was introduced.

Original report by IRC user begriffs; analysis and patch by me.
Reviewed by Tom Lane and Pavel Stehule.

Discussion: https://postgr.es/m/153394403528.10284.7530399040974170549@wrigleys.postgresql.org
2018-08-13 02:03:12 +01:00
Tom Lane
badaa0c50d Fix bogus loop logic in 013_crash_restart test's pump_until subroutine.
The pump_nb() step might've already received the desired data, so we must
check for that at the top of the loop not the bottom.  Otherwise, the
call to pump() will sit with nothing to do until the timeout elapses.
pump_until then falls out with apparent success ... but the timeout has
been used up, causing the next call of pump_until to report a timeout
failure.  I believe this explains the intermittent timeout failures
we've seen in the buildfarm ever since this test went in.  I was able
to reproduce the problem on gaur semi-repeatably, and this appears to
fix it.

In passing, remove a duplicate assignment, fix one stdin-assignment to
look like the rest, and document the test's dependency on test_decoding.
2018-08-12 18:05:49 -04:00
Tom Lane
0ff8f521d4 Fix wrong order of operations in inheritance_planner.
When considering a partitioning parent rel, we should stop processing that
subroot as soon as we've done adjust_appendrel_attrs and any securityQuals
updates.  The rest of this is unnecessary, and indeed adding duplicate
subquery RTEs to the subroot is *wrong*.  As the code stood, the children
of that partition ended up with two sets of copied subquery RTEs, confusing
matters greatly.  Even more hilarity ensued if all of the children got
excluded by constraint exclusion, so that the extra RTEs didn't make it
back into the parent rtable.

Per fuzz testing by Andreas Seltenreich.  Back-patch to v11 where this
got broken (by commit 0a480502b, it looks like).

Discussion: https://postgr.es/m/87va8g7vq0.fsf@ansel.ydns.eu
2018-08-11 15:53:20 -04:00
Andrew Dunstan
afff44303c Revert changes in execMain.c from commit 16828d5c02
These changes were put in at some stage of the development process, but
are unnecessary and should not have made it into the final patch. Mea
culpa.

Per gripe from Andreas Freund

Backpatch to REL_11_STABLE
2018-08-10 16:09:13 -04:00
Peter Geoghegan
9353d94a9b Handle parallel index builds on mapped relations.
Commit 9da0cc3528, which introduced parallel CREATE INDEX, failed to
propagate relmapper.c backend local cache state to parallel worker
processes.  This could result in parallel index builds against mapped
catalog relations where the leader process (participating as a worker)
scans the new, pristine relfilenode, while worker processes scan the
obsolescent relfilenode.  When this happened, the final index structure
was typically not consistent with the owning table's structure.  The
final index structure could contain entries formed from both heap
relfilenodes.  Only rebuilds on mapped catalog relations that occur as
part of a VACUUM FULL or CLUSTER could become corrupt in practice, since
their mapped relation relfilenode swap is what allows the inconsistency
to arise.

On master, fix the problem by propagating the required relmapper.c
backend state as part of standard parallel initialization (Cf. commit
29d58fd3).  On v11, simply disallow builds against mapped catalog
relations by deeming them parallel unsafe.

Author: Peter Geoghegan
Reported-By: "death lock"
Reviewed-By: Tom Lane, Amit Kapila
Bug: #15309
Discussion: https://postgr.es/m/153329671686.1405.18298309097348420351@wrigleys.postgresql.org
Backpatch: 11-, where parallel CREATE INDEX was introduced.
2018-08-10 13:01:33 -07:00
Alexander Korotkov
1b9d1b08fe Fix typo in SP-GiST error message
Error message didn't match the actual check.  Fix that.  Compression of leaf
SP-GiST values was introduced in 11.  So, backpatch.

Discussion: https://postgr.es/m/20180810.100742.15469435.horiguchi.kyotaro%40lab.ntt.co.jp
Author: Kyotaro Horiguchi
Backpatch-through: 11
2018-08-10 17:34:07 +03:00
Alexander Korotkov
dc444801ba Add missing documentation for argument of amcostestimate()
5262f7a4fc have introduced parallel index scan.  In order to estimate the
number of parallel workers, it adds extra argument to amcostestimate() index
access method API function.  However, this extra argument was missed in the
documentation.  This commit fixes that.

Discussion: https://postgr.es/m/4128fdb4-8b63-2e05-38f6-3125f8c27263%40lab.ntt.co.jp
Author: Tatsuro Yamada, Alexander Korotkov
Backpatch-through: 10
2018-08-10 14:20:11 +03:00
Alvaro Herrera
58a36f91b3 Add RECURSIVE to documentation index
Author: Daniel Vérité <daniel@manitou-mail.org>
Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://postgr.es/m/76d905d7-7eb7-4574-b6ec-a0ca3a1523c0@manitou-mail.org
2018-08-09 16:19:32 -04:00
Tom Lane
a015ae54a7 Document need to clear MAKELEVEL when invoking PG build from a makefile.
Since commit 3b8f6e75f, failure to do this would lead to
submake-generated-headers not doing anything, so that references to
generated or symlinked headers would fail.  Previous to that, the
omission only led to temp-install not doing anything, which apparently
affects many fewer people (doesn't anybody use "make check" in their
build rules??).  Hence, backpatch to v11 but not further.

Per complaints from Christoph Berg, Jakob Egger, and others.
2018-08-09 15:21:16 -04:00
Bruce Momjian
8c92638c00 docs: Only first instance of a PREPARE parameter sets data type
If the first reference to $1 is "($1 = col) or ($1 is null)", the data
type can be determined, but not for "($1 is null) or ($1 = col)".  This
change documents this.

Reported-by: Morgan Owens

Discussion: https://postgr.es/m/153233728858.1404.15268121695358514937@wrigleys.postgresql.org

Backpatch-through: 9.3
2018-08-09 10:13:15 -04:00
Heikki Linnakangas
83f2691a3f Spell "partitionwise" consistently.
I'm not sure which spelling is better, "partitionwise" or "partition-wise",
but everywhere else we spell it "partitionwise", so be consistent.

Tatsuro Yamada reported the one in README, I found the other one with grep.

Discussion: https://www.postgresql.org/message-id/d25ebf36-5a6d-8b2c-1ff3-d6f022a56000@lab.ntt.co.jp
2018-08-09 10:43:14 +03:00
Michael Paquier
87330e21c3 Restrict access to reindex of shared catalogs for non-privileged users
A database owner running a database-level REINDEX has the possibility to
also do the operation on shared system catalogs without being an owner
of them, which allows him to block resources it should not have access
to.  The same goes for a schema owner.  For example, PostgreSQL would go
unresponsive and even block authentication if a lock is waited for
pg_authid.  This commit makes sure that a user running a REINDEX SYSTEM,
DATABASE or SCHEMA only works on the following relations:
- The user is a superuser
- The user is the table owner
- The user is the database/schema owner, only if the relation worked on
is not shared.

Robert has worded most the documentation changes, and I have coded the
core part.

Reported-by: Lloyd Albin, Jeremy Schneider
Author: Michael Paquier, Robert Haas
Reviewed by: Nathan Bossart, Kyotaro Horiguchi
Discussion: https://postgr.es/m/152512087100.19803.12733865831237526317@wrigleys.postgresql.org
Discussion: https://postgr.es/m/20180805211059.GA2185@paquier.xyz
Backpatch-through: 11- as the current behavior has been around for a
very long time and could be disruptive for already released branches.
2018-08-09 09:40:27 +02:00
Tom Lane
69d0e7e6b8 Remove bogus Assert in make_partitionedrel_pruneinfo().
This Assert thought that a given rel couldn't be both leaf and
non-leaf, but it turns out that in some unusual plan trees
that's wrong, so remove it.

The lack of testing for cases like that is quite concerning ---
there is little reason for confidence that there aren't other
bugs in the area.  But developing a stable test case seems
rather difficult, and in any case we don't need this Assert.

David Rowley

Discussion: https://postgr.es/m/CAJGNTeOkdk=UVuMugmKL7M=owgt4nNr1wjxMg1F+mHsXyLCzFA@mail.gmail.com
2018-08-08 20:02:33 -04:00
Peter Geoghegan
393e539c54 Doc: Correct description of amcheck example query.
The amcheck documentation incorrectly claimed that its example query
verifies every catalog index in the database.  In fact, the query only
verifies the 10 largest indexes (as determined by pg_class.relpages).
Adjust the description accordingly.

Backpatch: 10-, where contrib/amcheck was introduced.
2018-08-08 12:56:23 -07:00
Heikki Linnakangas
79f17d45e8 Don't run atexit callbacks in quickdie signal handlers.
exit() is not async-signal safe. Even if the libc implementation is, 3rd
party libraries might have installed unsafe atexit() callbacks. After
receiving SIGQUIT, we really just want to exit as quickly as possible, so
we don't really want to run the atexit() callbacks anyway.

The original report by Jimmy Yih was a self-deadlock in startup_die().
However, this patch doesn't address that scenario; the signal handling
while waiting for the startup packet is more complicated. But at least this
alleviates similar problems in the SIGQUIT handlers, like that reported
by Asim R P later in the same thread.

Backpatch to 9.3 (all supported versions).

Discussion: https://www.postgresql.org/message-id/CAOMx_OAuRUHiAuCg2YgicZLzPVv5d9_H4KrL_OFsFP%3DVPekigA%40mail.gmail.com
2018-08-08 19:10:35 +03:00
Tom Lane
a3deecb1c9 Match RelOptInfos by relids not pointer equality.
Commit 1c2cb2744 added some code that tried to detect whether two
RelOptInfos were the "same" rel by pointer comparison; but it turns
out that inheritance_planner breaks that, through its shenanigans
with copying some relations forward into new subproblems.  Compare
relid sets instead.  Add a regression test case to exercise this
area.

Problem reported by Rushabh Lathia; diagnosis and fix by Amit Langote,
modified a bit by me.

Discussion: https://postgr.es/m/CAGPqQf3anJGj65bqAQ9edDr8gF7qig6_avRgwMT9MsZ19COUPw@mail.gmail.com
2018-08-08 11:44:50 -04:00
Tom Lane
ea1b659710 Don't record FDW user mappings as members of extensions.
CreateUserMapping has a recordDependencyOnCurrentExtension call that's
been there since extensions were introduced (very possibly my fault).
However, there's no support anywhere else for user mappings as members
of extensions, nor are they listed as a possible member object type in
the documentation.  Nor does it really seem like a good idea for user
mappings to belong to extensions when roles don't.  Hence, remove the
bogus call.

(As we saw in bug #15310, the lack of any pg_dump support for this case
ensures that any such membership record would silently disappear during
pg_upgrade.  So there's probably no need for us to do anything else
about cleaning up after this mistake.)

Discussion: https://postgr.es/m/27952.1533667213@sss.pgh.pa.us
2018-08-07 16:32:55 -04:00
Tom Lane
1b5438ec2a Fix incorrect initialization of BackendActivityBuffer.
Since commit c8e8b5a6e, this has been zeroed out using the wrong length.
In practice the length would always be too small, leading to not zeroing
the whole buffer rather than clobbering additional memory; and that's
pretty harmless, both because shmem would likely start out as zeroes
and because we'd reinitialize any given entry before use.  Still,
it's bogus, so fix it.

Reported by Petru-Florin Mihancea (bug #15312)

Discussion: https://postgr.es/m/153363913073.1303.6518849192351268091@wrigleys.postgresql.org
2018-08-07 16:00:55 -04:00
Tom Lane
187331fefd Fix pg_upgrade to handle event triggers in extensions correctly.
pg_dump with --binary-upgrade must emit ALTER EXTENSION ADD commands
for all objects that are members of extensions.  It forgot to do so for
event triggers, as per bug #15310 from Nick Barnes.  Back-patch to 9.3
where event triggers were introduced.

Haribabu Kommi

Discussion: https://postgr.es/m/153360083872.1395.4593932457718151600@wrigleys.postgresql.org
2018-08-07 15:43:48 -04:00
Tom Lane
f736430066 Ensure pg_dump_sort.c sorts null vs non-null namespace consistently.
The original coding here (which is, I believe, my fault) supposed that
it didn't need to concern itself with the possibility that one object
of a given type-priority has a namespace while another doesn't.  But
that's not reliably true anymore, if it ever was; and if it does happen
then it's possible that DOTypeNameCompare returns self-inconsistent
comparison results.  That leads to unspecified behavior in qsort()
and a resultant weird output order from pg_dump.

This should end up being only a cosmetic problem, because any ordering
constraints that actually matter should be enforced by the later
dependency-based sort.  Still, it's a bug, so back-patch.

Report and fix by Jacob Champion, though I editorialized on his
patch to the extent of making NULL sort after non-NULL, for consistency
with our usual sorting definitions.

Discussion: https://postgr.es/m/CABAq_6Hw+V-Kj7PNfD5tgOaWT_-qaYkc+SRmJkPLeUjYXLdxwQ@mail.gmail.com
2018-08-07 13:13:42 -04:00
Tom Lane
e62cc60fb9 Stamp 11beta3. 2018-08-06 16:02:42 -04:00
Peter Eisentraut
10dc69ef8f Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 9706d37387722f17626b41da7b83ea02691f735c
2018-08-06 20:09:07 +02:00
Tom Lane
749839c4d5 Last-minute updates for release notes.
Security: CVE-2018-10915, CVE-2018-10925
2018-08-06 13:13:40 -04:00
Tom Lane
f6f735f78d Fix failure to reset libpq's state fully between connection attempts.
The logic in PQconnectPoll() did not take care to ensure that all of
a PGconn's internal state variables were reset before trying a new
connection attempt.  If we got far enough in the connection sequence
to have changed any of these variables, and then decided to try a new
server address or server name, the new connection might be completed
with some state that really only applied to the failed connection.

While this has assorted bad consequences, the only one that is clearly
a security issue is that password_needed didn't get reset, so that
if the first server asked for a password and the second didn't,
PQconnectionUsedPassword() would return an incorrect result.  This
could be leveraged by unprivileged users of dblink or postgres_fdw
to allow them to use server-side login credentials that they should
not be able to use.

Other notable problems include the possibility of forcing a v2-protocol
connection to a server capable of supporting v3, or overriding
"sslmode=prefer" to cause a non-encrypted connection to a server that
would have accepted an encrypted one.  Those are certainly bugs but
it's harder to paint them as security problems in themselves.  However,
forcing a v2-protocol connection could result in libpq having a wrong
idea of the server's standard_conforming_strings setting, which opens
the door to SQL-injection attacks.  The extent to which that's actually
a problem, given the prerequisite that the attacker needs control of
the client's connection parameters, is unclear.

These problems have existed for a long time, but became more easily
exploitable in v10, both because it introduced easy ways to force libpq
to abandon a connection attempt at a late stage and then try another one
(rather than just giving up), and because it provided an easy way to
specify multiple target hosts.

Fix by rearranging PQconnectPoll's state machine to provide centralized
places to reset state properly when moving to a new target host or when
dropping and retrying a connection to the same host.

Tom Lane, reviewed by Noah Misch.  Our thanks to Andrew Krasichkov
for finding and reporting the problem.

Security: CVE-2018-10915
2018-08-06 10:53:35 -04:00
Tom Lane
c6db605c3e Release notes for 10.5, 9.6.10, 9.5.14, 9.4.19, 9.3.24. 2018-08-05 16:38:42 -04:00
Tom Lane
1d6c93f8f4 Doc: fix incorrectly stated argument list for pgcrypto's hmac() function.
The bytea variant takes (bytea, bytea, text).
Per unsigned report.

Discussion: https://postgr.es/m/153344327294.1404.654155870612982042@wrigleys.postgresql.org
2018-08-05 13:03:57 -04:00
Heikki Linnakangas
a2441558a6 Remove now unused check for HAVE_X509_GET_SIGNATURE_NID in test.
I removed the code that used this in the previous commit.

Spotted by Michael Paquier.
2018-08-05 17:17:15 +03:00
Heikki Linnakangas
1b7378b3d6 Remove support for tls-unique channel binding.
There are some problems with the tls-unique channel binding type. It's not
supported by all SSL libraries, and strictly speaking it's not defined for
TLS 1.3 at all, even though at least in OpenSSL, the functions used for it
still seem to work with TLS 1.3 connections. And since we had no
mechanism to negotiate what channel binding type to use, there would be
awkward interoperability issues if a server only supported some channel
binding types. tls-server-end-point seems feasible to support with any SSL
library, so let's just stick to that.

This removes the scram_channel_binding libpq option altogether, since there
is now only one supported channel binding type.

This also removes all the channel binding tests from the SSL test suite.
They were really just testing the scram_channel_binding option, which
is now gone. Channel binding is used if both client and server support it,
so it is used in the existing tests. It would be good to have some tests
specifically for channel binding, to make sure it really is used, and the
different combinations of a client and a server that support or doesn't
support it. The current set of settings we have make it hard to write such
tests, but I did test those things manually, by disabling
HAVE_BE_TLS_GET_CERTIFICATE_HASH and/or
HAVE_PGTLS_GET_PEER_CERTIFICATE_HASH.

I also removed the SCRAM_CHANNEL_BINDING_TLS_END_POINT constant. This is a
matter of taste, but IMO it's more readable to just use the
"tls-server-end-point" string.

Refactor the checks on whether the SSL library supports the functions
needed for tls-server-end-point channel binding. Now the server won't
advertise, and the client won't choose, the SCRAM-SHA-256-PLUS variant, if
compiled with an OpenSSL version too old to support it.

In the passing, add some sanity checks to check that the chosen SASL
mechanism, SCRAM-SHA-256 or SCRAM-SHA-256-PLUS, matches whether the SCRAM
exchange used channel binding or not. For example, if the client selects
the non-channel-binding variant SCRAM-SHA-256, but in the SCRAM message
uses channel binding anyway. It's harmless from a security point of view,
I believe, and I'm not sure if there are some other conditions that would
cause the connection to fail, but it seems better to be strict about these
things and check explicitly.

Discussion: https://www.postgresql.org/message-id/ec787074-2305-c6f4-86aa-6902f98485a4%40iki.fi
2018-08-05 13:44:26 +03:00
Tom Lane
87790fd1ea Update version 11 release notes.
Remove description of commit 1944cdc98, which has now been back-patched
so it's not relevant to v11 any longer.  Add descriptions of other
recent commits that seemed worth mentioning.

I marked the update as stopping at 2018-07-30, because it's unclear
whether d06eebce5 will be allowed to stay in v11, and I didn't feel like
putting effort into writing a description of it yet.  If it does stay,
I think it will deserve mention in the Source Code section.
2018-08-04 23:49:53 -04:00
Tom Lane
e7154b6acf Fix INSERT ON CONFLICT UPDATE through a view that isn't just SELECT *.
When expanding an updatable view that is an INSERT's target, the rewriter
failed to rewrite Vars in the ON CONFLICT UPDATE clause.  This accidentally
worked if the view was just "SELECT * FROM ...", as the transformation
would be a no-op in that case.  With more complicated view targetlists,
this omission would often lead to "attribute ... has the wrong type" errors
or even crashes, as reported by Mario De Frutos Dieguez.

Fix by adding code to rewriteTargetView to fix up the data structure
correctly.  The easiest way to update the exclRelTlist list is to rebuild
it from scratch looking at the new target relation, so factor the code
for that out of transformOnConflictClause to make it sharable.

In passing, avoid duplicate permissions checks against the EXCLUDED
pseudo-relation, and prevent useless view expansion of that relation's
dummy RTE.  The latter is only known to happen (after this patch) in cases
where the query would fail later due to not having any INSTEAD OF triggers
for the view.  But by exactly that token, it would create an unintended
and very poorly tested state of the query data structure, so it seems like
a good idea to prevent it from happening at all.

This has been broken since ON CONFLICT was introduced, so back-patch
to 9.5.

Dean Rasheed, based on an earlier patch by Amit Langote;
comment-kibitzing and back-patching by me

Discussion: https://postgr.es/m/CAFYwGJ0xfzy8jaK80hVN2eUWr6huce0RU8AgU04MGD00igqkTg@mail.gmail.com
2018-08-04 19:38:58 -04:00
Michael Paquier
58673b4a5f Reset properly errno before calling write()
6cb3372 enforces errno to ENOSPC when less bytes than what is expected
have been written when it is unset, though it forgot to properly reset
errno before doing a system call to write(), causing errno to
potentially come from a previous system call.

Reported-by: Tom Lane
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/31797.1533326676@sss.pgh.pa.us
2018-08-05 05:31:56 +09:00
Noah Misch
75224ac20e Make "kerberos" test suite independent of "localhost" name resolution.
This suite malfunctioned if the canonical name of "localhost" was
something other than "localhost", such as "localhost.localdomain".  Use
hostaddr=127.0.0.1 and a fictitious host=, so the resolver's answers for
"localhost" don't affect the outcome.  Back-patch to v11, which
introduced this test suite.

Discussion: https://postgr.es/m/20180801050903.GA1392916@rfd.leadboat.com
2018-08-03 20:53:40 -07:00
Peter Geoghegan
b9612e5cfa Add table relcache invalidation to index builds.
It's necessary to make sure that owning tables have a relcache
invalidation prior to advancing the command counter to make
newly-entered catalog tuples for the index visible.  inval.c must be
able to maintain the consistency of the local caches in the event of
transaction abort.  There is usually only a problem when CREATE INDEX
transactions abort, since there is a generic invalidation once we reach
index_update_stats().

This bug is of long standing.  Problems were made much more likely by
the addition of parallel CREATE INDEX (commit 9da0cc3528), but it is
strongly suspected that similar problems can be triggered without
involving plan_create_index_workers().  (plan_create_index_workers()
triggers a relcache build or rebuild, which previously only happened in
rare edge cases.)

Author: Peter Geoghegan
Reported-By: Luca Ferrari
Diagnosed-By: Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAKoxK+5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf=HEQ@mail.gmail.com
Backpatch: 9.3-
2018-08-03 14:45:02 -07:00
Alvaro Herrera
a395817893 Add 'n' to list of possible values to pg_default_acl.defaclobjtype
This was missed in commit ab89e465cb20; backpatch to v10.

Author: Fabien Coelho <coelho@cri.ensmp.fr>
Discussion: https://postgr.es/m/alpine.DEB.2.21.1807302243001.13230@lancre
2018-08-03 16:45:08 -04:00
Alvaro Herrera
d25c48d0c9 Fix pg_replication_slot example output
The example output of pg_replication_slot is wrong.  Correct it and make
the output stable by explicitly listing columns to output.

Author: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20180731.190909.42582169.horiguchi.kyotaro@lab.ntt.co.jp
2018-08-03 16:34:59 -04:00
Tom Lane
d8b2beb269 Remove no-longer-appropriate special case in psql's \conninfo code.
\conninfo prints the results of PQhost() and some other libpq functions.
It used to override the PQhost() result with the hostaddr parameter if
that'd been given, but that's unhelpful when multiple hosts were listed
in the connection string.  Furthermore, it seems unnecessary in the wake
of commit 1944cdc98, since PQhost does any useful substitution itself.
So let's just remove the extra code and print PQhost()'s result without
any editorialization.

Back-patch to v10, as 1944cdc98 (just) was.

Discussion: https://postgr.es/m/23287.1533227021@sss.pgh.pa.us
2018-08-03 12:20:47 -04:00
Tom Lane
6efc301671 Change libpq's internal uses of PQhost() to inspect host field directly.
Commit 1944cdc98 changed PQhost() to return the hostaddr value when that
is specified and host isn't.  This is a good idea in general, but
fe-auth.c and related files contain PQhost() calls for which it isn't.
Specifically, when we compare SSL certificates or other server identity
information to the host field, we do not want to use hostaddr instead;
that's not what's documented, that's not what happened pre-v10, and
it doesn't seem like a good idea.

Instead, we can just look at connhost[].host directly.  This does what
we want in v10 and up; in particular, if neither host nor hostaddr
were given, the host field will be replaced with the default host name.
That seems useful, and it's likely the reason that these places were
coded to call PQhost() originally (since pre-v10, the stored field was
not replaced with the default).

Back-patch to v10, as 1944cdc98 (just) was.

Discussion: https://postgr.es/m/23287.1533227021@sss.pgh.pa.us
2018-08-03 12:12:10 -04:00
Amit Kapila
dac7fe13bb Fix buffer usage stats for parallel nodes.
The buffer usage stats is accounted only for the execution phase of the
node.  For Gather and Gather Merge nodes, such stats are accumulated at
the time of shutdown of workers which is done after execution of node due
to which we missed to account them for such nodes.  Fix it by treating
nodes as running while we shut down them.

We can also miss accounting for a Limit node when Gather or Gather Merge
is beneath it, because it can finish the execution before shutting down
such nodes.  So we allow a Limit node to shut down the resources before it
completes the execution.

In the passing fix the gather node code to allow workers to shut down as
soon as we find that all the tuples from the workers have been retrieved.
The original code use to do that, but is accidently removed by commit
01edb5c7fc.

Reported-by: Adrien Nayrat
Author: Amit Kapila and Robert Haas
Reviewed-by: Robert Haas and Andres Freund
Backpatch-through: 9.6 where this code was introduced
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
2018-08-03 11:16:25 +05:30
Amit Kapila
ef305bd59d Match the buffer usage tracking for leader and worker backends.
In the leader backend, we don't track the buffer usage for ExecutorStart
phase whereas in worker backend we track it for ExecutorStart phase as
well.  This leads to different value for buffer usage stats for the
parallel and non-parallel query.  Change the code so that worker backend
also starts tracking buffer usage after ExecutorStart.

Author: Amit Kapila and Robert Haas
Reviewed-by: Robert Haas and Andres Freund
Backpatch-through: 9.6 where this code was introduced
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
2018-08-03 09:29:45 +05:30
Tom Lane
1b54e91faa Fix run-time partition pruning for appends with multiple source rels.
The previous coding here supposed that if run-time partitioning applied to
a particular Append/MergeAppend plan, then all child plans of that node
must be members of a single partitioning hierarchy.  This is totally wrong,
since an Append could be formed from a UNION ALL: we could have multiple
hierarchies sharing the same Append, or child plans that aren't part of any
hierarchy.

To fix, restructure the related plan-time and execution-time data
structures so that we can have a separate list or array for each
partitioning hierarchy.  Also track subplans that are not part of any
hierarchy, and make sure they don't get pruned.

Per reports from Phil Florent and others.  Back-patch to v11, since
the bug originated there.

David Rowley, with a lot of cosmetic adjustments by me; thanks also
to Amit Langote for review.

Discussion: https://postgr.es/m/HE1PR03MB17068BB27404C90B5B788BCABA7B0@HE1PR03MB1706.eurprd03.prod.outlook.com
2018-08-01 19:42:53 -04:00