Commit Graph

30153 Commits

Author SHA1 Message Date
Noah Misch
0170b10dff Restore PGREQUIRESSL recognition in libpq.
Commit 65c3bf19fd moved handling of the,
already then, deprecated requiressl parameter into conninfo_storeval().
The default PGREQUIRESSL environment variable was however lost in the
change resulting in a potentially silent accept of a non-SSL connection
even when set.  Its documentation remained.  Restore its implementation.
Also amend the documentation to mark PGREQUIRESSL as deprecated for
those not following the link to requiressl.  Back-patch to 9.3, where
commit 65c3bf1 first appeared.

Behavior has been more complex when the user provides both deprecated
and non-deprecated settings.  Before commit 65c3bf1, libpq operated
according to the first of these found:

  requiressl=1
  PGREQUIRESSL=1
  sslmode=*
  PGSSLMODE=*

(Note requiressl=0 didn't override sslmode=*; it would only suppress
PGREQUIRESSL=1 or a previous requiressl=1.  PGREQUIRESSL=0 had no effect
whatsoever.)  Starting with commit 65c3bf1, libpq ignored PGREQUIRESSL,
and order of precedence changed to this:

  last of requiressl=* or sslmode=*
  PGSSLMODE=*

Starting now, adopt the following order of precedence:

  last of requiressl=* or sslmode=*
  PGSSLMODE=*
  PGREQUIRESSL=1

This retains the 65c3bf1 behavior for connection strings that contain
both requiressl=* and sslmode=*.  It retains the 65c3bf1 change that
either connection string option overrides both environment variables.
For the first time, PGSSLMODE has precedence over PGREQUIRESSL; this
avoids reducing security of "PGREQUIRESSL=1 PGSSLMODE=verify-full"
configurations originating under v9.3 and later.

Daniel Gustafsson

Security: CVE-2017-7485
2017-05-08 07:24:24 -07:00
Peter Eisentraut
e2d4ef8de8 Add security checks to selectivity estimation functions
Some selectivity estimation functions run user-supplied operators over
data obtained from pg_statistic without security checks, which allows
those operators to leak pg_statistic data without having privileges on
the underlying tables.  Fix by checking that one of the following is
satisfied: (1) the user has table or column privileges on the table
underlying the pg_statistic data, or (2) the function implementing the
user-supplied operator is leak-proof.  If neither is satisfied, planning
will proceed as if there are no statistics available.

At least one of these is satisfied in most cases in practice.  The only
situations that are negatively impacted are user-defined or
not-leak-proof operators on a security-barrier view.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Peter Eisentraut <peter_e@gmx.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>

Security: CVE-2017-7484
2017-05-08 09:26:32 -04:00
Heikki Linnakangas
eb61136dc7 Remove support for password_encryption='off' / 'plain'.
Storing passwords in plaintext hasn't been a good idea for a very long
time, if ever. Now seems like a good time to finally forbid it, since we're
messing with this in PostgreSQL 10 anyway.

Remove the CREATE/ALTER USER UNENCRYPTED PASSSWORD 'foo' syntax, since
storing passwords unencrypted is no longer supported. ENCRYPTED PASSWORD
'foo' is still accepted, but ENCRYPTED is now just a noise-word, it does
the same as just PASSWORD 'foo'.

Likewise, remove the --unencrypted option from createuser, but accept
--encrypted as a no-op for backward compatibility. AFAICS, --encrypted was
a no-op even before this patch, because createuser encrypted the password
before sending it to the server even if --encrypted was not specified. It
added the ENCRYPTED keyword to the SQL command, but since the password was
already in encrypted form, it didn't make any difference. The documentation
was not clear on whether that was intended or not, but it's moot now.

Also, while password_encryption='on' is still accepted as an alias for
'md5', it is now marked as hidden, so that it is not listed as an accepted
value in error hints, for example. That's not directly related to removing
'plain', but it seems better this way.

Reviewed by Michael Paquier

Discussion: https://www.postgresql.org/message-id/16e9b768-fd78-0b12-cfc1-7b6b7f238fde@iki.fi
2017-05-08 11:26:07 +03:00
Simon Riggs
1f30295eab Remove poorly worded and duplicated comment
Move line of code to avoid need for duplicated comment

Brought to attention by Masahiko Sawada
2017-05-08 08:49:28 +01:00
Heikki Linnakangas
0186ded546 Fix memory leaks if random salt generation fails.
In the backend, this is just to silence coverity warnings, but in the
frontend, it's a genuine leak, even if extremely rare.

Spotted by Coverity, patch by Michael Paquier.
2017-05-07 19:58:21 +03:00
Tom Lane
a54d5875fe Guard against null t->tm_zone in strftime.c.
The upstream IANA code does not guard against null TM_ZONE pointers in this
function, but in our code there is such a check in the other pre-existing
use of t->tm_zone.  We do have some places that set pg_tm.tm_zone to NULL.
I'm not entirely sure it's possible to reach strftime with such a value,
but I'm not sure it isn't either, so be safe.

Per Coverity complaint.
2017-05-07 12:33:12 -04:00
Tom Lane
d4e59c5521 Install the "posixrules" timezone link in MSVC builds.
Somehow, we'd missed ever doing this.  The consequences aren't too
severe: basically, the timezone library would fall back on its hardwired
notion of the DST transition dates to use for a POSIX-style zone name,
rather than obeying US/Eastern which is the intended behavior.  The net
effect would only be to obey current US DST law further back than it
ought to apply; so it's not real surprising that nobody noticed.

David Rowley, per report from Amit Kapila

Discussion: https://postgr.es/m/CAA4eK1LC7CaNhRAQ__C3ht1JVrPzaAXXhEJRnR5L6bfYHiLmWw@mail.gmail.com
2017-05-07 11:57:41 -04:00
Tom Lane
5788a5670e Restore fullname[] contents before falling through in pg_open_tzfile().
Fix oversight in commit af2c5aa88: if the shortcut open() doesn't work,
we need to reset fullname[] to be just the name of the toplevel tzdata
directory before we fall through into the pre-existing code.  This failed
to be exposed in my (tgl's) testing because the fall-through path is
actually never taken under normal circumstances.

David Rowley, per report from Amit Kapila

Discussion: https://postgr.es/m/CAA4eK1LC7CaNhRAQ__C3ht1JVrPzaAXXhEJRnR5L6bfYHiLmWw@mail.gmail.com
2017-05-07 11:34:31 -04:00
Stephen Frost
09f8421819 pg_dump: Don't leak memory in buildDefaultACLCommands()
buildDefaultACLCommands() didn't destroy the string buffer created in
certain cases, leading to a memory leak.  Fix by destroying the buffer
before returning from the function.

Spotted by Coverity.

Author: Michael Paquier

Back-patch to 9.6 where buildDefaultACLCommands() was added.
2017-05-06 22:58:12 -04:00
Stephen Frost
aa5d3c0b3f RLS: Fix ALL vs. SELECT+UPDATE policy usage
When we add the SELECT-privilege based policies to the RLS with check
options (such as for an UPDATE statement, or when we have INSERT ...
RETURNING), we need to be sure and use the 'USING' case if the policy is
actually an 'ALL' policy (which could have both a USING clause and an
independent WITH CHECK clause).

This could result in policies acting differently when built using ALL
(when the ALL had both USING and WITH CHECK clauses) and when building
the policies independently as SELECT and UPDATE policies.

Fix this by adding an explicit boolean to add_with_check_options() to
indicate when the USING policy should be used, even if the policy has
both USING and WITH CHECK policies on it.

Reported by: Rod Taylor

Back-patch to 9.5 where RLS was introduced.
2017-05-06 21:46:35 -04:00
Andres Freund
b58c433ef9 Fix duplicated words in comment.
Reported-By: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-Wzn3rY2N0gTWndaApD113T+O8L6oz8cm7_F3P8y4awdoOg@mail.gmail.com
Backpatch: no, only present in master
2017-05-06 17:03:45 -07:00
Andres Freund
e6c44eef55 Fix off-by-one possibly leading to skipped XLOG_RUNNING_XACTS records.
Since 6ef2eba3f5 ("Skip checkpoints, archiving on idle systems."),
GetLastImportantRecPtr() is used to avoid performing superfluous
checkpoints, xlog switches, running-xact records when the system is
idle.  Unfortunately the check concerning running-xact records had a
off-by-one error, leading to such records being potentially skipped
when only a single record has been inserted since the last
running-xact record.

An alternative approach would have been to change
GetLastImportantRecPtr()'s definition to point to the end of records,
but that would make the checkpoint code more complicated.

Author: Andres Freund
Discussion: https://postgr.es/m/20170505012447.wsrympaxnfis6ojt@alap3.anarazel.de
Backpatch: no, code only present in master
2017-05-06 16:55:07 -07:00
Tom Lane
b3a47cdfd6 Suppress compiler warning about unportable pointer value.
Setting a pointer value to "0xdeadbeef" draws a warning from some
compilers, and for good reason.  Be less cute and just set it to NULL.

In passing make some other cosmetic adjustments nearby.

Discussion: https://postgr.es/m/CAJrrPGdW3EkU-CRobvVKYf3fJuBdgWyuGeAbNzAQ4yBh+bfb_Q@mail.gmail.com
2017-05-05 12:46:04 -04:00
Alvaro Herrera
14722c69f9 Allow MSVC to build with Tcl 8.6.
Commit eaba54c20c added support for Tcl 8.6 for configure-supported
platforms after verifying that pltcl works without further changes, but
the MSVC tooling wasn't updated accordingly.  Update MSVC to match,
restructuring the code to avoid duplicating the logic for every Tcl
version supported.

Backpatch to all live branches, like eaba54c20c.  In 9.4 and previous,
change the patch to use backslashes rather than forward, as in the rest
of the file.

Reported by Paresh More, who also tested the patch I provided.
Discussion: https://postgr.es/m/CAAgiCNGVw3ssBtSi3ZNstrz5k00ax=UV+_ZEHUeW_LMSGL2sew@mail.gmail.com
2017-05-05 12:38:29 -03:00
Peter Eisentraut
086221cf6b Prevent panic during shutdown checkpoint
When the checkpointer writes the shutdown checkpoint, it checks
afterwards whether any WAL has been written since it started and throws
a PANIC if so.  At that point, only walsenders are still active, so one
might think this could not happen, but walsenders can also generate WAL,
for instance in BASE_BACKUP and certain variants of
CREATE_REPLICATION_SLOT.  So they can trigger this panic if such a
command is run while the shutdown checkpoint is being written.

To fix this, divide the walsender shutdown into two phases.  First, the
postmaster sends a SIGUSR2 signal to all walsenders.  The walsenders
then put themselves into the "stopping" state.  In this state, they
reject any new commands.  (For simplicity, we reject all new commands,
so that in the future we do not have to track meticulously which
commands might generate WAL.)  The checkpointer waits for all walsenders
to reach this state before proceeding with the shutdown checkpoint.
After the shutdown checkpoint is done, the postmaster sends
SIGINT (previously unused) to the walsenders.  This triggers the
existing shutdown behavior of sending out the shutdown checkpoint record
and then terminating.

Author: Michael Paquier <michael.paquier@gmail.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-05-05 10:31:42 -04:00
Magnus Hagander
499ae5f5db Fix wording in pg_upgrade docs
Author: Daniel Gustafsson
2017-05-05 12:42:21 +02:00
Magnus Hagander
28d1c8ccc8 Build pgoutput.dll in MSVC build
Without this, logical replication obviously does not work on Windows

MauMau, with clean.bet additions from me per note from Michael Paquier
2017-05-05 12:08:48 +02:00
Heikki Linnakangas
0557a5dc2c Make SCRAM salts and nonces longer.
The salt is stored base64-encoded. With the old 10 bytes raw length, it was
always padded to 16 bytes after encoding. We might as well use 12 raw bytes
for the salt, and it's still encoded into 16 bytes.

Similarly for the random nonces, use a raw length that's divisible by 3, so
that there's no padding after base64 encoding. Make the nonces longer while
we're at it. 10 bytes was probably enough to prevent replay attacks, but
there's no reason to be skimpy here.

Per suggestion from Álvaro Hernández Tortosa.

Discussion: https://www.postgresql.org/message-id/df8c6e27-4d8e-5281-96e5-131a4e638fc8@8kdata.com
2017-05-05 10:02:13 +03:00
Heikki Linnakangas
e6e9c4da3a Misc cleanup of SCRAM code.
* Remove is_scram_verifier() function. It was unused.
* Fix sanitize_char() function, used in error messages on protocol
  violations, to print bytes >= 0x7F correctly.
* Change spelling of scram_MockSalt() function to be more consistent with
  the surroundings.
* Change a few more references to "server proof" to "server signature" that
  I missed in commit d981074c24.
2017-05-05 10:01:44 +03:00
Heikki Linnakangas
344a113079 Don't use SCRAM-specific "e=invalid-proof" on invalid password.
Instead, send the same FATAL message as with other password-based
authentication mechanisms. This gives a more user-friendly message:

psql: FATAL:  password authentication failed for user "test"

instead of:

psql: error received from server in SASL exchange: invalid-proof

Even before this patch, the server sent that FATAL message, after the
SCRAM-specific "e=invalid-proof" message. But libpq would stop at the
SCRAM error message, and not process the ErrorResponse that would come
after that. We could've taught libpq to check for an ErrorResponse after
failed authentication, but it's simpler to modify the server to send only
the ErrorResponse. The SCRAM specification allows for aborting the
authentication at any point, using an application-defined error mechanism,
like PostgreSQL's ErrorResponse. Using the e=invalid-proof message is
optional.

Reported by Jeff Janes.

Discussion: https://www.postgresql.org/message-id/CAMkU%3D1w3jQ53M1OeNfN8Cxd9O%2BA_9VONJivTbYoYRRdRsLT6vA@mail.gmail.com
2017-05-05 10:01:41 +03:00
Stephen Frost
44c528810a Change the way pg_dump retrieves partitioning info
This gets rid of the code that issued separate queries to retrieve the
partitioning parent-child relationship, parent partition key, and child
partition bound information.  With this patch, the information is
retrieved instead using the queries issued from getTables() and
getInherits(), which is both more efficient than the previous approach
and doesn't require any new code.

Since the partitioning parent-child relationship is now retrieved with
the same old code that handles inheritance, partition attributes receive
a proper flagInhAttrs() treatment (that it didn't receive before), which
is needed so that the inherited NOT NULL constraints are not emitted if
we already emitted it for the parent.

Also, fix a bug in pg_dump's --binary-upgrade code, which caused pg_dump
to emit invalid command to attach a partition to its parent.

Author: Amit Langote, with some additional changes by me.
2017-05-04 22:17:52 -04:00
Tom Lane
3f074845a8 Fix pfree-of-already-freed-tuple when rescanning a GiST index-only scan.
GiST's getNextNearest() function attempts to pfree the previously-returned
tuple if any (that is, scan->xs_hitup in HEAD, or scan->xs_itup in older
branches).  However, if we are rescanning a plan node after ending a
previous scan early, those tuple pointers could be pointing to garbage,
because they would be pointing into the scan's pageDataCxt or queueCxt
which has been reset.  In a debug build this reliably results in a crash,
although I think it might sometimes accidentally fail to fail in
production builds.

To fix, clear the pointer field anyplace we reset a context it might
be pointing into.  This may be overkill --- I think probably only the
queueCxt case is involved in this bug, so that resetting in gistrescan()
would be sufficient --- but dangling pointers are generally bad news,
so let's avoid them.

Another plausible answer might be to just not bother with the pfree in
getNextNearest().  The reconstructed tuples would go away anyway in the
context resets, and I'm far from convinced that freeing them a bit earlier
really saves anything meaningful.  I'll stick with the original logic in
this patch, but if we find more problems in the same area we should
consider that approach.

Per bug #14641 from Denis Smirnov.  Back-patch to 9.5 where this
logic was introduced.

Discussion: https://postgr.es/m/20170504072034.24366.57688@wrigleys.postgresql.org
2017-05-04 13:59:39 -04:00
Heikki Linnakangas
20bf7b2b0a Fix PQencryptPasswordConn to work with older server versions.
password_encryption was a boolean before version 10, so cope with "on" and
"off".

Also, change the behavior with "plain", to treat it the same as "md5".
We're discussing removing the password_encryption='plain' option from the
server altogether, which will make this the only reasonable choice, but
even if we kept it, it seems best to never send the password in cleartext.
2017-05-04 12:28:25 +03:00
Peter Eisentraut
0de791ed76 Fix cursor_to_xml in tableforest false mode
It only produced <row> elements but no wrapping <table> element.

By contrast, cursor_to_xmlschema produced a schema that is now correct
but did not previously match the XML data produced by cursor_to_xml.

In passing, also fix a minor misunderstanding about moving cursors in
the tests related to this.

Reported-by: filip@jirsak.org
Based-on-patch-by: Thomas Munro <thomas.munro@enterprisedb.com>
2017-05-03 21:41:10 -04:00
Tom Lane
4dd4104342 Remove useless and rather expensive stanza in matview regression test.
This removes a test case added by commit b69ec7cc9, which was intended
to exercise a corner case involving the rule used at that time that
materialized views were unpopulated iff they had physical size zero.
We got rid of that rule very shortly later, in commit 1d6c72a55, but
kept the test case.  However, because the case now asks what VACUUM
will do to a zero-sized physical file, it would be pretty surprising
if the answer were ever anything but "nothing" ... and if things were
indeed that broken, surely we'd find it out from other tests.  Since
the test involves a table that's fairly large by regression-test
standards (100K rows), it's quite slow to run.  Dropping it should
save some buildfarm cycles, so let's do that.

Discussion: https://postgr.es/m/32386.1493831320@sss.pgh.pa.us
2017-05-03 19:37:01 -04:00
Alvaro Herrera
a93077ef46 Add pg_dump tests for CREATE STATISTICS
CREATE STATISTICS pg_dump support code was not covered at all by
previous tests.

Discussion: https://postgr.es/m/20170503172746.rwftidszir67sgk7@alvherre.pgsql
2017-05-03 15:52:00 -03:00
Alvaro Herrera
698923d658 pg_dump/t/002: append terminating semicolon to SQL commands
It's easy to overlook the need for one, and its lack is annoying for the
next developer wanting to create a new test.  Rather than expect every
individual command to add the semicolon, just append one automatically.

Discussion: http://postgr.es/m/20170503172746.rwftidszir67sgk7@alvherre.pgsql
2017-05-03 15:12:09 -03:00
Heikki Linnakangas
8f8b9be51f Add PQencryptPasswordConn function to libpq, use it in psql and createuser.
The new function supports creating SCRAM verifiers, in addition to md5
hashes. The algorithm is chosen based on password_encryption, by default.

This fixes the issue reported by Jeff Janes, that there was previously
no way to create a SCRAM verifier with "\password".

Michael Paquier and me

Discussion: https://www.postgresql.org/message-id/CAMkU%3D1wfBgFPbfAMYZQE78p%3DVhZX7nN86aWkp0QcCp%3D%2BKxZ%3Dbg%40mail.gmail.com
2017-05-03 11:19:07 +03:00
Tom Lane
af2c5aa88d Improve performance of timezone loading, especially pg_timezone_names view.
tzparse() would attempt to load the "posixrules" timezone database file on
each call.  That might seem like it would only be an issue when selecting a
POSIX-style zone name rather than a zone defined in the timezone database,
but it turns out that each zone definition file contains a POSIX-style zone
string and tzload() will call tzparse() to parse that.  Thus, when scanning
the whole timezone file tree as we do in the pg_timezone_names view,
"posixrules" was read repetitively for each zone definition file.  Fix
that by caching the file on first use within any given process.  (We cache
other zone definitions for the life of the process, so there seems little
reason not to cache this one as well.)  This probably won't help much in
processes that never run pg_timezone_names, but even one additional SET
of the timezone GUC would come out ahead.

An even worse problem for pg_timezone_names is that pg_open_tzfile()
has an inefficient way of identifying the canonical case of a zone name:
it basically re-descends the directory tree to the zone file.  That's not
awful for an individual "SET timezone" operation, but it's pretty horrid
when we're inspecting every zone in the database.  And it's pointless too
because we already know the canonical spelling, having just read it from
the filesystem.  Fix by teaching pg_open_tzfile() to avoid the directory
search if it's not asked for the canonical name, and backfilling the
proper result in pg_tzenumerate_next().

In combination these changes seem to make the pg_timezone_names view
about 3x faster to read, for me.  Since a scan of pg_timezone_names
has up to now been one of the slowest queries in the regression tests,
this should help some little bit for buildfarm cycle times.

Back-patch to all supported branches, not so much because it's likely
that users will care much about the view's performance as because
tracking changes in the upstream IANA timezone code is really painful
if we don't keep all the branches in sync.

Discussion: https://postgr.es/m/27962.1493671706@sss.pgh.pa.us
2017-05-02 21:50:35 -04:00
Tom Lane
23c6eb0336 Remove create_singleton_array(), hard-coding the case in its sole caller.
create_singleton_array() was not really as useful as we perhaps thought
when we added it.  It had never accreted more than one call site, and is
only saving a dozen lines of code at that one, which is considerably less
bulk than the function itself.  Moreover, because of its insistence on
using the caller's fn_extra cache space, it's arguably a coding hazard.
text_to_array_internal() does not currently use fn_extra in any other way,
but if it did it would be subtly broken, since the conflicting fn_extra
uses could be needed within a single query, in the seldom-tested case that
the field separator varies during the query.  The same objection seems
likely to apply to any other potential caller.

The replacement code is a bit uglier, because it hardwires knowledge of
the storage parameters of type TEXT, but it's not like we haven't got
dozens or hundreds of other places that do the same.  Uglier seems like
a good tradeoff for smaller, faster, and safer.

Per discussion with Neha Khatri.

Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com
2017-05-02 20:41:37 -04:00
Tom Lane
9209e07605 Ensure commands in extension scripts see the results of preceding DDL.
Due to a missing CommandCounterIncrement() call, parsing of a non-utility
command in an extension script would not see the effects of the immediately
preceding DDL command, unless that command's execution ends with
CommandCounterIncrement() internally ... which some do but many don't.
Report by Philippe Beaudoin, diagnosis by Julien Rouhaud.

Rather remarkably, this bug has evaded detection since extensions were
invented, so back-patch to all supported branches.

Discussion: https://postgr.es/m/2cf7941e-4e41-7714-3de8-37b1a8f74dff@free.fr
2017-05-02 18:06:09 -04:00
Alvaro Herrera
93bbeec6a2 extstats: change output functions to emit valid JSON
Manipulating extended statistics is more convenient as JSON than the
current ad-hoc format, so let's change before it's too late.

Discussion: https://postgr.es/m/20170420193828.k3fliiock5hdnehn@alvherre.pgsql
2017-05-02 18:49:32 -03:00
Robert Haas
0d1e1f0ea4 Fix typos in comments.
Etsuro Fujita

Discussion: http://postgr.es/m/00e88999-684d-d79a-70e4-908c937a0126@lab.ntt.co.jp
2017-05-02 14:47:46 -04:00
Peter Eisentraut
3d092fe540 Avoid unnecessary catalog updates in ALTER SEQUENCE
ALTER SEQUENCE can do nontransactional changes to the sequence (RESTART
clause) and transactional updates to the pg_sequence catalog (most other
clauses).  When just calling RESTART, the code would still needlessly do
a catalog update without any changes.  This would entangle that
operation in the concurrency issues of a catalog update (causing either
locking or concurrency errors, depending on how that issue is to be
resolved).

Fix by keeping track during options parsing whether a catalog update is
needed, and skip it if not.

Reported-by: Jason Petersen <jason@citusdata.com>
2017-05-02 10:41:48 -04:00
Andrew Dunstan
9a0d2008c3 Fix perl thinko in commit fed6df486d
Report and fix from Vaishnavi Prabakaran

Backpatch to 9.4 like original.
2017-05-02 08:20:11 -04:00
Magnus Hagander
34fc616738 Change hot_standby default value to 'on'
This goes together with the changes made to enable replication on the
sending side by default (wal_level, max_wal_senders etc) by making the
receiving stadby node also enable it by default.

Huong Dangminh
2017-05-02 11:12:30 +02:00
Peter Eisentraut
a99448ab45 Don't wake up logical replication launcher unnecessarily
In CREATE SUBSCRIPTION, only wake up the launcher when the subscription
is enabled.

Author: Fujii Masao <masao.fujii@gmail.com>
2017-05-01 22:50:32 -04:00
Tom Lane
54affb41e7 Improve function header comment for create_singleton_array().
Mentioning the caller is neither future-proof nor an adequate substitute
for giving an API specification.  Per gripe from Neha Khatri, though
I changed the patch around some.

Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com
2017-05-01 15:31:41 -04:00
Tom Lane
92a43e4857 Reduce semijoins with unique inner relations to plain inner joins.
If the inner relation can be proven unique, that is it can have no more
than one matching row for any row of the outer query, then we might as
well implement the semijoin as a plain inner join, allowing substantially
more freedom to the planner.  This is a form of outer join strength
reduction, but it can't be implemented in reduce_outer_joins() because
we don't have enough info about the individual relations at that stage.
Instead do it much like remove_useless_joins(): once we've built base
relations, we can make another pass over the SpecialJoinInfo list and
get rid of any entries representing reducible semijoins.

This is essentially a followon to the inner-unique patch (commit 9c7f5229a)
and makes use of the proof machinery that that patch created.  We need only
minor refactoring of innerrel_is_unique's API to support this usage.

Per performance complaint from Teodor Sigaev.

Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
2017-05-01 14:53:42 -04:00
Tom Lane
2057a58d16 Fix mis-optimization of semijoins with more than one LHS relation.
The inner-unique patch (commit 9c7f5229a) supposed that if we're
considering a JOIN_UNIQUE_INNER join path, we can always set inner_unique
for the join, because the inner path produced by create_unique_path should
be unique relative to the outer relation.  However, that's true only if
we're considering joining to the whole outer relation --- otherwise we may
be applying only some of the join quals, and so the inner path might be
non-unique from the perspective of this join.  Adjust the test to only
believe that we can set inner_unique if we have the whole semijoin LHS on
the outer side.

There is more that can be done in this area, but this commit is only
intended to provide the minimal fix needed to get correct plans.

Per report from Teodor Sigaev.  Thanks to David Rowley for preliminary
investigation.

Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
2017-05-01 14:39:11 -04:00
Tom Lane
74a20d0ab7 Update time zone data files to tzdata release 2017b.
DST law changes in Chile, Haiti, and Mongolia.  Historical corrections for
Ecuador, Kazakhstan, Liberia, and Spain.

The IANA crew continue their campaign to replace invented time zone
abbrevations with numeric GMT offsets.  This update changes numerous zones
in South America, the Pacific and Indian oceans, and some Asian and Middle
Eastern zones.  I kept these abbreviations in the tznames/ data files,
however, so that we will still accept them for input.  (We may want to
start trimming those files someday, but I think we should wait for the
upstream dust to settle before deciding what to do.)

In passing, add MESZ (Mitteleuropaeische Sommerzeit) to the tznames lists;
since we accept MEZ (Mitteleuropaeische Zeit) it seems rather strange not
to take the other one.  And fix some incorrect, or at least obsolete,
comments that certain abbreviations are not traceable to the IANA data.
2017-05-01 11:53:11 -04:00
Robert Haas
bdac9836d3 libpq: Fix inadvertent change in .pgpass lookup behavior.
Commit 274bb2b385 caused password file
lookups to use the hostaddr in preference to the host, but that was
not intended and the documented behavior is the opposite.

Report and patch by Kyotaro Horiguchi.

Discussion: http://postgr.es/m/20170428.165432.60857995.horiguchi.kyotaro@lab.ntt.co.jp
2017-05-01 11:29:00 -04:00
Andrew Dunstan
fed6df486d Allow vcregress.pl to run an arbitrary TAP test set
Currently only provision for running the bin checks in a single step is
provided for. Now these tests can be run individually, as well as tests
in other locations (e.g. src.test/recover).

Also provide for suppressing unnecessary temp installs by setting the
NO_TEMP_INSTALL environment variable just as the Makefiles do.

Backpatch to 9.4.
2017-05-01 10:57:51 -04:00
Peter Eisentraut
9414e41ea7 Fix logical replication launcher wake up and reset
After the logical replication launcher was told to wake up at
commit (for example, by a CREATE SUBSCRIPTION command), the flag to wake
up was not reset, so it would be woken up at every following commit as
well.  So fix that by resetting the flag.

Also, we don't need to wake up anything if the transaction was rolled
back.  Just reset the flag in that case.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-05-01 10:18:09 -04:00
Robert Haas
e180c8aa8c Fire per-statement triggers on partitioned tables.
Even though no actual tuples are ever inserted into a partitioned
table (the actual tuples are in the partitions, not the partitioned
table itself), we still need to have a ResultRelInfo for the
partitioned table, or per-statement triggers won't get fired.

Amit Langote, per a report from Rajkumar Raghuwanshi.  Reviewed by me.

Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
2017-05-01 08:23:01 -04:00
Tom Lane
e18b2c480d Sync our copy of the timezone library with IANA release tzcode2017b.
zic no longer mishandles some transitions in January 2038 when it
attempts to work around Qt bug 53071.  This fixes a bug affecting
Pacific/Tongatapu that was introduced in zic 2016e.  localtime.c
now contains a workaround, useful when loading a file generated by
a buggy zic.

There are assorted cosmetic changes as well, notably relocation
of a bunch of #defines.
2017-04-30 15:13:51 -04:00
Tom Lane
12d11432b4 Fix possible null pointer dereference or invalid warning message.
Thinko in commit de4389712: this warning message references the wrong
"LogicalRepWorker *" variable.  This would often result in a core dump,
but if it didn't, the message would show the wrong subscription OID.

In passing, adjust the message text to format a subscription OID
similarly to how that's done elsewhere in the function; and fix
grammatical issues in some nearby messages.

Per Coverity testing.
2017-04-30 12:21:02 -04:00
Tom Lane
c23844212d Micro-optimize some slower queries in the opr_sanity regression test.
Convert the binary_coercible() and physically_coercible() functions from
SQL to plpgsql.  It's not that plpgsql is inherently better at doing
queries; if you simply convert the previous single SQL query into one
RETURN expression, it's no faster.  The problem with the existing code
is that it fools the plancache into deciding that it's worth re-planning
the query every time, since constant-folding with a concrete value for $2
allows elimination of at least one sub-SELECT.  In reality that's using the
planner to do the equivalent of a few runtime boolean tests, causing the
function to run much slower than it should.  Splitting the AND/OR logic
into separate plpgsql statements allows each if-expression to acquire a
static plan.

Also, get rid of some uses of obj_description() in favor of explicitly
joining to pg_description, allowing the joins to be optimized better.
(Someday we might improve the SQL-function-inlining logic enough that
this happens automatically, but today is not that day.)

Together, these changes reduce the runtime of the opr_sanity regression
test by about a factor of two on one of my slower machines.  They don't
seem to help as much on a fast machine, but this should at least benefit
the buildfarm.
2017-04-29 20:14:52 -04:00
Robert Haas
6a4dda44e0 Fix VALIDATE CONSTRAINT to consider NO INHERIT attribute.
Currently, trying to validate a NO INHERIT constraint on the parent will
search for the constraint in child tables (where it is not supposed to
exist), wrongly causing a "constraint does not exist" error.

Amit Langote, per a report from Hans Buschmann.

Discussion: http://postgr.es/m/20170421184012.24362.19@wrigleys.postgresql.org
2017-04-28 14:48:38 -04:00
Peter Eisentraut
e4fddfd492 psql: Support identity columns in sequence display
Where the footer for an owned serial sequence would say "Owned by", put
something analogous for a sequence belonging to an identity column.

Reported-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>
2017-04-28 14:43:36 -04:00
Robert Haas
5e1ccd4844 In load_relcache_init_file, initialize rd_pdcxt.
Oversight noted by Gao Zeng Qi.

Discussion: http://postgr.es/m/CAFmBtr1N3-SbepJbnGpaYp=jw-FvWMnYY7-bTtRgvjvbyB8YJA@mail.gmail.com
2017-04-28 14:05:13 -04:00
Robert Haas
c1e0e7e1d7 Speed up dropping tables with many partitions.
We need to lock the parent, but we don't need a relcache entry
for it.

Gao Zeng Qi, reviewed by Amit Langote

Discussion: http://postgr.es/m/CAFmBtr0ukqJjRJEhPWL5wt4rNMrJUUxggVAGXPR3SyYh3E+HDQ@mail.gmail.com
2017-04-28 14:02:24 -04:00
Robert Haas
504c2205ab Fix crash when partitioned column specified twice.
Amit Langote, reviewed by Beena Emerson

Discussion: http://postgr.es/m/6ed23d3d-c09d-4cbc-3628-0a8a32f750f4@lab.ntt.co.jp
2017-04-28 13:52:17 -04:00
Peter Eisentraut
e3cf708016 Wait between tablesync worker restarts
Before restarting a tablesync worker for the same relation, wait
wal_retrieve_retry_interval (currently 5s by default).  This avoids
restarting failing workers in a tight loop.

We keep the last start times in a hash table last_start_times that is
separate from the table_states list, because that list is cleared out on
syscache invalidation, which happens whenever a table finishes syncing.
The hash table is kept until all tables have finished syncing.

A future project might be to unify these two and keep everything in one
data structure, but for now this is a less invasive change to accomplish
the original purpose.

For the test suite, set wal_retrieve_retry_interval to its minimum
value, to not increase the test suite run time.

Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
2017-04-28 13:47:46 -04:00
Heikki Linnakangas
d981074c24 Misc SCRAM code cleanups.
* Move computation of SaltedPassword to a separate function from
  scram_ClientOrServerKey(). This saves a lot of cycles in libpq, by
  computing SaltedPassword only once per authentication. (Computing
  SaltedPassword is expensive by design.)

* Split scram_ClientOrServerKey() into two functions. Improves
  readability, by making the calling code less verbose.

* Rename "server proof" to "server signature", to better match the
  nomenclature used in RFC 5802.

* Rename SCRAM_SALT_LEN to SCRAM_DEFAULT_SALT_LEN, to make it more clear
  that the salt can be of any length, and the constant only specifies how
  long a salt we use when we generate a new verifier. Also rename
  SCRAM_ITERATIONS_DEFAULT to SCRAM_DEFAULT_ITERATIONS, for consistency.

These things caught my eye while working on other upcoming changes.
2017-04-28 15:22:38 +03:00
Stephen Frost
b9a3ef55b2 Remove unnecessairly duplicated gram.y productions
Declarative partitioning duplicated the TypedTableElement productions,
evidently to remove the need to specify WITH OPTIONS when creating
partitions.  Instead, simply make WITH OPTIONS optional in the
TypedTableElement production and remove all of the duplicate
PartitionElement-related productions.  This change simplifies the
syntax and makes WITH OPTIONS optional when adding defaults, constraints
or storage parameters to columns when creating either typed tables or
partitions.

Also update pg_dump to no longer include WITH OPTIONS, since it's not
necessary, and update the documentation to reflect that WITH OPTIONS is
now optional.
2017-04-27 20:14:39 -04:00
Andres Freund
ab9c43381e Don't build full initial logical decoding snapshot if NOEXPORT_SNAPSHOT.
Earlier commits (56e19d938d and 2bef06d516) make it cheaper to
create a logical slot if not exporting the initial snapshot.  If
NOEXPORT_SNAPSHOT is specified, we can skip the overhead, not just
when creating a slot via sql (which can't export snapshots).  As
NOEXPORT_SNAPSHOT has only recently been introduced, this shouldn't be
backpatched.
2017-04-27 15:52:31 -07:00
Andres Freund
56e19d938d Don't use on-disk snapshots for exported logical decoding snapshot.
Logical decoding stores historical snapshots on disk, so that logical
decoding can restart without having to reconstruct a snapshot from
scratch (for which the resources are not guaranteed to be present
anymore).  These serialized snapshots were also used when creating a
new slot via the walsender interface, which can export a "full"
snapshot (i.e. one that can read all tables, not just catalog ones).

The problem is that the serialized snapshots are only useful for
catalogs and not for normal user tables.  Thus the use of such a
serialized snapshot could result in an inconsistent snapshot being
exported, which could lead to queries returning wrong data.  This
would only happen if logical slots are created while another logical
slot already exists.

Author: Petr Jelinek
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backport: 9.4, where logical decoding was introduced.
2017-04-27 15:29:15 -07:00
Tom Lane
7834d20b57 Avoid slow shutdown of pg_basebackup.
pg_basebackup's child process did not pay any attention to the pipe
from its parent while waiting for input from the source server.
If no server data was arriving, it would only wake up and check the
pipe every standby_message_timeout or so.  This creates a problem
since the parent process might determine and send the desired stop
position only after the server has reached end-of-WAL and stopped
sending data.  In the src/test/recovery regression tests, the timing
is repeatably such that it takes nearly 10 seconds for the child
process to realize that it should shut down.  It's not clear how
often that would happen in real-world cases, but it sure seems like
a bug --- and if the user turns off standby_message_timeout or sets
it very large, the delay could be a lot worse.

To fix, expand the StreamCtl API to allow the pipe input FD to be
passed down to the low-level wait routine, and watch both sockets
when sleeping.

(Note: AFAICS this issue doesn't affect the Windows port, since
it doesn't rely on a pipe to transfer the stop position to the
child thread.)

Discussion: https://postgr.es/m/6456.1493263884@sss.pgh.pa.us
2017-04-27 18:27:02 -04:00
Fujii Masao
9f11fcec66 Fix bug so logical rep launcher saves correctly time of last startup of worker.
Previously the logical replication launcher stored the last timestamp
when it started the worker, in the local variable "last_start_time",
in order to check whether wal_retrive_retry_interval elapsed since
the last startup of worker. If it has elapsed, the launcher sees
pg_subscription and starts new worker if necessary. This is for
limitting the startup of worker to once a wal_retrieve_retry_interval.

The bug was that the variable "last_start_time" was defined and
always initialized with 0 at the beginning of the launcher's main loop.
So even if it's set to the last timestamp in later phase of the loop,
it's always reset to 0. Therefore the launcher could not check
correctly whether wal_retrieve_retry_interval elapsed since
the last startup.

This patch moves the variable "last_start_time" outside the main loop
so that it will not be reset.

Reviewed-by: Petr Jelinek
Discussion: http://postgr.es/m/CAHGQGwGJrPO++XM4mFENAwpy1eGXKsGdguYv43GUgLgU-x8nTQ@mail.gmail.com
2017-04-28 06:35:00 +09:00
Tom Lane
82ebbeb0ab Cope with glibc too old to have epoll_create1().
Commit fa31b6f4e supposed that we didn't have to worry about that
anymore, but it seems that RHEL5 is like that, and that's still
a supported platform.  Put back the prior coding under an #ifdef,
adding an explicit fcntl() to retain the desired CLOEXEC property.

Discussion: https://postgr.es/m/12307.1493325329@sss.pgh.pa.us
2017-04-27 17:13:53 -04:00
Andres Freund
2bef06d516 Preserve required !catalog tuples while computing initial decoding snapshot.
The logical decoding machinery already preserved all the required
catalog tuples, which is sufficient in the course of normal logical
decoding, but did not guarantee that non-catalog tuples were preserved
during computation of the initial snapshot when creating a slot over
the replication protocol.

This could cause a corrupted initial snapshot being exported.  The
time window for issues is usually not terribly large, but on a busy
server it's perfectly possible to it hit it.  Ongoing decoding is not
affected by this bug.

To avoid increased overhead for the SQL API, only retain additional
tuples when a logical slot is being created over the replication
protocol.  To do so this commit changes the signature of
CreateInitDecodingContext(), but it seems unlikely that it's being
used in an extension, so that's probably ok.

In a drive-by fix, fix handling of
ReplicationSlotsComputeRequiredXmin's already_locked argument, which
should only apply to ProcArrayLock, not ReplicationSlotControlLock.

Reported-By: Erik Rijkers
Analyzed-By: Petr Jelinek
Author: Petr Jelinek, heavily editorialized by Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com
Backport: 9.4, where logical decoding was introduced.
2017-04-27 13:13:36 -07:00
Tom Lane
fa31b6f4e9 Make latch.c more paranoid about child-process cases.
Although the postmaster doesn't currently create a self-pipe or any
latches, there's discussion of it doing so in future.  It's also
conceivable that a shared_preload_libraries extension would try to
create such a thing in the postmaster process today.  In that case
the self-pipe FDs would be inherited by forked child processes.
latch.c was entirely unprepared for such a case and could suffer an
assertion failure, or worse try to use the inherited pipe if somebody
called WaitLatch without having called InitializeLatchSupport in that
process.  Make it keep track of whether InitializeLatchSupport has been
called in the *current* process, and do the right thing if state has
been inherited from a parent.

Apply FD_CLOEXEC to file descriptors created in latch.c (the self-pipe,
as well as epoll event sets).  This ensures that child processes spawned
in backends, the archiver, etc cannot accidentally or intentionally mess
with these FDs.  It also ensures that we end up with the right state
for the self-pipe in EXEC_BACKEND processes, which otherwise wouldn't
know to close the postmaster's self-pipe FDs.

Back-patch to 9.6, mainly to keep latch.c looking similar in all branches
it exists in.

Discussion: https://postgr.es/m/8322.1493240739@sss.pgh.pa.us
2017-04-27 15:07:36 -04:00
Simon Riggs
49e9281549 Rework handling of subtransactions in 2PC recovery
The bug fixed by 0874d4f3e1
caused us to question and rework the handling of
subtransactions in 2PC during and at end of recovery.
Patch adds checks and tests to ensure no further bugs.

This effectively removes the temporary measure put in place
by 546c13e11b.

Author: Simon Riggs
Reviewed-by: Tom Lane, Michael Paquier
Discussion: http://postgr.es/m/CANP8+j+vvXmruL_i2buvdhMeVv5TQu0Hm2+C5N+kdVwHJuor8w@mail.gmail.com
2017-04-27 14:41:22 +02:00
Simon Riggs
0352c15e5a Additional tests for subtransactions in recovery
Tests for normal and prepared transactions

Author: Nikhil Sontakke, placed in new test file by me
2017-04-27 14:26:57 +02:00
Peter Eisentraut
6c9bd27aec Fix typo in comment
Author: Masahiko Sawada <sawada.mshk@gmail.com>
2017-04-26 21:13:01 -04:00
Tom Lane
aa1351f1ee Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long.  However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once.  On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives.  (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.

As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.

There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).

Backpatch to 9.6 where parallel query was added.  The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.

Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 16:17:34 -04:00
Stephen Frost
0c76c2463e pg_get_partkeydef: return NULL for non-partitions
Our general rule for pg_get_X(oid) functions is to simply return NULL
when passed an invalid or inappropriate OID.  Teach pg_get_partkeydef to
do this also, making it easier for users to use this function when
querying against tables with both partitions and non-partitions (such as
pg_class).

As a concrete example, this makes pg_dump's life a little easier.

Author: Amit Langote
2017-04-26 14:59:22 -04:00
Tom Lane
49da00677d Silence compiler warning induced by commit de4389712.
Smarter compilers can see that "slot" can't be used uninitialized,
but some popular ones cannot.  Noted by Jeff Janes.
2017-04-26 14:01:26 -04:00
Peter Eisentraut
61ecc90be6 Fix query that gets remote relation info
Publisher relation can be incorrectly chosen, if there are more than
one relation in different schemas with the same name.

Author: Euler Taveira <euler@timbira.com.br>
2017-04-26 12:07:22 -04:00
Peter Eisentraut
e495c1683f Spelling fixes in code comments
Author: Euler Taveira <euler@timbira.com.br>
2017-04-26 12:07:11 -04:00
Fujii Masao
1f8b060121 Fix typo in comment.
Author: Masahiko Sawada
2017-04-27 00:03:07 +09:00
Peter Eisentraut
de43897122 Fix various concurrency issues in logical replication worker launching
The code was originally written with assumption that launcher is the
only process starting the worker.  However that hasn't been true since
commit 7c4f52409 which failed to modify the worker management code
adequately.

This patch adds an in_use field to the LogicalRepWorker struct to
indicate whether the worker slot is being used and uses proper locking
everywhere this flag is set or read.

However if the parent process dies while the new worker is starting and
the new worker fails to attach to shared memory, this flag would never
get cleared.  We solve this rare corner case by adding a sort of garbage
collector for in_use slots.  This uses another field in the
LogicalRepWorker struct named launch_time that contains the time when
the worker was started.  If any request to start a new worker does not
find free slot, we'll check for workers that were supposed to start but
took too long to actually do so, and reuse their slot.

In passing also fix possible race conditions when stopping a worker that
hasn't finished starting yet.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-04-26 10:45:59 -04:00
Stephen Frost
9139aa1942 Allow ALTER TABLE ONLY on partitioned tables
There is no need to forbid ALTER TABLE ONLY on partitioned tables,
when no partitions exist yet.  This can be handy for users who are
building up their partitioned table independently and will create actual
partitions later.

In addition, this is how pg_dump likes to operate in certain instances.

Author: Amit Langote, with some error message word-smithing by me
2017-04-25 16:57:43 -04:00
Peter Eisentraut
a3f17b9c31 Wake up launcher when enabling a subscription
Otherwise one would have to wait up to DEFAULT_NAPTIME_PER_CYCLE until
the subscription worker is considered for starting.

There is a small race condition:  If one enables a subscription right
after disabling it, the launcher might not have registered the stopping
when receiving the wakeup signal for the re-enabling.  The start will
then not happen right away but after the full cycle time.

Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
2017-04-25 14:40:33 -04:00
Fujii Masao
346199dcab Set the priorities of all quorum synchronous standbys to 1.
In quorum-based synchronous replication, all the standbys listed in
synchronous_standby_names equally have chances to be chosen
as synchronous standbys. So they should have the same priority.
However, previously, quorum standbys whose names appear earlier
in the list were given higher priority values though the difference of
those priority values didn't affect the selection of synchronous standbys.
Users could see those "meaningless" priority values in pg_stat_replication
and this was confusing.

This commit gives all the quorum synchronous standbys the same
highest priority, i.e., 1, in order to remove such confusion.

Author: Fujii Masao
Reviewed-by: Masahiko Sawada, Kyotaro Horiguchi
Discussion: http://postgr.es/m/CAHGQGwEKOw=SmPLxJzkBsH6wwDBgOnVz46QjHbtsiZ-d-2RGUg@mail.gmail.com
2017-04-26 01:07:13 +09:00
Robert Haas
914ae8d3cb Adjust outdated comment.
Commit 5dfc198146 removed the only
existing caller of hash_freeze, but left behind a comment indicating
that hash_freeze was still used.  Adjust.

Kyotaro Horiguchi

Discussion: http://postgr.es/m/20170424.165541.230634914.horiguchi.kyotaro@lab.ntt.co.jp
2017-04-25 10:58:45 -04:00
Fujii Masao
7cc14ae9d8 Update copyright in recently added files.
This commit also fixes copyright line missed by the automated script.

Author: Masahiko Sawada
2017-04-25 23:38:41 +09:00
Tom Lane
64925603c9 Revert "Use pselect(2) not select(2), if available, to wait in postmaster's loop."
This reverts commit 81069a9efc.

Buildfarm results suggest that some platforms have versions of pselect(2)
that are not merely non-atomic, but flat out non-functional.  Revert the
use-pselect patch to confirm this diagnosis (and exclude the no-SA_RESTART
patch as the source of trouble).  If it's so, we should probably look into
blacklisting specific platforms that have broken pselect.

Discussion: https://postgr.es/m/9696.1493072081@sss.pgh.pa.us
2017-04-24 18:29:03 -04:00
Tom Lane
81069a9efc Use pselect(2) not select(2), if available, to wait in postmaster's loop.
Traditionally we've unblocked signals, called select(2), and then blocked
signals again.  The code expects that the select() will be cancelled with
EINTR if an interrupt occurs; but there's a race condition, which is that
an already-pending signal will be delivered as soon as we unblock, and then
when we reach select() there will be nothing preventing it from waiting.
This can result in a long delay before we perform any action that
ServerLoop was supposed to have taken in response to the signal.  As with
the somewhat-similar symptoms fixed by commit 893902085, the main practical
problem is slow launching of parallel workers.  The window for trouble is
usually pretty short, corresponding to one iteration of ServerLoop; but
it's not negligible.

To fix, use pselect(2) in place of select(2) where available, as that's
designed to solve exactly this problem.  Where not available, we continue
to use the old way, and are no worse off than before.

pselect(2) has been required by POSIX since about 2001, so most modern
platforms should have it.  A bigger portability issue is that some
implementations are said to be non-atomic, ie pselect() isn't really
any different from unblock/select/reblock.  Still, we're no worse off
than before on such a platform.

There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point this
could be reverted, since we'd be using a self-pipe to solve the race
condition.  But that's not happening before v11 at the earliest.

Back-patch to 9.6.  The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.

Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 14:03:14 -04:00
Tom Lane
8939020853 Run the postmaster's signal handlers without SA_RESTART.
The postmaster keeps signals blocked everywhere except while waiting
for something to happen in ServerLoop().  The code expects that the
select(2) will be cancelled with EINTR if an interrupt occurs; without
that, followup actions that should be performed by ServerLoop() itself
will be delayed.  However, some platforms interpret the SA_RESTART
signal flag as meaning that they should restart rather than cancel
the select(2).  Worse yet, some of them restart it with the original
timeout delay, meaning that a steady stream of signal interrupts can
prevent ServerLoop() from iterating at all if there are no incoming
connection requests.

Observable symptoms of this, on an affected platform such as HPUX 10,
include extremely slow parallel query startup (possibly as much as
30 seconds) and failure to update timestamps on the postmaster's sockets
and lockfiles when no new connections arrive for a long time.

We can fix this by running the postmaster's signal handlers without
SA_RESTART.  That would be quite a scary change if the range of code
where signals are accepted weren't so tiny, but as it is, it seems
safe enough.  (Note that postmaster children do, and must, reset all
the handlers before unblocking signals; so this change should not
affect any child process.)

There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point it might
be appropriate to revert this patch.  But that's not happening before
v11 at the earliest.

Back-patch to 9.6.  The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.

Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 13:00:30 -04:00
Fujii Masao
cbc2270e3f Get rid of extern declarations of non-existent functions.
Those extern declartions were mistakenly added by commit 7c4f52409.

Author: Petr Jelinek
2017-04-25 01:31:42 +09:00
Tom Lane
4fe04244b5 Fix postmaster's handling of fork failure for a bgworker process.
This corner case didn't behave nicely at all: the postmaster would
(partially) update its state as though the process had started
successfully, and be quite confused thereafter.  Fix it to act
like the worker had crashed, instead.

In passing, refactor so that do_start_bgworker contains all the
state-change logic for bgworker launch, rather than just some of it.

Back-patch as far as 9.4.  9.3 contains similar logic, but it's just
enough different that I don't feel comfortable applying the patch
without more study; and the use of bgworkers in 9.3 was so small
that it doesn't seem worth the extra work.

transam/parallel.c is still entirely unprepared for the possibility
of bgworker startup failure, but that seems like material for a
separate patch.

Discussion: https://postgr.es/m/4905.1492813727@sss.pgh.pa.us
2017-04-24 12:16:58 -04:00
Tom Lane
4b34624daa Code review for commands/statscmds.c.
Fix machine-dependent sorting of column numbers.  (Odd behavior
would only materialize for column numbers above 255, but that's
certainly legal.)

Fix poor choice of SQLSTATE for some errors, and improve error message
wording.  (Notably, "is not a scalar type" is a totally misleading way
to explain "does not have a default btree opclass".)

Avoid taking AccessExclusiveLock on the associated relation during DROP
STATISTICS.  That's neither necessary nor desirable, and it could easily
have put us into situations where DROP fails (compare commit 68ea2b7f9).

Adjust/improve comments.

David Rowley and Tom Lane

Discussion: https://postgr.es/m/CAKJS1f-GmCfPvBbAEaM5xoVOaYdVgVN1gicALSoYQ77z-+vLbw@mail.gmail.com
2017-04-24 11:15:15 -04:00
Andres Freund
b182a4ae2f Don't include sys/poll.h anymore.
poll.h is mandated by Single Unix Spec v2, the usual baseline for
postgres on unix.  None of the unixoid buildfarms animals has
sys/poll.h but not poll.h.  Therefore there's not much point to test
for sys/poll.h's existence and include it optionally.

Author: Andres Freund, per suggestion from Tom Lane
Discussion: https://postgr.es/m/20505.1492723662@sss.pgh.pa.us
2017-04-23 16:11:35 -07:00
Andres Freund
eb97aa7e65 Zero padding in replication origin's checkpointed on disk-state.
This seems to be largely cosmetic, avoiding valgrind bleats and the
like. The uninitialized padding influences the CRC of the on-disk
entry, but because it's also used when verifying the CRC, that doesn't
cause spurious failures.  Backpatch nonetheless.

It's a bit unfortunate that contrib/test_decoding/sql/replorigin.sql
doesn't exercise the checkpoint path, but checkpoints are fairly
expensive on weaker machines, and we'd have to stop/start for that to
be meaningful.

Author: Andres Freund
Discussion: https://postgr.es/m/20170422183123.w2jgiuxtts7qrqaq@alap3.anarazel.de
Backpatch: 9.5, where replication origins were introduced
2017-04-23 15:54:41 -07:00
Andres Freund
e84d243b1c Initialize all memory for logical replication relation cache.
As reported by buildfarm animal skink / valgrind, some of the
variables weren't always initialized.  To avoid further mishaps use
memset to ensure the entire entry is initialized.

Author: Petr Jelinek
Reported-By: Andres Freund
Discussion: https://postgr.es/m/20170422183123.w2jgiuxtts7qrqaq@alap3.anarazel.de
Backpatch: none, code new in master
2017-04-23 15:54:41 -07:00
Andres Freund
61c21ddad0 Remove select(2) backed latch implementation.
poll(2) is required by Single Unix Spec v2, the usual baseline for
postgres (leaving windows aside).  There's not been any buildfarm
animals without poll(2) for a long while, leaving the select(2)
implementation to be largely untested.

On windows, including mingw, poll() is not available, but we have a
special case implementation for windows anyway.

Author: Andres Freund
Discussion: https://postgr.es/m/20170420003611.7r2sdvehesdyiz2i@alap3.anarazel.de
2017-04-23 15:31:41 -07:00
Simon Riggs
546c13e11b Workaround for RecoverPreparedTransactions()
Force overwriteOK = true while we investigate deeper fix

Proposed by Tom Lane as temporary measure, accepted by me
2017-04-23 22:12:01 +01:00
Simon Riggs
8463880872 Fix LagTrackerRead() for timeline increments
Bug was masked by error in running 004_timeline_switch.pl that was
fixed recently in 7d68f2281a.

Detective work by Alvaro Herrera and Tom Lane

Author: Thomas Munro
2017-04-23 21:35:41 +01:00
Tom Lane
0874d4f3e1 Fix order of arguments to SubTransSetParent().
ProcessTwoPhaseBuffer (formerly StandbyRecoverPreparedTransactions)
mixed up the parent and child XIDs when calling SubTransSetParent to
record the transactions' relationship in pg_subtrans.

Remarkably, analysis by Simon Riggs suggests that this doesn't lead to
visible problems (at least, not in non-Assert builds).  That might
explain why we'd not noticed it before.  Nonetheless, it's surely wrong.

This code was born broken, so back-patch to all supported branches.

Discussion: https://postgr.es/m/20110.1492905318@sss.pgh.pa.us
2017-04-23 13:11:06 -04:00
Andrew Dunstan
33f3bbc6d3 Fix TAP infrastructure to support Mingw better
archive_command and restore_command need to refer to Windows paths, not
Msys virtual file system paths, as postgres is completely unaware of the
latter, so prefix them with the Windows path to the virtual file system
root. Clean psql and pg_recvlogical output of carriage returns.
2017-04-23 09:21:38 -04:00
Tom Lane
7d68f2281a Make PostgresNode.pm check server status more carefully.
PostgresNode blithely ignored the exit status of pg_ctl, and in general
made no effort to be sure that the server was running when it should be.
This caused it to miss server crashes, which is a serious shortcoming
in a test scaffold.  Make it complain if pg_ctl fails, and modify the
start and stop logic to complain if the server doesn't start, or doesn't
stop, when expected.

Also, have it turn off the "restart_after_crash" configuration parameter
in created clusters, as bitter experience has shown that leaving that on
can mask crashes too.

We might at some point need variant functions that allow for, eg,
server start failure to be expected.  But no existing test case appears
to want that, and it surely shouldn't be the default behavior.

Note that this *will* break the buildfarm, as it will expose known
bugs that the previous testing failed to.  I'm committing it despite
that, to verify that we get the expected failures in the buildfarm
not just in manual testing.

Back-patch into 9.6 where PostgresNode was introduced.  (The 9.6
branch is not expected to show any failures.)

Discussion: https://postgr.es/m/21432.1492886428@sss.pgh.pa.us
2017-04-22 18:18:25 -04:00
Tom Lane
8a19c1a373 Make PostgresNode::append_conf append a newline automatically.
Although the documentation for append_conf said clearly that it didn't
add a newline, many test authors seem to have forgotten that ... or maybe
they just consulted the example at the top of the POD documentation,
which clearly shows adding a config entry without bothering to add a
trailing newline.  The worst part of that is that it works, as long as
you don't do it more than once, since the backend isn't picky about
whether config files end with newlines.  So there's not a strong forcing
function reminding test authors not to do it like that.  Upshot is that
this is a terribly fragile way to go about things, and there's at least
one existing test case that is demonstrably broken and not testing what
it thinks it is.

Let's just make append_conf append a newline, instead; that is clearly
way safer than the old definition.

I also cleaned up a few call sites that were unnecessarily ugly.
(I left things alone in places where it's plausible that additional
config lines would need to be added someday.)

Back-patch the change in append_conf itself to 9.6 where it was added,
as having a definitional inconsistency between branches would obviously
be pretty hazardous for back-patching TAP tests.  The other changes are
just cosmetic and don't need to be back-patched.

Discussion: https://postgr.es/m/19751.1492892376@sss.pgh.pa.us
2017-04-22 16:58:15 -04:00
Andrew Dunstan
f92562adba Require sufficiently modern version of Test::More for TAP tests
Ancient versions of Test::More don't support the note() function used in
some TAP tests, so we require the minimum version of the module that
does.
2017-04-22 10:04:01 -04:00
Tom Lane
5041cdf2b7 Partially revert commit 536d47bd9d.
Per buildfarm, the "#ifdef F_SETFD" removed in that commit actually
is needed on Windows, because fcntl() isn't available at all on that
platform, unless using Cygwin.  We could perhaps spell it more like
"#ifdef HAVE_FCNTL", or "#ifndef WIN32", but it's not clear that
those choices are better.

It does seem that we don't need the bogus manual definition of
FD_CLOEXEC, though, so keep that change.

Discussion: https://postgr.es/m/26254.1492805635@sss.pgh.pa.us
2017-04-22 02:06:16 -04:00
Tom Lane
3e51725b38 Avoid depending on non-POSIX behavior of fcntl(2).
The POSIX standard does not say that the success return value for
fcntl(F_SETFD) and fcntl(F_SETFL) is zero; it says only that it's not -1.
We had several calls that were making the stronger assumption.  Adjust
them to test specifically for -1 for strict spec compliance.

The standard further leaves open the possibility that the O_NONBLOCK
flag bit is not the only active one in F_SETFL's argument.  Formally,
therefore, one ought to get the current flags with F_GETFL and store
them back with only the O_NONBLOCK bit changed when trying to change
the nonblock state.  In port/noblock.c, we were doing the full pushup
in pg_set_block but not in pg_set_noblock, which is just weird.  Make
both of them do it properly, since they have little business making
any assumptions about the socket they're handed.  The other places
where we're issuing F_SETFL are working with FDs we just got from
pipe(2), so it's reasonable to assume the FDs' properties are all
default, so I didn't bother adding F_GETFL steps there.

Also, while pg_set_block deserves some points for trying to do things
right, somebody had decided that it'd be even better to cast fcntl's
third argument to "long".  Which is completely loony, because POSIX
clearly says the third argument for an F_SETFL call is "int".

Given the lack of field complaints, these missteps apparently are not
of significance on any common platforms.  But they're still wrong,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/30882.1492800880@sss.pgh.pa.us
2017-04-21 15:56:16 -04:00
Heikki Linnakangas
68e61ee72e Change the on-disk format of SCRAM verifiers to conform to RFC 5803.
It doesn't make any immediate difference to PostgreSQL, but might as well
follow the standard, since one exists. (I looked at RFC 5803 earlier, but
didn't fully understand it back then.)

The new format uses Base64 instead of hex to encode StoredKey and
ServerKey, which makes the verifiers slightly smaller. Using the same
encoding for the salt and the keys also means that you only need one
encoder/decoder instead of two. Although we have code in the backend to
do both, we are talking about teaching libpq how to create SCRAM verifiers
for PQencodePassword(), and libpq doesn't currently have any code for hex
encoding.

Bump catversion, because this renders any existing SCRAM verifiers in
pg_authid invalid.

Discussion: https://www.postgresql.org/message-id/351ba574-85ea-d9b8-9689-8c928dd0955d@iki.fi
2017-04-21 22:51:57 +03:00
Tom Lane
536d47bd9d Remove long-obsolete catering for platforms without F_SETFD/FD_CLOEXEC.
SUSv2 mandates that <fcntl.h> provide both F_SETFD and FD_CLOEXEC,
so it seems pretty unlikely that any platforms remain without those.
Remove the #ifdef-ery installed by commit 7627b91cd to see if the
buildfarm agrees.

Discussion: https://postgr.es/m/21444.1492798101@sss.pgh.pa.us
2017-04-21 14:48:29 -04:00
Peter Eisentraut
dcb39c37c1 Synchronize table list before creating slot in CREATE SUBSCRIPTION
This way a failure to synchronize the table list will not leave an
unused slot on the publisher.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-04-21 08:37:03 -04:00