Commit Graph

30356 Commits

Author SHA1 Message Date
Peter Eisentraut 073ce405d6 Fix table syncing with different column order
Logical replication supports replicating between tables with different
column order.  But this failed for the initial table sync because of a
logic error in how the column list for the internal COPY command was
composed.  Fix that and also add a test.

Also fix a minor omission in the column name mapping cache.  When
creating the mapping list, it would not skip locally dropped columns.
So if a remote column had the same name as a locally dropped
column (...pg.dropped...), then the expected error would not occur.
2017-05-24 19:40:30 -04:00
Peter Eisentraut 92ecb148e5 Improve logical replication worker log messages
Reduce some redundant messages to DEBUG1.  Be clearer about the
distinction between apply workers and table synchronization workers.
Add subscription and table name where possible.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
2017-05-24 18:57:56 -04:00
Robert Haas 85c2b9a15a Code review of get_qual_for_list.
We need not consider the case where both nulltest1 and nulltest2 are
NULL; the partition either accepts nulls or it does not.

Jeevan Ladhe.  I added an assertion.
2017-05-24 16:45:58 -04:00
Tom Lane 9ae2661fe1 Tighten checks for whitespace in functions that parse identifiers etc.
This patch replaces isspace() calls with scanner_isspace() in functions
that are likely to be presented with non-ASCII input.  isspace() has
the small advantage that it will correctly recognize no-break space
in single-byte encodings (such as LATIN1); but it cannot work successfully
for any multibyte character, and depending on platform it might return
false positive results for some fragments of multibyte characters.  That's
disastrous for functions that are trying to discard whitespace between
valid strings, as noted in bug #14662 from Justin Muise.  Even treating
no-break space as whitespace is pretty questionable for the usages touched
here, because the core scanner would think it is an identifier character.

Affected functions are parse_ident(), parseNameAndArgTypes (underlying
regprocedurein() and siblings), SplitIdentifierString (used for parsing
GUCs and options that are qualified names or lists of names), and
SplitDirectoriesString (used for parsing GUCs that are lists of
directories).

All the functions adjusted here are parsing SQL identifiers and similar
constructs, so it's reasonable to insist that their definition of
whitespace match the core scanner.  So we can hope that this won't cause
many backwards-compatibility problems.  I've left alone isspace() calls
in places that aren't really expecting any non-ASCII input characters,
such as float8in().

Back-patch to all supported branches.

Discussion: https://postgr.es/m/10129.1495302480@sss.pgh.pa.us
2017-05-24 15:28:34 -04:00
Magnus Hagander f61bd73993 Update URLs in pgindent source and README
Website and buildfarm is https, not http, and the ftp protocol will be
shut down shortly.
2017-05-23 13:58:11 -04:00
Heikki Linnakangas 1c9b6e818f Verify that the server constructed the SCRAM nonce correctly.
The nonce consists of client and server nonces concatenated together. The
client checks the nonce contained the client nonce, but it would get fooled
if the server sent a truncated or even empty nonce.

Reported by Steven Fackler to security@postgresql.org. Neither me or Steven
are sure what harm a malicious server could do with this, but let's fix it.
2017-05-23 05:55:19 -04:00
Michael Meskes d951db2eff Synced ecpg's pg_type.h with the one used in the backend.
Patch by Vinayak Pokale.
2017-05-23 09:48:51 +02:00
Magnus Hagander 312bac54cc Fix typo in comment
Author: Masahiko Sawada
2017-05-22 09:10:02 +02:00
Tom Lane d761fe2182 Fix precision and rounding issues in money multiplication and division.
The cash_div_intX functions applied rint() to the result of the division.
That's not merely useless (because the result is already an integer) but
it causes precision loss for values larger than 2^52 or so, because of
the forced conversion to float8.

On the other hand, the cash_mul_fltX functions neglected to apply rint() to
their multiplication results, thus possibly causing off-by-one outputs.

Per C standard, arithmetic between any integral value and a float value is
performed in float format.  Thus, cash_mul_flt4 and cash_div_flt4 produced
answers good to only about six digits, even when the float value is exact.
We can improve matters noticeably by widening the float inputs to double.
(It's tempting to consider using "long double" arithmetic if available,
but that's probably too much of a stretch for a back-patched fix.)

Also, document that cash_div_intX operators truncate rather than round.

Per bug #14663 from Richard Pistole.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/22403.1495223615@sss.pgh.pa.us
2017-05-21 13:05:16 -04:00
Tom Lane 5c837ddd70 Rethink flex flags for syncrep_scanner.l.
Using flex's -i switch to achieve case-insensitivity is not a very safe
practice, because the scanner's behavior may then depend on the locale
that flex was invoked in.  In the particular example at hand, that's
not academic: the possible matches for "FIRST" will be different in a
Turkish locale than elsewhere.  Do it the hard way instead, as our
other scanners do.

Also, drop use of -b -CF -p, because this scanner is only used when
parsing the contents of a GUC variable.  That's not done often, and
the amount of text to be parsed can be expected to be trivial, so
prioritizing scanner speed over code size seems like quite the wrong
tradeoff.  Using flex's default optimization options reduces the
size of syncrep_gram.o by more than 50%.

The case-insensitivity problem is new in HEAD (cf commit 3901fd70c).
The poor choice of optimization flags exists also in 9.6, but it doesn't
seem important enough to back-patch.

Discussion: https://postgr.es/m/24403.1495225931@sss.pgh.pa.us
2017-05-19 18:05:20 -04:00
Robert Haas a95410e2ec pg_upgrade: Handle hash index upgrades more smoothly.
Mark any old hash indexes as invalid so that they don't get used, and
create a script to run REINDEX on all of them.  Without this, we'd
still try to use any upgraded hash indexes, but it would fail.

Amit Kapila, reviewed by me.  Per a suggestion from Tom Lane.

Discussion: http://postgr.es/m/CAA4eK1Jidtagm7Q81q-WoegOVgkotv0OxvHOjFxcvFRP4X=mSw@mail.gmail.com
2017-05-19 16:49:38 -04:00
Peter Eisentraut e807d8b163 Fix mistake in error message
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
Author: Dilip Kumar <dilipbalaut@gmail.com>
2017-05-19 16:30:02 -04:00
Robert Haas 5f374fe7a8 libpq: Try next host if one of them times out.
If one host in a multi-host connection string times out, move on to
the next specified host instead of giving up entirely.

Takayuki Tsunakawa, reviewed by Michael Paquier.  I added
a minor adjustment to the documentation.

Discussion: http://postgr.es/m/0A3221C70F24FB45833433255569204D1F6F42F5@G01JPEXMBYT05
2017-05-19 16:19:51 -04:00
Robert Haas aa41bc794c Capitalize SHOW when testing whether target_session_attrs=read-write.
This makes it also work for replication connections.

Report and patch by Daisuke Higuchi.

Discussion: http://postgr.es/m/1803D792815FC24D871C00D17AE95905B1A34A@g01jpexmbkw24
2017-05-19 15:48:10 -04:00
Robert Haas b522759508 Copy partitioned_rels lists to avoid shared substructure.
Otherwise, set_plan_refs() can get applied to the same list
multiple times through different references, leading to chaos.

Amit Langote, Dilip Kumar, and Robert Haas, reviewed by Ashutosh
Bapat.  Original report by Sveinn Sveinsson.

Discussion: http://postgr.es/m/20170517141151.1435.79890@wrigleys.postgresql.org
2017-05-19 15:26:05 -04:00
Tom Lane cf5389f5b5 Fix misspelled struct tag.
This was evidently intended to match the struct's typedef name,
but it didn't quite.  Noted while testing find_typedefs.
2017-05-19 15:05:58 -04:00
Robert Haas ac8d7e1b83 Fix corruption of tableElts list by MergeAttributes().
Since commit e7b3349a8a, MergeAttributes
destructively modifies the input List, to which the caller's
CreateStmt still points.  One may wonder whether this was already a
bug, but commit f0e44751d7 made things
noticeably worse by adding additional destructive modifications so
that the caller's List might, in the case of creation a partitioned
table, no longer even be structurally valid.  Restore the status quo
ante by assigning the return value of MergeAttributes back to
stmt->tableElts in the caller.

In most of the places where DefineRelation is called, it doesn't
matter what stmt->tableElts points to here or whether it's valid or
not, because the caller doesn't use the statement for anything after
DefineRelation returns anyway.  However, ProcessUtilitySlow passes it
to EventTriggerCollectSimpleCommand, and that function tries to invoke
copyObject on it.  If any of the CreateStmt's substructure is invalid
at that point, undefined behavior will result.

One might wonder whether this whole area needs further revision -
perhaps DefineRelation() ought not to be destructively modifying the
caller-provided CreateStmt at all.  However, that would be a behavior
change for any event triggers using C code to inspect the CreateStmt,
so for now, just fix the crash.

Report by Amit Langote, who provided a somewhat different patch for it.

Discussion: http://postgr.es/m/bf6a39a7-100a-74bd-1156-3c16a1429d88@lab.ntt.co.jp
2017-05-19 15:02:16 -04:00
Peter Eisentraut 7f17ae0ad0 Fix argument name differences
Different names were used between function declaration and definition.
2017-05-19 14:47:56 -04:00
Heikki Linnakangas 866490a6b7 Fix compilation with --with-bsd-auth.
Commit 8d3b9cce81 added extra arguments to the sendAuthRequest function,
but neglected this caller inside #ifdef USE_BSD_AUTH.

Per report from Pierre-Emmanuel André.

Discussion: https://www.postgresql.org/message-id/20170519090336.whzmjzrsap6ktbgg@digipea.digitick.local
2017-05-19 12:21:55 +03:00
Heikki Linnakangas 94884e1c27 Make slab allocator work on platforms with MAXIMUM_ALIGNOF < sizeof(int).
Notably, m68k only needs 2-byte alignment. Per report from Christoph Berg.

Discussion: https://www.postgresql.org/message-id/20170517193957.fwntkgi6epuso5l2@msg.df7cb.de
2017-05-18 22:22:13 +03:00
Robert Haas 3ec76ff1f2 Don't explicitly mark range partitioning columns NOT NULL.
This seemed like a good idea originally because there's no way to mark
a range partition as accepting NULL, but that now seems more like a
current limitation than something we want to lock down for all time.
For example, there's a proposal to add the notion of a default
partition which accepts all rows not otherwise routed, which directly
conflicts with the idea that a range-partitioned table should never
allow nulls anywhere.  So let's change this while we still can, by
putting the NOT NULL test into the partition constraint instead of
changing the column properties.

Amit Langote and Robert Haas, reviewed by Amit Kapila

Discussion: http://postgr.es/m/8e2dd63d-c6fb-bb74-3c2b-ed6d63629c9d@lab.ntt.co.jp
2017-05-18 13:49:31 -04:00
Heikki Linnakangas 2df537e43f Fix typo in comment.
Daniel Gustafsson
2017-05-18 10:33:16 +03:00
Peter Eisentraut 157239d2cd pg_dump: Fix dumping of slot_name = NONE
It previously wrote out slot_name = '', which was incorrect.

Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
2017-05-17 21:19:14 -04:00
Peter Eisentraut 6234569851 Improve CREATE SUBSCRIPTION option parsing
When creating a subscription with slot_name = NONE, we failed to check
that also create_slot = false and enabled = false were set.  This
created an invalid subscription and could later lead to a crash if a
NULL slot name was accessed.  Add more checks around that for
robustness.

Reported-by: tushar <tushar.ahuja@enterprisedb.com>
2017-05-17 20:47:37 -04:00
Bruce Momjian ce55481032 Post-PG 10 beta1 pgperltidy run 2017-05-17 19:01:23 -04:00
Bruce Momjian a6fd7b7a5f Post-PG 10 beta1 pgindent run
perltidy run not included.
2017-05-17 16:31:56 -04:00
Bruce Momjian 8a94332478 Update typedefs list in prep. for post-PG10 beta1 pgindent run 2017-05-17 15:52:16 -04:00
Bruce Momjian df238b43d7 Add download URL for perltidy version v20090616 2017-05-17 15:29:37 -04:00
Robert Haas b2e4399baa Code review for make_partition_op_expr.
It's better to use the actual keynum here rather than 0, because
someday someone might try to make list partitioning work with
multiple partitioning columns.

Jeevan Ladhe

Discussion: http://postgr.es/m/CAOgcT0M6-mx+dSX47JGJuJP1CKr4XssBFVmKNETt0OZYWpFr+w@mail.gmail.com
2017-05-17 14:31:48 -04:00
Tom Lane 05b5feb60e Revert changes to pg_basebackup and pg_waldump usage() code.
Partially revert commit c079673dcb.
There were complaints that splitting switch descriptions would
complicate translation efforts.  There are probably ways to resolve
the formatting problem without doing that, but undo it while we're
discussing.
2017-05-17 13:04:03 -04:00
Robert Haas 236d6d462d Remove redundant has_null member from PartitionBoundInfoData.
Jeevan Ladhe, with some changes by me.

Discussion: http://postgr.es/m/CAOgcT0NZ_30-pjBpW2OgneV1ammArHkZDZ8B_KFC3q+_Xb2H9A@mail.gmail.com
2017-05-17 12:50:01 -04:00
Peter Eisentraut 3db22794b7 Add more tests for CREATE SUBSCRIPTION
Add some tests for parsing different option combinations.  Fix some of
the resulting error messages for recent changes in option naming.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
2017-05-17 12:24:48 -04:00
Tom Lane 9485516ea2 Make psql handle EOF during COPY FROM STDIN properly on all platforms.
When stdin is a terminal, it's possible to end a COPY FROM STDIN with
a keyboard EOF signal (typically control-D), and then keep on issuing
SQL commands.  One would expect another COPY FROM STDIN to work as well,
but on some platforms it did not.  This turns out to be because we were
not resetting the stream's feof() flag, and BSD-ish versions of fread()
and fgets() won't attempt to read more data if that's set.

The misbehavior is observed on BSDen (including macOS), but not Linux,
Windows, or SysV-ish Unixen, which makes this a portability bug not
just a missing feature.

Add a clearerr() call to fix the behavior, and improve the prompt that's
issued when copying from a TTY to mention that EOF signals work.

It's been like this forever, so back-patch to all supported branches.

Thomas Munro

Discussion: https://postgr.es/m/CAEepm=0MCGfYf=JAMiYhO6JPtv9-3ZfBo8fcGeCZ8oMzaw+Z+Q@mail.gmail.com
2017-05-17 12:24:19 -04:00
Peter Eisentraut 944dc0f9ce Check relkind of tables in CREATE/ALTER SUBSCRIPTION
We used to only check for a supported relkind on the subscriber during
replication, which is needed to ensure that the setup is valid and we
don't crash.  But it's also useful to tell the user immediately when
CREATE or ALTER SUBSCRIPTION is executed that the relation being added
to the subscription is not of a supported relkind.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
2017-05-16 22:57:16 -04:00
Peter Eisentraut 0fbfb65d4b psql: publication/subscription tab completion fixes 2017-05-16 22:19:21 -04:00
Tom Lane c079673dcb Preventive maintenance in advance of pgindent run.
Reformat various places in which pgindent will make a mess, and
fix a few small violations of coding style that I happened to notice
while perusing the diffs from a pgindent dry run.

There is one actual bug fix here: the need-to-enlarge-the-buffer code
path in icu_convert_case was obviously broken.  Perhaps it's unreachable
in our usage?  Or maybe this is just sadly undertested.
2017-05-16 20:36:35 -04:00
Tom Lane ddd243584a Fix leakage of memory context header in find_all_inheritors().
Commit 827d6f977 contained the same misunderstanding of hash_create's API
as commit 090010f2e.  As in 5d00b764c, remove the unnecessary layer of
memory context.  (This bug is less significant than the other one, since
the extra context would be under a relatively short-lived context, but
it's still a bug.)
2017-05-16 19:33:31 -04:00
Kevin Grittner a19ea9c660 Revert "Add a test for transition table usage in FOR EACH ROW trigger."
This reverts commit 4a03f935b3.
2017-05-16 17:15:33 -05:00
Kevin Grittner 4a03f935b3 Add a test for transition table usage in FOR EACH ROW trigger. 2017-05-16 16:09:55 -05:00
Tom Lane 8b0b6303e9 Try to ensure that stats collector's receive buffer size is at least 100KB.
Since commit 4e37b3e15, buildfarm member frogmouth has been failing
occasionally with symptoms indicating that some expected stats data is
getting dropped.  The reason that that commit changed the behavior seems
probably to be that more data is getting shoved at the collector in a short
span of time.  In current sources, the stats test's first session sends
about 9KB of data while exiting, which is probably the same as what was
sent just before wait_for_stats() in the previous test design.  But now,
the test's second session is starting up concurrently, and it sends another
2KB (presumably reflecting its initial catalog accesses).  Since frogmouth
is running on Windows XP, which reputedly has a default socket receive
buffer size of only 8KB, it is not very surprising if this has put us over
the threshold where the receive buffer can overflow and drop messages.

The same mechanism could very easily explain the intermittent stats test
failures we've been seeing for years, since background processes such
as the bgwriter will sometimes send data concurrently with all this, and
could thus cause occasional buffer overflows.

Hence, insert some code into pgstat_init() to increase the stats socket's
receive buffer size to 100KB if it's less than that.  (On failure, emit a
LOG message, but keep going.)  Modern systems seem to have default sizes
in the range of 100KB-250KB, but older platforms don't.  I couldn't find
any platforms that wouldn't accept 100KB, so in theory this won't cause
any portability problems.

If this is successful at reducing the buildfarm failure rate in HEAD,
we should back-patch it, because it's certain that similar buffer overflows
happen in the field on platforms with small buffer sizes.  Going forward,
there might be an argument for trying to increase the buffer size even
more, but let's take a baby step first.

Discussion: https://postgr.es/m/22173.1494788088@sss.pgh.pa.us
2017-05-16 15:24:52 -04:00
Robert Haas 59f40566ca Fix relcache leak when row triggers on partitions are fired by COPY.
Thomas Munro, reviewed by Amit Langote

Discussion: http://postgr.es/m/CAEepm=15Jss-yhFApuKzxcoCuFnb8TR8iQiWMjG=CLYPx48QLw@mail.gmail.com
2017-05-16 12:46:32 -04:00
Tom Lane 91102dab44 In SSL tests, don't scribble on permissions of a repo file.
Modifying the permissions of a persistent file isn't really much nicer
than modifying its contents, even if git doesn't currently notice it.
Adjust the test script to make a copy and set the permissions of that
instead.

Michael Paquier, per a gripe from me.  Back-patch to 9.5 where these
tests were introduced.

Discussion: https://postgr.es/m/14836.1494885946@sss.pgh.pa.us
2017-05-15 23:27:51 -04:00
Tom Lane 5ad367a35b Stamp 10beta1. 2017-05-15 17:20:59 -04:00
Robert Haas 0ad226f2ae Add missing apostrophe.
Masahiko Sawada

Discussion: http://postgr.es/m/CAD21AoAzaR_XV7j7Wk9-QYXaFoT8H4egKwXvFY63wc8Lw2C9cg@mail.gmail.com
2017-05-15 15:41:15 -04:00
Tom Lane e3f67a5a17 Update oidjoins regression test for v10. 2017-05-15 14:04:11 -04:00
Peter Eisentraut b1ff33fd9b Add assertion to quiet Coverity 2017-05-15 13:59:58 -04:00
Peter Eisentraut 82d24bab75 Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 398beeef4921df0956f917becd7b5669d2a8a5c4
2017-05-15 12:19:54 -04:00
Tom Lane 4041808b5b Fix bogus syntax for CREATE PUBLICATION commands emitted by pg_dump.
Original coding was careless about where to insert commas.

Masahiko Sawada

Discussion: https://postgr.es/m/3427593a-61aa-b17e-64ef-383b7742d6d9@enterprisedb.com
2017-05-15 11:48:39 -04:00
Tom Lane 12590c5d33 Fix unsafe reference into relcache in constructed CommentStmt.
The CommentStmt made by RebuildConstraintComment() has to pstrdup the
relation name, else it will contain a dangling pointer after that
relcache entry is flushed.  (I'm less sure that pstrdup'ing conname
is necessary, but let's be safe.)  Failure to do this leads to weird
errors or crashes, as reported by Marko Elezovic.

Bug introduced by commit e42375fc8, so back-patch to 9.5 as that was.

Fix by David Rowley, regression test by Michael Paquier

Discussion: https://postgr.es/m/DB6PR03MB30775D58E732D4EB0C13725B9AE00@DB6PR03MB3077.eurprd03.prod.outlook.com
2017-05-15 11:33:44 -04:00
Peter Eisentraut f8dc1985fd Fix ALTER SEQUENCE locking
In 1753b1b027, the pg_sequence system
catalog was introduced.  This made sequence metadata changes
transactional, while the actual sequence values are still behaving
nontransactionally.  This requires some refinement in how ALTER
SEQUENCE, which operates on both, locks the sequence and the catalog.

The main problems were:

- Concurrent ALTER SEQUENCE causes "tuple concurrently updated" error,
  caused by updates to pg_sequence catalog.

- Sequence WAL writes and catalog updates are not protected by same
  lock, which could lead to inconsistent recovery order.

- nextval() disregarding uncommitted ALTER SEQUENCE changes.

To fix, nextval() and friends now lock the sequence using
RowExclusiveLock instead of AccessShareLock.  ALTER SEQUENCE locks the
sequence using ShareRowExclusiveLock.  This means that nextval() and
ALTER SEQUENCE block each other, and ALTER SEQUENCE on the same sequence
blocks itself.  (This was already the case previously for the OWNER TO,
RENAME, and SET SCHEMA variants.)  Also, rearrange some code so that the
entire AlterSequence is protected by the lock on the sequence.

As an exception, use reduced locking for ALTER SEQUENCE ... RESTART.
Since that is basically a setval(), it does not require the full locking
of other ALTER SEQUENCE actions.  So check whether we are only running a
RESTART and run with less locking if so.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reported-by: Jason Petersen <jason@citusdata.com>
Reported-by: Andres Freund <andres@anarazel.de>
2017-05-15 10:19:57 -04:00
Magnus Hagander b1c45afb01 Fix typo in comment
Michael Paquier
2017-05-15 11:08:02 +02:00
Tom Lane eda4ef8151 stats regression test's wait_for_stats() must check timestamp too.
pg_stat_get_snapshot_timestamp() returns the timestamp seen in the "global"
stats file.  Because pgstat_write_statsfiles() writes per-DB stats files
before the global file (or at least before renaming it into place), there
is a window where the test backend can see all the stats updates that
wait_for_stats() was checking for (all of which come from the per-DB file)
but also see the same global stats file it had seen at the start of the
test script.  This results in a failure in only the "snapshot_newer" query,
as reported by a couple of buildfarm members recently.

I suspect that this ought to be back-patched.  Commit 4e37b3e15 has
evidently increased the probability of this window getting hit, but
it's not apparent why it could not have been hit before.  I'll refrain
for the moment though.
2017-05-14 23:33:18 -04:00
Tom Lane 5d00b764cd Make pgstat tabstat lookup hash table less fragile.
Code review for commit 090010f2e.

Fix cases where an elog(ERROR) partway through a function would leave the
persistent data structures in a corrupt state.  pgstat_report_stat got this
wrong by invalidating PgStat_TableEntry structs before removing hashtable
entries pointing to them, and get_tabstat_entry got it wrong by ignoring
the possibility of palloc failure after it had already created a hashtable
entry.

Also, avoid leaking a memory context per transaction, which the previous
code did through misunderstanding hash_create's API.  We do not need to
create a context to hold the hash table; hash_create will do that.
(The leak wasn't that large, amounting to only a memory context header
per iteration, but it's still surprising that nobody noticed it yet.)
2017-05-14 22:52:49 -04:00
Tom Lane 7606bbb3de Make stats regression test more robust in the face of parallel query.
Commit 60690a6fe attempted to fix the wait_for_stats() function in this
test so that it would wait properly if the tenk2 scans were done in
parallel workers instead of the main session (typically as a consequence of
force_parallel_mode being turned on).  However, we made it test for whether
the main session's actions had been reported by looking for inserts on
'trunc_stats_test'.  This is the Wrong Thing, because those aren't the last
updates we expect the main session to do.  As shown by recent failures on
buildfarm member frogmouth, it's entirely likely that the trunc_stats_test
updates will be reported in a separate message from later updates, which
means there can be a window in which wait_for_stats() will exit but not all
the updates we are expecting to see will have arrived.  We should test for
the last updates we're expecting, namely those on 'trunc_stats_test4'.

Unfortunately, I doubt that this explains frogmouth's failures, because
there's no reason to believe that it's running the tenk2 queries in
parallel.  Still, the test is wrong on its own terms, so fix and back-patch
to 9.6 where parallel query came in.
2017-05-14 21:39:10 -04:00
Robert Haas edbe2a2936 Attempt to fix compiler warning.
Per a report from Tom Lane, newer versions of gcc apparently think
that partexprs_item_saved can be used uninitialized.  Try to convince
them otherwise.
2017-05-14 20:59:28 -04:00
Tom Lane e84c019598 Fix maintenance hazards caused by ill-considered use of default: cases.
Remove default cases from assorted switches over ObjectClass and some
related enum types, so that we'll get compiler warnings when someone
adds a new enum value without accounting for it in all these places.

In passing, re-order some switch cases as needed to match the declaration
of enum ObjectClass.  OK, that's just neatnik-ism, but I dislike code
that looks like it was assembled with the help of a dartboard.

Discussion: https://postgr.es/m/20170512221010.nglatgt5azzdxjlj@alvherre.pgsql
2017-05-14 13:32:59 -04:00
Tom Lane b5b0db19b8 Fix handling of extended statistics during ALTER COLUMN TYPE.
ALTER COLUMN TYPE on a column used by a statistics object fails since
commit 928c4de30, because the relevant switch in ATExecAlterColumnType
is unprepared for columns to have dependencies from OCLASS_STATISTIC_EXT
objects.

Although the existing types of extended statistics don't actually need us
to do any work for a column type change, it seems completely indefensible
that that assumption is hidden behind the failure of an unrelated module
to contain any code for the case.  Hence, create and call an API function
in statscmds.c where the assumption can be explained, and where we could
add code to deal with the problem when it inevitably becomes real.

Also, the reason this wasn't handled before, neither for extended stats
nor for the last half-dozen new OCLASS kinds :-(, is that the default:
in that switch suppresses compiler warnings, allowing people to miss the
need to consider it when adding an OCLASS.  We don't really need a default
because surely getObjectClass should only return valid values of the enum;
so remove it, and add the missed OCLASS entries where they should be.

Discussion: https://postgr.es/m/20170512221010.nglatgt5azzdxjlj@alvherre.pgsql
2017-05-14 12:22:25 -04:00
Tom Lane f674743487 Remove no-longer-needed fields of Hash plan nodes.
skewColType/skewColTypmod are no longer used in the wake of commit
9aab83fc5, and seem unlikely to be wanted in future, so let's drop 'em.

Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us
2017-05-14 11:07:40 -04:00
Tom Lane f04c9a6146 Standardize terminology for pg_statistic_ext entries.
Consistently refer to such an entry as a "statistics object", not just
"statistics" or "extended statistics".  Previously we had a mismash of
terms, accompanied by utter confusion as to whether the term was
singular or plural.  That's not only grating (at least to the ear of
a native English speaker) but could be outright misleading, eg in error
messages that seemed to be referring to multiple objects where only one
could be meant.

This commit fixes the code and a lot of comments (though I may have
missed a few).  I also renamed two new SQL functions,
pg_get_statisticsextdef -> pg_get_statisticsobjdef
pg_statistic_ext_is_visible -> pg_statistics_obj_is_visible
to conform better with this terminology.

I have not touched the SGML docs other than fixing those function
names; the docs certainly need work but it seems like a separable task.

Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us
2017-05-14 10:55:01 -04:00
Andres Freund 29c7d5e484 Specify --outputdir for isolation install check, not just plain check.
This should probably have been part of 60f826c5e6.

Reported-By: Andrew Gierth
2017-05-13 15:13:17 -07:00
Andres Freund 524dbc1433 Avoid superfluous work for commits during logical slot creation.
Before 955a684e04 logical decoding snapshot maintenance needed to
cope with transactions it might not have seen in their entirety. For
such transactions we'd to assume they modified the catalog (could have
happened before we were watching), and thus a new snapshot had to be
built, and distributed to concurrently running transactions.

That's problematic because building a new snapshot isn't that cheap ,
especially as the the array of committed transactions needs to be
sorted.  When creating a slot on a server with a lot of transactions,
this could make logical slot creation infeasibly expensive.

After 955a684e04 there's no need to deal with transaction that
aren't guaranteed to be fully observable.  That allows to avoid
building snapshots for transactions that haven't modified catalog,
even before reaching consistency.

While this isn't necessarily a bugfix, slot creation being impossible
in some production workloads, is severe enough to warrant
backpatching.

Author: Andres Freund, based on a quite different patch from Petr Jelinek
Analyzed-By: Petr Jelinek
Reviewed-By: Petr Jelinek
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backpatch: 9.4-, where logical decoding has been introduced
2017-05-13 15:06:40 -07:00
Andres Freund 955a684e04 Fix race condition leading to hanging logical slot creation.
The snapshot assembly during the creation of logical slots relied
waiting for transactions in xl_running_xacts to end, by checking for
their commit/abort records.  Unfortunately, despite locking, it is
possible to see an xl_running_xact record listing transactions as
ready, that have already WAL-logged an commit/abort record, as the
locking just prevents the ProcArray to be adjusted, and the commit
record has to be logged first.

That lead to either delayed or hanging snapshot creation, because
snapbuild.c would wait "forever" to see commit/abort records for some
transactions.  That hang resolved only if a xl_running_xacts record
without any running transactions happened to be logged, far from
certain on a busy server.

It's impractical to prevent that via more heavyweight locking, the
likelihood of deadlocks and significantly increased contention would
be too big.

Instead change the initial snapshot creation to be solely based on
tracking the oldest running transaction via
xl_running_xacts->oldestRunningXid - that actually ends up
significantly simplifying the code.  That has two disadvantages:
1) Because we cannot fully "trust" the contents of xl_running_xacts,
   we cannot use it to build the initial snapshot.  Instead we have to
   wait twice for all running transactions to finish.
2) Previously a slot, unless the race occurred, could be created when
   the all transaction perceived as running based on commit/abort
   records, now we have to wait for the next xl_running_xacts record.
To address that, trigger logging new xl_running_xacts record from
within snapbuild.c exactly when necessary.

Unfortunately snabuild.c's SnapBuild is stored on disk, one of the
stupider ideas of a certain Mr Freund, so we can't change it in a
minor release.  As this is going to be backpatched, we have to hack
around a bit to keep on-disk compatibility.  A later commit will
rejigger that on master.

Author: Andres Freund, based on a quite different patch from Petr Jelinek
Analyzed-By: Petr Jelinek
Reviewed-By: Petr Jelinek
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backpatch: 9.4-, where logical decoding has been introduced
2017-05-13 14:21:00 -07:00
Tom Lane 9aab83fc50 Redesign get_attstatsslot()/free_attstatsslot() for more safety and speed.
The mess cleaned up in commit da0759600 is clear evidence that it's a
bug hazard to expect the caller of get_attstatsslot()/free_attstatsslot()
to provide the correct type OID for the array elements in the slot.
Moreover, we weren't even getting any performance benefit from that,
since get_attstatsslot() was extracting the real type OID from the array
anyway.  So we ought to get rid of that requirement; indeed, it would
make more sense for get_attstatsslot() to pass back the type OID it found,
in case the caller isn't sure what to expect, which is likely in binary-
compatible-operator cases.

Another problem with the current implementation is that if the stats array
element type is pass-by-reference, we incur a palloc/memcpy/pfree cycle
for each element.  That seemed acceptable when the code was written because
we were targeting O(10) array sizes --- but these days, stats arrays are
almost always bigger than that, sometimes much bigger.  We can save a
significant number of cycles by doing one palloc/memcpy/pfree of the whole
array.  Indeed, in the now-probably-common case where the array is toasted,
that happens anyway so this method is basically free.  (Note: although the
catcache code will inline any out-of-line toasted values, it doesn't
decompress them.  At the other end of the size range, it doesn't expand
short-header datums either.  In either case, DatumGetArrayTypeP would have
to make a copy.  We do end up using an extra array copy step if the element
type is pass-by-value and the array length is neither small enough for a
short header nor large enough to have suffered compression.  But that
seems like a very acceptable price for winning in pass-by-ref cases.)

Hence, redesign to take these insights into account.  While at it,
convert to an API in which we fill a struct rather than passing a bunch
of pointers to individual output arguments.  That will make it less
painful if we ever want further expansion of what get_attstatsslot can
pass back.

It's certainly arguable that this is new development and not something to
push post-feature-freeze.  However, I view it as primarily bug-proofing
and therefore something that's better to have sooner not later.  Since
we aren't quite at beta phase yet, let's put it in.

Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us
2017-05-13 15:14:39 -04:00
Robert Haas 1848b73d45 Teach \d+ to show partitioning constraints.
The fact that we didn't have this in the first place is likely why
the problem fixed by f8bffe9e6d
escaped detection.

Patch by Amit Langote, reviewed and slightly adjusted by me.

Discussion: http://postgr.es/m/CA+TgmoYWnV2GMnYLG-Czsix-E1WGAbo4D+0tx7t9NdfYBDMFsA@mail.gmail.com
2017-05-13 12:04:53 -04:00
Robert Haas f8bffe9e6d Fix multi-column range partitioning constraints.
The old logic was just plain wrong.

Report by Olaf Gawenda.  Patch by Amit Langote, reviewed by
Beena Emerson and by me.  Minor adjustments by me also.
2017-05-13 11:36:41 -04:00
Tom Lane 4e37b3e15c Avoid hard-wired sleep delays in stats regression test.
On faster machines, the overall runtime for running the core regression
tests is under twenty seconds these days, of which the hard-wired delays
in the stats test are a significant fraction.  But on closer inspection,
it seems like we shouldn't need those.

The initial 2-second delay is there only to reduce the risk of the test's
stats messages not getting sent due to contention.  But analysis of the
last ten years' worth of buildfarm runs shows no evidence that such
failures actually occur.  (We do see failures that look like stats
messages not getting sent, particularly on Windows; but there is little
reason to believe that the initial delay reduces their frequency.)

The later 1-second delay is there to ensure that our session's stats
will have gotten sent.  But we could also do that by starting a fresh
session, which takes well under 1 second even on very slow machines.

Hence, let's remove both delays and see what happens.  The first delay
was the only test of pg_sleep_for() in the regression tests, but we can
move that responsibility into wait_for_stats().

Discussion: https://postgr.es/m/17795.1493869423@sss.pgh.pa.us
2017-05-13 09:42:12 -04:00
Andrew Dunstan 8d9f060977 Use a better way of skipping all subscription tests on Windows
This way we only need to specify the number of tests in one place, and
the output is also less verbose.
2017-05-13 02:49:32 -04:00
Alvaro Herrera d99d58cdc8 Complete tab completion for DROP STATISTICS
Tab-completing DROP STATISTICS would only work if you started writing
the schema name containing the statistics object, because the visibility
clause was missing.  To add it, we need to add SQL-callable support for
testing visibility of a statistics object, like all other object types
already have.

Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us
2017-05-13 01:05:48 -03:00
Tom Lane 2df5d46555 Avoid searching for callback functions in CallSyscacheCallbacks().
We have now grown enough registerable syscache-invalidation callback
functions that the original assumption that there would be few of them
is causing performance problems.  In particular, let's fix things so that
CallSyscacheCallbacks doesn't have to search the whole array to find
which callback(s) to invoke for a given cache ID.  Preserve the original
behavior that callbacks are called in order of registration, just in
case there's someplace that depends on that (which I doubt).

In support of this, export the number of syscaches from syscache.h.
People could have found that out anyway from the enum, but adding a
#define makes that much safer.

This provides a useful additional speedup in Mathieu Fenniak's
logical-decoding test case, although we're reaching the point of
diminishing returns there.  I think any further improvement will have
to come from reducing the number of cache invalidations that are
triggered in the first place.  Still, we can hope that this change
gives some incremental benefit for all invalidation scenarios.

Back-patch to 9.4 where logical decoding was introduced.

Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
2017-05-12 19:05:27 -04:00
Tom Lane 8085a4f751 Reduce initial size of RelfilenodeMapHash.
A test case provided by Mathieu Fenniak shows that hash_seq_search'ing
this hashtable can consume a very significant amount of overhead during
logical decoding, which triggers frequent cache invalidation.  Testing
suggests that the actual population of the hashtable is often no more
than a few dozen entries, so we can cut the overhead just by dropping
the initial number of buckets down from 1024 --- I chose to cut it to 64.
(In situations where we do have a significant number of entries, we
shouldn't get any real penalty from doing this, as the dynahash.c code
will resize the hashtable automatically.)

This gives a further factor-of-two savings in Mathieu's test case.
That may be overly optimistic for real-world benefit, as real cases
may have larger average table populations, but it's hard to see it
turning into a net negative for any workload.

Back-patch to 9.4 where relfilenodemap.c was introduced.

Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
2017-05-12 18:30:17 -04:00
Alvaro Herrera 5e2af609e1 getObjectDescription: support extended statistics
This was missed in 7b504eb282.

Remove the "default:" clause in the switch, to avoid this problem in the
future.  Other switches involving the same enum should probably be
changed in the same way, but are not touched by this patch.

Discussion: https://postgr.es/m/20170512204800.iqt2uwyx3c32j45r@alvherre.pgsql
2017-05-12 19:22:50 -03:00
Tom Lane 50ee1c7462 Avoid searching for the target catcache in CatalogCacheIdInvalidate.
A test case provided by Mathieu Fenniak shows that the initial search for
the target catcache in CatalogCacheIdInvalidate consumes a very significant
amount of overhead in cases where cache invalidation is triggered but has
little useful work to do.  There is no good reason for that search to exist
at all, as the index array maintained by syscache.c allows direct lookup of
the catcache from its ID.  We just need a frontend function in syscache.c,
matching the division of labor for most other cache-accessing operations.

While there's more that can be done in this area, this patch alone reduces
the runtime of Mathieu's example by 2X.  We can hope that it offers some
useful benefit in other cases too, although usually cache invalidation
overhead is not such a striking fraction of the total runtime.

Back-patch to 9.4 where logical decoding was introduced.  It might be
worth going further back, but presently the only case we know of where
cache invalidation is really a significant burden is in logical decoding.
Also, older branches have fewer catcaches, reducing the possible benefit.

(Note: although this nominally changes catcache's API, we have always
documented CatalogCacheIdInvalidate as a private function, so I would
have little sympathy for an external module calling it directly.  So
backpatching should be fine.)

Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
2017-05-12 18:17:29 -04:00
Tom Lane 928c4de309 Fix dependencies for extended statistics objects.
A stats object ought to have a dependency on each individual column
it reads, not the entire table.  Doing this honestly lets us get rid
of the hard-wired logic in RemoveStatisticsExt, which seems to have
been misguidedly modeled on RemoveStatistics; and it will be far easier
to extend to multiple tables later.

Also, add overlooked dependency on owner, and make the dependency on
schema be NORMAL like every other such dependency.

There remains some unfinished work here, which is to allow statistics
objects to be extension members.  That takes more effort than just
adding the dependency call, though, so I left it out for now.

initdb forced because this changes the set of pg_depend records that
should exist for a statistics object.

Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us
2017-05-12 16:26:31 -04:00
Alvaro Herrera bc085205c8 Change CREATE STATISTICS syntax
Previously, we had the WITH clause in the middle of the command, where
you'd specify both generic options as well as statistic types.  Few
people liked this, so this commit changes it to remove the WITH keyword
from that clause and makes it accept statistic types only.  (We
currently don't have any generic options, but if we invent in the
future, we will gain a new WITH clause, probably at the end of the
command).

Also, the column list is now specified without parens, which makes the
whole command look more similar to a SELECT command.  This change will
let us expand the command to supporting expressions (not just columns
names) as well as multiple tables and their join conditions.

Tom added lots of code comments and fixed some parts of the CREATE
STATISTICS reference page, too; more changes in this area are
forthcoming.  He also fixed a potential problem in the alter_generic
regression test, reducing verbosity on a cascaded drop to avoid
dependency on message ordering, as we do in other tests.

Tom also closed a security bug: we documented that table ownership was
required in order to create a statistics object on it, but didn't
actually implement it.

Implement tab-completion for statistics objects.  This can stand some
more improvement.

Authors: Alvaro Herrera, with lots of cleanup by Tom Lane
Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql
2017-05-12 14:59:35 -03:00
Peter Eisentraut d496a65790 Standardize "WAL location" terminology
Other previously used terms were "WAL position" or "log position".
2017-05-12 13:51:27 -04:00
Peter Eisentraut c1a7f64b4a Replace "transaction log" with "write-ahead log"
This makes documentation and error messages match the renaming of "xlog"
to "wal" in APIs and file naming.
2017-05-12 11:52:43 -04:00
Andrew Dunstan 56b6ef893f Honor PROVE_FLAGS environment setting
On MSVC builds and on back branches that means removing the hardcoded
--verbose setting. On master for Unix that means removing the empty
setting in the global Makefile so that the value can be acquired from
the environment as well as from the make arguments.

Backpatch to 9.4 where we introduced TAP tests
2017-05-12 11:11:49 -04:00
Andrew Dunstan b757e01f62 Add libxml2 include path for MSVC builds
On Unix this path is detected via the use of xml2-config, but that's not
available on Windows. This means that users building with libxml2 will
no longer need to move things around from the standard libxml2
installation for MSVC builds.

Backpatch to all live branches.
2017-05-12 10:21:13 -04:00
Peter Eisentraut 96e1cb4c0f pg_dump: Add --no-publications option
Author: Michael Paquier <michael.paquier@gmail.com>
2017-05-12 09:15:40 -04:00
Peter Eisentraut b807f59828 Rework the options syntax for logical replication commands
For CREATE/ALTER PUBLICATION/SUBSCRIPTION, use similar option style as
other statements that use a WITH clause for options.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-05-12 08:57:49 -04:00
Andrew Dunstan 734cb4c2e7 Avoid tests which crash the calling process on Windows
Certain recovery tests use the Perl IPC::Run module's start/kill_kill
method of processing. On at least some versions of perl this causes the
whole process and its caller to crash. If we ever find a better way of
doing these tests they can be re-enabled on this platform. This does not
affect Mingw or Cygwin builds, which use a different perl and a
different shell and so are not affected.
2017-05-12 06:41:23 -04:00
Simon Riggs 024711bb54 Lag tracking for logical replication
Lag tracking is called for each commit, but we introduce
a pacing delay to ensure we don't swamp the lag tracker.

Author: Petr Jelinek, with minor pacing delay code from me
2017-05-12 10:50:56 +01:00
Tom Lane 596a7c8df7 Increase MAX_SYSCACHE_CALLBACKS to provide more room for extensions.
Increase from the historical value of 32 to 64.  We are up to 31 callers
of CacheRegisterSyscacheCallback() in HEAD, so if they were all to be
exercised in one process that would leave only one slot for add-on modules.
It's probably not possible for that to happen, but still we clearly need
more daylight here.  (At some point it might be worth making the array
dynamically resizable; but since we've never heard a complaint of "out of
syscache_callback_list slots" happening in the field, I doubt it's worth
it yet.)

Back-patch as far as 9.4, which is where we increased the companion limit
MAX_RELCACHE_CALLBACKS (cf commit f01d1ae3a).  It's not as urgent in
released branches, which have only a couple dozen call sites in core, but
it still seems that somebody might hit the limit before these branches die.

Discussion: https://postgr.es/m/12184.1494450131@sss.pgh.pa.us
2017-05-11 14:51:21 -04:00
Tom Lane d10c626de4 Rename WAL-related functions and views to use "lsn" not "location".
Per discussion, "location" is a rather vague term that could refer to
multiple concepts.  "LSN" is an unambiguous term for WAL locations and
should be preferred.  Some function names, view column names, and function
output argument names used "lsn" already, but others used "location",
as well as yet other terms such as "wal_position".  Since we've already
renamed a lot of things in this area from "xlog" to "wal" for v10,
we may as well incur a bit more compatibility pain and make these names
all consistent.

David Rowley, minor additional docs hacking by me

Discussion: https://postgr.es/m/CAKJS1f8O0njDKe8ePFQ-LK5-EjwThsDws6ohJ-+c6nWK+oUxtg@mail.gmail.com
2017-05-11 11:49:59 -04:00
Alvaro Herrera b66adb7b0c Revert "Permit dump/reload of not-too-large >1GB tuples"
This reverts commits fa2fa99552 and 42f50cb8fa.

While the functionality that was intended to be provided by these
commits is desired, the patch didn't actually solve as many of the
problematic situations as we hoped, and it created a bunch of its own
problems.  Since we're going to require more extensive changes soon for
other reasons and users have been working around these problems for a
long time already, there is no point in spending effort in fixing this
halfway measure.

Per complaint from Tom Lane.
Discussion: https://postgr.es/m/21407.1484606922@sss.pgh.pa.us

(Commit fa2fa99552 had already been reverted in branches 9.5 as
f858524ee4 and 9.6 as e9e44a0953, so this touches master only.
Commit 42f50cb8fa was not present in the older branches.)
2017-05-10 18:41:27 -03:00
Peter Eisentraut b83f4e4a25 psql: Add missing translation markers 2017-05-10 10:15:14 -04:00
Robert Haas 622c82279d Avoid theoretical infinite loop loading relcache partition key.
Amit Langote, per report from 甄明洋

Discussion: http://postgr.es/m/57bd1e1.1886.15bd7b79cee.Coremail.18612389267@yeah.net
2017-05-09 23:53:35 -04:00
Robert Haas a5775991bb Remove no-longer-needed compatibility code for hash indexes.
Because commit ea69a0dead bumped the
HASH_VERSION, we don't need to worry about PostgreSQL 10 seeing
bucket pages from earlier versions.

Amit Kapila

Discussion: http://postgr.es/m/CAA4eK1LAo4DGwh+mi-G3U8Pj1WkBBeFL38xdCnUHJv1z4bZFkQ@mail.gmail.com
2017-05-09 23:44:21 -04:00
Robert Haas df1a4eba94 Fix typos in comments.
Etsuro Fujita

Discussion: http://postgr.es/m/968d99bf-0fa8-085b-f0a1-a379f8d661ff@lab.ntt.co.jp
2017-05-09 23:40:08 -04:00
Robert Haas 9e6104c667 Prohibit transition tables on views and foreign tables.
Thomas Munro, per off-list report from Prabhat Sabu.  Changes
to the message wording for consistency with the existing
relkind check for partitioned tables by me.

Discussion: http://postgr.es/m/CAEepm=2xJFFpGM+N=gpWx-9Nft2q1oaFZX07_y23AHCrJQLt0g@mail.gmail.com
2017-05-09 23:34:02 -04:00
Robert Haas 29fd3d9da0 Don't permit transition tables with TRUNCATE triggers.
Prior to this prohibition, such a trigger caused a crash.

Thomas Munro, per a report from Neha Sharma.  I added a
regression test.

Discussion: http://postgr.es/m/CAEepm=0VR5W-N38eTkO_FqJbGqQ_ykbBRmzmvHyxDhy1p=0Csw@mail.gmail.com
2017-05-09 23:24:23 -04:00
Robert Haas 304007d9f1 Pass EXEC_FLAG_REWIND when initializing a tuplestore scan.
Since a rescan is possible, we must be able to rewind.

Thomas Munro, per a report from Prabhat Sabu

Discussion: http://postgr.es/m/CAEepm=2=Uv5fm=exqL+ygBxaO+-tgmC=o+63H4zYAXi9HtXf1w@mail.gmail.com
2017-05-09 23:13:21 -04:00
Robert Haas 3439f84475 Disallow finite partition bound following earlier UNBOUNDED column.
Amit Langote, per an observation by me.

Discussion: http://postgr.es/m/CA+TgmoYWnV2GMnYLG-Czsix-E1WGAbo4D+0tx7t9NdfYBDMFsA@mail.gmail.com
2017-05-09 22:41:12 -04:00
Peter Eisentraut 489b96e80b Improve memory use in logical replication apply
Previously, the memory used by the logical replication apply worker for
processing messages would never be freed, so that could end up using a
lot of memory.  To improve that, change the existing ApplyContext memory
context to ApplyMessageContext and reset that after every
message (similar to MessageContext used elsewhere).  For consistency of
naming, rename the ApplyCacheContext to ApplyContext.

Author: Stas Kelvich <s.kelvich@postgrespro.ru>
2017-05-09 14:51:49 -04:00
Alvaro Herrera e0bf16060b Ignore PQcancel errors properly
Add a (void) cast to all PQcancel() calls that purposefully don't check
the return value, to keep compilers and static checkers happy.

Per Coverity.
2017-05-09 14:58:51 -03:00
Peter Eisentraut 26aa1cf376 pg_dump: Add --no-subscriptions option
Author: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-05-09 10:58:06 -04:00
Peter Eisentraut 013c1178fd Remove the NODROP SLOT option from DROP SUBSCRIPTION
It turned out this approach had problems, because a DROP command should
not have any options other than CASCADE and RESTRICT.  Instead, always
attempt to drop the slot if there is one configured, but also add an
ALTER SUBSCRIPTION action to set the slot to NONE.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/29431.1493730652@sss.pgh.pa.us
2017-05-09 10:20:42 -04:00
Bruce Momjian c4c493fd35 pgindent: use HTTP instead of FTP to retrieve pg_bsd_indent src
FTP support will be removed from ftp.postgresql.org in months, but http
still works.  Typedefs already used http.
2017-05-09 09:28:44 -04:00
Tom Lane da07596006 Further patch rangetypes_selfuncs.c's statistics slot management.
Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8,
not of the type of the column the statistics are for.

This bug is at least partly the fault of sloppy specification comments
for get_attstatsslot()/free_attstatsslot(): the type OID they want is that
of the stavalues entries, not of the underlying column.  (I double-checked
other callers and they seem to get this right.)  Adjust the comments to be
more correct.

Per buildfarm.

Security: CVE-2017-7484
2017-05-08 15:03:14 -04:00
Peter Eisentraut fe974cc5a6 Check connection info string in ALTER SUBSCRIPTION
Previously it would allow an invalid connection string to be set.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: tushar <tushar.ahuja@enterprisedb.com>
2017-05-08 14:01:00 -04:00
Peter Eisentraut 9a591c1bcc Fix statistics reporting in logical replication workers
This new arrangement ensures that statistics are reported right after
commit of transactions.  The previous arrangement didn't get this quite
right and could lead to assertion failures.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Erik Rijkers <er@xs4all.nl>
2017-05-08 12:10:22 -04:00
Tom Lane b6576e5914 Fix possibly-uninitialized variable.
Oversight in e2d4ef8de et al (my fault not Peter's).  Per buildfarm.

Security: CVE-2017-7484
2017-05-08 11:18:40 -04:00
Noah Misch 3eefc51053 Match pg_user_mappings limits to information_schema.user_mapping_options.
Both views replace the umoptions field with NULL when the user does not
meet qualifications to see it.  They used different qualifications, and
pg_user_mappings documented qualifications did not match its implemented
qualifications.  Make its documentation and implementation match those
of user_mapping_options.  One might argue for stronger qualifications,
but these have long, documented tenure.  pg_user_mappings has always
exhibited this problem, so back-patch to 9.2 (all supported versions).

Michael Paquier and Feike Steenbergen.  Reviewed by Jeff Janes.
Reported by Andrew Wheelwright.

Security: CVE-2017-7486
2017-05-08 07:24:24 -07:00
Noah Misch 0170b10dff Restore PGREQUIRESSL recognition in libpq.
Commit 65c3bf19fd moved handling of the,
already then, deprecated requiressl parameter into conninfo_storeval().
The default PGREQUIRESSL environment variable was however lost in the
change resulting in a potentially silent accept of a non-SSL connection
even when set.  Its documentation remained.  Restore its implementation.
Also amend the documentation to mark PGREQUIRESSL as deprecated for
those not following the link to requiressl.  Back-patch to 9.3, where
commit 65c3bf1 first appeared.

Behavior has been more complex when the user provides both deprecated
and non-deprecated settings.  Before commit 65c3bf1, libpq operated
according to the first of these found:

  requiressl=1
  PGREQUIRESSL=1
  sslmode=*
  PGSSLMODE=*

(Note requiressl=0 didn't override sslmode=*; it would only suppress
PGREQUIRESSL=1 or a previous requiressl=1.  PGREQUIRESSL=0 had no effect
whatsoever.)  Starting with commit 65c3bf1, libpq ignored PGREQUIRESSL,
and order of precedence changed to this:

  last of requiressl=* or sslmode=*
  PGSSLMODE=*

Starting now, adopt the following order of precedence:

  last of requiressl=* or sslmode=*
  PGSSLMODE=*
  PGREQUIRESSL=1

This retains the 65c3bf1 behavior for connection strings that contain
both requiressl=* and sslmode=*.  It retains the 65c3bf1 change that
either connection string option overrides both environment variables.
For the first time, PGSSLMODE has precedence over PGREQUIRESSL; this
avoids reducing security of "PGREQUIRESSL=1 PGSSLMODE=verify-full"
configurations originating under v9.3 and later.

Daniel Gustafsson

Security: CVE-2017-7485
2017-05-08 07:24:24 -07:00
Peter Eisentraut e2d4ef8de8 Add security checks to selectivity estimation functions
Some selectivity estimation functions run user-supplied operators over
data obtained from pg_statistic without security checks, which allows
those operators to leak pg_statistic data without having privileges on
the underlying tables.  Fix by checking that one of the following is
satisfied: (1) the user has table or column privileges on the table
underlying the pg_statistic data, or (2) the function implementing the
user-supplied operator is leak-proof.  If neither is satisfied, planning
will proceed as if there are no statistics available.

At least one of these is satisfied in most cases in practice.  The only
situations that are negatively impacted are user-defined or
not-leak-proof operators on a security-barrier view.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Peter Eisentraut <peter_e@gmx.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>

Security: CVE-2017-7484
2017-05-08 09:26:32 -04:00
Heikki Linnakangas eb61136dc7 Remove support for password_encryption='off' / 'plain'.
Storing passwords in plaintext hasn't been a good idea for a very long
time, if ever. Now seems like a good time to finally forbid it, since we're
messing with this in PostgreSQL 10 anyway.

Remove the CREATE/ALTER USER UNENCRYPTED PASSSWORD 'foo' syntax, since
storing passwords unencrypted is no longer supported. ENCRYPTED PASSWORD
'foo' is still accepted, but ENCRYPTED is now just a noise-word, it does
the same as just PASSWORD 'foo'.

Likewise, remove the --unencrypted option from createuser, but accept
--encrypted as a no-op for backward compatibility. AFAICS, --encrypted was
a no-op even before this patch, because createuser encrypted the password
before sending it to the server even if --encrypted was not specified. It
added the ENCRYPTED keyword to the SQL command, but since the password was
already in encrypted form, it didn't make any difference. The documentation
was not clear on whether that was intended or not, but it's moot now.

Also, while password_encryption='on' is still accepted as an alias for
'md5', it is now marked as hidden, so that it is not listed as an accepted
value in error hints, for example. That's not directly related to removing
'plain', but it seems better this way.

Reviewed by Michael Paquier

Discussion: https://www.postgresql.org/message-id/16e9b768-fd78-0b12-cfc1-7b6b7f238fde@iki.fi
2017-05-08 11:26:07 +03:00
Simon Riggs 1f30295eab Remove poorly worded and duplicated comment
Move line of code to avoid need for duplicated comment

Brought to attention by Masahiko Sawada
2017-05-08 08:49:28 +01:00
Heikki Linnakangas 0186ded546 Fix memory leaks if random salt generation fails.
In the backend, this is just to silence coverity warnings, but in the
frontend, it's a genuine leak, even if extremely rare.

Spotted by Coverity, patch by Michael Paquier.
2017-05-07 19:58:21 +03:00
Tom Lane a54d5875fe Guard against null t->tm_zone in strftime.c.
The upstream IANA code does not guard against null TM_ZONE pointers in this
function, but in our code there is such a check in the other pre-existing
use of t->tm_zone.  We do have some places that set pg_tm.tm_zone to NULL.
I'm not entirely sure it's possible to reach strftime with such a value,
but I'm not sure it isn't either, so be safe.

Per Coverity complaint.
2017-05-07 12:33:12 -04:00
Tom Lane d4e59c5521 Install the "posixrules" timezone link in MSVC builds.
Somehow, we'd missed ever doing this.  The consequences aren't too
severe: basically, the timezone library would fall back on its hardwired
notion of the DST transition dates to use for a POSIX-style zone name,
rather than obeying US/Eastern which is the intended behavior.  The net
effect would only be to obey current US DST law further back than it
ought to apply; so it's not real surprising that nobody noticed.

David Rowley, per report from Amit Kapila

Discussion: https://postgr.es/m/CAA4eK1LC7CaNhRAQ__C3ht1JVrPzaAXXhEJRnR5L6bfYHiLmWw@mail.gmail.com
2017-05-07 11:57:41 -04:00
Tom Lane 5788a5670e Restore fullname[] contents before falling through in pg_open_tzfile().
Fix oversight in commit af2c5aa88: if the shortcut open() doesn't work,
we need to reset fullname[] to be just the name of the toplevel tzdata
directory before we fall through into the pre-existing code.  This failed
to be exposed in my (tgl's) testing because the fall-through path is
actually never taken under normal circumstances.

David Rowley, per report from Amit Kapila

Discussion: https://postgr.es/m/CAA4eK1LC7CaNhRAQ__C3ht1JVrPzaAXXhEJRnR5L6bfYHiLmWw@mail.gmail.com
2017-05-07 11:34:31 -04:00
Stephen Frost 09f8421819 pg_dump: Don't leak memory in buildDefaultACLCommands()
buildDefaultACLCommands() didn't destroy the string buffer created in
certain cases, leading to a memory leak.  Fix by destroying the buffer
before returning from the function.

Spotted by Coverity.

Author: Michael Paquier

Back-patch to 9.6 where buildDefaultACLCommands() was added.
2017-05-06 22:58:12 -04:00
Stephen Frost aa5d3c0b3f RLS: Fix ALL vs. SELECT+UPDATE policy usage
When we add the SELECT-privilege based policies to the RLS with check
options (such as for an UPDATE statement, or when we have INSERT ...
RETURNING), we need to be sure and use the 'USING' case if the policy is
actually an 'ALL' policy (which could have both a USING clause and an
independent WITH CHECK clause).

This could result in policies acting differently when built using ALL
(when the ALL had both USING and WITH CHECK clauses) and when building
the policies independently as SELECT and UPDATE policies.

Fix this by adding an explicit boolean to add_with_check_options() to
indicate when the USING policy should be used, even if the policy has
both USING and WITH CHECK policies on it.

Reported by: Rod Taylor

Back-patch to 9.5 where RLS was introduced.
2017-05-06 21:46:35 -04:00
Andres Freund b58c433ef9 Fix duplicated words in comment.
Reported-By: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-Wzn3rY2N0gTWndaApD113T+O8L6oz8cm7_F3P8y4awdoOg@mail.gmail.com
Backpatch: no, only present in master
2017-05-06 17:03:45 -07:00
Andres Freund e6c44eef55 Fix off-by-one possibly leading to skipped XLOG_RUNNING_XACTS records.
Since 6ef2eba3f5 ("Skip checkpoints, archiving on idle systems."),
GetLastImportantRecPtr() is used to avoid performing superfluous
checkpoints, xlog switches, running-xact records when the system is
idle.  Unfortunately the check concerning running-xact records had a
off-by-one error, leading to such records being potentially skipped
when only a single record has been inserted since the last
running-xact record.

An alternative approach would have been to change
GetLastImportantRecPtr()'s definition to point to the end of records,
but that would make the checkpoint code more complicated.

Author: Andres Freund
Discussion: https://postgr.es/m/20170505012447.wsrympaxnfis6ojt@alap3.anarazel.de
Backpatch: no, code only present in master
2017-05-06 16:55:07 -07:00
Tom Lane b3a47cdfd6 Suppress compiler warning about unportable pointer value.
Setting a pointer value to "0xdeadbeef" draws a warning from some
compilers, and for good reason.  Be less cute and just set it to NULL.

In passing make some other cosmetic adjustments nearby.

Discussion: https://postgr.es/m/CAJrrPGdW3EkU-CRobvVKYf3fJuBdgWyuGeAbNzAQ4yBh+bfb_Q@mail.gmail.com
2017-05-05 12:46:04 -04:00
Alvaro Herrera 14722c69f9 Allow MSVC to build with Tcl 8.6.
Commit eaba54c20c added support for Tcl 8.6 for configure-supported
platforms after verifying that pltcl works without further changes, but
the MSVC tooling wasn't updated accordingly.  Update MSVC to match,
restructuring the code to avoid duplicating the logic for every Tcl
version supported.

Backpatch to all live branches, like eaba54c20c.  In 9.4 and previous,
change the patch to use backslashes rather than forward, as in the rest
of the file.

Reported by Paresh More, who also tested the patch I provided.
Discussion: https://postgr.es/m/CAAgiCNGVw3ssBtSi3ZNstrz5k00ax=UV+_ZEHUeW_LMSGL2sew@mail.gmail.com
2017-05-05 12:38:29 -03:00
Peter Eisentraut 086221cf6b Prevent panic during shutdown checkpoint
When the checkpointer writes the shutdown checkpoint, it checks
afterwards whether any WAL has been written since it started and throws
a PANIC if so.  At that point, only walsenders are still active, so one
might think this could not happen, but walsenders can also generate WAL,
for instance in BASE_BACKUP and certain variants of
CREATE_REPLICATION_SLOT.  So they can trigger this panic if such a
command is run while the shutdown checkpoint is being written.

To fix this, divide the walsender shutdown into two phases.  First, the
postmaster sends a SIGUSR2 signal to all walsenders.  The walsenders
then put themselves into the "stopping" state.  In this state, they
reject any new commands.  (For simplicity, we reject all new commands,
so that in the future we do not have to track meticulously which
commands might generate WAL.)  The checkpointer waits for all walsenders
to reach this state before proceeding with the shutdown checkpoint.
After the shutdown checkpoint is done, the postmaster sends
SIGINT (previously unused) to the walsenders.  This triggers the
existing shutdown behavior of sending out the shutdown checkpoint record
and then terminating.

Author: Michael Paquier <michael.paquier@gmail.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-05-05 10:31:42 -04:00
Magnus Hagander 499ae5f5db Fix wording in pg_upgrade docs
Author: Daniel Gustafsson
2017-05-05 12:42:21 +02:00
Magnus Hagander 28d1c8ccc8 Build pgoutput.dll in MSVC build
Without this, logical replication obviously does not work on Windows

MauMau, with clean.bet additions from me per note from Michael Paquier
2017-05-05 12:08:48 +02:00
Heikki Linnakangas 0557a5dc2c Make SCRAM salts and nonces longer.
The salt is stored base64-encoded. With the old 10 bytes raw length, it was
always padded to 16 bytes after encoding. We might as well use 12 raw bytes
for the salt, and it's still encoded into 16 bytes.

Similarly for the random nonces, use a raw length that's divisible by 3, so
that there's no padding after base64 encoding. Make the nonces longer while
we're at it. 10 bytes was probably enough to prevent replay attacks, but
there's no reason to be skimpy here.

Per suggestion from Álvaro Hernández Tortosa.

Discussion: https://www.postgresql.org/message-id/df8c6e27-4d8e-5281-96e5-131a4e638fc8@8kdata.com
2017-05-05 10:02:13 +03:00
Heikki Linnakangas e6e9c4da3a Misc cleanup of SCRAM code.
* Remove is_scram_verifier() function. It was unused.
* Fix sanitize_char() function, used in error messages on protocol
  violations, to print bytes >= 0x7F correctly.
* Change spelling of scram_MockSalt() function to be more consistent with
  the surroundings.
* Change a few more references to "server proof" to "server signature" that
  I missed in commit d981074c24.
2017-05-05 10:01:44 +03:00
Heikki Linnakangas 344a113079 Don't use SCRAM-specific "e=invalid-proof" on invalid password.
Instead, send the same FATAL message as with other password-based
authentication mechanisms. This gives a more user-friendly message:

psql: FATAL:  password authentication failed for user "test"

instead of:

psql: error received from server in SASL exchange: invalid-proof

Even before this patch, the server sent that FATAL message, after the
SCRAM-specific "e=invalid-proof" message. But libpq would stop at the
SCRAM error message, and not process the ErrorResponse that would come
after that. We could've taught libpq to check for an ErrorResponse after
failed authentication, but it's simpler to modify the server to send only
the ErrorResponse. The SCRAM specification allows for aborting the
authentication at any point, using an application-defined error mechanism,
like PostgreSQL's ErrorResponse. Using the e=invalid-proof message is
optional.

Reported by Jeff Janes.

Discussion: https://www.postgresql.org/message-id/CAMkU%3D1w3jQ53M1OeNfN8Cxd9O%2BA_9VONJivTbYoYRRdRsLT6vA@mail.gmail.com
2017-05-05 10:01:41 +03:00
Stephen Frost 44c528810a Change the way pg_dump retrieves partitioning info
This gets rid of the code that issued separate queries to retrieve the
partitioning parent-child relationship, parent partition key, and child
partition bound information.  With this patch, the information is
retrieved instead using the queries issued from getTables() and
getInherits(), which is both more efficient than the previous approach
and doesn't require any new code.

Since the partitioning parent-child relationship is now retrieved with
the same old code that handles inheritance, partition attributes receive
a proper flagInhAttrs() treatment (that it didn't receive before), which
is needed so that the inherited NOT NULL constraints are not emitted if
we already emitted it for the parent.

Also, fix a bug in pg_dump's --binary-upgrade code, which caused pg_dump
to emit invalid command to attach a partition to its parent.

Author: Amit Langote, with some additional changes by me.
2017-05-04 22:17:52 -04:00
Tom Lane 3f074845a8 Fix pfree-of-already-freed-tuple when rescanning a GiST index-only scan.
GiST's getNextNearest() function attempts to pfree the previously-returned
tuple if any (that is, scan->xs_hitup in HEAD, or scan->xs_itup in older
branches).  However, if we are rescanning a plan node after ending a
previous scan early, those tuple pointers could be pointing to garbage,
because they would be pointing into the scan's pageDataCxt or queueCxt
which has been reset.  In a debug build this reliably results in a crash,
although I think it might sometimes accidentally fail to fail in
production builds.

To fix, clear the pointer field anyplace we reset a context it might
be pointing into.  This may be overkill --- I think probably only the
queueCxt case is involved in this bug, so that resetting in gistrescan()
would be sufficient --- but dangling pointers are generally bad news,
so let's avoid them.

Another plausible answer might be to just not bother with the pfree in
getNextNearest().  The reconstructed tuples would go away anyway in the
context resets, and I'm far from convinced that freeing them a bit earlier
really saves anything meaningful.  I'll stick with the original logic in
this patch, but if we find more problems in the same area we should
consider that approach.

Per bug #14641 from Denis Smirnov.  Back-patch to 9.5 where this
logic was introduced.

Discussion: https://postgr.es/m/20170504072034.24366.57688@wrigleys.postgresql.org
2017-05-04 13:59:39 -04:00
Heikki Linnakangas 20bf7b2b0a Fix PQencryptPasswordConn to work with older server versions.
password_encryption was a boolean before version 10, so cope with "on" and
"off".

Also, change the behavior with "plain", to treat it the same as "md5".
We're discussing removing the password_encryption='plain' option from the
server altogether, which will make this the only reasonable choice, but
even if we kept it, it seems best to never send the password in cleartext.
2017-05-04 12:28:25 +03:00
Peter Eisentraut 0de791ed76 Fix cursor_to_xml in tableforest false mode
It only produced <row> elements but no wrapping <table> element.

By contrast, cursor_to_xmlschema produced a schema that is now correct
but did not previously match the XML data produced by cursor_to_xml.

In passing, also fix a minor misunderstanding about moving cursors in
the tests related to this.

Reported-by: filip@jirsak.org
Based-on-patch-by: Thomas Munro <thomas.munro@enterprisedb.com>
2017-05-03 21:41:10 -04:00
Tom Lane 4dd4104342 Remove useless and rather expensive stanza in matview regression test.
This removes a test case added by commit b69ec7cc9, which was intended
to exercise a corner case involving the rule used at that time that
materialized views were unpopulated iff they had physical size zero.
We got rid of that rule very shortly later, in commit 1d6c72a55, but
kept the test case.  However, because the case now asks what VACUUM
will do to a zero-sized physical file, it would be pretty surprising
if the answer were ever anything but "nothing" ... and if things were
indeed that broken, surely we'd find it out from other tests.  Since
the test involves a table that's fairly large by regression-test
standards (100K rows), it's quite slow to run.  Dropping it should
save some buildfarm cycles, so let's do that.

Discussion: https://postgr.es/m/32386.1493831320@sss.pgh.pa.us
2017-05-03 19:37:01 -04:00
Alvaro Herrera a93077ef46 Add pg_dump tests for CREATE STATISTICS
CREATE STATISTICS pg_dump support code was not covered at all by
previous tests.

Discussion: https://postgr.es/m/20170503172746.rwftidszir67sgk7@alvherre.pgsql
2017-05-03 15:52:00 -03:00
Alvaro Herrera 698923d658 pg_dump/t/002: append terminating semicolon to SQL commands
It's easy to overlook the need for one, and its lack is annoying for the
next developer wanting to create a new test.  Rather than expect every
individual command to add the semicolon, just append one automatically.

Discussion: http://postgr.es/m/20170503172746.rwftidszir67sgk7@alvherre.pgsql
2017-05-03 15:12:09 -03:00
Heikki Linnakangas 8f8b9be51f Add PQencryptPasswordConn function to libpq, use it in psql and createuser.
The new function supports creating SCRAM verifiers, in addition to md5
hashes. The algorithm is chosen based on password_encryption, by default.

This fixes the issue reported by Jeff Janes, that there was previously
no way to create a SCRAM verifier with "\password".

Michael Paquier and me

Discussion: https://www.postgresql.org/message-id/CAMkU%3D1wfBgFPbfAMYZQE78p%3DVhZX7nN86aWkp0QcCp%3D%2BKxZ%3Dbg%40mail.gmail.com
2017-05-03 11:19:07 +03:00
Tom Lane af2c5aa88d Improve performance of timezone loading, especially pg_timezone_names view.
tzparse() would attempt to load the "posixrules" timezone database file on
each call.  That might seem like it would only be an issue when selecting a
POSIX-style zone name rather than a zone defined in the timezone database,
but it turns out that each zone definition file contains a POSIX-style zone
string and tzload() will call tzparse() to parse that.  Thus, when scanning
the whole timezone file tree as we do in the pg_timezone_names view,
"posixrules" was read repetitively for each zone definition file.  Fix
that by caching the file on first use within any given process.  (We cache
other zone definitions for the life of the process, so there seems little
reason not to cache this one as well.)  This probably won't help much in
processes that never run pg_timezone_names, but even one additional SET
of the timezone GUC would come out ahead.

An even worse problem for pg_timezone_names is that pg_open_tzfile()
has an inefficient way of identifying the canonical case of a zone name:
it basically re-descends the directory tree to the zone file.  That's not
awful for an individual "SET timezone" operation, but it's pretty horrid
when we're inspecting every zone in the database.  And it's pointless too
because we already know the canonical spelling, having just read it from
the filesystem.  Fix by teaching pg_open_tzfile() to avoid the directory
search if it's not asked for the canonical name, and backfilling the
proper result in pg_tzenumerate_next().

In combination these changes seem to make the pg_timezone_names view
about 3x faster to read, for me.  Since a scan of pg_timezone_names
has up to now been one of the slowest queries in the regression tests,
this should help some little bit for buildfarm cycle times.

Back-patch to all supported branches, not so much because it's likely
that users will care much about the view's performance as because
tracking changes in the upstream IANA timezone code is really painful
if we don't keep all the branches in sync.

Discussion: https://postgr.es/m/27962.1493671706@sss.pgh.pa.us
2017-05-02 21:50:35 -04:00
Tom Lane 23c6eb0336 Remove create_singleton_array(), hard-coding the case in its sole caller.
create_singleton_array() was not really as useful as we perhaps thought
when we added it.  It had never accreted more than one call site, and is
only saving a dozen lines of code at that one, which is considerably less
bulk than the function itself.  Moreover, because of its insistence on
using the caller's fn_extra cache space, it's arguably a coding hazard.
text_to_array_internal() does not currently use fn_extra in any other way,
but if it did it would be subtly broken, since the conflicting fn_extra
uses could be needed within a single query, in the seldom-tested case that
the field separator varies during the query.  The same objection seems
likely to apply to any other potential caller.

The replacement code is a bit uglier, because it hardwires knowledge of
the storage parameters of type TEXT, but it's not like we haven't got
dozens or hundreds of other places that do the same.  Uglier seems like
a good tradeoff for smaller, faster, and safer.

Per discussion with Neha Khatri.

Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com
2017-05-02 20:41:37 -04:00
Tom Lane 9209e07605 Ensure commands in extension scripts see the results of preceding DDL.
Due to a missing CommandCounterIncrement() call, parsing of a non-utility
command in an extension script would not see the effects of the immediately
preceding DDL command, unless that command's execution ends with
CommandCounterIncrement() internally ... which some do but many don't.
Report by Philippe Beaudoin, diagnosis by Julien Rouhaud.

Rather remarkably, this bug has evaded detection since extensions were
invented, so back-patch to all supported branches.

Discussion: https://postgr.es/m/2cf7941e-4e41-7714-3de8-37b1a8f74dff@free.fr
2017-05-02 18:06:09 -04:00
Alvaro Herrera 93bbeec6a2 extstats: change output functions to emit valid JSON
Manipulating extended statistics is more convenient as JSON than the
current ad-hoc format, so let's change before it's too late.

Discussion: https://postgr.es/m/20170420193828.k3fliiock5hdnehn@alvherre.pgsql
2017-05-02 18:49:32 -03:00
Robert Haas 0d1e1f0ea4 Fix typos in comments.
Etsuro Fujita

Discussion: http://postgr.es/m/00e88999-684d-d79a-70e4-908c937a0126@lab.ntt.co.jp
2017-05-02 14:47:46 -04:00
Peter Eisentraut 3d092fe540 Avoid unnecessary catalog updates in ALTER SEQUENCE
ALTER SEQUENCE can do nontransactional changes to the sequence (RESTART
clause) and transactional updates to the pg_sequence catalog (most other
clauses).  When just calling RESTART, the code would still needlessly do
a catalog update without any changes.  This would entangle that
operation in the concurrency issues of a catalog update (causing either
locking or concurrency errors, depending on how that issue is to be
resolved).

Fix by keeping track during options parsing whether a catalog update is
needed, and skip it if not.

Reported-by: Jason Petersen <jason@citusdata.com>
2017-05-02 10:41:48 -04:00
Andrew Dunstan 9a0d2008c3 Fix perl thinko in commit fed6df486d
Report and fix from Vaishnavi Prabakaran

Backpatch to 9.4 like original.
2017-05-02 08:20:11 -04:00
Magnus Hagander 34fc616738 Change hot_standby default value to 'on'
This goes together with the changes made to enable replication on the
sending side by default (wal_level, max_wal_senders etc) by making the
receiving stadby node also enable it by default.

Huong Dangminh
2017-05-02 11:12:30 +02:00
Peter Eisentraut a99448ab45 Don't wake up logical replication launcher unnecessarily
In CREATE SUBSCRIPTION, only wake up the launcher when the subscription
is enabled.

Author: Fujii Masao <masao.fujii@gmail.com>
2017-05-01 22:50:32 -04:00
Tom Lane 54affb41e7 Improve function header comment for create_singleton_array().
Mentioning the caller is neither future-proof nor an adequate substitute
for giving an API specification.  Per gripe from Neha Khatri, though
I changed the patch around some.

Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com
2017-05-01 15:31:41 -04:00
Tom Lane 92a43e4857 Reduce semijoins with unique inner relations to plain inner joins.
If the inner relation can be proven unique, that is it can have no more
than one matching row for any row of the outer query, then we might as
well implement the semijoin as a plain inner join, allowing substantially
more freedom to the planner.  This is a form of outer join strength
reduction, but it can't be implemented in reduce_outer_joins() because
we don't have enough info about the individual relations at that stage.
Instead do it much like remove_useless_joins(): once we've built base
relations, we can make another pass over the SpecialJoinInfo list and
get rid of any entries representing reducible semijoins.

This is essentially a followon to the inner-unique patch (commit 9c7f5229a)
and makes use of the proof machinery that that patch created.  We need only
minor refactoring of innerrel_is_unique's API to support this usage.

Per performance complaint from Teodor Sigaev.

Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
2017-05-01 14:53:42 -04:00
Tom Lane 2057a58d16 Fix mis-optimization of semijoins with more than one LHS relation.
The inner-unique patch (commit 9c7f5229a) supposed that if we're
considering a JOIN_UNIQUE_INNER join path, we can always set inner_unique
for the join, because the inner path produced by create_unique_path should
be unique relative to the outer relation.  However, that's true only if
we're considering joining to the whole outer relation --- otherwise we may
be applying only some of the join quals, and so the inner path might be
non-unique from the perspective of this join.  Adjust the test to only
believe that we can set inner_unique if we have the whole semijoin LHS on
the outer side.

There is more that can be done in this area, but this commit is only
intended to provide the minimal fix needed to get correct plans.

Per report from Teodor Sigaev.  Thanks to David Rowley for preliminary
investigation.

Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
2017-05-01 14:39:11 -04:00
Tom Lane 74a20d0ab7 Update time zone data files to tzdata release 2017b.
DST law changes in Chile, Haiti, and Mongolia.  Historical corrections for
Ecuador, Kazakhstan, Liberia, and Spain.

The IANA crew continue their campaign to replace invented time zone
abbrevations with numeric GMT offsets.  This update changes numerous zones
in South America, the Pacific and Indian oceans, and some Asian and Middle
Eastern zones.  I kept these abbreviations in the tznames/ data files,
however, so that we will still accept them for input.  (We may want to
start trimming those files someday, but I think we should wait for the
upstream dust to settle before deciding what to do.)

In passing, add MESZ (Mitteleuropaeische Sommerzeit) to the tznames lists;
since we accept MEZ (Mitteleuropaeische Zeit) it seems rather strange not
to take the other one.  And fix some incorrect, or at least obsolete,
comments that certain abbreviations are not traceable to the IANA data.
2017-05-01 11:53:11 -04:00
Robert Haas bdac9836d3 libpq: Fix inadvertent change in .pgpass lookup behavior.
Commit 274bb2b385 caused password file
lookups to use the hostaddr in preference to the host, but that was
not intended and the documented behavior is the opposite.

Report and patch by Kyotaro Horiguchi.

Discussion: http://postgr.es/m/20170428.165432.60857995.horiguchi.kyotaro@lab.ntt.co.jp
2017-05-01 11:29:00 -04:00
Andrew Dunstan fed6df486d Allow vcregress.pl to run an arbitrary TAP test set
Currently only provision for running the bin checks in a single step is
provided for. Now these tests can be run individually, as well as tests
in other locations (e.g. src.test/recover).

Also provide for suppressing unnecessary temp installs by setting the
NO_TEMP_INSTALL environment variable just as the Makefiles do.

Backpatch to 9.4.
2017-05-01 10:57:51 -04:00
Peter Eisentraut 9414e41ea7 Fix logical replication launcher wake up and reset
After the logical replication launcher was told to wake up at
commit (for example, by a CREATE SUBSCRIPTION command), the flag to wake
up was not reset, so it would be woken up at every following commit as
well.  So fix that by resetting the flag.

Also, we don't need to wake up anything if the transaction was rolled
back.  Just reset the flag in that case.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-05-01 10:18:09 -04:00
Robert Haas e180c8aa8c Fire per-statement triggers on partitioned tables.
Even though no actual tuples are ever inserted into a partitioned
table (the actual tuples are in the partitions, not the partitioned
table itself), we still need to have a ResultRelInfo for the
partitioned table, or per-statement triggers won't get fired.

Amit Langote, per a report from Rajkumar Raghuwanshi.  Reviewed by me.

Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
2017-05-01 08:23:01 -04:00
Tom Lane e18b2c480d Sync our copy of the timezone library with IANA release tzcode2017b.
zic no longer mishandles some transitions in January 2038 when it
attempts to work around Qt bug 53071.  This fixes a bug affecting
Pacific/Tongatapu that was introduced in zic 2016e.  localtime.c
now contains a workaround, useful when loading a file generated by
a buggy zic.

There are assorted cosmetic changes as well, notably relocation
of a bunch of #defines.
2017-04-30 15:13:51 -04:00
Tom Lane 12d11432b4 Fix possible null pointer dereference or invalid warning message.
Thinko in commit de4389712: this warning message references the wrong
"LogicalRepWorker *" variable.  This would often result in a core dump,
but if it didn't, the message would show the wrong subscription OID.

In passing, adjust the message text to format a subscription OID
similarly to how that's done elsewhere in the function; and fix
grammatical issues in some nearby messages.

Per Coverity testing.
2017-04-30 12:21:02 -04:00
Tom Lane c23844212d Micro-optimize some slower queries in the opr_sanity regression test.
Convert the binary_coercible() and physically_coercible() functions from
SQL to plpgsql.  It's not that plpgsql is inherently better at doing
queries; if you simply convert the previous single SQL query into one
RETURN expression, it's no faster.  The problem with the existing code
is that it fools the plancache into deciding that it's worth re-planning
the query every time, since constant-folding with a concrete value for $2
allows elimination of at least one sub-SELECT.  In reality that's using the
planner to do the equivalent of a few runtime boolean tests, causing the
function to run much slower than it should.  Splitting the AND/OR logic
into separate plpgsql statements allows each if-expression to acquire a
static plan.

Also, get rid of some uses of obj_description() in favor of explicitly
joining to pg_description, allowing the joins to be optimized better.
(Someday we might improve the SQL-function-inlining logic enough that
this happens automatically, but today is not that day.)

Together, these changes reduce the runtime of the opr_sanity regression
test by about a factor of two on one of my slower machines.  They don't
seem to help as much on a fast machine, but this should at least benefit
the buildfarm.
2017-04-29 20:14:52 -04:00
Robert Haas 6a4dda44e0 Fix VALIDATE CONSTRAINT to consider NO INHERIT attribute.
Currently, trying to validate a NO INHERIT constraint on the parent will
search for the constraint in child tables (where it is not supposed to
exist), wrongly causing a "constraint does not exist" error.

Amit Langote, per a report from Hans Buschmann.

Discussion: http://postgr.es/m/20170421184012.24362.19@wrigleys.postgresql.org
2017-04-28 14:48:38 -04:00
Peter Eisentraut e4fddfd492 psql: Support identity columns in sequence display
Where the footer for an owned serial sequence would say "Owned by", put
something analogous for a sequence belonging to an identity column.

Reported-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>
2017-04-28 14:43:36 -04:00
Robert Haas 5e1ccd4844 In load_relcache_init_file, initialize rd_pdcxt.
Oversight noted by Gao Zeng Qi.

Discussion: http://postgr.es/m/CAFmBtr1N3-SbepJbnGpaYp=jw-FvWMnYY7-bTtRgvjvbyB8YJA@mail.gmail.com
2017-04-28 14:05:13 -04:00
Robert Haas c1e0e7e1d7 Speed up dropping tables with many partitions.
We need to lock the parent, but we don't need a relcache entry
for it.

Gao Zeng Qi, reviewed by Amit Langote

Discussion: http://postgr.es/m/CAFmBtr0ukqJjRJEhPWL5wt4rNMrJUUxggVAGXPR3SyYh3E+HDQ@mail.gmail.com
2017-04-28 14:02:24 -04:00
Robert Haas 504c2205ab Fix crash when partitioned column specified twice.
Amit Langote, reviewed by Beena Emerson

Discussion: http://postgr.es/m/6ed23d3d-c09d-4cbc-3628-0a8a32f750f4@lab.ntt.co.jp
2017-04-28 13:52:17 -04:00
Peter Eisentraut e3cf708016 Wait between tablesync worker restarts
Before restarting a tablesync worker for the same relation, wait
wal_retrieve_retry_interval (currently 5s by default).  This avoids
restarting failing workers in a tight loop.

We keep the last start times in a hash table last_start_times that is
separate from the table_states list, because that list is cleared out on
syscache invalidation, which happens whenever a table finishes syncing.
The hash table is kept until all tables have finished syncing.

A future project might be to unify these two and keep everything in one
data structure, but for now this is a less invasive change to accomplish
the original purpose.

For the test suite, set wal_retrieve_retry_interval to its minimum
value, to not increase the test suite run time.

Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
2017-04-28 13:47:46 -04:00
Heikki Linnakangas d981074c24 Misc SCRAM code cleanups.
* Move computation of SaltedPassword to a separate function from
  scram_ClientOrServerKey(). This saves a lot of cycles in libpq, by
  computing SaltedPassword only once per authentication. (Computing
  SaltedPassword is expensive by design.)

* Split scram_ClientOrServerKey() into two functions. Improves
  readability, by making the calling code less verbose.

* Rename "server proof" to "server signature", to better match the
  nomenclature used in RFC 5802.

* Rename SCRAM_SALT_LEN to SCRAM_DEFAULT_SALT_LEN, to make it more clear
  that the salt can be of any length, and the constant only specifies how
  long a salt we use when we generate a new verifier. Also rename
  SCRAM_ITERATIONS_DEFAULT to SCRAM_DEFAULT_ITERATIONS, for consistency.

These things caught my eye while working on other upcoming changes.
2017-04-28 15:22:38 +03:00
Stephen Frost b9a3ef55b2 Remove unnecessairly duplicated gram.y productions
Declarative partitioning duplicated the TypedTableElement productions,
evidently to remove the need to specify WITH OPTIONS when creating
partitions.  Instead, simply make WITH OPTIONS optional in the
TypedTableElement production and remove all of the duplicate
PartitionElement-related productions.  This change simplifies the
syntax and makes WITH OPTIONS optional when adding defaults, constraints
or storage parameters to columns when creating either typed tables or
partitions.

Also update pg_dump to no longer include WITH OPTIONS, since it's not
necessary, and update the documentation to reflect that WITH OPTIONS is
now optional.
2017-04-27 20:14:39 -04:00
Andres Freund ab9c43381e Don't build full initial logical decoding snapshot if NOEXPORT_SNAPSHOT.
Earlier commits (56e19d938d and 2bef06d516) make it cheaper to
create a logical slot if not exporting the initial snapshot.  If
NOEXPORT_SNAPSHOT is specified, we can skip the overhead, not just
when creating a slot via sql (which can't export snapshots).  As
NOEXPORT_SNAPSHOT has only recently been introduced, this shouldn't be
backpatched.
2017-04-27 15:52:31 -07:00
Andres Freund 56e19d938d Don't use on-disk snapshots for exported logical decoding snapshot.
Logical decoding stores historical snapshots on disk, so that logical
decoding can restart without having to reconstruct a snapshot from
scratch (for which the resources are not guaranteed to be present
anymore).  These serialized snapshots were also used when creating a
new slot via the walsender interface, which can export a "full"
snapshot (i.e. one that can read all tables, not just catalog ones).

The problem is that the serialized snapshots are only useful for
catalogs and not for normal user tables.  Thus the use of such a
serialized snapshot could result in an inconsistent snapshot being
exported, which could lead to queries returning wrong data.  This
would only happen if logical slots are created while another logical
slot already exists.

Author: Petr Jelinek
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backport: 9.4, where logical decoding was introduced.
2017-04-27 15:29:15 -07:00
Tom Lane 7834d20b57 Avoid slow shutdown of pg_basebackup.
pg_basebackup's child process did not pay any attention to the pipe
from its parent while waiting for input from the source server.
If no server data was arriving, it would only wake up and check the
pipe every standby_message_timeout or so.  This creates a problem
since the parent process might determine and send the desired stop
position only after the server has reached end-of-WAL and stopped
sending data.  In the src/test/recovery regression tests, the timing
is repeatably such that it takes nearly 10 seconds for the child
process to realize that it should shut down.  It's not clear how
often that would happen in real-world cases, but it sure seems like
a bug --- and if the user turns off standby_message_timeout or sets
it very large, the delay could be a lot worse.

To fix, expand the StreamCtl API to allow the pipe input FD to be
passed down to the low-level wait routine, and watch both sockets
when sleeping.

(Note: AFAICS this issue doesn't affect the Windows port, since
it doesn't rely on a pipe to transfer the stop position to the
child thread.)

Discussion: https://postgr.es/m/6456.1493263884@sss.pgh.pa.us
2017-04-27 18:27:02 -04:00
Fujii Masao 9f11fcec66 Fix bug so logical rep launcher saves correctly time of last startup of worker.
Previously the logical replication launcher stored the last timestamp
when it started the worker, in the local variable "last_start_time",
in order to check whether wal_retrive_retry_interval elapsed since
the last startup of worker. If it has elapsed, the launcher sees
pg_subscription and starts new worker if necessary. This is for
limitting the startup of worker to once a wal_retrieve_retry_interval.

The bug was that the variable "last_start_time" was defined and
always initialized with 0 at the beginning of the launcher's main loop.
So even if it's set to the last timestamp in later phase of the loop,
it's always reset to 0. Therefore the launcher could not check
correctly whether wal_retrieve_retry_interval elapsed since
the last startup.

This patch moves the variable "last_start_time" outside the main loop
so that it will not be reset.

Reviewed-by: Petr Jelinek
Discussion: http://postgr.es/m/CAHGQGwGJrPO++XM4mFENAwpy1eGXKsGdguYv43GUgLgU-x8nTQ@mail.gmail.com
2017-04-28 06:35:00 +09:00
Tom Lane 82ebbeb0ab Cope with glibc too old to have epoll_create1().
Commit fa31b6f4e supposed that we didn't have to worry about that
anymore, but it seems that RHEL5 is like that, and that's still
a supported platform.  Put back the prior coding under an #ifdef,
adding an explicit fcntl() to retain the desired CLOEXEC property.

Discussion: https://postgr.es/m/12307.1493325329@sss.pgh.pa.us
2017-04-27 17:13:53 -04:00
Andres Freund 2bef06d516 Preserve required !catalog tuples while computing initial decoding snapshot.
The logical decoding machinery already preserved all the required
catalog tuples, which is sufficient in the course of normal logical
decoding, but did not guarantee that non-catalog tuples were preserved
during computation of the initial snapshot when creating a slot over
the replication protocol.

This could cause a corrupted initial snapshot being exported.  The
time window for issues is usually not terribly large, but on a busy
server it's perfectly possible to it hit it.  Ongoing decoding is not
affected by this bug.

To avoid increased overhead for the SQL API, only retain additional
tuples when a logical slot is being created over the replication
protocol.  To do so this commit changes the signature of
CreateInitDecodingContext(), but it seems unlikely that it's being
used in an extension, so that's probably ok.

In a drive-by fix, fix handling of
ReplicationSlotsComputeRequiredXmin's already_locked argument, which
should only apply to ProcArrayLock, not ReplicationSlotControlLock.

Reported-By: Erik Rijkers
Analyzed-By: Petr Jelinek
Author: Petr Jelinek, heavily editorialized by Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com
Backport: 9.4, where logical decoding was introduced.
2017-04-27 13:13:36 -07:00
Tom Lane fa31b6f4e9 Make latch.c more paranoid about child-process cases.
Although the postmaster doesn't currently create a self-pipe or any
latches, there's discussion of it doing so in future.  It's also
conceivable that a shared_preload_libraries extension would try to
create such a thing in the postmaster process today.  In that case
the self-pipe FDs would be inherited by forked child processes.
latch.c was entirely unprepared for such a case and could suffer an
assertion failure, or worse try to use the inherited pipe if somebody
called WaitLatch without having called InitializeLatchSupport in that
process.  Make it keep track of whether InitializeLatchSupport has been
called in the *current* process, and do the right thing if state has
been inherited from a parent.

Apply FD_CLOEXEC to file descriptors created in latch.c (the self-pipe,
as well as epoll event sets).  This ensures that child processes spawned
in backends, the archiver, etc cannot accidentally or intentionally mess
with these FDs.  It also ensures that we end up with the right state
for the self-pipe in EXEC_BACKEND processes, which otherwise wouldn't
know to close the postmaster's self-pipe FDs.

Back-patch to 9.6, mainly to keep latch.c looking similar in all branches
it exists in.

Discussion: https://postgr.es/m/8322.1493240739@sss.pgh.pa.us
2017-04-27 15:07:36 -04:00
Simon Riggs 49e9281549 Rework handling of subtransactions in 2PC recovery
The bug fixed by 0874d4f3e1
caused us to question and rework the handling of
subtransactions in 2PC during and at end of recovery.
Patch adds checks and tests to ensure no further bugs.

This effectively removes the temporary measure put in place
by 546c13e11b.

Author: Simon Riggs
Reviewed-by: Tom Lane, Michael Paquier
Discussion: http://postgr.es/m/CANP8+j+vvXmruL_i2buvdhMeVv5TQu0Hm2+C5N+kdVwHJuor8w@mail.gmail.com
2017-04-27 14:41:22 +02:00
Simon Riggs 0352c15e5a Additional tests for subtransactions in recovery
Tests for normal and prepared transactions

Author: Nikhil Sontakke, placed in new test file by me
2017-04-27 14:26:57 +02:00
Peter Eisentraut 6c9bd27aec Fix typo in comment
Author: Masahiko Sawada <sawada.mshk@gmail.com>
2017-04-26 21:13:01 -04:00
Tom Lane aa1351f1ee Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long.  However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once.  On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives.  (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.

As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.

There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).

Backpatch to 9.6 where parallel query was added.  The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.

Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 16:17:34 -04:00
Stephen Frost 0c76c2463e pg_get_partkeydef: return NULL for non-partitions
Our general rule for pg_get_X(oid) functions is to simply return NULL
when passed an invalid or inappropriate OID.  Teach pg_get_partkeydef to
do this also, making it easier for users to use this function when
querying against tables with both partitions and non-partitions (such as
pg_class).

As a concrete example, this makes pg_dump's life a little easier.

Author: Amit Langote
2017-04-26 14:59:22 -04:00
Tom Lane 49da00677d Silence compiler warning induced by commit de4389712.
Smarter compilers can see that "slot" can't be used uninitialized,
but some popular ones cannot.  Noted by Jeff Janes.
2017-04-26 14:01:26 -04:00
Peter Eisentraut 61ecc90be6 Fix query that gets remote relation info
Publisher relation can be incorrectly chosen, if there are more than
one relation in different schemas with the same name.

Author: Euler Taveira <euler@timbira.com.br>
2017-04-26 12:07:22 -04:00
Peter Eisentraut e495c1683f Spelling fixes in code comments
Author: Euler Taveira <euler@timbira.com.br>
2017-04-26 12:07:11 -04:00
Fujii Masao 1f8b060121 Fix typo in comment.
Author: Masahiko Sawada
2017-04-27 00:03:07 +09:00
Peter Eisentraut de43897122 Fix various concurrency issues in logical replication worker launching
The code was originally written with assumption that launcher is the
only process starting the worker.  However that hasn't been true since
commit 7c4f52409 which failed to modify the worker management code
adequately.

This patch adds an in_use field to the LogicalRepWorker struct to
indicate whether the worker slot is being used and uses proper locking
everywhere this flag is set or read.

However if the parent process dies while the new worker is starting and
the new worker fails to attach to shared memory, this flag would never
get cleared.  We solve this rare corner case by adding a sort of garbage
collector for in_use slots.  This uses another field in the
LogicalRepWorker struct named launch_time that contains the time when
the worker was started.  If any request to start a new worker does not
find free slot, we'll check for workers that were supposed to start but
took too long to actually do so, and reuse their slot.

In passing also fix possible race conditions when stopping a worker that
hasn't finished starting yet.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-04-26 10:45:59 -04:00
Stephen Frost 9139aa1942 Allow ALTER TABLE ONLY on partitioned tables
There is no need to forbid ALTER TABLE ONLY on partitioned tables,
when no partitions exist yet.  This can be handy for users who are
building up their partitioned table independently and will create actual
partitions later.

In addition, this is how pg_dump likes to operate in certain instances.

Author: Amit Langote, with some error message word-smithing by me
2017-04-25 16:57:43 -04:00
Peter Eisentraut a3f17b9c31 Wake up launcher when enabling a subscription
Otherwise one would have to wait up to DEFAULT_NAPTIME_PER_CYCLE until
the subscription worker is considered for starting.

There is a small race condition:  If one enables a subscription right
after disabling it, the launcher might not have registered the stopping
when receiving the wakeup signal for the re-enabling.  The start will
then not happen right away but after the full cycle time.

Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
2017-04-25 14:40:33 -04:00
Fujii Masao 346199dcab Set the priorities of all quorum synchronous standbys to 1.
In quorum-based synchronous replication, all the standbys listed in
synchronous_standby_names equally have chances to be chosen
as synchronous standbys. So they should have the same priority.
However, previously, quorum standbys whose names appear earlier
in the list were given higher priority values though the difference of
those priority values didn't affect the selection of synchronous standbys.
Users could see those "meaningless" priority values in pg_stat_replication
and this was confusing.

This commit gives all the quorum synchronous standbys the same
highest priority, i.e., 1, in order to remove such confusion.

Author: Fujii Masao
Reviewed-by: Masahiko Sawada, Kyotaro Horiguchi
Discussion: http://postgr.es/m/CAHGQGwEKOw=SmPLxJzkBsH6wwDBgOnVz46QjHbtsiZ-d-2RGUg@mail.gmail.com
2017-04-26 01:07:13 +09:00
Robert Haas 914ae8d3cb Adjust outdated comment.
Commit 5dfc198146 removed the only
existing caller of hash_freeze, but left behind a comment indicating
that hash_freeze was still used.  Adjust.

Kyotaro Horiguchi

Discussion: http://postgr.es/m/20170424.165541.230634914.horiguchi.kyotaro@lab.ntt.co.jp
2017-04-25 10:58:45 -04:00
Fujii Masao 7cc14ae9d8 Update copyright in recently added files.
This commit also fixes copyright line missed by the automated script.

Author: Masahiko Sawada
2017-04-25 23:38:41 +09:00
Tom Lane 64925603c9 Revert "Use pselect(2) not select(2), if available, to wait in postmaster's loop."
This reverts commit 81069a9efc.

Buildfarm results suggest that some platforms have versions of pselect(2)
that are not merely non-atomic, but flat out non-functional.  Revert the
use-pselect patch to confirm this diagnosis (and exclude the no-SA_RESTART
patch as the source of trouble).  If it's so, we should probably look into
blacklisting specific platforms that have broken pselect.

Discussion: https://postgr.es/m/9696.1493072081@sss.pgh.pa.us
2017-04-24 18:29:03 -04:00
Tom Lane 81069a9efc Use pselect(2) not select(2), if available, to wait in postmaster's loop.
Traditionally we've unblocked signals, called select(2), and then blocked
signals again.  The code expects that the select() will be cancelled with
EINTR if an interrupt occurs; but there's a race condition, which is that
an already-pending signal will be delivered as soon as we unblock, and then
when we reach select() there will be nothing preventing it from waiting.
This can result in a long delay before we perform any action that
ServerLoop was supposed to have taken in response to the signal.  As with
the somewhat-similar symptoms fixed by commit 893902085, the main practical
problem is slow launching of parallel workers.  The window for trouble is
usually pretty short, corresponding to one iteration of ServerLoop; but
it's not negligible.

To fix, use pselect(2) in place of select(2) where available, as that's
designed to solve exactly this problem.  Where not available, we continue
to use the old way, and are no worse off than before.

pselect(2) has been required by POSIX since about 2001, so most modern
platforms should have it.  A bigger portability issue is that some
implementations are said to be non-atomic, ie pselect() isn't really
any different from unblock/select/reblock.  Still, we're no worse off
than before on such a platform.

There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point this
could be reverted, since we'd be using a self-pipe to solve the race
condition.  But that's not happening before v11 at the earliest.

Back-patch to 9.6.  The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.

Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 14:03:14 -04:00
Tom Lane 8939020853 Run the postmaster's signal handlers without SA_RESTART.
The postmaster keeps signals blocked everywhere except while waiting
for something to happen in ServerLoop().  The code expects that the
select(2) will be cancelled with EINTR if an interrupt occurs; without
that, followup actions that should be performed by ServerLoop() itself
will be delayed.  However, some platforms interpret the SA_RESTART
signal flag as meaning that they should restart rather than cancel
the select(2).  Worse yet, some of them restart it with the original
timeout delay, meaning that a steady stream of signal interrupts can
prevent ServerLoop() from iterating at all if there are no incoming
connection requests.

Observable symptoms of this, on an affected platform such as HPUX 10,
include extremely slow parallel query startup (possibly as much as
30 seconds) and failure to update timestamps on the postmaster's sockets
and lockfiles when no new connections arrive for a long time.

We can fix this by running the postmaster's signal handlers without
SA_RESTART.  That would be quite a scary change if the range of code
where signals are accepted weren't so tiny, but as it is, it seems
safe enough.  (Note that postmaster children do, and must, reset all
the handlers before unblocking signals; so this change should not
affect any child process.)

There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point it might
be appropriate to revert this patch.  But that's not happening before
v11 at the earliest.

Back-patch to 9.6.  The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.

Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 13:00:30 -04:00
Fujii Masao cbc2270e3f Get rid of extern declarations of non-existent functions.
Those extern declartions were mistakenly added by commit 7c4f52409.

Author: Petr Jelinek
2017-04-25 01:31:42 +09:00
Tom Lane 4fe04244b5 Fix postmaster's handling of fork failure for a bgworker process.
This corner case didn't behave nicely at all: the postmaster would
(partially) update its state as though the process had started
successfully, and be quite confused thereafter.  Fix it to act
like the worker had crashed, instead.

In passing, refactor so that do_start_bgworker contains all the
state-change logic for bgworker launch, rather than just some of it.

Back-patch as far as 9.4.  9.3 contains similar logic, but it's just
enough different that I don't feel comfortable applying the patch
without more study; and the use of bgworkers in 9.3 was so small
that it doesn't seem worth the extra work.

transam/parallel.c is still entirely unprepared for the possibility
of bgworker startup failure, but that seems like material for a
separate patch.

Discussion: https://postgr.es/m/4905.1492813727@sss.pgh.pa.us
2017-04-24 12:16:58 -04:00
Tom Lane 4b34624daa Code review for commands/statscmds.c.
Fix machine-dependent sorting of column numbers.  (Odd behavior
would only materialize for column numbers above 255, but that's
certainly legal.)

Fix poor choice of SQLSTATE for some errors, and improve error message
wording.  (Notably, "is not a scalar type" is a totally misleading way
to explain "does not have a default btree opclass".)

Avoid taking AccessExclusiveLock on the associated relation during DROP
STATISTICS.  That's neither necessary nor desirable, and it could easily
have put us into situations where DROP fails (compare commit 68ea2b7f9).

Adjust/improve comments.

David Rowley and Tom Lane

Discussion: https://postgr.es/m/CAKJS1f-GmCfPvBbAEaM5xoVOaYdVgVN1gicALSoYQ77z-+vLbw@mail.gmail.com
2017-04-24 11:15:15 -04:00
Andres Freund b182a4ae2f Don't include sys/poll.h anymore.
poll.h is mandated by Single Unix Spec v2, the usual baseline for
postgres on unix.  None of the unixoid buildfarms animals has
sys/poll.h but not poll.h.  Therefore there's not much point to test
for sys/poll.h's existence and include it optionally.

Author: Andres Freund, per suggestion from Tom Lane
Discussion: https://postgr.es/m/20505.1492723662@sss.pgh.pa.us
2017-04-23 16:11:35 -07:00
Andres Freund eb97aa7e65 Zero padding in replication origin's checkpointed on disk-state.
This seems to be largely cosmetic, avoiding valgrind bleats and the
like. The uninitialized padding influences the CRC of the on-disk
entry, but because it's also used when verifying the CRC, that doesn't
cause spurious failures.  Backpatch nonetheless.

It's a bit unfortunate that contrib/test_decoding/sql/replorigin.sql
doesn't exercise the checkpoint path, but checkpoints are fairly
expensive on weaker machines, and we'd have to stop/start for that to
be meaningful.

Author: Andres Freund
Discussion: https://postgr.es/m/20170422183123.w2jgiuxtts7qrqaq@alap3.anarazel.de
Backpatch: 9.5, where replication origins were introduced
2017-04-23 15:54:41 -07:00
Andres Freund e84d243b1c Initialize all memory for logical replication relation cache.
As reported by buildfarm animal skink / valgrind, some of the
variables weren't always initialized.  To avoid further mishaps use
memset to ensure the entire entry is initialized.

Author: Petr Jelinek
Reported-By: Andres Freund
Discussion: https://postgr.es/m/20170422183123.w2jgiuxtts7qrqaq@alap3.anarazel.de
Backpatch: none, code new in master
2017-04-23 15:54:41 -07:00
Andres Freund 61c21ddad0 Remove select(2) backed latch implementation.
poll(2) is required by Single Unix Spec v2, the usual baseline for
postgres (leaving windows aside).  There's not been any buildfarm
animals without poll(2) for a long while, leaving the select(2)
implementation to be largely untested.

On windows, including mingw, poll() is not available, but we have a
special case implementation for windows anyway.

Author: Andres Freund
Discussion: https://postgr.es/m/20170420003611.7r2sdvehesdyiz2i@alap3.anarazel.de
2017-04-23 15:31:41 -07:00
Simon Riggs 546c13e11b Workaround for RecoverPreparedTransactions()
Force overwriteOK = true while we investigate deeper fix

Proposed by Tom Lane as temporary measure, accepted by me
2017-04-23 22:12:01 +01:00
Simon Riggs 8463880872 Fix LagTrackerRead() for timeline increments
Bug was masked by error in running 004_timeline_switch.pl that was
fixed recently in 7d68f2281a.

Detective work by Alvaro Herrera and Tom Lane

Author: Thomas Munro
2017-04-23 21:35:41 +01:00
Tom Lane 0874d4f3e1 Fix order of arguments to SubTransSetParent().
ProcessTwoPhaseBuffer (formerly StandbyRecoverPreparedTransactions)
mixed up the parent and child XIDs when calling SubTransSetParent to
record the transactions' relationship in pg_subtrans.

Remarkably, analysis by Simon Riggs suggests that this doesn't lead to
visible problems (at least, not in non-Assert builds).  That might
explain why we'd not noticed it before.  Nonetheless, it's surely wrong.

This code was born broken, so back-patch to all supported branches.

Discussion: https://postgr.es/m/20110.1492905318@sss.pgh.pa.us
2017-04-23 13:11:06 -04:00
Andrew Dunstan 33f3bbc6d3 Fix TAP infrastructure to support Mingw better
archive_command and restore_command need to refer to Windows paths, not
Msys virtual file system paths, as postgres is completely unaware of the
latter, so prefix them with the Windows path to the virtual file system
root. Clean psql and pg_recvlogical output of carriage returns.
2017-04-23 09:21:38 -04:00
Tom Lane 7d68f2281a Make PostgresNode.pm check server status more carefully.
PostgresNode blithely ignored the exit status of pg_ctl, and in general
made no effort to be sure that the server was running when it should be.
This caused it to miss server crashes, which is a serious shortcoming
in a test scaffold.  Make it complain if pg_ctl fails, and modify the
start and stop logic to complain if the server doesn't start, or doesn't
stop, when expected.

Also, have it turn off the "restart_after_crash" configuration parameter
in created clusters, as bitter experience has shown that leaving that on
can mask crashes too.

We might at some point need variant functions that allow for, eg,
server start failure to be expected.  But no existing test case appears
to want that, and it surely shouldn't be the default behavior.

Note that this *will* break the buildfarm, as it will expose known
bugs that the previous testing failed to.  I'm committing it despite
that, to verify that we get the expected failures in the buildfarm
not just in manual testing.

Back-patch into 9.6 where PostgresNode was introduced.  (The 9.6
branch is not expected to show any failures.)

Discussion: https://postgr.es/m/21432.1492886428@sss.pgh.pa.us
2017-04-22 18:18:25 -04:00
Tom Lane 8a19c1a373 Make PostgresNode::append_conf append a newline automatically.
Although the documentation for append_conf said clearly that it didn't
add a newline, many test authors seem to have forgotten that ... or maybe
they just consulted the example at the top of the POD documentation,
which clearly shows adding a config entry without bothering to add a
trailing newline.  The worst part of that is that it works, as long as
you don't do it more than once, since the backend isn't picky about
whether config files end with newlines.  So there's not a strong forcing
function reminding test authors not to do it like that.  Upshot is that
this is a terribly fragile way to go about things, and there's at least
one existing test case that is demonstrably broken and not testing what
it thinks it is.

Let's just make append_conf append a newline, instead; that is clearly
way safer than the old definition.

I also cleaned up a few call sites that were unnecessarily ugly.
(I left things alone in places where it's plausible that additional
config lines would need to be added someday.)

Back-patch the change in append_conf itself to 9.6 where it was added,
as having a definitional inconsistency between branches would obviously
be pretty hazardous for back-patching TAP tests.  The other changes are
just cosmetic and don't need to be back-patched.

Discussion: https://postgr.es/m/19751.1492892376@sss.pgh.pa.us
2017-04-22 16:58:15 -04:00
Andrew Dunstan f92562adba Require sufficiently modern version of Test::More for TAP tests
Ancient versions of Test::More don't support the note() function used in
some TAP tests, so we require the minimum version of the module that
does.
2017-04-22 10:04:01 -04:00
Tom Lane 5041cdf2b7 Partially revert commit 536d47bd9d.
Per buildfarm, the "#ifdef F_SETFD" removed in that commit actually
is needed on Windows, because fcntl() isn't available at all on that
platform, unless using Cygwin.  We could perhaps spell it more like
"#ifdef HAVE_FCNTL", or "#ifndef WIN32", but it's not clear that
those choices are better.

It does seem that we don't need the bogus manual definition of
FD_CLOEXEC, though, so keep that change.

Discussion: https://postgr.es/m/26254.1492805635@sss.pgh.pa.us
2017-04-22 02:06:16 -04:00
Tom Lane 3e51725b38 Avoid depending on non-POSIX behavior of fcntl(2).
The POSIX standard does not say that the success return value for
fcntl(F_SETFD) and fcntl(F_SETFL) is zero; it says only that it's not -1.
We had several calls that were making the stronger assumption.  Adjust
them to test specifically for -1 for strict spec compliance.

The standard further leaves open the possibility that the O_NONBLOCK
flag bit is not the only active one in F_SETFL's argument.  Formally,
therefore, one ought to get the current flags with F_GETFL and store
them back with only the O_NONBLOCK bit changed when trying to change
the nonblock state.  In port/noblock.c, we were doing the full pushup
in pg_set_block but not in pg_set_noblock, which is just weird.  Make
both of them do it properly, since they have little business making
any assumptions about the socket they're handed.  The other places
where we're issuing F_SETFL are working with FDs we just got from
pipe(2), so it's reasonable to assume the FDs' properties are all
default, so I didn't bother adding F_GETFL steps there.

Also, while pg_set_block deserves some points for trying to do things
right, somebody had decided that it'd be even better to cast fcntl's
third argument to "long".  Which is completely loony, because POSIX
clearly says the third argument for an F_SETFL call is "int".

Given the lack of field complaints, these missteps apparently are not
of significance on any common platforms.  But they're still wrong,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/30882.1492800880@sss.pgh.pa.us
2017-04-21 15:56:16 -04:00