Commit Graph

29278 Commits

Author SHA1 Message Date
Kevin Grittner 8c48375e5f Implement syntax for transition tables in AFTER triggers.
This is infrastructure for the complete SQL standard feature.  No
support is included at this point for execution nodes or PLs.  The
intent is to add that soon.

As this patch leaves things, standard syntax can create tuplestores
to contain old and/or new versions of rows affected by a statement.
References to these tuplestores are in the TriggerData structure.
C triggers can access the tuplestores directly, so they are usable,
but they cannot yet be referenced within a SQL statement.
2016-11-04 10:49:50 -05:00
Peter Eisentraut 69d590fffb pg_xlogdump: Add NLS
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-11-04 10:40:05 -04:00
Peter Eisentraut 59fa9d2d9d pg_test_timing: Add NLS
Also straighten out use of time unit abbreviations a bit.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-11-04 10:40:05 -04:00
Peter Eisentraut a39255d766 pg_test_fsync: Add NLS
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-11-04 10:40:05 -04:00
Peter Eisentraut 632fbe772c pg_archivecleanup: Add NLS
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-11-04 10:40:05 -04:00
Robert Haas f2e6a2ccf1 Add API to check if an existing exclusive lock allows cleanup.
LockBufferForCleanup() acquires a cleanup lock unconditionally, and
ConditionalLockBufferForCleanup() acquires a cleanup lock if it is
possible to do so without waiting; this patch adds a new API,
IsBufferCleanupOK(), which tests whether an exclusive lock already
held happens to be a cleanup lock.  This is possible because a cleanup
lock simply means an exclusive lock plus the assurance any other pins
on the buffer are newer than our own pin.  Therefore, just as the
existing functions decide that the exclusive lock that they've just
taken is a cleanup lock if they observe the pin count to be 1, this
new function allows us to observe that the pin count is 1 on a buffer
we've already locked.

This is useful in situations where a backend definitely wishes to
modify the buffer and also wishes to perform cleanup operations if
possible.  The patch to eliminate heavyweight locking by hash indexes
uses this, and it may have other applications as well.

Amit Kapila, per a suggestion from me.  Some comment adjustments by me
as well.
2016-11-04 09:32:24 -04:00
Tom Lane 1f87181e12 Sync our copy of the timezone library with IANA tzcode master.
This patch absorbs some unreleased fixes for symlink manipulation bugs
introduced in tzcode 2016g.  Ordinarily I'd wait around for a released
version, but in this case it seems like we could do with extra testing,
in particular checking whether it works in EDB's VMware build environment.
This corresponds to commit aec59156abbf8472ba201b6c7ca2592f9c10e077 in
https://github.com/eggert/tz.

Per a report from Sandeep Thakkar, building in an environment where hard
links are not supported in the timezone data installation directory failed,
because upstream code refactoring had broken the case of symlinking from an
existing symlink.  Further experimentation also showed that the symlinks
were sometimes made incorrectly, with too many or too few "../"'s in the
symlink contents.

This should get back-patched, but first let's see what the buildfarm
makes of it.  I'm not too sure about the new dependency on linkat(2).

Report: <CANFyU94_p6mqRQc2i26PFp5QAOQGB++AjGX=FO8LDpXw0GSTjw@mail.gmail.com>
Discussion: http://mm.icann.org/pipermail/tz/2016-November/024431.html
2016-11-03 22:24:34 -04:00
Peter Eisentraut a0f357e570 psql: Split up "Modifiers" column in \d and \dD
Make separate columns "Collation", "Nullable", "Default".

Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
2016-11-03 14:02:46 -04:00
Robert Haas 1d15d0db50 psql: Tab-complete LOCK [TABLE] ... IN {ACCESS|ROW|SHARE}.
Suggest the lock modes that begin with the word in question.

Thomas Munro, reviewed by Marllius Ribeiro.  Comments tweaked by me.
2016-11-03 11:42:13 -04:00
Robert Haas 274bb2b385 libpq: Allow connection strings and URIs to specify multiple hosts.
It's also possible to specify a separate port for each host.

Previously, we'd loop over every address returned by looking up the
host name; now, we'll try every address for every host name.

Patch by me.  Victor Wagner wrote an earlier patch for this feature,
which I read, but I didn't use any of his code.  Review by Mithun Cy.
2016-11-03 09:25:20 -04:00
Tom Lane 770671062f Don't make FK-based selectivity estimates in inheritance situations.
The foreign-key-aware logic for estimation of join sizes (added in commit
100340e2d) blindly tried to apply the concept to rels that are actually
parents of inheritance trees.  This is just plain wrong so far as the
referenced relation is concerned, since the inheritance scan may well
produce lots of rows that are not participating in the constraint.  It's
wrong for the referencing relation too, for the same reason; although on
that end we could conceivably detect whether all members of the inheritance
tree have equivalent FK constraints pointing to the same referenced rel,
and then proceed more or less as we do now.  But pending somebody writing
code to do that, we must disable this, because it's producing completely
silly estimates when there's an FK linking the heads of inheritance trees.

Per bug #14404 from Clinton Adams.  Back-patch to 9.6 where the new
estimation logic came in.

Report: <20161028200412.15987.96482@wrigleys.postgresql.org>
2016-11-02 15:50:15 -04:00
Tom Lane da8f3ebf30 Don't convert Consts into Vars during setrefs.c processing.
While converting expressions in an upper-level plan node so that they
reference Vars and expressions provided by the input plan node(s),
don't convert plain Const items, even if there happens to be a matching
Const in the input.  It's silly to do so because a Var is more expensive to
execute than a Const.  Moreover, converting can fool ExecCheckPlanOutput's
check that an insert or update query inserts nulls into dropped columns,
leading to "query provides a value for a dropped column" errors during
INSERT or UPDATE on a table with a dropped column.  We could solve this
by making that check more complicated, but I don't see the point; this fix
should save a marginal number of cycles, and it also makes for less messy
EXPLAIN output, as shown by the ensuing regression test result changes.

Per report from Pavel Hanák.  I have not incorporated a test case based
on that example, as there doesn't seem to be a simple way of checking
this in isolation without making a bunch of assumptions about other
planner and SQL-function behavior.

Back-patch to 9.6.  This setrefs.c behavior exists much further back,
but there is not currently reason to think that it causes problems
before 9.6.

Discussion: <83shraampf.fsf@is-it.eu>
2016-11-02 14:32:13 -04:00
Peter Eisentraut 3a47c704fb Add make rules to download raw Unicode mapping files
This serves as implicit documentation and is handy if someone wants to
tweak things.  The rules are not part of a normal build, like this
entire directory.
2016-11-01 11:54:58 -04:00
Robert Haas 6bb9a6177d Remove declarations for pq_putmessage_hook and pq_flush_hook.
Commit 2bd9e412f9 added these in error.
They were part of an earlier design for that patch and survived in the
committed version only by inadvertency.

Julien Rouhaud
2016-10-31 09:14:46 -04:00
Tom Lane 5ec81aceec Fix nasty performance problem in tsquery_rewrite().
tsquery_rewrite() tries to find matches to subsets of AND/OR conditions;
for example, in the query 'a | b | c' the substitution subquery 'a | c'
should match and lead to replacement of the first and third items.
That's fine, but the matching algorithm apparently takes about O(2^N)
for an N-clause query (I say "apparently" because the code is also both
unintelligible and uncommented).  We could probably do better than that
even without any extra assumptions --- but actually, we know that the
subclauses are sorted, indeed are depending on that elsewhere in this very
same function.  So we can just scan the two lists a single time to detect
matches, as though we were doing a merge join.

Also do a re-flattening call (QTNTernary()) in tsquery_rewrite_query, just
to make sure that the tree fits the expectations of the next search cycle.
I didn't try to devise a test case for this, but I'm pretty sure that the
oversight could have led to failure to match in some cases where a match
would be expected.

Improve comments, and also stick a CHECK_FOR_INTERRUPTS into
dofindsubquery, just in case it's still too slow for somebody.

Per report from Andreas Seltenreich.  Back-patch to all supported branches.

Discussion: <8760oasf2y.fsf@credativ.de>
2016-10-30 17:35:42 -04:00
Tom Lane 24ebc444c6 Fix bogus tree-flattening logic in QTNTernary().
QTNTernary() contains logic to flatten, eg, '(a & b) & c' into 'a & b & c',
which is all well and good, but it tries to do that to NOT nodes as well,
so that '!!a' gets changed to '!a'.  Explicitly restrict the conversion to
be done only on AND and OR nodes, and add a test case illustrating the bug.

In passing, provide some comments for the sadly naked functions in
tsquery_util.c, and simplify some baroque logic in QTNFree(), which
I think may have been leaking some items it intended to free.

Noted while investigating a complaint from Andreas Seltenreich.
Back-patch to all supported versions.
2016-10-30 15:24:40 -04:00
Tom Lane 9a00f03e47 Improve speed of aggregates that use array_append as transition function.
In the previous coding, if an aggregate's transition function returned an
expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus
force it into the flat representation.  This led to ping-ponging between
flat and expanded formats, which costs a lot.  For an aggregate using
array_append as transition function, I measured about a 15X slowdown
compared to the pre-9.5 code, when working on simple int[] arrays.
Of course, the old code was already O(N^2) in this usage due to copying
flat arrays all the time, but it wasn't quite this inefficient.

To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition
values without copying, so long as the transition function takes care to
return the transition value already properly parented under the aggcontext.
That puts a bit of extra responsibility on the transition function, but
doing it this way allows us to not need any extra logic in the fast path
of advance_transition_function (ie, with a pass-by-value transition value,
or with a modified-in-place pass-by-reference value).  We already know
that that's a hot spot so I'm loath to add any cycles at all there.  Also,
while only array_append currently knows how to follow this convention,
this solution allows other transition functions to opt-in without needing
to have a whitelist in the core aggregation code.

(The reason we would need a whitelist is that currently, if you pass a
R/W expanded-object pointer to an arbitrary function, it's allowed to do
anything with it including deleting it; that breaks the core agg code's
assumption that it should free discarded values.  Returning a value under
aggcontext is the transition function's signal that it knows it is an
aggregate transition function and will play nice.  Possibly the API rules
for expanded objects should be refined, but that would not be a
back-patchable change.)

With this fix, an aggregate using array_append is no longer O(N^2), so it's
much faster than pre-9.5 code rather than much slower.  It's still a bit
slower than the bespoke infrastructure for array_agg, but the differential
seems to be only about 10%-20% rather than orders of magnitude.

Discussion: <6315.1477677885@sss.pgh.pa.us>
2016-10-30 12:27:41 -04:00
Magnus Hagander a775406ec4 Fix memory leak in tar file padding
Spotted by Coverity, patch by Michael Paquier
2016-10-30 14:10:39 +01:00
Robert Haas 33839b5ffb Fix leftover reference to background writer performing checkpoints.
This was changed in PostgreSQL 9.2, but somehow this comment never
got updated.
2016-10-28 09:09:00 -04:00
Peter Eisentraut ce4dc97056 Remove invitation to report a bug about unknown encoding
The error message when we couldn't determine the encoding from a locale
said to report a bug about that.  That might have been appropriate when
this code was first added, but by now this works pretty solidly and any
encodings we don't recognize we probably just don't support.  We still
print the warning, but no longer invite the bug report.
2016-10-27 18:43:46 -04:00
Peter Eisentraut eaed88ce12 Add function name to PyArg_ParseTuple()
This causes the supplied function name to appear in any error message,
making the error message friendlier and relieving us from having to
provide our own in some cases.
2016-10-27 15:41:29 -04:00
Peter Eisentraut 84d457edaf Format PL/Python module contents test vertically
It makes it readable again and makes merges more manageable.
2016-10-27 15:41:29 -04:00
Robert Haas 4f714b2fd2 If the stats collector dies during Hot Standby, restart it.
This bug exists as far back as 9.0, when Hot Standby was introduced,
so back-patch to all supported branches.

Report and patch by Takayuki Tsunakawa, reviewed by Michael Paquier
and Kuntal Ghosh.
2016-10-27 14:27:40 -04:00
Robert Haas f267c1c244 Fix possible pg_basebackup failure on standby with "include WAL".
If a restartpoint flushed no dirty buffers, it could fail to update
the minimum recovery point, leading to a minimum recovery point prior
to the starting REDO location.  perform_base_backup() would interpret
that as meaning that no WAL files at all needed to be included in the
backup, failing an internal sanity check.  To fix, have restartpoints
always update the minimum recovery point to just after the checkpoint
record itself, so that the file (or files) containing the checkpoint
record will always be included in the backup.

Code by Amit Kapila, per a design suggestion by me, with some
additional work on the code comment by me.  Test case by Michael
Paquier.  Report by Kyotaro Horiguchi.
2016-10-27 11:19:51 -04:00
Peter Eisentraut c32fe432af Avoid using a C++ keyword in header file
per cpluspluscheck
2016-10-26 22:41:56 -04:00
Bruce Momjian 586a46c22c Properly indent postgresql.conf comments to align
A few comments were misaligned.
2016-10-26 21:16:50 -04:00
Tom Lane a522fc3d80 Fix incorrect trigger-property updating in ALTER CONSTRAINT.
The code to change the deferrability properties of a foreign-key constraint
updated all the associated triggers to match; but a moment's examination of
the code that creates those triggers in the first place shows that only
some of them should track the constraint's deferrability properties.  This
leads to odd failures in subsequent exercise of the foreign key, as the
triggers are fired at the wrong times.  Fix that, and add a regression test
comparing the trigger properties produced by ALTER CONSTRAINT with those
you get by creating the constraint as-intended to begin with.

Per report from James Parks.  Back-patch to 9.4 where this ALTER
functionality was introduced.

Report: <CAJ3Xv+jzJ8iNNUcp4RKW8b6Qp1xVAxHwSXVpjBNygjKxcVuE9w@mail.gmail.com>
2016-10-26 17:05:06 -04:00
Tom Lane 19b2094d96 Fix not-HAVE_SYMLINK code in zic.c.
I broke this in commit f3094920a.  Apparently it's dead code anyway,
at least as far as our buildfarm is concerned (and the upstream IANA
code doesn't worry at all about symlink() not being present).
But as long as the rest of our code is willing to guard against not
having symlink(), this should too.  Noted while investigating a
tangentially-related complaint from Sandeep Thakkar.

Back-patch to keep branches in sync.
2016-10-26 13:40:41 -04:00
Heikki Linnakangas 8eb6337f9f Remove platform-dependent PL/python test case.
Turns out that the output format of Python Decimal isn't totally platform-
independent either. There are other tests for multi-dimensional arrays,
so rather than try to fix this test case, just remove it.

Per buildfarm member prairiedog.
2016-10-26 17:09:18 +03:00
Heikki Linnakangas cfd9c87a54 Only treat Python Lists as array dimensions.
Instead of treating all python sequence types as array dimensions, except
for tuples and various kinds of strings, only treat Python lists as
dimensions. The PyBytes_Check() function used previously is only available
on Python 2.6 and newer, and it was a bit fiddly anyway. The list of
exceptions would require adjustment if Python got a new kind of a sequence
similar to bytes/unicodes/strings, so only checking for Lists seems more
future-proof. The documentation only mentioned using Lists, so this is
closer to what was documented, anyway.

This should fix the buildfarm failures on systems building with Python 2.5,
although I don't have Python 2.5 installed myself to test with.
2016-10-26 14:44:55 +03:00
Heikki Linnakangas 73c8e8506c Avoid using platform-dependent floats in test case.
The number of decimals printed for floats varied in this test case, as
noted by several buildfarm members. There's nothing special about floats
and arrays in the code being tested, so replace the floats with numerics to
make the output platform-independent.
2016-10-26 14:17:07 +03:00
Heikki Linnakangas e131ba4fe5 Fix regression test to also work with Python 2.
Per buildfarm.
2016-10-26 11:18:04 +03:00
Heikki Linnakangas 56f39009c5 Fix typos in comments.
Vinayak Pokale
2016-10-26 11:12:31 +03:00
Heikki Linnakangas 510e1b8ecf Give a hint, when [] is incorrectly used for a composite type in array.
That used to be accepted, so let's try to give a hint to users on why
their PL/python functions no longer work.

Reviewed by Pavel Stehule.

Discussion: <CAH38_tmbqwaUyKs9yagyRra=SMaT45FPBxk1pmTYcM0TyXGG7Q@mail.gmail.com>
2016-10-26 10:56:56 +03:00
Heikki Linnakangas 94aceed317 Support multi-dimensional arrays in PL/python.
Multi-dimensional arrays can now be used as arguments to a PL/python function
(used to throw an error), and they can be returned as nested Python lists.

This makes a backwards-incompatible change to the handling of composite
types in arrays. Previously, you could return an array of composite types
as "[[col1, col2], [col1, col2]]", but now that is interpreted as a two-
dimensional array. Composite types in arrays must now be returned as
Python tuples, not lists, to resolve the ambiguity. I.e. "[(col1, col2),
(col1, col2)]".

To avoid breaking backwards-compatibility, when not necessary, () is still
accepted for arrays at the top-level, but it is always treated as a
single-dimensional array. Likewise, [] is still accepted for composite types,
when they are not in an array. Update the documentation to recommend using []
for arrays, and () for composite types, with a mention that those other things
are also accepted in some contexts.

This needs to be mentioned in the release notes.

Alexey Grishchenko, Dave Cramer and me. Reviewed by Pavel Stehule.

Discussion: <CAH38_tmbqwaUyKs9yagyRra=SMaT45FPBxk1pmTYcM0TyXGG7Q@mail.gmail.com>
2016-10-26 10:56:30 +03:00
Peter Eisentraut 8c035e55c4 pg_dump: Simplify internal archive version handling
The ArchiveHandle structure contained the archive format version number
twice, once as a single field and once split into components.  Simplify
that by just keeping the single field and adding some macros to extract
the components.  Introduce some macros for composing version numbers, to
eliminate the repeated use of magic formulas.  Drop the unused trailing
zero byte from the run-time composite version representation.

reviewed by Tom Lane
2016-10-25 17:02:22 -04:00
Magnus Hagander 78d109150b Free walmethods before exiting
Not strictly necessary since we quite after, but could become important
in the future if we do restarts etc.

Michael Paquier with nitpicking from me
2016-10-25 19:00:12 +02:00
Magnus Hagander 8c46f0c9ce Don't fsync() files when --no-sync is specified
Michael Paquier
2016-10-25 19:00:01 +02:00
Bruce Momjian 10c064ce4d Consistently mention 'SELECT pg_reload_conf()' in config files
Previously we only mentioned SIGHUP and 'pg_ctl reload' in
postgresql.conf and pg_hba.conf.
2016-10-25 11:26:15 -04:00
Magnus Hagander 2dde01ccbf Use ssize_t where signed results can happen
Noted by Alexander Korotkov
2016-10-24 20:10:18 +02:00
Alvaro Herrera 00f15338b2 Preserve commit timestamps across clean restart
An oversight in setting the boundaries of known commit timestamps during
startup caused old commit timestamps to become inaccessible after a
server restart.

Author and reporter: Julien Rouhaud
Review, test code: Craig Ringer
2016-10-24 09:45:48 -03:00
Tom Lane 8f1fb7d621 Avoid testing tuple visibility without buffer lock.
INSERT ... ON CONFLICT (specifically ExecCheckHeapTupleVisible) contains
another example of this unsafe coding practice.  It is much harder to get
a failure out of it than the case fixed in commit 6292c2339, because in
most scenarios any hint bits that could be set would have already been set
earlier in the command.  However, Konstantin Knizhnik reported a failure
with a custom transaction manager, and it's clearly possible to get a
failure via a race condition in async-commit mode.

For lack of a reproducible example, no regression test case in this
commit.

I did some testing with Asserts added to tqual.c's functions, and can say
that running "make check-world" exposed these two bugs and no others.
The Asserts are messy enough that I've not added them to the code for now.

Report: <57EE93C8.8080504@postgrespro.ru>
Related-Discussion: <CAO3NbwOycQjt2Oqy2VW-eLTq2M5uGMyHnGm=RNga4mjqcYD7gQ@mail.gmail.com>
2016-10-23 19:14:32 -04:00
Tom Lane a6c0a5b6e8 Don't throw serialization errors for self-conflicts in INSERT ON CONFLICT.
A transaction that conflicts against itself, for example
	INSERT INTO t(pk) VALUES (1),(1) ON CONFLICT DO NOTHING;
should behave the same regardless of isolation level.  It certainly
shouldn't throw a serialization error, as retrying will not help.
We got this wrong due to the ON CONFLICT logic not considering the case,
as reported by Jason Dusek.

Core of this patch is by Peter Geoghegan (based on an earlier patch by
Thomas Munro), though I didn't take his proposed code refactoring for fear
that it might have unexpected side-effects.  Test cases by Thomas Munro
and myself.

Report: <CAO3NbwOycQjt2Oqy2VW-eLTq2M5uGMyHnGm=RNga4mjqcYD7gQ@mail.gmail.com>
Related-Discussion: <57EE93C8.8080504@postgrespro.ru>
2016-10-23 18:36:13 -04:00
Tom Lane 6292c23391 Avoid testing tuple visibility without buffer lock in RI_FKey_check().
Despite the argumentation I wrote in commit 7a2fe85b0, it's unsafe to do
this, because in corner cases it's possible for HeapTupleSatisfiesSelf
to try to set hint bits on the target tuple; and at least since 8.2 we
have required the buffer content lock to be held while setting hint bits.

The added regression test exercises one such corner case.  Unpatched, it
causes an assertion failure in assert-enabled builds, or otherwise would
cause a hint bit change in a buffer we don't hold lock on, which given
the right race condition could result in checksum failures or other data
consistency problems.  The odds of a problem in the field are probably
pretty small, but nonetheless back-patch to all supported branches.

Report: <19391.1477244876@sss.pgh.pa.us>
2016-10-23 15:01:24 -04:00
Magnus Hagander eade082b12 Rename walmethod fsync method to sync
Using the name fsync clashed with the #define we have on Windows that
redefines it to _commit. Naming it sync should remove that conflict.

Per all the Windows buildfarm members
2016-10-23 18:04:34 +02:00
Magnus Hagander a5c17c1dce Fix obviously too quickly applied fix to zlib issue 2016-10-23 16:07:31 +02:00
Magnus Hagander 9ae6713cdf Fix walmethods.c build without libz
Per numerous buildfarm manuals
2016-10-23 16:00:42 +02:00
Magnus Hagander d97a59a4c5 Remove extra comma at end of enum list
C99-specific feature, and wasn't intentional in the first place.

Per buildfarm member mylodon
2016-10-23 15:57:25 +02:00
Magnus Hagander 56c7d8d455 Allow pg_basebackup to stream transaction log in tar mode
This will write the received transaction log into a file called
pg_wal.tar(.gz) next to the other tarfiles instead of writing it to
base.tar. When using fetch mode, the transaction log is still written to
base.tar like before, and when used against a pre-10 server, the file
is named pg_xlog.tar.

To do this, implement a new concept of a "walmethod", which is
responsible for writing the WAL. Two implementations exist, one that
writes to a plain directory (which is also used by pg_receivexlog) and
one that writes to a tar file with optional compression.

Reviewed by Michael Paquier
2016-10-23 15:23:11 +02:00
Robert Haas 919c811ca1 Fix comment formatting. 2016-10-21 12:04:21 -04:00
Robert Haas 7012b132d0 postgres_fdw: Push down aggregates to remote servers.
Now that the upper planner uses paths, and now that we have proper hooks
to inject paths into the upper planning process, it's possible for
foreign data wrappers to arrange to push aggregates to the remote side
instead of fetching all of the rows and aggregating them locally.  This
figures to be a massive win for performance, so teach postgres_fdw to
do it.

Jeevan Chalke and Ashutosh Bapat.  Reviewed by Ashutosh Bapat with
additional testing by Prabhat Sahu.  Various mostly cosmetic changes
by me.
2016-10-21 09:54:29 -04:00
Tom Lane 709e461bef Fix EXPLAIN so that it doesn't emit invalid XML in corner cases.
With track_io_timing = on, EXPLAIN (ANALYZE, BUFFERS) will emit fields
named like "I/O Read Time".  The slash makes that invalid as an XML
element name, so that adding FORMAT XML would produce invalid XML.

We already have code in there to translate spaces to dashes, so let's
generalize that to convert anything that isn't a valid XML name character,
viz letters, digits, hyphens, underscores, and periods.  We could just
reject slashes, which would run a bit faster.  But the fact that this went
unnoticed for so long doesn't give me a warm feeling that we'd notice the
next creative violation, so let's make it a permanent fix.

Reported by Markus Winand, though this isn't his initial patch proposal.

Back-patch to 9.2 where track_io_timing was added.  The problem is only
latent in 9.1, so I don't feel a need to fix it there.

Discussion: <E0BF6A45-68E8-45E6-918F-741FB332C6BB@winand.at>
2016-10-20 17:17:50 -04:00
Tom Lane 5e21b68111 Sync our copy of the timezone library with IANA release tzcode2016h.
This absorbs a fix for a symlink-manipulation bug in zic that was
introduced in 2016g.  It probably isn't interesting for our use-case,
but I'm not quite sure, so let's update while we're at it.
2016-10-20 15:40:07 -04:00
Tom Lane d8fc45bd0f Update time zone data files to tzdata release 2016h.
(Didn't I just do this?  Oh well.)

DST law changes in Palestine.  Historical corrections for Turkey.
Switch to numeric abbreviations for Asia/Colombo.
2016-10-20 15:20:11 -04:00
Robert Haas f82ec32ac3 Rename "pg_xlog" directory to "pg_wal".
"xlog" is not a particularly clear abbreviation for "write-ahead log",
and it sometimes confuses users into believe that the contents of the
"pg_xlog" directory are not critical data, leading to unpleasant
consequences.  So, rename the directory to "pg_wal".

This patch modifies pg_upgrade and pg_basebackup to understand both
the old and new directory layouts; the former is necessary given the
purpose of the tool, while the latter merely avoids an unnecessary
backward-compatibility break.

We may wish to consider renaming other programs, switches, and
functions which still use the old "xlog" naming to also refer to
"wal".  However, that's still under discussion, so let's do just this
much for now.

Discussion: CAB7nPqTeC-8+zux8_-4ZD46V7YPwooeFxgndfsq5Rg8ibLVm1A@mail.gmail.com

Michael Paquier
2016-10-20 11:32:18 -04:00
Robert Haas ec7db2b483 Remove a comment which is now incorrect.
Before 5d305d86bd, this comment was
correct, but now it says we do something which we don't actually do.
Accordingly, remove the comment.
2016-10-20 10:24:51 -04:00
Tom Lane 23ed2ba812 Another portability fix for tzcode2016g update.
clang points out that SIZE_MAX wouldn't fit into an int, which means
this comparison is pretty useless.  Per report from Thomas Munro.
2016-10-19 23:32:08 -04:00
Tom Lane ad90ac4d67 Windows portability fix.
Per buildfarm.
2016-10-19 19:28:11 -04:00
Tom Lane f3094920a5 Sync our copy of the timezone library with IANA release tzcode2016g.
This is mostly to absorb some corner-case fixes in zic for year-2037
timestamps.  The other changes that have been made are unlikely to affect
our usage, but nonetheless we may as well take 'em.
2016-10-19 18:55:52 -04:00
Tom Lane a3215431ab Suppress "Factory" zone in pg_timezone_names view for tzdata >= 2016g.
IANA got rid of the really silly "abbreviation" and replaced it with one
that's only moderately silly.  But it's still pointless, so keep on not
showing it.
2016-10-19 18:11:49 -04:00
Tom Lane ecbac3e6e0 Update time zone data files to tzdata release 2016g.
DST law changes in Turkey.  Historical corrections for America/Los_Angeles,
Europe/Kirov, Europe/Moscow, Europe/Samara, and Europe/Ulyanovsk.
Rename Asia/Rangoon to Asia/Yangon, with a backward compatibility link.

The IANA crew continue their campaign to replace invented time zone
abbrevations with numeric GMT offsets.  This update changes numerous zones
in Antarctica and the former Soviet Union, for instance Antarctica/Casey
now reports "+08" not "AWST" in the pg_timezone_names view.  I kept these
abbreviations in the tznames/ data files, however, so that we will still
accept them for input.  (We may want to start trimming those files someday,
but today is not that day.)

An exception is that since IANA no longer claims that "AMT" is in use
in Armenia for GMT+4, I replaced it in the Default file with GMT-4,
corresponding to Amazon Time which is in use in South America.  It may be
that that meaning is also invented and IANA will drop it in a future
update; but for now, it seems silly to give pride of place to a meaning
not traceable to IANA over one that is.
2016-10-19 17:56:38 -04:00
Peter Eisentraut 9ffe4a8b4c Make getrusage() output a little more readable
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Peter Geoghegan <pg@heroku.com>
2016-10-19 09:53:16 -04:00
Peter Eisentraut e5a9bcb529 Use pg_ctl promote -w in TAP tests
Switch TAP tests to use the new wait mode of pg_ctl promote.  This
allows avoiding extra logic with poll_query_until() to be sure that a
promoted standby is ready for read-write queries.

From: Michael Paquier <michael.paquier@gmail.com>
2016-10-19 09:18:50 -04:00
Peter Eisentraut 5d58c07a44 initdb pg_basebackup: Rename --noxxx options to --no-xxx
--noclean and --nosync were the only options spelled without a hyphen,
so change this for consistency with other options.  The options in
pg_basebackup have not been in a release, so we just rename them.  For
initdb, we retain the old variants.

Vik Fearing and me
2016-10-19 08:48:48 -04:00
Peter Eisentraut caf936b09f pg_ctl: Add long option for -o
Now all normally used options are covered by long options as well.
2016-10-19 08:48:48 -04:00
Peter Eisentraut 0be22457d7 pg_ctl: Add long options for -w and -W
From: Vik Fearing <vik@2ndquadrant.fr>
2016-10-19 08:48:22 -04:00
Heikki Linnakangas 917dc7d239 Fix WAL-logging of FSM and VM truncation.
When a relation is truncated, it is important that the FSM is truncated as
well. Otherwise, after recovery, the FSM can return a page that has been
truncated away, leading to errors like:

ERROR:  could not read block 28991 in file "base/16390/572026": read only 0
of 8192 bytes

We were using MarkBufferDirtyHint() to dirty the buffer holding the last
remaining page of the FSM, but during recovery, that might in fact not
dirty the page, and the FSM update might be lost.

To fix, use the stronger MarkBufferDirty() function. MarkBufferDirty()
requires us to do WAL-logging ourselves, to protect from a torn page, if
checksumming is enabled.

Also fix an oversight in visibilitymap_truncate: it also needs to WAL-log
when checksumming is enabled.

Analysis by Pavan Deolasee.

Discussion: <CABOikdNr5vKucqyZH9s1Mh0XebLs_jRhKv6eJfNnD2wxTn=_9A@mail.gmail.com>
2016-10-19 14:26:05 +03:00
Robert Haas b801e12008 Improve regression test coverage for hash indexes.
On my system, this improves coverage for src/backend/access/hash from
61.3% of lines to 88.2% of lines, and from 83.5% of functions to 97.5%
of functions, which is pretty good for 36 lines of tests.

Mithun Cy, reviewing by Amit Kapila and Álvaro Herrera
2016-10-18 15:57:58 -04:00
Andres Freund 90d3da11c9 Fix a few typos in simplehash.h.
Author: Erik Rijkers
Discussion: <274e4c8ac545d6622735f97c1f6c354b@xs4all.nl>
2016-10-18 10:55:56 -07:00
Robert Haas fca41acb86 Fix typo in comment.
Amit Langote
2016-10-18 13:43:27 -04:00
Tom Lane 6f13a682c8 Fix cidin() to handle values above 2^31 platform-independently.
CommandId is declared as uint32, and values up to 4G are indeed legal.
cidout() handles them properly by treating the value as unsigned int.
But cidin() was just using atoi(), which has platform-dependent behavior
for values outside the range of signed int, as reported by Bart Lengkeek
in bug #14379.  Use strtoul() instead, as xidin() does.

In passing, make some purely cosmetic changes to make xidin/xidout
look more like cidin/cidout; the former didn't have a monopoly on
best practice IMO.

Neither xidin nor cidin make any attempt to throw error for invalid input.
I didn't change that here, and am not sure it's worth worrying about
since neither is really a user-facing type.  The point is just to ensure
that indubitably-valid inputs work as expected.

It's been like this for a long time, so back-patch to all supported
branches.

Report: <20161018152550.1413.6439@wrigleys.postgresql.org>
2016-10-18 12:24:46 -04:00
Heikki Linnakangas faae1c918e Revert "Replace PostmasterRandom() with a stronger way of generating randomness."
This reverts commit 9e083fd468. That was a
few bricks shy of a load:

* Query cancel stopped working
* Buildfarm member pademelon stopped working, because the box doesn't have
  /dev/urandom nor /dev/random.

This clearly needs some more discussion, and a quite different patch, so
revert for now.
2016-10-18 16:28:23 +03:00
Robert Haas 7d3235ba42 By default, set log_line_prefix = '%m [%p] '.
This value might not be to everyone's taste; in particular, some
people might prefer %t to %m, and others may want %u, %d, or other
fields.  However, it's a vast improvement on the old default of ''.

Christoph Berg
2016-10-17 16:34:48 -04:00
Heikki Linnakangas d8589946dd Fix use-after-free around DISTINCT transition function calls.
Have tuplesort_gettupleslot() copy the contents of its current table slot
as needed. This is based on an approach taken by tuplestore_gettupleslot().
In the future, tuplesort_gettupleslot() may also be taught to avoid copying
the tuple where caller can determine that that is safe (the
tuplestore_gettupleslot() interface already offers this option to callers).

Patch by Peter Geoghegan. Fixes bug #14344, reported by Regina Obe.

Report: <20160929035538.20224.39628@wrigleys.postgresql.org>

Backpatch-through: 9.6
2016-10-17 12:13:16 +03:00
Heikki Linnakangas 9e083fd468 Replace PostmasterRandom() with a stronger way of generating randomness.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.

pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
- /dev/random

Original patch by Magnus Hagander, with further work by Michael Paquier
and me.

Discussion: <CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com>
2016-10-17 11:52:50 +03:00
Andres Freund 5dfc198146 Use more efficient hashtable for execGrouping.c to speed up hash aggregation.
The more efficient hashtable speeds up hash-aggregations with more than
a few hundred groups significantly. Improvements of over 120% have been
measured.

Due to the the different hash table queries that not fully
determined (e.g. GROUP BY without ORDER BY) may change their result
order.

The conversion is largely straight-forward, except that, due to the
static element types of simplehash.h type hashes, the additional data
some users store in elements (e.g. the per-group working data for hash
aggregaters) is now stored in TupleHashEntryData->additional.  The
meaning of BuildTupleHashTable's entrysize (renamed to additionalsize)
has been changed to only be about the additionally stored size.  That
size is only used for the initial sizing of the hash-table.

Reviewed-By: Tomas Vondra
Discussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>
2016-10-14 17:22:51 -07:00
Andres Freund 75ae538bc3 Use more efficient hashtable for tidbitmap.c to speed up bitmap scans.
Use the new simplehash.h to speed up tidbitmap.c uses. For bitmap scan
heavy queries speedups of over 100% have been measured. Both lossy and
exact scans benefit, but the wins are bigger for mostly exact scans.

The conversion is mostly trivial, except that tbm_lossify() now restarts
lossifying at the point it previously stopped. Otherwise the hash table
becomes unbalanced because the scan in done in hash-order, leaving the
end of the hashtable more densely filled then the beginning. That caused
performance issues with dynahash as well, but due to the open chaining
they were less pronounced than with the linear adressing from
simplehash.h.

Reviewed-By: Tomas Vondra
Discussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>
2016-10-14 16:08:11 -07:00
Andres Freund b30d3ea824 Add a macro templatized hashtable.
dynahash.c hash tables aren't quite fast enough for some
use-cases. There are several reasons for lacking performance:
- the use of chaining for collision handling makes them cache
  inefficient, that's especially an issue when the tables get bigger.
- as the element sizes for dynahash are only determined at runtime,
  offset computations are somewhat expensive
- hash and element comparisons are indirect function calls, causing
  unnecessary pipeline stalls
- it's two level structure has some benefits (somewhat natural
  partitioning), but increases the number of indirections
to fix several of these the hash tables have to be adjusted to the
individual use-case at compile-time. C unfortunately doesn't provide a
good way to do compile code generation (like e.g. c++'s templates for
all their weaknesses do).  Thus the somewhat ugly approach taken here is
to allow for code generation using a macro-templatized header file,
which generates functions and types based on a prefix and other
parameters.

Later patches use this infrastructure to use such hash tables for
tidbitmap.c (bitmap scans) and execGrouping.c (hash aggregation,
...). In queries where these use up a large fraction of the time, this
has been measured to lead to performance improvements of over 100%.

There are other cases where this could be useful (e.g. catcache.c).

The hash table design chosen is a variant of linear open-addressing. The
biggest disadvantage of simple linear addressing schemes are highly
variable lookup times due to clustering, and deletions leaving a lot of
tombstones around.  To address these issues a variant of "robin hood"
hashing is employed.  Robin hood hashing optimizes chaining lengths by
moving elements close to their optimal bucket ("rich" elements), out of
the way if a to-be-inserted element is further away from its optimal
position (i.e. it's "poor").  While that can make insertions slower, the
average lookup performance is a lot better, and higher fill factors can
be used in a still performant manner.  To avoid tombstones - which
normally solve the issue that a deleted node's presence is relevant to
determine whether a lookup needs to continue looking or is done -
buckets following a deleted element are shifted backwards, unless
they're empty or already at their optimal position.

There's further possible improvements that can be made to this
implementation. Amongst others:
- Use distance as a termination criteria during searches. This is
  generally a good idea, but I've been able to see the overhead of
  distance calculations in some cases.
- Consider combining the 'empty' status into the hashvalue, and enforce
  storing the hashvalue. That could, in some cases, increase memory
  density and remove a few instructions.
- Experiment further with the, very conservatively choosen, fillfactor.
- Make maximum size of hashtable configurable, to allow storing very
  very large tables. That'd require 64bit hash values to be more common
  than now, though.
- some smaller memcpy calls could be optimized to copy larger chunks
But since the new implementation is already considerably faster than
dynahash it seem sensible to start using it.

Reviewed-By: Tomas Vondra
Discussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>
2016-10-14 16:07:38 -07:00
Andres Freund aa3ca5e3dd Add likely/unlikely() branch hint macros.
These are useful for very hot code paths. Because it's easy to guess
wrongly about likelihood, and because such likelihoods change over time,
they should be used sparingly.

Past tests have shown it'd be a good idea to use them in some places,
e.g. in error checks around ereports that ERROR out, but that's work for
later.

Discussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>
2016-10-14 16:05:30 -07:00
Tom Lane 32fdf42cf5 Fix assorted integer-overflow hazards in varbit.c.
bitshiftright() and bitshiftleft() would recursively call each other
infinitely if the user passed INT_MIN for the shift amount, due to integer
overflow in negating the shift amount.  To fix, clamp to -VARBITMAXLEN.
That doesn't change the results since any shift distance larger than the
input bit string's length produces an all-zeroes result.

Also fix some places that seemed inadequately paranoid about input typmods
exceeding VARBITMAXLEN.  While a typmod accepted by anybit_typmodin() will
certainly be much less than that, at least some of these spots are
reachable with user-chosen integer values.

Andreas Seltenreich and Tom Lane

Discussion: <87d1j2zqtz.fsf@credativ.de>
2016-10-14 16:28:34 -04:00
Tom Lane 81e82a2bd4 Fix handling of pgstat counters for TRUNCATE in a prepared transaction.
pgstat_twophase_postcommit is supposed to duplicate the math in
AtEOXact_PgStat, but it had missed out the bit about clearing
t_delta_live_tuples/t_delta_dead_tuples for a TRUNCATE.

It's harder than you might think to replicate the issue here, because
those counters would only be nonzero when a previous transaction in
the same backend had added/deleted tuples in the truncated table,
and those counts hadn't been sent to the stats collector yet.

Evident oversight in commit d42358efb.  I've not added a regression
test for this; we tried to add one in d42358efb, and had to revert it
because it was too timing-sensitive for the buildfarm.

Back-patch to 9.5 where d42358efb came in.

Stas Kelvich

Discussion: <EB57BF68-C06D-4737-BDDC-4BA778F4E62B@postgrespro.ru>
2016-10-13 19:46:05 -04:00
Tom Lane 3cca13cbfc Fix another bug in merging of inherited CHECK constraints.
It's not good for an inherited child constraint to be marked connoinherit;
that would result in the constraint not propagating to grandchild tables,
if any are created later.  The code mostly prevented this from happening
but there was one case that was missed.

This is somewhat related to commit e55a946a8, which also tightened checks
on constraint merging.  Hence, back-patch to 9.2 like that one.  This isn't
so much because there's a concrete feature-related reason to stop there,
as to avoid having more distinct behaviors than we have to in this area.

Amit Langote

Discussion: <b28ee774-7009-313d-dd55-5bdd81242c41@lab.ntt.co.jp>
2016-10-13 17:05:14 -04:00
Tom Lane c08521eb55 Remove dead code in pg_dump.
I'm not sure if this provision for "pg_backup" behaving a bit differently
from "pg_dump" ever did anything useful in a released version.  But it's
definitely dead code now.

Michael Paquier
2016-10-13 16:08:16 -04:00
Tom Lane cb775768e3 Try to find out the actual hugepage size when making a MAP_HUGETLB request.
Even if Linux's mmap() is okay with a partial-hugepage request, munmap()
is not, as reported by Chris Richards.  Therefore it behooves us to try
a bit harder to find out the actual hugepage size, instead of assuming
that we can skate by with a guess.

For the moment, just look into /proc/meminfo to find out the default
hugepage size, and use that.  Later, on kernels that support requests
for nondefault sizes, we might try to consider other alternatives.
But that smells more like a new feature than a bug fix, especially if
we want to provide any way for the DBA to control it, so leave it for
another day.

I set this up to allow easy addition of platform-specific code for
non-Linux platforms, if needed; but right now there are no reports
suggesting that we need to work harder on other platforms.

Back-patch to 9.4 where hugepage support was introduced.

Discussion: <31056.1476303954@sss.pgh.pa.us>
2016-10-13 15:06:46 -04:00
Tom Lane 15fc5e1581 Clean up handling of anonymous mmap'd shared-memory segment.
Fix detaching of the mmap'd segment to have its own on_shmem_exit callback,
rather than piggybacking on the one for detaching from the SysV segment.
That was confusing, and given the distance between the two attach calls,
it was trouble waiting to happen.

Make the detaching calls idempotent by clearing AnonymousShmem to show
we've already unmapped.  I spent quite a bit of time yesterday trying
to find a path that would allow the munmap()'s to be done twice, and
while I did not succeed, it seems silly that there's even a question.

Make the #ifdef logic less confusing by separating "do we want to use
anonymous shmem" from EXEC_BACKEND.  Even though there's no current
scenario where those conditions are different, it is not helpful for
different places in the same file to be testing EXEC_BACKEND for what
are fundamentally different reasons.

Don't do on_exit_reset() in StartBackgroundWorker().  At best that's
useless (InitPostmasterChild would have done it already) and at worst
it could zap some callback that's unrelated to shared memory.

Improve comments, and simplify the huge_pages enablement logic slightly.

Back-patch to 9.4 where hugepage support was introduced.
Arguably this should go into 9.3 as well, but the code looks
significantly different there, and I doubt it's worth the
trouble of adapting the patch given I can't show a live bug.
2016-10-13 13:59:56 -04:00
Tom Lane 0a4bf6b192 Fix pg_dumpall regression test to be locale-independent.
The expected results in commit b4fc64578 seem to have been generated
in a non-C locale, which just points up the fact that the ORDER BY
clause was locale-sensitive.

Per buildfarm.
2016-10-13 10:46:22 -04:00
Tom Lane 9c4cc9e2c7 Fix broken jsonb_set() logic for replacing array elements.
Commit 0b62fd036 did a fairly sloppy job of refactoring setPath()
to support jsonb_insert() along with jsonb_set().  In its defense,
though, there was no regression test case exercising the case of
replacing an existing element in a jsonb array.

Per bug #14366 from Peng Sun.  Back-patch to 9.6 where bug was introduced.

Report: <20161012065349.1412.47858@wrigleys.postgresql.org>
2016-10-13 00:25:48 -04:00
Andres Freund ccbb852cd6 Fix further hash table order dependent tests.
Similar to 0137caf273, this makes contrib and pl tests less dependant on
hash-table order.  After this commit, at least some order affecting
changes to execGrouping.c don't result in regression test changes
anymore.
2016-10-12 18:31:45 -07:00
Andres Freund b4fc645787 Make pg_dumpall's database ACL query independent of hash table order.
Previously GRANT order on databases was not well defined, due to the use
of EXCEPT without an ORDER BY.  Add an ORDER BY, adapt test output.

I don't, at the moment, see reason to backpatch this.
2016-10-12 18:29:57 -07:00
Tom Lane 4f52fd3c6d Revert addition of PGDLLEXPORT in PG_FUNCTION_INFO_V1 macro.
This turns out not to be as harmless as I thought: MSVC will complain
if it sees an "extern" declaration without PGDLLEXPORT and then one with.
(Seems fairly silly, given that this can be changed after the fact by the
linker, but there you have it.)  Therefore, contrib modules that have
extern's for V1 functions in header files are falling over in the
buildfarm, since none of those externs are marked PGDLLEXPORT.

We might or might not conclude that we're willing to plaster those
declarations with PGDLLEXPORT in HEAD, but in any case there's no way we're
going to ship this change in the back branches.  Third-party authors would
not thank us for breaking their code in a minor release.  Hence, revert
the addition of PGDLLEXPORT (but let's keep the extra info in the comment).
If we do the other changes we can revert this commit in HEAD.

Per buildfarm.
2016-10-12 18:01:43 -04:00
Tom Lane c0a3b211bc pg_dump's getTypes() needn't retrieve typinput or typoutput anymore.
Commit 64f3524e2 removed the stanza of code that examined these values.
I failed to notice they were unnecessary because my compiler didn't
warn about the un-read variables.  Noted by Peter Eisentraut.
2016-10-12 15:11:31 -04:00
Tom Lane 5c80642aa8 Remove unnecessary int2vector-specific hash function and equality operator.
These functions were originally added in commit d8cedf67a to support
use of int2vector columns as catcache lookup keys.  However, there are
no catcaches that use such columns.  (Indeed I now think it must always
have been dead code: a catcache with such a key column would need an
underlying unique index on the column, but we've never had an int2vector
btree opclass.)

Getting rid of the int2vector-specific operator and function does not
lose any functionality, because operations on int2vectors will now fall
back to the generic anyarray support.  This avoids a wart that a btree
index on an int2vector column (made using anyarray_ops) would fail to
match equality searches, because int2vectoreq wasn't a member of the
opclass.  We don't really care much about that, since int2vector is not
meant as a type for users to use, but it's silly to have extra code and
less functionality.

If we ever do want a catcache to be indexed by an int2vector column,
we'd need to put back full btree and hash opclasses for int2vector,
comparable to the support for oidvector.  (The anyarray code can't be
used at such a low level, because it needs to do catcache lookups.)
But we'll deal with that if/when the need arises.

Also worth noting is that removal of the hash int2vector_ops opclass will
break any user-created hash indexes on int2vector columns.  While hash
anyarray_ops would serve the same purpose, it would probably not compute
the same hash values and thus wouldn't be on-disk-compatible.  Given that
int2vector isn't a user-facing type and we're planning other incompatible
changes in hash indexes for v10 anyway, this doesn't seem like something
to worry about, but it's probably worth mentioning here.

Amit Langote

Discussion: <d9bb74f8-b194-7307-9ebd-90645d377e45@lab.ntt.co.jp>
2016-10-12 14:54:08 -04:00
Tom Lane 8518583cdb Provide DLLEXPORT markers for C functions via PG_FUNCTION_INFO_V1 macro.
This isn't really necessary for our own code, because we use a .DEF file
in MSVC builds (see gendef.pl), or --export-all-symbols in MinGW and
Cygwin builds, to ensure that all global symbols in loadable modules
will be exported on Windows.  However, third-party authors might use
different build processes that need this marker, and it's harmless
enough for our own builds.

To some extent, this is an oversight in commit e7128e8db, so back-patch
to 9.4 where that was added.

Laurenz Albe

Discussion: <A737B7A37273E048B164557ADEF4A58B539300BD@ntex2010a.host.magwien.gv.at>
2016-10-12 12:45:50 -04:00
Tom Lane 64f3524e2c Remove pg_dump/pg_dumpall support for dumping from pre-8.0 servers.
The need for dumping from such ancient servers has decreased to about nil
in the field, so let's remove all the code that catered to it.  Aside
from removing a lot of boilerplate variant queries, this allows us to not
have to cope with servers that don't have (a) schemas or (b) pg_depend.
That means we can get rid of assorted squishy code around that.  There
may be some nonobvious additional simplifications possible, but this patch
already removes about 1500 lines of code.

I did not remove the ability for pg_restore to read custom-format archives
generated by these old versions (and light testing says that that does
still work).  If you have an old server, you probably also have a pg_dump
that will work with it; but you have an old custom-format backup file,
that might be all you have.

It'd be possible at this point to remove fmtQualifiedId()'s version
argument, but I refrained since that would affect code outside pg_dump.

Discussion: <2661.1475849167@sss.pgh.pa.us>
2016-10-12 12:20:02 -04:00
Heikki Linnakangas bb55dd6059 Fix copy-pasto in comment.
Amit Langote
2016-10-12 12:07:54 +03:00
Heikki Linnakangas b75f467b6e Simplify the code for logical tape read buffers.
Pass the buffer size as argument to LogicalTapeRewindForRead, rather than
setting it earlier with the separate LogicTapeAssignReadBufferSize call.
This way, the buffer size is set closer to where it's actually used, which
makes the code easier to understand.

This makes the calculation for how much memory to use for the buffers less
precise. We now use the same amount of memory for every tape, rounded down
to the nearest BLCKSZ boundary, instead of using one more block for some
tapes, to get the total up to exact amount of memory available. That should
be OK, merging isn't too sensitive to the exact amount of memory used.

Reviewed by Peter Geoghegan

Discussion: <0f607c4b-df23-353e-bf56-c0389d28495f@iki.fi>
2016-10-12 12:05:45 +03:00
Tom Lane 2f1eaf87e8 Drop server support for FE/BE protocol version 1.0.
While this isn't a lot of code, it's been essentially untestable for
a very long time, because libpq doesn't support anything older than
protocol 2.0, and has not since release 6.3.  There's no reason to
believe any other client-side code still uses that protocol, either.

Discussion: <2661.1475849167@sss.pgh.pa.us>
2016-10-11 12:19:18 -04:00
Tom Lane 2b860f52ed Remove "sco" and "unixware" ports.
SCO OpenServer and SCO UnixWare are more or less dead platforms.
We have never had a buildfarm member testing the "sco" port, and
the last "unixware" member was last heard from in 2012, so it's
fair to doubt that the code even compiles anymore on either one.
Remove both ports.  We can always undo this if someone shows up
with an interest in maintaining and testing these platforms.

Discussion: <17177.1476136994@sss.pgh.pa.us>
2016-10-11 11:26:04 -04:00
Andres Freund 0137caf273 Make regression tests less dependent on hash table order.
Upcoming changes to the hash table code used, among others, for grouping
and set operations will change the output order for a few queries. To
make it less likely that actual bugs are hidden between regression test
ordering changes, and to make the tests robust against platform
dependant ordering, add ORDER BYs guaranteeing the output order.

As it's possible that some of the changes expose platform dependant
ordering, push this earlier, to let the buildfarm shake out potentially
unstable results.

Discussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>
2016-10-10 13:41:57 -07:00
Tom Lane 886f6c5ccd In PQsendQueryStart(), avoid leaking any left-over async result.
Ordinarily there would not be an async result sitting around at this
point, but it appears that in corner cases there can be.  Considering
all the work we're about to launch, it's hardly going to cost anything
noticeable to check.

It's been like this forever, so back-patch to all supported branches.

Report: <CAD-Qf1eLUtBOTPXyFQGW-4eEsop31tVVdZPu4kL9pbQ6tJPO8g@mail.gmail.com>
2016-10-10 10:35:58 -04:00
Heikki Linnakangas 6fb12cbcd6 Remove some unnecessary #includes.
Amit Langote
2016-10-10 12:22:58 +03:00
Peter Eisentraut 52f0142eb4 Add a noreturn attribute to help static analyzers 2016-10-09 21:36:42 -04:00
Tom Lane ecb0d20a9d Use unnamed POSIX semaphores, if available, on Linux and FreeBSD.
We've had support for using unnamed POSIX semaphores instead of System V
semaphores for quite some time, but it was not used by default on any
platform.  Since many systems have rather small limits on the number of
SysV semaphores allowed, it seems desirable to switch to POSIX semaphores
where they're available and don't create performance or kernel resource
problems.  Experimentation by me shows that unnamed POSIX semaphores
are at least as good as SysV semaphores on Linux, and we previously had
a report from Maksym Sobolyev that FreeBSD is significantly worse with
SysV semaphores than POSIX ones.  So adjust those two platforms to use
unnamed POSIX semaphores, if configure can find the necessary library
functions.  If this goes well, we may switch other platforms as well,
but it would be advisable to test them individually first.

It's not currently contemplated that we'd encourage users to select
a semaphore API for themselves, but anyone who wants to experiment
can add PREFERRED_SEMAPHORES=UNNAMED_POSIX (or NAMED_POSIX, or SYSV)
to their configure command line to do so.

I also tweaked configure to report which API it's selected, mainly
so that we can tell that from buildfarm reports.

I did not touch the user documentation's discussion about semaphores;
that will need some adjustment once the dust settles.

Discussion: <8536.1475704230@sss.pgh.pa.us>
2016-10-09 18:03:45 -04:00
Tom Lane ac4a9d92fc Fix incorrect handling of polymorphic aggregates used as window functions.
The transfunction was told that its first argument and result were
of the window function output type, not the aggregate state type.
This'd only matter if the transfunction consults get_fn_expr_argtype,
which typically only polymorphic functions would do.

Although we have several regression tests around polymorphic aggs,
none of them detected this mistake --- in fact, they still didn't
fail when I injected the same mistake into nodeAgg.c.  So add some
more tests covering both plain agg and window-function-agg cases.

Per report from Sebastian Luque.  Back-patch to 9.6 where the error
was introduced (by sloppy refactoring in commit 804163bc2, looks like).

Report: <87int2qkat.fsf@gmail.com>
2016-10-09 12:49:37 -04:00
Tom Lane e55a946a81 Fix two bugs in merging of inherited CHECK constraints.
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent.  This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error.  This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too.  The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true.  In this way, either order of adding the constraints
has the same end result.

Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated.  In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case.  (Note: valid child and
not-valid parent seems fine, so continue to allow that.)

Per report from Benedikt Grundmann.  Back-patch to 9.2 where we introduced
possibly-not-valid check constraints.  The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.

Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
2016-10-08 19:29:27 -04:00
Tom Lane 8811f5d3a4 libpqwalreceiver needs to link with libintl when using --enable-nls.
The need for this was previously obscured even on picky platforms
by the hack we used to support direct cross-module references in
the transforms contrib modules.  Now that that hack is gone, the
undefined symbol is exposed, as reported by Robert Haas.

Back-patch to 9.5 where we started to use -Wl,-undefined,dynamic_lookup.
I'm a bit surprised that the older branches don't seem to contain
any gettext references in this module, but since they don't fail
at build time, they must not.  (We might be able to get away with
leaving this alone in 9.5/9.6, but I think it's cleaner if the
reference gets resolved at link time.)

Report: <CA+TgmoaHJKU5kcWZcYduATYVT7Mnx+8jUnycaYYL7OtCwCigug@mail.gmail.com>
2016-10-07 21:12:25 -04:00
Andres Freund b0779abb3a Fix fallback implementation of pg_atomic_write_u32().
I somehow had assumed that in the spinlock (in turn possibly using
semaphores) based fallback atomics implementation 32 bit writes could be
done without a lock. As far as the write goes that's correct, since
postgres supports only platforms with single-copy atomicity for aligned
32bit writes.  But writing without holding the spinlock breaks
read-modify-write operations like pg_atomic_compare_exchange_u32(),
since they'll potentially "miss" a concurrent write, which can't happen
in actual hardware implementations.

In 9.6+ when using the fallback atomics implementation this could lead
to buffer header locks not being properly marked as released, and
potentially some related state corruption.  I don't see a related danger
in 9.5 (earliest release with the API), because pg_atomic_write_u32()
wasn't used in a concurrent manner there.

The state variable of local buffers, before this change, were
manipulated using pg_atomic_write_u32(), to avoid unnecessary
synchronization overhead. As that'd not be the case anymore, introduce
and use pg_atomic_unlocked_write_u32(), which does not correctly
interact with RMW operations.

This bug only caused issues when postgres is compiled on platforms
without atomics support (i.e. no common new platform), or when compiled
with --disable-atomics, which explains why this wasn't noticed in
testing.

Reported-By: Tom Lane
Discussion: <14947.1475690465@sss.pgh.pa.us>
Backpatch: 9.5-, where the atomic operations API was introduced.
2016-10-07 16:55:15 -07:00
Heikki Linnakangas 0aec7f9aec Remove bogus mapping from UTF-8 to SJIS conversion table.
0xc19c is not a valid UTF-8 byte sequence. It doesn't do any harm, AFAICS,
but it's surely not intentional. No backpatching though, just to be sure.

In the passing, also add a file header comment to the file, like the
UCS_to_SJIS.pl script would produce. (The file was originally created with
UCS_to_SJIS.pl, but has been modified by hand since then. That's
questionable, but I'll leave fixing that for later.)

Kyotaro Horiguchi

Discussion: <20160907.155050.233844095.horiguchi.kyotaro@lab.ntt.co.jp>
2016-10-07 23:56:42 +03:00
Heikki Linnakangas d668b03378 Make TAP test suites to work, when @INC does not contain current dir.
Recent Perl and/or new Linux distributions are starting to remove "." from
the @INC list by default. That breaks pg_rewind and ssl test suites, which
use helper perl modules that reside in the same directory. To fix, add the
current source directory explicitly to prove's include dir.

The vcregress.pl script probably also needs something like this, but I
wasn't able to remove '.' from @INC on Windows to test this, and don't want
to try doing that blindly.

Discussion: <20160908204529.flg6nivjuwp5vaoy@alap3.anarazel.de>
2016-10-07 21:49:49 +03:00
Tom Lane 4806f26f9e Fix pg_dump to work against pre-9.0 servers again.
getBlobs' queries for pre-9.0 servers were broken in two ways:
the 7.x/8.x query uses DISTINCT so it can't have unspecified-type
NULLs in the target list, and both that query and the 7.0 one
failed to provide the correct output column labels, so that the
subsequent code to extract data from the PGresult would fail.

Back-patch to 9.6 where the breakage was introduced (by commit 23f34fa4b).

Amit Langote and Tom Lane

Discussion: <0a3e7a0e-37bd-8427-29bd-958135862f0a@lab.ntt.co.jp>
2016-10-07 09:51:18 -04:00
Heikki Linnakangas 0d4d7d6185 Don't allow both --source-server and --source-target args to pg_rewind.
They are supposed to be mutually exclusive, but there was no check for
that.

Michael Banck

Discussion: <20161007103414.GD12247@nighthawk.caipicrew.dd-dns.de>
2016-10-07 14:35:17 +03:00
Heikki Linnakangas 275bf98601 Clear OpenSSL error queue after failed X509_STORE_load_locations() call.
Leaving the error in the error queue used to be harmless, because the
X509_STORE_load_locations() call used to be the last step in
initialize_SSL(), and we would clear the queue before the next
SSL_connect() call. But previous commit moved things around. The symptom
was that if a CRL file was not found, and one of the subsequent
initialization steps, like loading the client certificate or private key,
failed, we would incorrectly print the "no such file" error message from
the earlier X509_STORE_load_locations() call as the reason.

Backpatch to all supported versions, like the previous patch.
2016-10-07 12:51:52 +03:00
Heikki Linnakangas 8bb14cdd33 Don't share SSL_CTX between libpq connections.
There were several issues with the old coding:

1. There was a race condition, if two threads opened a connection at the
   same time. We used a mutex around SSL_CTX_* calls, but that was not
   enough, e.g. if one thread SSL_CTX_load_verify_locations() with one
   path, and another thread set it with a different path, before the first
   thread got to establish the connection.

2. Opening two different connections, with different sslrootcert settings,
   seemed to fail outright with "SSL error: block type is not 01". Not sure
   why.

3. We created the SSL object, before calling SSL_CTX_load_verify_locations
   and SSL_CTX_use_certificate_chain_file on the SSL context. That was
   wrong, because the options set on the SSL context are propagated to the
   SSL object, when the SSL object is created. If they are set after the
   SSL object has already been created, they won't take effect until the
   next connection. (This is bug #14329)

At least some of these could've been fixed while still using a shared
context, but it would've been more complicated and error-prone. To keep
things simple, let's just use a separate SSL context for each connection,
and accept the overhead.

Backpatch to all supported versions.

Report, analysis and test case by Kacper Zuk.

Discussion: <20160920101051.1355.79453@wrigleys.postgresql.org>
2016-10-07 12:20:39 +03:00
Heikki Linnakangas d7eb76b908 Disable synchronous commits in pg_rewind.
If you point pg_rewind to a server that is using synchronous replication,
with "pg_rewind --source-server=...", and the replication is not working
for some reason, pg_rewind will get stuck because it creates a temporary
table, which needs to be replicated. You could call broken replication a
pilot error, but pg_rewind is often used in special circumstances, when
there are changes to the replication setup.

We don't do any "real" updates, and we don't care about fsyncing or
replicating the operations on the temporary tables, so fix that by
setting synchronous_commit off.

Michael Banck, Michael Paquier. Backpatch to 9.5, where pg_rewind was
introduced.

Discussion: <20161005143938.GA12247@nighthawk.caipicrew.dd-dns.de>
2016-10-06 13:24:46 +03:00
Heikki Linnakangas b56fb691b0 Fix excessive memory consumption in the new sort pre-reading code.
LogicalTapeRewind() should not allocate large read buffer, if the tape
is completely empty. The calling code relies on that, for its
calculation of how much memory to allocate for the read buffers. That
lead to massive overallocation of memory, if maxTapes was high, but
only a few tapes were actually used.

Reported by Tomas Vondra

Discussion: <7303da46-daf7-9c68-3cc1-9f83235cf37e@2ndquadrant.com>
2016-10-06 09:46:40 +03:00
Tom Lane bfe2e84781 Remove -Wl,-undefined,dynamic_lookup in macOS build.
We don't need this anymore, and it prevents build-time error checking
that's usually good to have, so remove it.  Undoes one change of commit
cac765820.

Unfortunately, it's much harder to get a similar effect on other common
platforms, because we don't want the linker to throw errors for symbols
that will be resolved in the core backend.  Only macOS and AIX expect the
core backend executable to be available while linking loadable modules,
so only these platforms can usefully throw errors for unresolved symbols
at link time.

Discussion: <2652.1475512158@sss.pgh.pa.us>
2016-10-05 23:03:55 -04:00
Robert Haas 61f9e7ba3c Update obsolete comments and perldoc.
Loose ends from commit 2a0f89cd71.

Daniel Gustafsson
2016-10-05 13:09:52 -04:00
Robert Haas eb3bc0bd1a Re-alphabetize #include directives.
Thomas Munro
2016-10-05 08:24:25 -04:00
Robert Haas d2ce38e204 Rename WAIT_* constants to PG_WAIT_*.
Windows apparently has a constant named WAIT_TIMEOUT, and some of these
other names are pretty generic, too.  Insert "PG_" at the front of each
name in order to disambiguate.

Michael Paquier
2016-10-05 08:04:52 -04:00
Tom Lane eda04886c1 Avoid direct cross-module links in hstore_plperl and ltree_plpython, too.
Just turning the crank on the project started in commit d51924be8.
These cases turn out to be exact subsets of the boilerplate needed
for hstore_plpython.

Discussion: <2652.1475512158@sss.pgh.pa.us>
2016-10-04 17:49:07 -04:00
Robert Haas 6c9c95ed1b Fix another Windows compile break.
Commit 6f3bd98ebf is still making
the buildfarm unhappy.  This time it's mastodon that is complaining.
2016-10-04 13:14:19 -04:00
Robert Haas 9445d1121d Fix Windows compile break in 6f3bd98ebf. 2016-10-04 12:18:05 -04:00
Heikki Linnakangas d4fca5e6c7 Fix another outdated comment.
Preloading is done by logtape.c now.
2016-10-04 19:16:00 +03:00
Robert Haas 23843dcb60 Remove trailing commas from enums.
Buildfarm member mylodon doesn't like them.  Actually, I don't like
them either, but I failed to notice these before pushing commit
6f3bd98ebf.
2016-10-04 11:50:34 -04:00
Robert Haas 976a1ce910 Adjust worker_spi for 6f3bd98ebf. 2016-10-04 11:18:43 -04:00
Robert Haas 6f3bd98ebf Extend framework from commit 53be0b1ad to report latch waits.
WaitLatch, WaitLatchOrSocket, and WaitEventSetWait now taken an
additional wait_event_info parameter; legal values are defined in
pgstat.h.  This makes it possible to uniquely identify every point in
the core code where we are waiting for a latch; extensions can pass
WAIT_EXTENSION.

Because latches were the major wait primitive not previously covered
by this patch, it is now possible to see information in
pg_stat_activity on a large number of important wait events not
previously addressed, such as ClientRead, ClientWrite, and SyncRep.

Unfortunately, many of the wait events added by this patch will fail
to appear in pg_stat_activity because they're only used in background
processes which don't currently appear in pg_stat_activity.  We should
fix this either by creating a separate view for such information, or
else by deciding to include them in pg_stat_activity after all.

Michael Paquier and Robert Haas, reviewed by Alexander Korotkov and
Thomas Munro.
2016-10-04 11:01:42 -04:00
Heikki Linnakangas c86c2d9d57 Update comment.
mergepreread()/mergeprereadone() don't exist anymore, the function that
does roughly the same is now called mergereadnext().
2016-10-04 09:47:54 +03:00
Andres Freund 61633f7904 Correct logical decoding restore behaviour for subtransactions.
Before initializing iteration over a subtransaction's changes, the last
few changes were not spilled to disk. That's correct if the transaction
didn't spill to disk, but otherwise... This bug can lead to missed or
misorderd subtransaction contents when they were spilled to disk.

Move spilling of the remaining in-memory changes to
ReorderBufferIterTXNInit(), where it can easily be applied to the top
transaction and, if present, subtransactions.

Since this code had too many bugs already, noticeably increase test
coverage.

Fixes: #14319
Reported-By: Huan Ruan
Discussion: <20160909012610.20024.58169@wrigleys.postgresql.org>
Backport: 9,4-, where logical decoding was added
2016-10-03 22:11:36 -07:00
Tom Lane d51924be88 Convert contrib/hstore_plpython to not use direct linking to other modules.
Previously, on most platforms, we allowed hstore_plpython's references
to hstore and plpython to be unresolved symbols at link time, trusting
the dynamic linker to resolve them when the module is loaded.  This
has a number of problems, the worst being that the dynamic linker
does not know where the references come from and can do nothing but
fail if those other modules haven't been loaded.  We've more or less
gotten away with that for the limited use-case of datatype transform
modules, but even there, it requires some awkward hacks, most recently
commit 83c249200.

Instead, let's not treat these references as linker-resolvable at all,
but use function pointers that are manually filled in by the module's
_PG_init function.  There are few enough contact points that this
doesn't seem unmaintainable, at least for these use-cases.  (Note that
the same technique wouldn't work at all for decoupling from libpython
itself, but fortunately that's just a standard shared library and can
be linked to normally.)

This is an initial patch that just converts hstore_plpython.  If the
buildfarm doesn't find any fatal problems, I'll work on the other
transform modules soon.

Tom Lane, per an idea of Andres Freund's.

Discussion: <2652.1475512158@sss.pgh.pa.us>
2016-10-03 22:27:11 -04:00
Tom Lane 6bc811c992 Show a sensible value in pg_settings.unit for GUC_UNIT_XSEGS variables.
Commit 88e982302 invented GUC_UNIT_XSEGS for min_wal_size and max_wal_size,
but neglected to make it display sensibly in pg_settings.unit (by adding a
case to the switch in GetConfigOptionByNum).  Fix that, and adjust said
switch to throw a run-time error the next time somebody forgets.

In passing, avoid using a static buffer for the output string --- the rest
of this function pstrdup's from a local buffer, and I see no very good
reason why the units code should do it differently and less safely.

Per report from Otar Shavadze.  Back-patch to 9.5 where the new unit type
was added.

Report: <CAG-jOyA=iNFhN+yB4vfvqh688B7Tr5SArbYcFUAjZi=0Exp-Lg@mail.gmail.com>
2016-10-03 16:40:25 -04:00
Stephen Frost 814b9e9b8e Fix RLS with COPY (col1, col2) FROM tab
Attempting to COPY a subset of columns from a table with RLS enabled
would fail due to an invalid query being constructed (using a single
ColumnRef with the list of fields to exact in 'fields', but that's for
the different levels of an indirection for a single column, not for
specifying multiple columns).

Correct by building a ColumnRef and then RestTarget for each column
being requested and then adding those to the targetList for the select
query.  Include regression tests to hopefully catch if this is broken
again in the future.

Patch-By: Adam Brightwell
Reviewed-By: Michael Paquier
2016-10-03 16:22:57 -04:00
Tom Lane 83c2492002 Enforce a specific order for probing library loadability in pg_upgrade.
pg_upgrade checks whether all the shared libraries used in the old cluster
are also available in the new one by issuing LOAD for each library name.
Previously, it cared not what order it did the LOADs in.  Ideally it
should not have to care, but currently the transform modules in contrib
fail unless both the language and datatype modules they depend on are
loaded first.  A backend-side solution for that looks possible but
probably not back-patchable, so as a stopgap measure, let's do the LOAD
tests in order by library name length.  That should fix the problem for
reasonably-named transform modules, eg "hstore_plpython" will be loaded
after both "hstore" and "plpython".  (Yeah, it's a hack.)

In a larger sense, having a predictable order of these probes is a good
thing, since it will make upgrades predictably work or not work in the
face of inter-library dependencies.  Also, this patch replaces O(N^2)
de-duplication logic with O(N log N) logic, which could matter in
installations with very many databases.  So I don't foresee reverting this
even after we have a proper fix for the library-dependency problem.

In passing, improve a couple of SQL queries used here.

Per complaint from Andrew Dunstan that pg_upgrade'ing the transform contrib
modules failed.  Back-patch to 9.5 where transform modules were introduced.

Discussion: <f7ac29f3-515c-2a44-21c5-ec925053265f@dunslane.net>
2016-10-03 10:07:49 -04:00
Heikki Linnakangas e94568ecc1 Change the way pre-reading in external sort's merge phase works.
Don't pre-read tuples into SortTuple slots during merge. Instead, use the
memory for larger read buffers in logtape.c. We're doing the same number
of READTUP() calls either way, but managing the pre-read SortTuple slots
is much more complicated. Also, the on-tape representation is more compact
than SortTuples, so we can fit more pre-read tuples into the same amount
of memory this way. And we have better cache-locality, when we use just a
small number of SortTuple slots.

Now that we only hold one tuple from each tape in the SortTuple slots, we
can greatly simplify the "batch memory" management. We now maintain a
small set of fixed-sized slots, to hold the tuples, and fall back to
palloc() for larger tuples. We use this method during all merge phases,
not just the final merge, and also when randomAccess is requested, and
also in the TSS_SORTEDONTAPE case. In other words, it's used whenever we
do an external sort.

Reviewed by Peter Geoghegan and Claudio Freire.

Discussion: <CAM3SWZTpaORV=yQGVCG8Q4axcZ3MvF-05xe39ZvORdU9JcD6hQ@mail.gmail.com>
2016-10-03 13:37:49 +03:00
Tom Lane e8bdee2770 Add ALTER EXTENSION ADD/DROP ACCESS METHOD, and use it in pg_upgrade.
Without this, an extension containing an access method is not properly
dumped/restored during pg_upgrade --- the AM ends up not being a member
of the extension after upgrading.

Another oversight in commit 473b93287, reported by Andrew Dunstan.

Report: <f7ac29f3-515c-2a44-21c5-ec925053265f@dunslane.net>
2016-10-02 14:31:28 -04:00
Tom Lane 728ceba938 Avoid leaking FDs after an fsync failure.
Fixes errors introduced in commit bc34223bc, as detected by Coverity.

In passing, report ENOSPC for a short write while padding a new wal file in
open_walfile, make certain that close_walfile closes walfile in all cases,
and improve a couple of comments.

Michael Paquier and Tom Lane
2016-10-02 12:33:46 -04:00
Tom Lane 3b90e38c5d Do ClosePostmasterPorts() earlier in SubPostmasterMain().
In standard Unix builds, postmaster child processes do ClosePostmasterPorts
immediately after InitPostmasterChild, that is almost immediately after
being spawned.  This is important because we don't want children holding
open the postmaster's end of the postmaster death watch pipe.

However, in EXEC_BACKEND builds, SubPostmasterMain was postponing this
responsibility significantly, in order to make it slightly more convenient
to pass the right flag value to ClosePostmasterPorts.  This is bad,
particularly seeing that process_shared_preload_libraries() might invoke
nearly-arbitrary code.  Rearrange so that we do it as soon as we've
fetched the socket FDs via read_backend_variables().

Also move the comment explaining about randomize_va_space to before the
call of PGSharedMemoryReAttach, which is where it's relevant.  The old
placement was appropriate when the reattach happened inside
CreateSharedMemoryAndSemaphores, but that was a long time ago.

Back-patch to 9.3; the patch doesn't apply cleanly before that, and
it doesn't seem worth a lot of effort given that we've had no actual
field complaints traceable to this.

Discussion: <4157.1475178360@sss.pgh.pa.us>
2016-10-01 17:15:09 -04:00
Tom Lane ea046f08d1 Fix misstatement in comment in Makefile.shlib.
There is no need for "all: all-lib" to be placed before inclusion of
Makefile.shlib.  Makefile.global is what ensures that "all" is the
default target, and we already document that that has to be included
first.  Per comment from Pavel Raiskup.

Discussion: <1925924.izSMJEZO3x@unused-4-107.brq.redhat.com>
2016-10-01 13:45:16 -04:00
Tom Lane 7107d58ec5 Fix misplacement of submake-generated-headers prerequisites.
The sequence "configure; cd src/pl/plpython; make -j" failed due to
trying to compile plpython's .o files before the generated headers
finished building.  (This is an important real-world case, since it's
the typical second step when building both plpython2 and plpython3.)
This happens because the submake-generated-headers target is not
placed in a way to make it a prerequisite to compiling the .o files.
Fix that.

Checking other uses of submake-generated-headers, I noted that the one
attached to pg_regress was similarly misplaced; but it's actually not
needed at all for pg_regress.o, rather regress.o, so move it to be a
prerequisite of that.

Back-patch to 9.6 where submake-generated-headers was introduced
(by commit 548af97fc).  It's not immediately clear to me why the
previous coding didn't have the same issue; but since we've not
had field reports of plpython make failing, leave it alone in the
older branches.

Pavel Raiskup and Tom Lane

Discussion: <1925924.izSMJEZO3x@unused-4-107.brq.redhat.com>
2016-10-01 13:35:13 -04:00
Peter Eisentraut a4327296df Set log_line_prefix and application name in test drivers
Before pg_regress runs psql, set the application name to the test name.
Similarly, set the application name to the test file name in the TAP
tests.  Also, set a default log_line_prefix that show the application
name, as well as the PID and a time stamp.

That way, the server log output can be correlated to the test input
files, making debugging a bit easier.
2016-09-30 21:32:33 -04:00
Tom Lane f002ed2b8e Improve error reporting in pg_upgrade's file copying/linking/rewriting.
The previous design for this had copyFile(), linkFile(), and
rewriteVisibilityMap() returning strerror strings, with the caller
producing one-size-fits-all error messages based on that.  This made it
impossible to produce messages that described the failures with any degree
of precision, especially not short-read problems since those don't set
errno at all.

Since pg_upgrade has no intention of continuing after any error in this
area, let's fix this by just letting these functions call pg_fatal() for
themselves, making it easy for each point of failure to have a suitable
error message.  Taking this approach also allows dropping cleanup code
that was unnecessary and was often rather sloppy about preserving errno.
To not lose relevant info that was reported before, pass in the schema name
and table name of the current table so that they can be included in the
error reports.

An additional problem was the use of getErrorText(), which was flat out
wrong for all but a couple of call sites, because it unconditionally did
"_dosmaperr(GetLastError())" on Windows.  That's only appropriate when
reporting an error from a Windows-native API, which only a couple of
the callers were actually doing.  Thus, even the reported strerror string
would be unrelated to the actual failure in many cases on Windows.
To fix, get rid of getErrorText() altogether, and just have call sites
do strerror(errno) instead, since that's the way all the rest of our
frontend programs do it.  Add back the _dosmaperr() calls in the two
places where that's actually appropriate.

In passing, make assorted messages hew more closely to project style
guidelines, notably by removing initial capitals in not-complete-sentence
primary error messages.  (I didn't make any effort to clean up places
I didn't have another reason to touch, though.)

Per discussion of a report from Thomas Kellerer.  Back-patch to 9.6,
but no further; given the relative infrequency of reports of problems
here, it's not clear it's worth adapting the patch to older branches.

Patch by me, but with credit to Alvaro Herrera for spotting the issue
with getErrorText's misuse of _dosmaperr().

Discussion: <nsjrbh$8li$1@blaine.gmane.org>
2016-09-30 20:40:56 -04:00
Tom Lane 5afcd2aa74 Fix multiple portability issues in pg_upgrade's rewriteVisibilityMap().
This is new code in 9.6, and evidently we missed out testing it as
thoroughly as it should have been.  Bugs fixed here:

1. Use binary not text mode to open the files on Windows.  Before, if
the visibility map chanced to contain two bytes that looked like \r\n,
Windows' read() would convert that to \n, which both corrupts the map
data and causes the file to look shorter than it should.  Unless you
were *very* unlucky and had an exact multiple of 8K such occurrences
in each VM file, this would cause pg_upgrade to report a failure,
though with a rather obscure error message.

2. The code for copying rebuilt bytes into the output was simply wrong.
It chanced to work okay on little-endian machines but would emit the
bytes in the wrong order on big-endian, leading to silent corruption
of the visibility map data.

3. The code was careless about alignment of the working buffers.  Given
all three of an alignment-picky architecture, a compiler that chooses
to put the new_vmbuf[] local variable at an odd starting address, and
a checksum-enabled database, pg_upgrade would dump core.

Point one was reported by Thomas Kellerer, the other two detected by
code-reading.

Point two is much the nastiest of these issues from an impact standpoint,
though fortunately it affects only a minority of users.  The Windows issue
will definitely bite people, but it seems quite unlikely that there would
be undetected corruption from that.

In addition, I failed to resist the temptation to do some minor cosmetic
adjustments, mostly improving the comments.

It would be a good idea to try to improve the error reporting here, but
that seems like material for a separate patch.

Discussion: <nsjrbh$8li$1@blaine.gmane.org>
2016-09-30 20:40:55 -04:00
Peter Eisentraut cd03890d0b Fix breakage in previous change 2016-09-30 15:27:51 -04:00
Peter Eisentraut 330b48b94b Separate enum from struct
Otherwise the enum symbols are not visible outside the struct in C++.

Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com>
2016-09-30 15:11:47 -04:00
Magnus Hagander 3d39244e6e Retry opening new segments in pg_xlogdump --folllow
There is a small window between when the server closes out the existing
segment and the new one is created. Put a loop around the open call in
this case to make sure we wait for the new file to actually appear.
2016-09-30 11:22:00 +02:00
Peter Eisentraut f2af8dc5ba Fix compiler warnings
This was missed in bf5bb2e85b, because the
code is only visible under PG_FLUSH_DATA_WORKS.
2016-09-29 12:00:00 -04:00
Peter Eisentraut 728a3e73e9 Switch pg_basebackup commands in Postgres.pm to use --nosync
On slow machines, this greatly reduces the I/O pressure induced by the
tests.

From: Michael Paquier <michael.paquier@gmail.com>
2016-09-29 12:00:00 -04:00
Peter Eisentraut 6ed2d8584c pg_basebackup: Add --nosync option
This is useful for testing, similar to initdb's --nosync.

From: Michael Paquier <michael.paquier@gmail.com>
2016-09-29 12:00:00 -04:00
Peter Eisentraut bc34223bc1 pg_basebackup pg_receivexlog: Issue fsync more carefully
Several places weren't careful about fsyncing in the way.  See 1d4a0ab1
and 606e0f98 for details about required fsyncs.

This adds a couple of functions in src/common/ that have an equivalent
in the backend: durable_rename(), fsync_parent_path()

From: Michael Paquier <michael.paquier@gmail.com>
2016-09-29 12:00:00 -04:00
Peter Eisentraut bf5bb2e85b Move fsync routines of initdb into src/common/
The intention is to used those in other utilities such as pg_basebackup
and pg_receivexlog.

From: Michael Paquier <michael.paquier@gmail.com>
2016-09-29 12:00:00 -04:00
Peter Eisentraut 6ad8ac6026 Exclude additional directories in pg_basebackup
The list of files and directories that pg_basebackup excludes from the
backup was somewhat incomplete and unorganized.  Change that with having
the exclusion driven from tables.  Clean up some code around it.  Also
document the exclusions in more detail so that users of pg_start_backup
can make use of it as well.

The contents of these directories are now excluded from the backup:
pg_dynshmem, pg_notify, pg_serial, pg_snapshots, pg_subtrans

Also fix a bug that a pg_repl_slot or pg_stat_tmp being a symlink would
cause a corrupt tar header to be created.  Now such symlinks are
included in the backup as empty directories.  Bug found by Ashutosh
Sharma <ashu.coek88@gmail.com>.

From: David Steele <david@pgmasters.net>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-09-28 12:00:00 -04:00
Alvaro Herrera b82d5a2c7c Silence compiler warnings
Reported by Peter Eisentraut.  Coding suggested by Tom Lane.
2016-09-28 19:31:58 -03:00
Tom Lane 83bed06be4 Rationalize format-picture caching logic in formatting.c.
Add a validity flag to DCHCacheEntry and NUMCacheEntry entries, and
do not set it true until after we've parsed the supplied format string.
This allows dealing with possible errors while parsing the format
without the baroque hack that was there before (which only covered
errors within NUMDesc_prepare, anyway).  We can get rid of the PG_TRY in
NUMDesc_prepare, as well as last_NUMCacheEntry and NUM_cache_remove.
(Essentially, this reverts commit ff783fbae in favor of a less fragile
solution; the problems with that approach are well illustrated by later
hacking such as 55f927a46.)

In passing, define the size of these caches as DCH_CACHE_ENTRIES not
DCH_CACHE_FIELDS + 1 (whoever thought that was a good definition?)
and likewise for the NUM cache.  Also const-ify format string parameters
where convenient, and merge duplicated cache lookup logic.

This is primarily driven by a proposed patch from Artur Zakirov,
which introduced some ereport's into format string parsing for
the datetime case.  He proposed preventing the creation of invalid
cache entries by parsing the format string first into a local-variable
array, and then copying that to a cache entry.  That seemed a bit
ugly to me, and anyway randomly different from the way the identical
problem had been solved for the numeric case.  Let's make the two
sets of code more similar not less so.

I'm not sure whether we'll adopt the new error conditions Artur proposes,
but this patch seems like good code cleanup and future-proofing in any
case.  The existing code is critically (and undocumented-ly) dependent on
no elog being thrown out of several nontrivial functions, which is trouble
waiting to happen, though it doesn't seem to be actively broken today.

Discussion: <b2a39359-3282-b402-f4a3-057aae500ee7@postgrespro.ru>
2016-09-28 17:08:40 -04:00
Tom Lane d3cd36a133 Make to_timestamp() and to_date() range-check fields of their input.
Historically, something like to_date('2009-06-40','YYYY-MM-DD') would
return '2009-07-10' because there was no prohibition on out-of-range
month or day numbers.  This has been widely panned, and it also turns
out that Oracle throws an error in such cases.  Since these functions
are nominally Oracle-compatibility features, let's change that.

There's no particular restriction on year (modulo the fact that the
scanner may not believe that more than 4 digits are year digits,
a matter to be addressed separately if at all).  But we now check month,
day, hour, minute, second, and fractional-second fields, as well as
day-of-year and second-of-day fields if those are used.

Currently, no checks are made on ISO-8601-style week numbers or day
numbers; it's not very clear what the appropriate rules would be there,
and they're probably so little used that it's not worth sweating over.

Artur Zakirov, reviewed by Amul Sul, further adjustments by me

Discussion: <1873520224.1784572.1465833145330.JavaMail.yahoo@mail.yahoo.com>
See-Also: <57786490.9010201@wars-nicht.de>
2016-09-28 14:36:17 -04:00
Peter Eisentraut 967ed9205b Remove dead line of code 2016-09-28 12:00:00 -04:00
Robert Haas 4929704acb worker_spi: Call pgstat_report_stat.
Without this, statistics changes accumulated by the worker never get
reported to the stats collector, which is bad.

Julien Rouhaud
2016-09-28 12:42:48 -04:00
Peter Eisentraut e79e6c4da1 Fix CRC check handling in get_controlfile
The previous patch broke this by returning NULL for a failed CRC check,
which pg_controldata would then try to read.  Fix by returning the
result of the CRC check in a separate argument.

Michael Paquier and myself
2016-09-28 12:00:00 -04:00
Robert Haas 308985b0b4 Fix dangling pointer problem in ReorderBufferSerializeChange.
Commit 3fe3511d05 introduced a new
case into this function, but neglected to ensure that the "ondisk"
pointer got updated after a possible reallocation as the code does
in other cases.

Stas Kelvich, per diagnosis by Konstantin Knizhnik.
2016-09-28 11:19:46 -04:00
Heikki Linnakangas babe05bc2b Turn password_encryption GUC into an enum.
This makes the parameter easier to extend, to support other password-based
authentication protocols than MD5. (SCRAM is being worked on.)

The GUC still accepts on/off as aliases for "md5" and "plain", although
we may want to remove those once we actually add support for another
password hash type.

Michael Paquier, reviewed by David Steele, with some further edits by me.

Discussion: <CAB7nPqSMXU35g=W9X74HVeQp0uvgJxvYOuA4A-A3M+0wfEBv-w@mail.gmail.com>
2016-09-28 12:22:44 +03:00
Tom Lane 72daabc7a3 Disallow pushing volatile quals past set-returning functions.
Pushing an upper-level restriction clause into an unflattened
subquery-in-FROM is okay when the subquery contains no SRFs in its
targetlist, or when it does but the SRFs are unreferenced by the clause
*and the clause is not volatile*.  Otherwise, we're changing the number
of times the clause is evaluated, which is bad for volatile quals, and
possibly changing the result, since a volatile qual might succeed for some
SRF output rows and not others despite not referencing any of the changing
columns.  (Indeed, if the clause is something like "random() > 0.5", the
user is probably expecting exactly that behavior.)

We had most of these restrictions down, but not the one about the upper
clause not being volatile.  Fix that, and add a regression test to
illustrate the expected behavior.

Although this is definitely a bug, it doesn't seem like back-patch
material, since possibly some users don't realize that the broken
behavior is broken and are relying on what happens now.  Also, while
the added test is quite cheap in the wake of commit a4c35ea1c, it would
be much more expensive (or else messier) in older branches.

Per report from Tom van Tilburg.

Discussion: <CAP3PPDiucxYCNev52=YPVkrQAPVF1C5PFWnrQPT7iMzO1fiKFQ@mail.gmail.com>
2016-09-27 18:43:36 -04:00
Tom Lane 0109ab2760 Make struct ParallelSlot private within pg_dump/parallel.c.
The only field of this struct that other files have any need to touch
is the pointer to the TocEntry a worker is working on.  (Well,
pg_backup_archiver.c is actually looking at workerStatus too, but that
can be finessed by specifying that the TocEntry pointer is NULL for a
non-busy worker.)

Hence, move out the TocEntry pointers to a separate array within
struct ParallelState, and then we can make struct ParallelSlot private.

I noted the possibility of this previously, but hadn't got round to
actually doing it.

Discussion: <1188.1464544443@sss.pgh.pa.us>
2016-09-27 14:29:12 -04:00
Tom Lane fb03d08a89 Rationalize parallel dump/restore's handling of worker cmd/status messages.
The existing APIs for creating and parsing command and status messages are
rather messy; for example, archive-format modules have to provide code
for constructing command messages, which is entirely pointless since
the code to read them is hard-wired in WaitForCommands() and hence
no format-specific variation is actually possible.  But there's little
foreseeable reason to need format-specific variation anyway.

The situation for status messages is no better; at least those are both
constructed and parsed by format-specific code, but said code is quite
redundant since there's no actual need for format-specific variation.

To add insult to injury, the first API involves returning pointers to
static buffers, which is bad, while the second involves returning pointers
to malloc'd strings, which is safer but randomly inconsistent.

Hence, get rid of the MasterStartParallelItem and MasterEndParallelItem
APIs, and instead write centralized functions that construct and parse
command and status messages.  If we ever do need more flexibility, these
functions can be the standard implementations of format-specific
callback methods, but that's a long way off if it ever happens.

Tom Lane, reviewed by Kevin Grittner

Discussion: <17340.1464465717@sss.pgh.pa.us>
2016-09-27 13:56:04 -04:00
Tom Lane b7b8cc0cfc Redesign parallel dump/restore's wait-for-workers logic.
The ListenToWorkers/ReapWorkerStatus APIs were messy and hard to use.
Instead, make DispatchJobForTocEntry register a callback function that
will take care of state cleanup, doing whatever had been done by the caller
of ReapWorkerStatus in the old design.  (This callback is essentially just
the old mark_work_done function in the restore case, and a trivial test for
worker failure in the dump case.)  Then we can have ListenToWorkers call
the callback immediately on receipt of a status message, and return the
worker to WRKR_IDLE state; so the WRKR_FINISHED state goes away.

This allows us to design a unified wait-for-worker-messages loop:
WaitForWorkers replaces EnsureIdleWorker and EnsureWorkersFinished as well
as the mess in restore_toc_entries_parallel.  Also, we no longer need the
fragile API spec that the caller of DispatchJobForTocEntry is responsible
for ensuring there's an idle worker, since DispatchJobForTocEntry can just
wait until there is one.

In passing, I got rid of the ParallelArgs struct, which was a net negative
in terms of notational verboseness, and didn't seem to be providing any
noticeable amount of abstraction either.

Tom Lane, reviewed by Kevin Grittner

Discussion: <1188.1464544443@sss.pgh.pa.us>
2016-09-27 13:22:39 -04:00
Alvaro Herrera 51c3e9fade Include <sys/select.h> where needed
<sys/select.h> is required by POSIX.1-2001 to get the prototype of
select(2), but nearly no systems enforce that because older standards
let you get away with including some other headers.  Recent OpenBSD
hacking has removed that frail touch of friendliness, however, which
broke some compiles; fix all the way back to 9.1 by adding the required
standard.  Only vacuumdb.c was reported to fail, but it seems easier to
fix the whole lot in a fell swoop.

Per bug #14334 by Sean Farrell.
2016-09-27 01:05:21 -03:00
Peter Eisentraut 440c8d1bbc Fix some typos in comment 2016-09-26 12:00:00 -04:00
Tom Lane 9779bda86c Fix newly-introduced issues in pgbench.
The result of FD_ISSET() doesn't necessarily fit in a bool, though
assigning it to one might accidentally work depending on platform and which
socket FD number is being inquired of.  Rewrite to test it with if(),
rather than making any specific assumption about the result width,
to match the way every other such call in PG is written.

Don't break out of the input_mask-filling loop after finding the first
client that we're waiting for results from.  That mostly breaks parallel
query management.

Also, if we choose not to call select(), be sure to clear out any bits
the mask-filling loop might have set, so that we don't accidentally call
doCustom for clients we don't know have input.  Doing so would likely
be harmless, but it's a waste of cycles and doesn't seem to be intended.

Make this_usec wide enough.  (Yeah, the value would usually fit in an
int, but then why are we using int64 everywhere else?)

Minor cosmetic adjustments, mostly comment improvements.

Problems introduced by commit 12788ae49.  The first issue was discovered
by buildfarm testing, the others by code review.
2016-09-26 20:23:50 -04:00
Tom Lane fdc9186f7e Replace the built-in GIN array opclasses with a single polymorphic opclass.
We had thirty different GIN array opclasses sharing the same operators and
support functions.  That still didn't cover all the built-in types, nor
did it cover arrays of extension-added types.  What we want is a single
polymorphic opclass for "anyarray".  There were two missing features needed
to make this possible:

1. We have to be able to declare the index storage type as ANYELEMENT
when the opclass is declared to index ANYARRAY.  This just takes a few
more lines in index_create().  Although this currently seems of use only
for GIN, there's no reason to make index_create() restrict it to that.

2. We have to be able to identify the proper GIN compare function for
the index storage type.  This patch proceeds by making the compare function
optional in GIN opclass definitions, and specifying that the default btree
comparison function for the index storage type will be looked up when the
opclass omits it.  Again, that seems pretty generically useful.

Since the comparison function lookup is done in initGinState(), making
use of the second feature adds an additional cache lookup to GIN index
access setup.  It seems unlikely that that would be very noticeable given
the other costs involved, but maybe at some point we should consider
making GinState data persist longer than it now does --- we could keep it
in the index relcache entry, perhaps.

Rather fortuitously, we don't seem to need to do anything to get this
change to play nice with dump/reload or pg_upgrade scenarios: the new
opclass definition is automatically selected to replace existing index
definitions, and the on-disk data remains compatible.  Also, if a user has
created a custom opclass definition for a non-builtin type, this doesn't
break that, since CREATE INDEX will prefer an exact match to opcintype
over a match to ANYARRAY.  However, if there's anyone out there with
handwritten DDL that explicitly specifies _bool_ops or one of the other
replaced opclass names, they'll need to adjust that.

Tom Lane, reviewed by Enrique Meneses

Discussion: <14436.1470940379@sss.pgh.pa.us>
2016-09-26 14:52:44 -04:00
Heikki Linnakangas 12788ae49e Refactor script execution state machine in pgbench.
The doCustom() function had grown into quite a mess. Rewrite it, in a more
explicit state machine style, for readability.

This also fixes one minor bug: if a script consisted entirely of meta
commands, doCustom() never returned to the caller, so progress reports
with the -P option were not printed. I don't want to backpatch this
refactoring, and the bug is quite insignificant, so only commit this to
master, and leave the bug unfixed in back-branches.

Review and original bug report by Fabien Coelho.

Discussion: <alpine.DEB.2.20.1607090850120.3412@sto>
2016-09-26 10:56:02 +03:00
Tom Lane da6c4f6ca8 Refer to OS X as "macOS", except for the port name which is still "darwin".
We weren't terribly consistent about whether to call Apple's OS "OS X"
or "Mac OS X", and the former is probably confusing to people who aren't
Apple users.  Now that Apple has rebranded it "macOS", follow their lead
to establish a consistent naming pattern.  Also, avoid the use of the
ancient project name "Darwin", except as the port code name which does not
seem desirable to change.  (In short, this patch touches documentation and
comments, but no actual code.)

I didn't touch contrib/start-scripts/osx/, either.  I suspect those are
obsolete and due for a rewrite, anyway.

I dithered about whether to apply this edit to old release notes, but
those were responsible for quite a lot of the inconsistencies, so I ended
up changing them too.  Anyway, Apple's being ahistorical about this,
so why shouldn't we be?
2016-09-25 15:40:57 -04:00
Tom Lane c3a0818460 Install TAP test infrastructure so it's available for extension testing.
When configured with --enable-tap-tests, "make install" will now install
the Perl support files for TAP testing where PGXS will find them.
This allows extensions to rely on $(prove_check) even when being built
out-of-tree.  Back-patch to 9.4 where we first started to support TAP
testing, to reduce the number of cases extension makefiles need to
consider.

Craig Ringer

Discussion: <CAMsr+YFXv+2qne6xJW7z_25mYBtktRX5rpkrgrb+DRgQ_FxgHQ@mail.gmail.com>
2016-09-23 15:50:00 -04:00
Tom Lane 12f6eadffd Fix incorrect logic for excluding range constructor functions in pg_dump.
Faulty AND/OR nesting in the WHERE clause of getFuncs' SQL query led to
dumping range constructor functions if they are part of an extension
and we're in binary-upgrade mode.  Actually, we don't want to dump them
separately even then, since CREATE TYPE AS RANGE will create the range's
constructor functions regardless.  Per report from Andrew Dunstan.

It looks like this mistake was introduced by me, in commit b985d4877, in
perhaps-overzealous refactoring to reduce code duplication.  I'm suitably
embarrassed.

Report: <34854939-02d7-f591-5677-ce2994104599@dunslane.net>
2016-09-23 13:49:26 -04:00
Tom Lane 959ea7fa76 Remove useless code.
Apparent copy-and-pasteo in standby_desc_invalidations() had two
entries for msg->id == SHAREDINVALRELMAP_ID.

Aleksander Alekseev

Discussion: <20160923090814.GB1238@e733>
2016-09-23 10:44:50 -04:00
Tom Lane 8e6b4ee21f Don't trust CreateFileMapping() to clear the error code on success.
We must test GetLastError() even when CreateFileMapping() returns a
non-null handle.  If that value were left over from some previous system
call, we might be fooled into thinking the segment already existed.
Experimentation on Windows 7 suggests that CreateFileMapping() clears
the error code on success, but it is not documented to do so, so let's
not rely on that happening in all Windows releases.

Amit Kapila

Discussion: <20811.1474390987@sss.pgh.pa.us>
2016-09-23 10:09:52 -04:00
Tom Lane 49a91b88e6 Avoid using PostmasterRandom() for DSM control segment ID.
Commits 470d886c3 et al intended to fix the problem that the postmaster
selected the same "random" DSM control segment ID on every start.  But
using PostmasterRandom() for that destroys the intended property that the
delay between random_start_time and random_stop_time will be unpredictable.
(Said delay is probably already more predictable than we could wish, but
that doesn't mean that reducing it by a couple orders of magnitude is OK.)
Revert the previous patch and add a comment warning against misuse of
PostmasterRandom.  Fix the original problem by calling srandom() early in
PostmasterMain, using a low-security seed that will later be overwritten
by PostmasterRandom.

Discussion: <20789.1474390434@sss.pgh.pa.us>
2016-09-23 09:54:11 -04:00
Peter Eisentraut 6fa51c79c7 pg_ctl: Add promote wait option to help output
pointed out by Masahiko Sawada <sawada.mshk@gmail.com>
2016-09-23 12:00:00 -04:00
Heikki Linnakangas 3c2d5d6600 Improve error message on MSVC if perl*.lib is not found.
John Harvey, reviewed by Michael Paquier

Discussion: <CABcP5fjEjgOsh097cWnQrsK9yCswo4DZxp-V47DKCH-MxY9Gig@mail.gmail.com>
2016-09-23 14:21:59 +03:00
Heikki Linnakangas 674e2de64d Fix typo in comment.
Daniel Gustafsson
2016-09-23 08:04:19 +03:00
Bruce Momjian 1ff0042165 C comment: fix function header comment
Fix for transformOnConflictClause().

Author: Tomonari Katsumata
2016-09-22 17:34:24 -04:00
Tom Lane 8023b5827f Remove nearly-unused SizeOfIptrData macro.
Past refactorings have removed all but one reference to SizeOfIptrData
(and that one place was in a pretty noncritical spot).  Since nobody's
complained, it seems probable that there are no supported compilers
that don't think sizeof(ItemPointerData) is 6.  If there are, we're
wasting MAXALIGN per heap tuple anyway, so it's rather silly to worry
about whether we can shave space in places like WAL records.

Pavan Deolasee

Discussion: <CABOikdOOawDda4hwLOT6zdA6MFfPLu3Z2YBZkX0JdayNS6JOeQ@mail.gmail.com>
2016-09-22 14:30:33 -04:00
Tom Lane 96dd77d349 Be sure to rewind the tuplestore read pointer in non-leader CTEScan nodes.
ExecInitCteScan supposed that it didn't have to do anything to the extra
tuplestore read pointer it gets from tuplestore_alloc_read_pointer.
However, it needs this read pointer to be positioned at the start of the
tuplestore, while tuplestore_alloc_read_pointer is actually defined as
cloning the current position of read pointer 0.  In normal situations
that accidentally works because we initialize the whole plan tree at once,
before anything gets read.  But it fails in an EvalPlanQual recheck, as
illustrated in bug #14328 from Dima Pavlov.  To fix, just forcibly rewind
the pointer after tuplestore_alloc_read_pointer.  The cost of doing so is
negligible unless the tuplestore is already in TSS_READFILE state, which
wouldn't happen in normal cases.  We could consider altering tuplestore's
API to make that case cheaper, but that would make for a more invasive
back-patch and it doesn't seem worth it.

This has been broken probably for as long as we've had CTEs, so back-patch
to all supported branches.

Discussion: <32468.1474548308@sss.pgh.pa.us>
2016-09-22 11:35:03 -04:00
Peter Eisentraut 8b845520fb Add tests for various connection string issues
Add tests for consistent support of connection strings in frontend
programs as well as proper handling of unusual characters in database
and user names.  These tests were developed for the issues of
CVE-2016-5424.

To allow testing of names with spaces, change the pg_regress
command-line options --create-role and --dbname to split their arguments
by comma only, not space or comma as before.  Only commas were actually
used in existing uses.

Noah Misch, Michael Paquier, Peter Eisentraut
2016-09-22 12:00:00 -04:00
Peter Eisentraut e7010ce479 pg_ctl: Add wait option to promote action
When waiting is selected for the promote action, look into pg_control
until the state changes, then use the PQping-based waiting until the
server is reachable.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-09-21 12:00:00 -04:00
Peter Eisentraut ebdf5bf7d1 Delay updating control file to "in production"
Move the updating of the control file to "in production" status until
the point where WAL writes are allowed.  Before, there could be a
significant gap between the control file update and write transactions
actually being allowed.  This makes it more reliable to use the control
status to verify the end of a promotion.

From: Michael Paquier <michael.paquier@gmail.com>
2016-09-21 12:00:00 -04:00
Peter Eisentraut c1dc51d484 pg_ctl: Detect current standby state from pg_control
pg_ctl used to determine whether a server was in standby mode by looking
for a recovery.conf file.  With this change, it instead looks into
pg_control, which is potentially more accurate.  There are also
occasional discussions about removing recovery.conf, so this removes one
dependency.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-09-21 12:00:00 -04:00
Peter Eisentraut eb5089a05b pg_ctl: Add tests for promote action
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-09-21 12:00:00 -04:00
Peter Eisentraut e767db2242 Make command_like output more compact
Consistently print the test name, not the full command, which can be
quite lenghty and include temporary directory names and other
distracting details.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-09-21 12:00:00 -04:00
Peter Eisentraut c91b34bab1 Fix typo
From: Michael Paquier <michael.paquier@gmail.com>
2016-09-21 12:00:00 -04:00
Heikki Linnakangas 2a7f4f7643 Print test parameters like "foo: 123", and results like "foo = 123".
The way "latency average" was printed was differently if it was calculated
from the overall run time or was measured on a per-transaction basis.
Also, the per-script weight is a test parameter, rather than a result, so
use the "weight: %f" style for that.

Backpatch to 9.6, since the inconsistency on "latency average" was
introduced there.

Fabien Coelho

Discussion: <alpine.DEB.2.20.1607131015370.7486@sto>
2016-09-21 13:24:13 +03:00
Heikki Linnakangas 65c6556384 Fix pgbench's calculation of average latency, when -T is not used.
If the test duration was given in # of transactions (-t or no option),
rather as a duration (-T), the latency average was always printed as 0.
It has been broken ever since the display of latency average was added,
in 9.4.

Fabien Coelho

Discussion: <alpine.DEB.2.20.1607131015370.7486@sto>
2016-09-21 13:14:48 +03:00
Peter Eisentraut 46b55e7f85 pg_restore: Add -N option to exclude schemas
This is similar to the -N option in pg_dump, except that it doesn't take
a pattern, just like the existing -n option in pg_restore.

From: Michael Banck <michael.banck@credativ.de>
2016-09-20 12:00:00 -04:00
Peter Eisentraut 90c9648212 Re-add translation markers that were lost
When win32security.c was moved from src/backend/port/win32/security.c,
the message writing function was changed from write_stderr to log_error,
but nls.mk was not updated.  We could add log_error to GETTEXT_TRIGGERS,
but it's also used in src/common/exec.c in a different way and that
would create some confusion or a larger patch.  For now, just put an
explicit translation marker onto the strings that were previously
translated.
2016-09-20 12:00:00 -04:00
Robert Haas 470d886c32 Use PostmasterRandom(), not random(), for DSM control segment ID.
Otherwise, every startup gets the same "random" value, which is
definitely not what was intended.
2016-09-20 12:26:29 -04:00
Robert Haas 419113dfdc Retry DSM control segment creation if Windows indicates access denied.
Otherwise, attempts to run multiple postmasters running on the same
machine may fail, because Windows sometimes returns ERROR_ACCESS_DENIED
rather than ERROR_ALREADY_EXISTS when there is an existing segment.

Hitting this bug is much more likely because of another defect not
fixed by this patch, namely that dsm_postmaster_startup() uses
random() which returns the same value every time.  But that's not
a reason not to fix this.

Kyotaro Horiguchi and Amit Kapila, reviewed by Michael Paquier

Discussion: <CAA4eK1JyNdMeF-dgrpHozDecpDfsRZUtpCi+1AbtuEkfG3YooQ@mail.gmail.com>
2016-09-20 12:04:41 -04:00
Heikki Linnakangas 45310221a9 Fix outdated comments, GIST search queue is not an RBTree anymore.
The GiST search queue is implemented as a pairing heap rather than as
Red-Black Tree, since 9.5 (commit e7032610). I neglected these comments
in that commit.
2016-09-20 11:38:25 +03:00
Heikki Linnakangas 40c3fe4980 Fix latency calculation when there are \sleep commands in the script.
We can't use txn_scheduled to hold the sleep-until time for \sleep, because
that interferes with calculation of the latency of the transaction as whole.

Backpatch to 9.4, where this bug was introduced.

Fabien COELHO

Discussion: <alpine.DEB.2.20.1608231622170.7102@lancre>
2016-09-19 22:55:43 +03:00
Robert Haas 8614b39bca MSVC: Include pg_recvlogical in client-only install.
MauMau, reviewed by Michael Paquier
2016-09-19 14:25:57 -04:00
Heikki Linnakangas 3fcc98c990 Fix ecpg -? option on Windows, add -V alias for --version.
This makes the -? and -V options work consistently with other binaries.
--help and --version are now only recognized as the first option, i.e.
"ecpg --foobar --help" no longer prints the help, but that's consistent
with most of our other binaries, too.

Backpatch to all supported versions.

Haribabu Kommi

Discussion: <CAJrrPGfnRXvmCzxq6Dy=stAWebfNHxiL+Y_z7uqksZUCkW_waQ@mail.gmail.com>
2016-09-18 13:46:32 +03:00
Tom Lane d8c61c9765 Add debugging aid "bmsToString(Bitmapset *bms)".
This function has no direct callers at present, but it's convenient for
manual use in a debugger, rather than having to inspect memory and do
bit-counting in your head.

In passing, get rid of useless outBitmapset() wrapper around
_outBitmapset(); let's just export the function that does the work.
Likewise for outToken().

Ashutosh Bapat, tweaked a bit by me

Discussion: <CAFjFpRdiht8e1HTVirbubr4YzaON5iZTzFJjq909y4sU8M_6eA@mail.gmail.com>
2016-09-16 09:36:24 -04:00
Robert Haas 5225c66336 Clarify policy on marking inherited constraints as valid.
Amit Langote and Robert Haas
2016-09-15 17:24:54 -04:00
Heikki Linnakangas 5c6df67e0c Fix building with LibreSSL.
LibreSSL defines OPENSSL_VERSION_NUMBER to claim that it is version 2.0.0,
but it doesn't have the functions added in OpenSSL 1.1.0. Add autoconf
checks for the individual functions we need, and stop relying on
OPENSSL_VERSION_NUMBER.

Backport to 9.5 and 9.6, like the patch that broke this. In the
back-branches, there are still a few OPENSSL_VERSION_NUMBER checks left,
to check for OpenSSL 0.9.8 or 0.9.7. I left them as they were - LibreSSL
has all those functions, so they work as intended.

Per buildfarm member curculio.

Discussion: <2442.1473957669@sss.pgh.pa.us>
2016-09-15 22:52:51 +03:00
Robert Haas ffccee4736 Fix typo in comment.
Amit Langote
2016-09-15 11:35:57 -04:00
Tom Lane 5472ed3e9b Make min_parallel_relation_size's default value platform-independent.
The documentation states that the default value is 8MB, but this was
only true at BLCKSZ = 8kB, because the default was hard-coded as 1024.
Make the code match the docs by computing the default as 8MB/BLCKSZ.

Oversight in commit 75be66464, noted pursuant to a gripe from Peter E.

Discussion: <90634e20-097a-e4fd-67d5-fb2c42f0dd71@2ndquadrant.com>
2016-09-15 11:23:25 -04:00
Heikki Linnakangas 593d4e47db Support OpenSSL 1.1.0.
Changes needed to build at all:

- Check for SSL_new in configure, now that SSL_library_init is a macro.
- Do not access struct members directly. This includes some new code in
  pgcrypto, to use the resource owner mechanism to ensure that we don't
  leak OpenSSL handles, now that we can't embed them in other structs
  anymore.
- RAND_SSLeay() -> RAND_OpenSSL()

Changes that were needed to silence deprecation warnings, but were not
strictly necessary:

- RAND_pseudo_bytes() -> RAND_bytes().
- SSL_library_init() and OpenSSL_config() -> OPENSSL_init_ssl()
- ASN1_STRING_data() -> ASN1_STRING_get0_data()
- DH_generate_parameters() -> DH_generate_parameters()
- Locking callbacks are not needed with OpenSSL 1.1.0 anymore. (Good
  riddance!)

Also change references to SSLEAY_VERSION_NUMBER with OPENSSL_VERSION_NUMBER,
for the sake of consistency. OPENSSL_VERSION_NUMBER has existed since time
immemorial.

Fix SSL test suite to work with OpenSSL 1.1.0. CA certificates must have
the "CA:true" basic constraint extension now, or OpenSSL will refuse them.
Regenerate the test certificates with that. The "openssl" binary, used to
generate the certificates, is also now more picky, and throws an error
if an X509 extension is specified in "req_extensions", but that section
is empty.

Backpatch to all supported branches, per popular demand. In back-branches,
we still support OpenSSL 0.9.7 and above. OpenSSL 0.9.6 should still work
too, but I didn't test it. In master, we only support 0.9.8 and above.

Patch by Andreas Karlsson, with additional changes by me.

Discussion: <20160627151604.GD1051@msg.df7cb.de>
2016-09-15 14:42:29 +03:00
Heikki Linnakangas c99dd5bfed Fix and clarify comments on replacement selection.
These were modified by the patch to only use replacement selection for the
first run in an external sort.
2016-09-15 11:51:43 +03:00
Peter Eisentraut 656df624c0 Add overflow checks to money type input function
The money type input function did not have any overflow checks at all.
There were some regression tests that purported to check for overflow,
but they actually checked for the overflow behavior of the int8 type
before casting to money.  Remove those unnecessary checks and add some
that actually check the money input function.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
2016-09-14 12:00:00 -05:00
Tom Lane 0dac5b5174 Tweak targetlist-SRF tests some more.
Seems like it would be good to have a test case documenting the
existing behavior for non-top-level SRFs.
2016-09-14 19:48:50 -04:00
Robert Haas 6415ba502b Improve code comment for GatherPath's single_copy flag.
Discussion: 5934.1472642782@sss.pgh.pa.us
2016-09-14 15:43:26 -04:00
Tom Lane a163c006ca Tweak targetlist-SRF tests.
Add a test case showing that we don't support SRFs in window-function
arguments.  Remove a duplicate test case for SRFs in aggregate arguments.
2016-09-14 14:30:40 -04:00
Tom Lane 55c3391d1e Be pickier about converting between Name and Datum.
We were misapplying NameGetDatum() to plain C strings in some places.
This worked, because it was just a pointer cast anyway, but it's a type
cheat in some sense.  Use CStringGetDatum instead, and modify the
NameGetDatum macro so it won't compile if applied to something that's
not a pointer to NameData.  This should result in no changes to
generated code, but it is logically cleaner.

Mark Dilger, tweaked a bit by me

Discussion: <EFD8AC94-4C1F-40C1-A5EA-304080089C1B@gmail.com>
2016-09-13 17:17:48 -04:00
Tom Lane fdc79e1909 Fix executor/README to reflect disallowing SRFs in UPDATE.
The parenthetical comment here is obsoleted by commit a4c35ea1c.
Noted by Andres Freund.
2016-09-13 14:25:35 -04:00
Tom Lane a4c35ea1c2 Improve parser's and planner's handling of set-returning functions.
Teach the parser to reject misplaced set-returning functions during parse
analysis using p_expr_kind, in much the same way as we do for aggregates
and window functions (cf commit eaccfded9).  While this isn't complete
(it misses nesting-based restrictions), it's much better than the previous
error reporting for such cases, and it allows elimination of assorted
ad-hoc expression_returns_set() error checks.  We could add nesting checks
later if it seems important to catch all cases at parse time.

There is one case the parser will now throw error for although previous
versions allowed it, which is SRFs in the tlist of an UPDATE.  That never
behaved sensibly (since it's ill-defined which generated row should be
used to perform the update) and it's hard to see why it should not be
treated as an error.  It's a release-note-worthy change though.

Also, add a new Query field hasTargetSRFs reporting whether there are
any SRFs in the targetlist (including GROUP BY/ORDER BY expressions).
The parser can now set that basically for free during parse analysis,
and we can use it in a number of places to avoid expression_returns_set
searches.  (There will be more such checks soon.)  In some places, this
allows decontorting the logic since it's no longer expensive to check for
SRFs in the tlist --- so I made the checks parallel to the handling of
hasAggs/hasWindowFuncs wherever it seemed appropriate.

catversion bump because adding a Query field changes stored rules.

Andres Freund and Tom Lane

Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 13:54:24 -04:00
Robert Haas 445a38aba2 Have heapam.h include lockdefs.h rather than lock.h.
lockdefs.h was only split from lock.h relatively recently, and
represents a minimal subset of the old lock.h.  heapam.h only needs
that smaller subset, so adjust it to include only that.  This requires
some corresponding adjustments elsewhere.

Peter Geoghegan
2016-09-13 09:21:35 -04:00
Andres Freund 0dba54f166 Remove user_relns() SRF from regression tests.
The output of the function changes whenever previous (or, as in this
case, concurrent) tests leave a table in place. That causes unneeded
churn.

This should fix failures due to the tests added bfe16d1a5, like on
lapwing, caused by the tsrf test running concurrently with misc. Those
could also have been addressed by using temp tables, but that test has
annoyed me before.

Discussion: <27626.1473729905@sss.pgh.pa.us>
2016-09-12 19:37:16 -07:00
Andres Freund 9f478b4f19 Address portability issues in bfe16d1a5 test output. 2016-09-12 18:15:10 -07:00
Andres Freund bfe16d1a5d Add more tests for targetlist SRFs.
We're considering changing the implementation of targetlist SRFs
considerably, and a lot of the current behaviour isn't tested in our
regression tests. Thus it seems useful to increase coverage to avoid
accidental behaviour changes.

It's quite possible that some of the plans here will require adjustments
to avoid falling afoul of ordering differences (e.g. hashed group
bys). The buildfarm will tell us.

Reviewed-By: Tom Lane
Discussion: <20160827214829.zo2dfb5jaikii5nw@alap3.anarazel.de>
2016-09-12 17:27:47 -07:00
Peter Eisentraut 9083353b15 pg_basebackup: Clean created directories on failure
Like initdb, clean up created data and xlog directories, unless the new
-n/--noclean option is specified.

Tablespace directories are not cleaned up, but a message is written
about that.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
2016-09-12 12:00:00 -04:00
Kevin Grittner 63c1a87194 Fix recent commit for tab-completion of database template.
The details of commit 52803098ab were
based on a misunderstanding of the role inheritance allowing use
of a database for a template.  While the CREATEDB privilege is not
inherited, the database ownership is privileges are.

Pointed out by Vitaly Burovoy and Tom Lane.
Fix provided by Tom Lane, reviewed by Vitaly Burovoy.
2016-09-12 09:22:57 -05:00
Simon Riggs 4068eb9918 Fix copy/pasto in file identification
Daniel Gustafsson
2016-09-12 09:01:58 +01:00
Simon Riggs fc3d4a44e9 Identify walsenders in pg_stat_activity
Following 8299471c37 walsender procs are now visible in pg_stat_activity.
Set query to ‘walsender’ for walsender procs to allow them to be identified.

Discussion:CAB7nPqS8c76KPSufK_HSDeYrbtg+zZ7D0EEkjeM6txSEuCB_jA@mail.gmail.com

Michael Paquier, issue raised by Fujii Masao, reviewed by Tom Lane
2016-09-12 08:57:14 +01:00
Simon Riggs c3c0d7bd70 Raise max setting of checkpoint_timeout to 1d
Previously checkpoint_timeout was capped at 3600s
New max setting is 86400s = 24h = 1d

Discussion: 32558.1454471895@sss.pgh.pa.us
2016-09-11 23:26:18 +01:00
Kevin Grittner 52803098ab psql tab completion for CREATE DATABASE ... TEMPLATE ...
Sehrope Sarkuni, reviewed by Merlin Moncure & Vitaly Burovoy
with some editing by me
2016-09-11 15:37:27 -05:00
Tom Lane 40b449ae84 Allow CREATE EXTENSION to follow extension update paths.
Previously, to update an extension you had to produce both a version-update
script and a new base installation script.  It's become more and more
obvious that that's tedious, duplicative, and error-prone.  This patch
attempts to improve matters by allowing the new base installation script
to be omitted.  CREATE EXTENSION will install a requested version if it
can find a base script and a chain of update scripts that will get there.
As in the existing update logic, shorter chains are preferred if there's
more than one possibility, with an arbitrary tie-break rule for chains
of equal length.

Also adjust the pg_available_extension_versions view to show such versions
as installable.

While at it, refactor the code so that CASCADE processing works for
extensions requested during ApplyExtensionUpdates().  Without this,
addition of a new requirement in an updated extension would require
creating a new base script, even if there was no other reason to do that.
(It would be easy at this point to add a CASCADE option to ALTER EXTENSION
UPDATE, to allow the same thing to happen during a manually-commanded
version update, but I have not done that here.)

Tom Lane, reviewed by Andres Freund

Discussion: <20160905005919.jz2m2yh3und2dsuy@alap3.anarazel.de>
2016-09-11 14:15:07 -04:00
Tom Lane 28e5e5648c Fix and simplify MSVC build's handling of xml/xslt/uuid dependencies.
Solution.pm mistakenly believed that the xml option requires the xslt
option, when actually the dependency is the other way around; and it
believed that libxml requires libiconv, which is not necessarily so,
so we shouldn't enforce it here.  Fix the option cross-checking logic.

Also, since AddProject already takes care of adding libxml and libxslt
include and library dependencies to every project, there's no need
for the custom code that did that in mkvcbuild.  While at it, let's
handle the similar dependencies for uuid in a similar fashion.

Given the lack of field complaints about these overly strict build
dependency requirements, there seems no need for a back-patch.

Michael Paquier

Discussion: <CAB7nPqR0+gpu3mRQvFjf-V-bMxmiSJ6NpTg9_WzVDL+a31cV2g@mail.gmail.com>
2016-09-11 12:46:55 -04:00
Heikki Linnakangas 24598337c8 Implement binary heap replace-top operation in a smarter way.
In external sort's merge phase, we maintain a binary heap holding the next
tuple from each input tape. On each step, the topmost tuple is returned,
and replaced with the next tuple from the same tape. We were doing the
replacement by deleting the top node in one operation, and inserting the
next tuple after that. However, you can do a "replace-top" operation more
efficiently, in one "sift-up". A deletion will always walk the heap from
top to bottom, but in a replacement, we can stop as soon as we find the
right place for the new tuple. This is particularly helpful, if the tapes
are not in completely random order, so that the next tuple from a tape is
likely to land near the top of the heap.

Peter Geoghegan, reviewed by Claudio Freire, with some editing by me.

Discussion: <CAM3SWZRhBhiknTF_=NjDSnNZ11hx=U_SEYwbc5vd=x7M4mMiCw@mail.gmail.com>
2016-09-11 16:27:27 +03:00
Tom Lane f2717c79ee Improve unreachability recognition in elog() macro.
Some experimentation with an older version of gcc showed that it is able
to determine whether "if (elevel_ >= ERROR)" is compile-time constant
if elevel_ is declared "const", but otherwise not so much.  We had
accounted for that in ereport() but were too miserly with braces to
make it so in elog().  I don't know how many currently-interesting
compilers have the same quirk, but in case it will save some code
space, let's make sure that elog() is on the same footing as ereport()
for this purpose.

Back-patch to 9.3 where we introduced pg_unreachable() calls into
elog/ereport.
2016-09-10 17:54:23 -04:00
Tom Lane ddc8893179 Fix miserable coding in pg_stat_get_activity().
Commit dd1a3bccc replaced a test on whether a subroutine returned a
null pointer with a test on whether &pointer->backendStatus was null.
This accidentally failed to fail, at least on common compilers, because
backendStatus is the first field in the struct; but it was surely trouble
waiting to happen.  Commit f91feba87 then messed things up further,
changing the logic to

	local_beentry = pgstat_fetch_stat_local_beentry(curr_backend);
	if (!local_beentry)
		continue;
	beentry = &local_beentry->backendStatus;
	if (!beentry)
	{

where the second "if" is now dead code, so that the intended behavior of
printing a row with "<backend information not available>" cannot occur.

I suspect this is all moot because pgstat_fetch_stat_local_beentry
will never actually return null in this function's usage, but it's still
very poor coding.  Repair back to 9.4 where the original problem was
introduced.
2016-09-10 13:49:04 -04:00
Tom Lane 24992c6db9 Rewrite PageIndexDeleteNoCompact into a form that only deletes 1 tuple.
The full generality of deleting an arbitrary number of tuples is no longer
needed, so let's save some code and cycles by replacing the original coding
with an implementation based on PageIndexTupleDelete.

We can always get back the old code from git if we need it again for new
callers (though I don't care for its willingness to mess with line pointers
it wasn't told to mess with).

Discussion: <552.1473445163@sss.pgh.pa.us>
2016-09-09 19:00:59 -04:00
Tom Lane 1a4be103a5 Convert PageAddItem into a macro to save a few cycles.
Nowadays this is just a backwards-compatibility wrapper around
PageAddItemExtended, so let's avoid the extra level of function call.
In addition, because pretty much all callers are passing constants
for the two bool arguments, compilers will be able to constant-fold
the conversion to a flags bitmask.

Discussion: <552.1473445163@sss.pgh.pa.us>
2016-09-09 18:17:07 -04:00
Tom Lane b1328d78f8 Invent PageIndexTupleOverwrite, and teach BRIN and GiST to use it.
PageIndexTupleOverwrite performs approximately the same function as
PageIndexTupleDelete (or PageIndexDeleteNoCompact) followed by PageAddItem
targeting the same item pointer offset.  But in the case where the new
tuple is the same size as the old, it avoids shuffling other data around on
the page, because the new tuple is placed where the old one was rather than
being appended to the end of the page.  This has been shown to provide a
substantial speedup for some GiST use-cases.

Also, this change allows some API simplifications: we can get rid of
the rather klugy and error-prone PAI_ALLOW_FAR_OFFSET flag for
PageAddItemExtended, since that was used only to cover a corner case
for BRIN that's better expressed by using PageIndexTupleOverwrite.

Note that this patch causes a rather subtle WAL incompatibility: the
physical page content change represented by certain WAL records is now
different than it was before, because while the tuples have the same
itempointer line numbers, the tuples themselves are in different places.
I have not bumped the WAL version number because I think it doesn't matter
unless you are trying to do bitwise comparisons of original and replayed
pages, and in any case we're early in a devel cycle and there will probably
be more WAL changes before v10 gets out the door.

There is probably room to make use of PageIndexTupleOverwrite in SP-GiST
and GIN too, but that is left for a future patch.

Andrey Borodin, reviewed by Anastasia Lubennikova, whacked around a bit
by me

Discussion: <CAJEAwVGQjGGOj6mMSgMwGvtFd5Kwe6VFAxY=uEPZWMDjzbn4VQ@mail.gmail.com>
2016-09-09 18:02:36 -04:00
Alvaro Herrera 5c609a742f Fix locking a tuple updated by an aborted (sub)transaction
When heap_lock_tuple decides to follow the update chain, it tried to
also lock any version of the tuple that was created by an update that
was subsequently rolled back.  This is pointless, since for all intents
and purposes that tuple exists no more; and moreover it causes
misbehavior, as reported independently by Marko Tiikkaja and Marti
Raudsepp: some SELECT FOR UPDATE/SHARE queries may fail to return
the tuples, and assertion-enabled builds crash.

Fix by having heap_lock_updated_tuple test the xmin and return success
immediately if the tuple was created by an aborted transaction.

The condition where tuples become invisible occurs when an updated tuple
chain is followed by heap_lock_updated_tuple, which reports the problem
as HeapTupleSelfUpdated to its caller heap_lock_tuple, which in turn
propagates that code outwards possibly leading the calling code
(ExecLockRows) to believe that the tuple exists no longer.

Backpatch to 9.3.  Only on 9.5 and newer this leads to a visible
failure, because of commit 27846f02c176; before that, heap_lock_tuple
skips the whole dance when the tuple is already locked by the same
transaction, because of the ancient HeapTupleSatisfiesUpdate behavior.
Still, the buggy condition may also exist in more convoluted scenarios
involving concurrent transactions, so it seems safer to fix the bug in
the old branches too.

Discussion:
	https://www.postgresql.org/message-id/CABRT9RC81YUf1=jsmWopcKJEro=VoeG2ou6sPwyOUTx_qteRsg@mail.gmail.com
	https://www.postgresql.org/message-id/48d3eade-98d3-8b9a-477e-1a8dc32a724d@joh.to
2016-09-09 15:54:29 -03:00
Tom Lane 984d0a14e8 In PageIndexTupleDelete, don't assume stored item lengths are MAXALIGNed.
PageAddItem stores the item length as-is.  It MAXALIGN's the amount of
space actually allocated for each tuple, but not the stored length.
PageRepairFragmentation, PageIndexMultiDelete, and PageIndexDeleteNoCompact
are all on board with this and MAXALIGN item lengths after fetching them.
But PageIndexTupleDelete expects the stored length to be a MAXALIGN
multiple already.  This accidentally works for existing index AMs because
they all maxalign their tuple sizes internally; but we don't do that for
heap tuples, and it shouldn't be a requirement for index tuples either.

So, sync PageIndexTupleDelete with the rest of bufpage.c by having it
maxalign the item size after fetching.

Also add a check that pd_special is maxaligned, to ensure that the test
"(offset + size) > phdr->pd_special" is still doing the right thing.
(If offset and pd_special are aligned, it doesn't matter whether size is.)
Again, this is in sync with the rest of the routines here, except for
PageAddItem which doesn't test because it doesn't actually do anything
for which pd_special alignment matters.

This shouldn't have any immediate functional impact; it just adds the
flexibility to use PageIndexTupleDelete on index tuples with non-aligned
lengths.

Discussion: <3814.1473366762@sss.pgh.pa.us>
2016-09-09 12:21:09 -04:00
Peter Eisentraut e0013deb59 Make better use of existing enums in plpgsql
plpgsql.h defines a number of enums, but most of the code passes them
around as ints.  Update structs and function prototypes to take enum
types instead.  This clarifies the struct definitions in plpgsql.h in
particular.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
2016-09-09 12:00:00 -04:00
Tom Lane 967a7b0fc9 Avoid reporting "cache lookup failed" for some user-reachable cases.
We have a not-terribly-thoroughly-enforced-yet project policy that internal
errors with SQLSTATE XX000 (ie, plain elog) should not be triggerable from
SQL.  record_in, domain_in, and PL validator functions all failed to meet
this standard, because they threw plain elog("cache lookup failed for XXX")
errors on bad OIDs, and those are all invokable from SQL.

For record_in, the best fix is to upgrade typcache.c (lookup_type_cache)
to throw a user-facing error for this case.  That seems consistent because
it was more than halfway there already, having user-facing errors for shell
types and non-composite types.  Having done that, tweak domain_in to rely
on the typcache to throw an appropriate error.  (This costs little because
InitDomainConstraintRef would fetch the typcache entry anyway.)

For the PL validator functions, we already have a single choke point at
CheckFunctionValidatorAccess, so just fix its error to be user-facing.

Dilip Kumar, reviewed by Haribabu Kommi

Discussion: <87wpxfygg9.fsf@credativ.de>
2016-09-09 09:20:34 -04:00
Simon Riggs ec253de1fd Fix corruption of 2PC recovery with subxacts
Reading 2PC state files during recovery was borked, causing corruptions during
recovery. Effect limited to servers with 2PC, subtransactions and
recovery/replication.

Stas Kelvich, reviewed by Michael Paquier and Pavan Deolasee
2016-09-09 11:55:12 +01:00
Andres Freund 45e191e3aa Improve scalability of md.c for large relations.
So far md.c used a linked list of segments. That proved to be a problem
when processing large relations, because every smgr.c/md.c level access
to a page incurred walking through a linked list of all preceding
segments. Thus making accessing pages O(#segments).

Replace the linked list of segments hanging off SMgrRelationData with an
array of opened segments. That allows O(1) access to individual
segments, if they've previously been opened.

Discussion: <20140331101001.GE13135@alap3.anarazel.de>
Reviewed-By: Peter Geoghegan, Tom Lane (in an older version)
2016-09-08 17:18:46 -07:00
Andres Freund 417fefaf08 Faster PageIsVerified() for the all zeroes case.
That's primarily useful for testing very large relations, using sparse
files.

Discussion: <20140331101001.GE13135@alap3.anarazel.de>
Reviewed-By: Peter Geoghegan
2016-09-08 17:02:43 -07:00
Andres Freund 769fd9d8e0 Fix mdtruncate() to close fd.c handle of deleted segments.
mdtruncate() forgot to FileClose() a segment's mdfd_vfd, when deleting
it. That lead to a fd.c handle to a truncated file being kept open until
backend exit.

The issue appears to have been introduced way back in 1a5c450f30,
before that the handle was closed inside FileUnlink().

The impact of this bug is limited - only VACUUM and ON COMMIT TRUNCATE
for temporary tables, truncate files in place (i.e. TRUNCATE itself is
not affected), and the relation has to be bigger than 1GB. The
consequences of a leaked fd.c handle aren't severe either.

Discussion: <20160908220748.oqh37ukwqqncbl3n@alap3.anarazel.de>
Backpatch: all supported releases
2016-09-08 16:51:09 -07:00
Alvaro Herrera 19acee8c5a Fix two src/test/modules Makefiles
commit_ts and test_pg_dump were declaring targets before including the
PGXS stanza, which meant that the "all" target customarily defined as
the first (and therefore default target) was not the default anymore.
Fix that by moving those target definitions to after PGXS.

commit_ts was initially good, but I broke it in commit 9def031bd2;
test_pg_dump was born broken, probably copying from commit_ts' mistake.

In passing, fix a comment mistake in test_pg_dump/Makefile.

Backpatch to 9.6.

Noted by Tom Lane.
2016-09-08 14:39:05 -03:00
Tom Lane df5d9bb8d5 Allow pg_dump to dump non-extension members of an extension-owned schema.
Previously, if a schema was created by an extension, a normal pg_dump run
(not --binary-upgrade) would summarily skip every object in that schema.
In a case where an extension creates a schema and then users create other
objects within that schema, this does the wrong thing: we want pg_dump
to skip the schema but still create the non-extension-owned objects.

There's no easy way to fix this pre-9.6, because in earlier versions the
"dump" status for a schema is just a bool and there's no way to distinguish
"dump me" from "dump my members".  However, as of 9.6 we do have enough
state to represent that, so this is a simple correction of the logic in
selectDumpableNamespace.

In passing, make some cosmetic fixes in nearby code.

Martín Marqués, reviewed by Michael Paquier

Discussion: <99581032-71de-6466-c325-069861f1947d@2ndquadrant.com>
2016-09-08 13:12:01 -04:00
Tom Lane e97e9c57bd Don't print database's tablespace in pg_dump -C --no-tablespaces output.
If the database has a non-default tablespace, we emitted a TABLESPACE
clause in the CREATE DATABASE command emitted by -C, even if
--no-tablespaces was also specified.  This seems wrong, and it's
inconsistent with what pg_dumpall does, so change it.  Per bug #14315
from Danylo Hlynskyi.

Back-patch to 9.5.  The bug is much older, but it'd be a more invasive
change before 9.5 because dumpDatabase() hasn't got an easy way to get
to the outputNoTablespaces flag.  Doesn't seem worth the work given
the lack of previous complaints.

Report: <20160908081953.1402.75347@wrigleys.postgresql.org>
2016-09-08 10:48:03 -04:00
Simon Riggs 67c6bd1ca3 Fix minor memory leak in Standby startup
StandbyRecoverPreparedTransactions() leaked the buffer
used for two phase state file. This was leaked once
at startup and at every shutdown checkpoint seen.

Backpatch to 9.6

Stas Kelvich
2016-09-08 10:32:58 +01:00
Noah Misch d299eb41df MSVC: Pass any user-set MSBFLAGS to MSBuild and VCBUILD.
This is particularly useful to pass /m, to perform a parallel build.

Christian Ullrich, reviewed by Michael Paquier.
2016-09-08 01:42:09 -04:00
Noah Misch 976a9bbd02 MSVC: Place gendef.pl temporary file in the target directory.
Until now, it used the current working directory.  This makes it safe
for simultaneous invocations of gendef.pl, with different target
directories, to run from a single current working directory, such as
$(top_srcdir).  The MSVC build system will soon rely on this.

Christian Ullrich, reviewed by Michael Paquier.
2016-09-08 01:40:53 -04:00
Tom Lane 0ab9c56d0f Support renaming an existing value of an enum type.
Not much to be said about this patch: it does what it says on the tin.

In passing, rename AlterEnumStmt.skipIfExists to skipIfNewValExists
to clarify what it actually does.  In the discussion of this patch
we considered supporting other similar options, such as IF EXISTS
on the type as a whole or IF NOT EXISTS on the target name.  This
patch doesn't actually add any such feature, but it might happen later.

Dagfinn Ilmari Mannsåker, reviewed by Emre Hasegeli

Discussion: <CAO=2mx6uvgPaPDf-rHqG8=1MZnGyVDMQeh8zS4euRyyg4D35OQ@mail.gmail.com>
2016-09-07 16:11:56 -04:00
Tom Lane 4f405c8ef4 Add a HINT for client-vs-server COPY failure cases.
Users often get confused between COPY and \copy and try to use client-side
paths with COPY.  The server then cannot find the file (if remote), or sees
a permissions problem (if local), or some variant of that.  Emit a hint
about this in the most common cases.

In future we might want to expand the set of errnos for which the hint
gets printed, but be conservative for now.

Craig Ringer, reviewed by Christoph Berg and Tom Lane

Discussion: <CAMsr+YEqtD97qPEzQDqrCt5QiqPbWP_X4hmvy2pQzWC0GWiyPA@mail.gmail.com>
2016-09-06 23:55:55 -04:00
Peter Eisentraut 49eb0fd097 Add location field to DefElem
Add a location field to the DefElem struct, used to parse many utility
commands.  Update various error messages to supply error position
information.

To propogate the error position information in a more systematic way,
create a ParseState in standard_ProcessUtility() and pass that to
interested functions implementing the utility commands.  This seems
better than passing the query string and then reassembling a parse state
ad hoc, which violates the encapsulation of the ParseState type.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
2016-09-06 12:00:00 -04:00
Tom Lane f032722f86 Guard against possible memory allocation botch in batchmemtuples().
Negative availMemLessRefund would be problematic.  It's not entirely
clear whether the case can be hit in the code as it stands, but this
seems like good future-proofing in any case.  While we're at it,
insist that the value be not merely positive but not tiny, so as to
avoid doing a lot of repalloc work for little gain.

Peter Geoghegan

Discussion: <CAM3SWZRVkuUB68DbAkgw=532gW0f+fofKueAMsY7hVYi68MuYQ@mail.gmail.com>
2016-09-06 15:50:31 -04:00
Tom Lane cdc70597c9 Teach appendShellString() to not quote strings containing "-".
Brain fade in commit a00c58314: I was thinking that a string starting with
"-" could be taken as a switch depending on command line syntax.  That's
true, but having appendShellString() quote it will not help, so we may as
well not do so.  Per complaint from Peter Eisentraut.
2016-09-06 14:53:31 -04:00
Tom Lane a2ee579b6d Repair whitespace in initdb message.
What used to be four spaces somehow turned into a tab and a couple of
spaces in commit a00c58314, no doubt from overhelpful emacs autoindent.
Noted by Peter Eisentraut.
2016-09-06 13:26:43 -04:00
Simon Riggs dcb12ce8d8 Fix VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL
lazy_truncate_heap() was waiting for
VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL, but in microseconds
not milliseconds as originally intended.

Found by code inspection.

Simon Riggs
2016-09-06 15:35:47 +01:00
Bruce Momjian 67e1e2aaff C comment: fix file name mention on line 1
Author: Amit Langote
2016-09-06 00:03:55 -04:00
Tom Lane 25794e841e Cosmetic code cleanup in commands/extension.c.
Some of the comments added by the CREATE EXTENSION CASCADE patch were
a bit sloppy, and I didn't care for redeclaring the same local variable
inside a nested block either.  No functional changes.
2016-09-05 18:53:33 -04:00
Tom Lane c54159d44c Make locale-dependent regex character classes work for large char codes.
Previously, we failed to recognize Unicode characters above U+7FF as
being members of locale-dependent character classes such as [[:alpha:]].
(Actually, the same problem occurs for large pg_wchar values in any
multibyte encoding, but UTF8 is the only case people have actually
complained about.)  It's impractical to get Spencer's original code to
handle character classes or ranges containing many thousands of characters,
because it insists on considering each member character individually at
regex compile time, whether or not the character will ever be of interest
at run time.  To fix, choose a cutoff point MAX_SIMPLE_CHR below which
we process characters individually as before, and deal with entire ranges
or classes as single entities above that.  We can actually make things
cheaper than before for chars below the cutoff, because the color map can
now be a simple linear array for those chars, rather than the multilevel
tree structure Spencer designed.  It's more expensive than before for
chars above the cutoff, because we must do a binary search in a list of
high chars and char ranges used in the regex pattern, plus call iswalpha()
and friends for each locale-dependent character class used in the pattern.
However, multibyte encodings are normally designed to give smaller codes
to popular characters, so that we can expect that the slow path will be
taken relatively infrequently.  In any case, the speed penalty appears
minor except when we have to apply iswalpha() etc. to high character codes
at runtime --- and the previous coding gave wrong answers for those cases,
so whether it was faster is moot.

Tom Lane, reviewed by Heikki Linnakangas

Discussion: <15563.1471913698@sss.pgh.pa.us>
2016-09-05 17:06:29 -04:00
Bruce Momjian f80049f76a C comment: align dashes in GroupState node header
Author: Jim Nasby
2016-09-05 13:09:54 -04:00
Tom Lane 15bc038f9b Relax transactional restrictions on ALTER TYPE ... ADD VALUE.
To prevent possibly breaking indexes on enum columns, we must keep
uncommitted enum values from getting stored in tables, unless we
can be sure that any such column is new in the current transaction.

Formerly, we enforced this by disallowing ALTER TYPE ... ADD VALUE
from being executed at all in a transaction block, unless the target
enum type had been created in the current transaction.  This patch
removes that restriction, and instead insists that an uncommitted enum
value can't be referenced unless it belongs to an enum type created
in the same transaction as the value.  Per discussion, this should be
a bit less onerous.  It does require each function that could possibly
return a new enum value to SQL operations to check this restriction,
but there aren't so many of those that this seems unmaintainable.

Andrew Dunstan and Tom Lane

Discussion: <4075.1459088427@sss.pgh.pa.us>
2016-09-05 12:59:55 -04:00
Simon Riggs 016abf1fb8 Add debug check function LWLockHeldByMeInMode()
Tests whether my process holds a lock in given mode.
Add initial usage in MarkBufferDirty().

Thomas Munro
2016-09-05 10:38:08 +01:00
Simon Riggs d851bef2d6 Dirty replication slots when using sql interface
When pg_logical_slot_get_changes(...) sets confirmed_flush_lsn to the point at
which replay stopped, it doesn't dirty the replication slot.  So if the replay
didn't cause restart_lsn or catalog_xmin to change as well, this change will
not get written out to disk. Even on a clean shutdown.

If Pg crashes or restarts, a subsequent pg_logical_slot_get_changes(...) call
will see the same changes already replayed since it uses the slot's
confirmed_flush_lsn as the start point for fetching changes. The caller can't
specify a start LSN when using the SQL interface.

Mark the slot as dirty after reading changes using the SQL interface so that
users won't see repeated changes after a clean shutdown. Repeated changes still
occur when using the walsender interface or after an unclean shutdown.

Craig Ringer
2016-09-05 09:44:38 +01:00
Tom Lane b6182081be Remove duplicate code from ReorderBufferCleanupTXN().
Andres is apparently the only hacker who thinks this code is better as-is.
I (tgl) follow some of his logic, but the fact that it's setting off
warnings from static code analyzers seems like a sufficient reason to
put the complexity into a comment rather than the code.

Aleksander Alekseev

Discussion: <20160404190345.54d84ee8@fujitsu>
2016-09-04 20:49:44 -04:00
Tom Lane c7f68bea22 Add regression test coverage for non-default timezone abbreviation sets.
After further reflection about the mess cleaned up in commit 39b691f25,
I decided the main bit of test coverage that was still missing was to
check that the non-default abbreviation-set files we supply are usable.
Add that.

Back-patch to supported branches, just because it seems like a good
idea to keep this all in sync.
2016-09-04 20:02:16 -04:00
Tom Lane da6ea70c32 Remove vestigial references to "zic" in favor of "IANA database".
Commit b2cbced9e instituted a policy of referring to the timezone database
as the "IANA timezone database" in our user-facing documentation.
Propagate that wording into a couple of places that were still using "zic"
to refer to the database, which is definitely not right (zic is the
compilation tool, not the data).

Back-patch, not because this is very important in itself, but because
we routinely cherry-pick updates to the tznames files and I don't want
to risk future merge failures.
2016-09-04 19:42:08 -04:00
Tom Lane a2d75b67bc Remove useless pg_strdup() operations.
split_to_stringlist() doesn't modify its first argument nor expect it
to remain valid after exit, so there's no need to duplicate the optarg
string at the call sites.  Per Coverity.  (This has been wrong all along,
but commit 052cc223d changed the useless calls from "strdup" to
"pg_strdup", which apparently made Coverity think it's a new bug.
It's not, but it's also not worth back-patching.)
2016-09-04 12:33:58 -04:00
Heikki Linnakangas e21db14b8a Clarify the new Red-Black post-order traversal code a bit.
Coverity complained about the for(;;) loop, because it never actually
iterated. It was used just to be able to use "break" to exit it early. I
agree with Coverity, that's a bit confusing, so refactor the code to
use if-else instead.

While we're at it, use a local variable to hold the "current" node. That's
shorter and clearer than referring to "iter->last_visited" all the time.
2016-09-04 15:02:06 +03:00
Tom Lane 6591f4226c Improve readability of the output of psql's \timing command.
In addition to the existing decimal-milliseconds output value,
display the same value in mm:ss.fff format if it exceeds one second.
Tack on hours and even days fields if the interval is large enough.
This avoids needing mental arithmetic to convert the values into
customary time units.

Corey Huinker, reviewed by Gerdan Santos; bikeshedding by many

Discussion: <CADkLM=dbC4R8sbbuFXQVBFWoJGQkTEW8RWnC0PbW9nZsovZpJQ@mail.gmail.com>
2016-09-03 15:29:03 -04:00
Tom Lane 600dc4c0da Fix multiple bugs in numeric_poly_deserialize().
These were evidently introduced by yesterday's commit 9cca11c91,
which perhaps needs more review than it got.

Per report from Andreas Seltenreich and additional examination
of nearby code.

Report: <87oa45qfwq.fsf@credativ.de>
2016-09-03 14:18:55 -04:00
Tom Lane 60893786d5 Fix corrupt GIN_SEGMENT_ADDITEMS WAL records on big-endian hardware.
computeLeafRecompressWALData() tried to produce a uint16 WAL log field by
memcpy'ing the first two bytes of an int-sized variable.  That accidentally
works on little-endian hardware, but not at all on big-endian.  Replay then
thinks it's looking at an ADDITEMS action with zero entries, and reads the
first two bytes of the first TID therein as the next segno/action,
typically leading to "unexpected GIN leaf action" errors during replay.
Even if replay failed to crash, the resulting GIN index page would surely
be incorrect.  To fix, just declare the variable as uint16 instead.

Per bug #14295 from Spencer Thomason (much thanks to Spencer for turning
his problem into a self-contained test case).  This likely also explains
a previous report of the same symptom from Bernd Helmle.

Back-patch to 9.4 where the problem was introduced (by commit 14d02f0bb).

Discussion: <20160826072658.15676.7628@wrigleys.postgresql.org>
Possible-Report: <2DA7350F7296B2A142272901@eje.land.credativ.lan>
2016-09-03 13:28:53 -04:00
Simon Riggs 35250b6ad7 New recovery target recovery_target_lsn
Michael Paquier
2016-09-03 17:48:01 +01:00
Tom Lane 39b691f251 Don't require dynamic timezone abbreviations to match underlying time zone.
Previously, we threw an error if a dynamic timezone abbreviation did not
match any abbreviation recorded in the referenced IANA time zone entry.
That seemed like a good consistency check at the time, but it turns out
that a number of the abbreviations in the IANA database are things that
Olson and crew made up out of whole cloth.  Their current policy is to
remove such names in favor of using simple numeric offsets.  Perhaps
unsurprisingly, a lot of these made-up abbreviations have varied in meaning
over time, which meant that our commit b2cbced9e and later changes made
them into dynamic abbreviations.  So with newer IANA database versions
that don't mention these abbreviations at all, we fail, as reported in bug
#14307 from Neil Anderson.  It's worse than just a few unused-in-the-wild
abbreviations not working, because the pg_timezone_abbrevs view stops
working altogether (since its underlying function tries to compute the
whole view result in one call).

We considered deleting these abbreviations from our abbreviations list, but
the problem with that is that we can't stay ahead of possible future IANA
changes.  Instead, let's leave the abbreviations list alone, and treat any
"orphaned" dynamic abbreviation as just meaning the referenced time zone.
It will behave a bit differently than it used to, in that you can't any
longer override the zone's standard vs. daylight rule by using the "wrong"
abbreviation of a pair, but that's better than failing entirely.  (Also,
this solution can be interpreted as adding a small new feature, which is
that any abbreviation a user wants can be defined as referencing a time
zone name.)

Back-patch to all supported branches, since this problem affects all
of them when using tzdata 2016f or newer.

Report: <20160902031551.15674.67337@wrigleys.postgresql.org>
Discussion: <6189.1472820913@sss.pgh.pa.us>
2016-09-02 17:30:02 -04:00
Heikki Linnakangas ec136d19b2 Move code shared between libpq and backend from backend/libpq/ to common/.
When building libpq, ip.c and md5.c were symlinked or copied from
src/backend/libpq into src/interfaces/libpq, but now that we have a
directory specifically for routines that are shared between the server and
client binaries, src/common/, move them there.

Some routines in ip.c were only used in the backend. Keep those in
src/backend/libpq, but rename to ifaddr.c to avoid confusion with the file
that's now in common.

Fix the comment in src/common/Makefile to reflect how libpq actually links
those files.

There are two more files that libpq symlinks directly from src/backend:
encnames.c and wchar.c. I don't feel compelled to move those right now,
though.

Patch by Michael Paquier, with some changes by me.

Discussion: <69938195-9c76-8523-0af8-eb718ea5b36e@iki.fi>
2016-09-02 13:49:59 +03:00
Heikki Linnakangas 9cca11c915 Speed up SUM calculation in numeric aggregates.
This introduces a numeric sum accumulator, which performs better than
repeatedly calling add_var(). The performance comes from using wider digits
and delaying carry propagation, tallying positive and negative values
separately, and avoiding a round of palloc/pfree on every value. This
speeds up SUM(), as well as other standard aggregates like AVG() and
STDDEV() that also calculate a sum internally.

Reviewed-by: Andrey Borodin
Discussion: <c0545351-a467-5b76-6d46-4840d1ea8aa4@iki.fi>
2016-09-02 11:51:49 +03:00
Heikki Linnakangas 9f85784cae Support multiple iterators in the Red-Black Tree implementation.
While we don't need multiple iterators at the moment, the interface is
nicer and less dangerous this way.

Aleksander Alekseev, with some changes by me.
2016-09-02 08:39:39 +03:00
Kevin Grittner 76f9dd4fa8 Improve tab completion for BEGIN & START|SET TRANSACTION.
Andreas Karlsson with minor change by me for SET TRANSACTION
SNAPSHOT.
2016-09-01 16:10:30 -05:00
Tom Lane 6c03d981a6 Change API of ShmemAlloc() so it throws error rather than returning NULL.
A majority of callers seem to have believed that this was the API spec
already, because they omitted any check for a NULL result, and hence
would crash on an out-of-shared-memory failure.  The original proposal
was to just add such error checks everywhere, but that does nothing to
prevent similar omissions in future.  Instead, let's make ShmemAlloc()
throw the error (so we can remove the caller-side checks that do exist),
and introduce a new function ShmemAllocNoError() that has the previous
behavior of returning NULL, for the small number of callers that need
that and are prepared to do the right thing.  This also lets us remove
the rather wishy-washy behavior of printing a WARNING for out-of-shmem,
which never made much sense: either the caller has a strategy for
dealing with that, or it doesn't.  It's not ShmemAlloc's business to
decide whether a warning is appropriate.

The v10 release notes will need to call this out as a significant
source-code change.  It's likely that it will be a bug fix for
extension callers too, but if not, they'll need to change to using
ShmemAllocNoError().

This is nominally a bug fix, but the odds that it's fixing any live
bug are actually rather small, because in general the requests
being made by the unchecked callers were already accounted for in
determining the overall shmem size, so really they ought not fail.
Between that and the possible impact on extensions, no back-patch.

Discussion: <24843.1472563085@sss.pgh.pa.us>
2016-09-01 10:13:55 -04:00
Tom Lane 6f7c0ea32f Improve memory management for PL/Perl functions.
Unlike PL/Tcl, PL/Perl at least made an attempt to clean up after itself
when a function gets redefined.  But it was still using TopMemoryContext
for the fn_mcxt of argument/result I/O functions, resulting in the
potential for memory leaks depending on what those functions did, and the
retail alloc/free logic was pretty bulky as well.  Fix things to use a
per-function memory context like the other PLs now do.  Tweak a couple of
places where things were being done in a not-very-safe order (on the
principle that a memory leak is better than leaving global state
inconsistent after an error).  Also make some minor cosmetic adjustments,
mostly in field names, to make the code look similar to the way PL/Tcl does
now wherever it's essentially the same logic.

Michael Paquier and Tom Lane

Discussion: <CAB7nPqSOyAsHC6jL24J1B+oK3p=yyNoFU0Vs_B6fd2kdd5g5WQ@mail.gmail.com>
2016-08-31 19:54:58 -04:00
Tom Lane d062245b5b Improve memory management for PL/Tcl functions.
Formerly, the memory used to represent a PL/Tcl function was allocated with
malloc() or in TopMemoryContext, and we'd leak it all if the function got
redefined during the session.  Instead, create a per-function context and
keep everything in or under that context.  Add a reference-counting
mechanism (like the one plpgsql has long had) so that we can safely clean
up an old function definition, either immediately if it's not being
executed or at the end of the outermost execution.

Currently, we only detect that a cached function is obsolete when we next
attempt to call that function.  So this covers the updated-definition case
but leaves cruft around after DROP FUNCTION.  It's not clear whether it's
worth installing a syscache invalidation callback to watch for drops;
none of the other PLs do, so for now we won't do it here either.

Michael Paquier and Tom Lane

Discussion: <CAB7nPqSOyAsHC6jL24J1B+oK3p=yyNoFU0Vs_B6fd2kdd5g5WQ@mail.gmail.com>
2016-08-31 17:27:09 -04:00
Tom Lane 65a588b4c3 Try to fix portability issue in enum renumbering (again).
The hack embodied in commit 4ba61a487 no longer works after today's change
to allow DatumGetFloat4/Float4GetDatum to be inlined (commit 14cca1bf8).
Probably what's happening is that the faulty compilers are deciding that
the now-inlined assignment is a no-op and so they're not required to
round to float4 width.

We had a bunch of similar issues earlier this year in the degree-based
trig functions, and eventually settled on using volatile intermediate
variables as the least ugly method of forcing recalcitrant compilers
to do what the C standard says (cf commit 82311bcdd).  Let's see if
that method works here.

Discussion: <4640.1472664476@sss.pgh.pa.us>
2016-08-31 13:58:01 -04:00
Tom Lane 679226337a Remove no-longer-useful SSL-specific Port.count field.
Since we removed SSL renegotiation, there's no longer any reason to
keep track of the amount of data transferred over the link.

Daniel Gustafsson

Discussion: <FEA7F89C-ECDF-4799-B789-2F8DDCBA467F@yesql.se>
2016-08-31 09:24:19 -04:00
Heikki Linnakangas 14cca1bf8e Use static inline functions for float <-> Datum conversions.
Now that we are OK with using static inline functions, we can use them
to avoid function call overhead of pass-by-val versions of Float4GetDatum,
DatumGetFloat8, and Float8GetDatum. Those functions are only a few CPU
instructions long, but they could not be written into macros previously,
because we need a local union variable for the conversion.

I kept the pass-by-ref versions as regular functions. They are very simple
too, but they call palloc() anyway, so shaving a few instructions from the
function call doesn't seem so important there.

Discussion: <dbb82a4a-2c15-ba27-dd0a-009d2aa72b77@iki.fi>
2016-08-31 16:00:28 +03:00
Tom Lane 0e0f43d6fd Prevent starting a standalone backend with standby_mode on.
This can't really work because standby_mode expects there to be more
WAL arriving, which there will not ever be because there's no WAL
receiver process to fetch it.  Moreover, if standby_mode is on then
hot standby might also be turned on, causing even more strangeness
because that expects read-only sessions to be executing in parallel.
Bernd Helmle reported a case where btree_xlog_delete_get_latestRemovedXid
got confused, but rather than band-aiding individual problems it seems
best to prevent getting anywhere near this state in the first place.
Back-patch to all supported branches.

In passing, also fix some omissions of errcodes in other ereport's in
readRecoveryCommandFile().

Michael Paquier (errcode hacking by me)

Discussion: <00F0B2CEF6D0CEF8A90119D4@eje.credativ.lan>
2016-08-31 08:52:13 -04:00
Robert Haas 530fb68e0f Update comments to reflect code rearrangement.
Commit f9143d102f falsified these.

KaiGai Kohei
2016-08-31 12:36:18 +05:30
Tom Lane 052cc223d5 Fix a bunch of places that called malloc and friends with no NULL check.
Where possible, use palloc or pg_malloc instead; otherwise, insert
explicit NULL checks.

Generally speaking, these are places where an actual OOM is quite
unlikely, either because they're in client programs that don't
allocate all that much, or they're very early in process startup
so that we'd likely have had a fork() failure instead.  Hence,
no back-patch, even though this is nominally a bug fix.

Michael Paquier, with some adjustments by me

Discussion: <CAB7nPqRu07Ot6iht9i9KRfYLpDaF2ZuUv5y_+72uP23ZAGysRg@mail.gmail.com>
2016-08-30 18:22:43 -04:00
Tom Lane 9daec77e16 Simplify correct use of simple_prompt().
The previous API for this function had it returning a malloc'd string.
That meant that callers had to check for NULL return, which few of them
were doing, and it also meant that callers had to remember to free()
the string later, which required extra logic in most cases.

Instead, make simple_prompt() write into a buffer supplied by the caller.
Anywhere that the maximum required input length is reasonably small,
which is almost all of the callers, we can just use a local or static
array as the buffer instead of dealing with malloc/free.

A fair number of callers used "pointer == NULL" as a proxy for "haven't
requested the password yet".  Maintaining the same behavior requires
adding a separate boolean flag for that, which adds back some of the
complexity we save by removing free()s.  Nonetheless, this nets out
at a small reduction in overall code size, and considerably less code
than we would have had if we'd added the missing NULL-return checks
everywhere they were needed.

In passing, clean up the API comment for simple_prompt() and get rid
of a very-unnecessary malloc/free in its Windows code path.

This is nominally a bug fix, but it does not seem worth back-patching,
because the actual risk of an OOM failure in any of these places seems
pretty tiny, and all of them are client-side not server-side anyway.

This patch is by me, but it owes a great deal to Michael Paquier
who identified the problem and drafted a patch for fixing it the
other way.

Discussion: <CAB7nPqRu07Ot6iht9i9KRfYLpDaF2ZuUv5y_+72uP23ZAGysRg@mail.gmail.com>
2016-08-30 17:02:02 -04:00
Tom Lane 37f6fd1eaa Fix initdb misbehavior when user mis-enters superuser password.
While testing simple_prompt() revisions, I happened to notice that
current initdb behaves rather badly when --pwprompt is specified and
the user miskeys the second password.  It complains about the mismatch,
does "rm -rf" on the data directory, and exits.  The problem is that
since commit c4a8812cf, there's a standalone backend sitting waiting
for commands at that point.  It gets unhappy about its datadir having
gone away, and spews a PANIC message at the user, which is not nice.
(And the shell then adds to the mess with meaningless bleating about a
core dump...)  We don't really want that sort of thing to happen unless
there's an internal failure in initdb, which this surely is not.

The best fix seems to be to move the collection of the password
earlier, so that it's done essentially as part of argument collection,
rather than at the rather ad-hoc time it was done before.

Back-patch to 9.6 where the problem was introduced.
2016-08-30 15:25:01 -04:00
Alvaro Herrera 8e1e3f958f Split hash.h → hash_xlog.h
Since the hash AM is going to be revamped to have WAL, this is a good
opportunity to clean up the include file a little bit to avoid including
a lot of extra stuff in the future.

Author: Amit Kapila
2016-08-29 18:55:49 -03:00
Heikki Linnakangas 9b7cd59af1 Remove support for OpenSSL versions older than 0.9.8.
OpenSSL officially only supports 1.0.1 and newer. Some OS distributions
still provide patches for 0.9.8, but anything older than that is not
interesting anymore. Let's simplify things by removing compatibility code.

Andreas Karlsson, with small changes by me.
2016-08-29 20:16:02 +03:00
Tom Lane cf34fdbbe1 Make AllocSetContextCreate throw an error for bad context-size parameters.
The previous behavior was to silently change them to something valid.
That obscured the bugs fixed in commit ea268cdc9, and generally seems
less useful than complaining.  Unlike the previous commit, though,
we'll do this in HEAD only --- it's a bit too late to be possibly
breaking third-party code in 9.6.

Discussion: <CA+TgmobNcELVd3QmLD3tx=w7+CokRQiC4_U0txjz=WHpfdkU=w@mail.gmail.com>
2016-08-29 09:29:26 -04:00
Simon Riggs 49340627f9 Fix pg_receivexlog --synchronous
Make pg_receivexlog work correctly with --synchronous without slots

Backpatch to 9.5

Gabriele Bartolini, reviewed by Michael Paquier and Simon Riggs
2016-08-29 12:16:18 +01:00
Fujii Masao bd082231ed Fix typos in comments. 2016-08-29 16:06:40 +09:00
Fujii Masao bab7823a49 Fix pg_xlogdump so that it handles cross-page XLP_FIRST_IS_CONTRECORD record.
Previously pg_xlogdump failed to dump the contents of the WAL file
if the file starts with the continuation WAL record which spans
more than one pages. Since pg_xlogdump assumed that the continuation
record always fits on a page, it could not find the valid WAL record to
start reading from in that case.

This patch changes pg_xlogdump so that it can handle a continuation
WAL record which crosses a page boundary and find the valid record
to start reading from.

Back-patch to 9.3 where pg_xlogdump was introduced.

Author: Pavan Deolasee
Reviewed-By: Michael Paquier and Craig Ringer
Discussion: CABOikdPsPByMiG6J01DKq6om2+BNkxHTPkOyqHM2a4oYwGKsqQ@mail.gmail.com
2016-08-29 14:34:58 +09:00
Tom Lane b899ccbb49 Fix stray reference to the old genbki.sh script.
Per Tomas Vondra.
2016-08-28 17:44:29 -04:00
Tom Lane ea268cdc9a Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters.  While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea.  Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.

While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.

In passing, change TopMemoryContext to use the default allocation
parameters.  Formerly it could only be extended 8K at a time.  That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there.  There seems no good reason
not to let it use growing blocks like most other contexts.

Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain.  The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.

Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 17:50:38 -04:00
Tom Lane 26fa446da6 Add a nonlocalized version of the severity field to client error messages.
This has been requested a few times, but the use-case for it was never
entirely clear.  The reason for adding it now is that transmission of
error reports from parallel workers fails when NLS is active, because
pq_parse_errornotice() wrongly assumes that the existing severity field
is nonlocalized.  There are other ways we could have fixed that, but the
other options were basically kluges, whereas this way provides something
that's at least arguably a useful feature along with the bug fix.

Per report from Jakob Egger.  Back-patch into 9.6, because otherwise
parallel query is essentially unusable in non-English locales.  The
problem exists in 9.5 as well, but we don't want to risk changing
on-the-wire behavior in 9.5 (even though the possibility of new error
fields is specifically called out in the protocol document).  It may
be sufficient to leave the issue unfixed in 9.5, given the very limited
usefulness of pq_parse_errornotice in that version.

Discussion: <A88E0006-13CB-49C6-95CC-1A77D717213C@eggerapps.at>
2016-08-26 16:20:17 -04:00
Tom Lane 78dcd027e8 Fix potential memory leakage from HandleParallelMessages().
HandleParallelMessages leaked memory into the caller's context.  Since it's
called from ProcessInterrupts, there is basically zero certainty as to what
CurrentMemoryContext is, which means we could be leaking into long-lived
contexts.  Over the processing of many worker messages that would grow to
be a problem.  Things could be even worse than just a leak, if we happened
to service the interrupt while ErrorContext is current: elog.c thinks it
can reset that on its own whim, possibly yanking storage out from under
HandleParallelMessages.

Give HandleParallelMessages its own dedicated context instead, which we can
reset during each call to ensure there's no accumulation of wasted memory.

Discussion: <16610.1472222135@sss.pgh.pa.us>
2016-08-26 15:04:05 -04:00
Tom Lane 45a36e6853 Put static forward declarations in elog.c back into same order as code.
The guiding principle for the last few patches in this area apparently
involved throwing darts.

Cosmetic only, but back-patch to 9.6 because there is no reason for
9.6 and HEAD to diverge yet in this file.
2016-08-26 14:19:03 -04:00
Tom Lane 8529036b53 Fix assorted small bugs in ThrowErrorData().
Copy the palloc'd strings into the correct context, ie ErrorContext
not wherever the source ErrorData is.  This would be a large bug,
except that it appears that all catchers of thrown errors do either
EmitErrorReport or CopyErrorData before doing anything that would
cause transient memory contexts to be cleaned up.  Still, it's wrong
and it will bite somebody someday.

Fix failure to copy cursorpos and internalpos.

Utter the appropriate incantations involving recursion_depth, so that
we'll behave sanely if we get an error inside pstrdup.  (In general,
the body of this function ought to act like, eg, errdetail().)

Per code reading induced by Jakob Egger's report.
2016-08-26 14:15:47 -04:00
Tom Lane fbf28b6b52 Fix logic for adding "parallel worker" context line to worker errors.
The previous coding here was capable of adding a "parallel worker" context
line to errors that were not, in fact, returned from a parallel worker.
Instead of using an errcontext callback to add that annotation, just paste
it onto the message by hand; this looks uglier but is more reliable.

Discussion: <19757.1472151987@sss.pgh.pa.us>
2016-08-26 10:07:28 -04:00
Tom Lane 2533ff0aa5 Fix instability in parallel regression tests.
Commit f0c7b789a added a test case in case.sql that creates and then drops
both an '=' operator and the type it's for.  Given the right timing, that
can cause a "cache lookup failed for type" failure in concurrent sessions,
which see the '=' operator as a potential match for '=' in a query, but
then the type is gone by the time they inquire into its properties.
It might be nice to make that behavior more robust someday, but as a
back-patchable solution, adjust the new test case so that the operator
is never visible to other sessions.  Like the previous commit, back-patch
to all supported branches.

Discussion: <5983.1471371667@sss.pgh.pa.us>
2016-08-25 09:57:09 -04:00
Tom Lane ae4760d667 Fix small query-lifespan memory leak in bulk updates.
When there is an identifiable REPLICA IDENTITY index on the target table,
heap_update leaks the id_attrs bitmapset.  That's not many bytes, but it
adds up over enough rows, since the code typically runs in a query-lifespan
context.  Bug introduced in commit e55704d8b, which did a rather poor job
of cloning the existing use-pattern for RelationGetIndexAttrBitmap().

Per bug #14293 from Zhou Digoal.  Back-patch to 9.4 where the bug was
introduced.

Report: <20160824114320.15676.45171@wrigleys.postgresql.org>
2016-08-24 22:20:25 -04:00
Tom Lane 2c00fad286 Fix improper repetition of previous results from a hashed aggregate.
ExecReScanAgg's check for whether it could re-use a previously calculated
hashtable neglected the possibility that the Agg node might reference
PARAM_EXEC Params that are not referenced by its input plan node.  That's
okay if the Params are in upper tlist or qual expressions; but if one
appears in aggregate input expressions, then the hashtable contents need
to be recomputed when the Param's value changes.

To avoid unnecessary performance degradation in the case of a Param that
isn't within an aggregate input, add logic to the planner to determine
which Params are within aggregate inputs.  This requires a new field in
struct Agg, but fortunately we never write plans to disk, so this isn't
an initdb-forcing change.

Per report from Jeevan Chalke.  This has been broken since forever,
so back-patch to all supported branches.

Andrew Gierth, with minor adjustments by me

Report: <CAM2+6=VY8ykfLT5Q8vb9B6EbeBk-NGuLbT6seaQ+Fq4zXvrDcA@mail.gmail.com>
2016-08-24 14:38:12 -04:00
Kevin Grittner 5cd3864075 Remove unnecessary #include.
Accidentally added in 8b65cf4c5e.

Pointed out by Álvaro Herrera
2016-08-24 13:17:21 -05:00
Noah Misch 0395198728 Build libpgfeutils before src/bin/pg_basebackup programs.
Oversight in commit 9132c01429.
2016-08-23 23:40:38 -04:00
Noah Misch b6418a0919 Build libpgfeutils before pg_isready.
Every program having -lpgfeutils in LDFLAGS must have this dependency,
whether or not the program uses a libpgfeutils symbol.  Back-patch to
9.6, where libpgfeutils was introduced.
2016-08-23 23:40:38 -04:00
Tom Lane 71e006f031 Suppress compiler warnings in non-cassert builds.
With Asserts off, these variables are set but never used, resulting
in warnings from pickier compilers.  Fix that with our standard solution.
Per report from Jeff Janes.
2016-08-23 23:21:10 -04:00
Tom Lane 32909a57f9 Fix network_spgist.c build failures from missing AF_INET definition.
AF_INET is apparently defined in something that's pulled in automatically
on Linux, but the buildfarm says that's not true everywhere.  Comparing
to network_gist.c suggests that including <sys/socket.h> ought to fix it,
and the POSIX standard concurs.
2016-08-23 16:25:35 -04:00
Tom Lane 77e2906821 Create an SP-GiST opclass for inet/cidr.
This seems to offer significantly better search performance than the
existing GiST opclass for inet/cidr, at least on data with a wide mix
of network mask lengths.  (That may suggest that the data splitting
heuristics in the GiST opclass could be improved.)

Emre Hasegeli, with mostly-cosmetic adjustments by me

Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>
2016-08-23 15:16:30 -04:00
Robert Haas 0fda682e54 Extend dsm API with a new function dsm_unpin_segment.
If you have previously pinned a segment and decide that you don't
actually want to keep it around until shutdown, this new API lets you
remove the pin.  This is pretty trivial except on Windows, where it
requires closing the duplicate handle that was used to implement the
pin.

Thomas Munro and Amit Kapila, reviewed by Amit Kapila and by me.
2016-08-23 14:32:23 -04:00
Robert Haas 19998730ae Remove duplicate function prototype.
Kyotaro Horiguchi
2016-08-23 13:44:18 -04:00
Tom Lane d2ddee63b4 Improve SP-GiST opclass API to better support unlabeled nodes.
Previously, the spgSplitTuple action could only create a new upper tuple
containing a single labeled node.  This made it useless for opclasses
that prefer to work with fixed sets of nodes (labeled or otherwise),
which meant that restrictive prefixes could not be used with such
node definitions.  Change the output field set for the choose() method
to allow it to specify any valid node set for the new upper tuple,
and to specify which of these nodes to place the modified lower tuple in.

In addition to its primary use for fixed node sets, this feature could
allow existing opclasses that use variable node sets to skip a separate
spgAddNode action when splitting a tuple, by setting up the node needed
for the incoming value as part of the spgSplitTuple action.  However, care
would have to be taken to add the extra node only when it would not make
the tuple bigger than before.  (spgAddNode can enlarge the tuple,
spgSplitTuple can't.)

This is a prerequisite for an upcoming SP-GiST inet opclass, but is
being committed separately to increase the visibility of the API change.

In passing, improve the documentation about the traverse-values feature
that was added by commit ccd6eb49a.

Emre Hasegeli, with cosmetic adjustments and documentation rework by me

Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>
2016-08-23 12:10:34 -04:00
Robert Haas 86f31695f3 Add txid_current_ifassigned().
Add a variant of txid_current() that returns NULL if no transaction ID
is assigned.  This version can be used even on a standby server,
although it will always return NULL since no transaction IDs can be
assigned during recovery.

Craig Ringer, per suggestion from Jim Nasby.  Reviewed by Petr Jelinek
and by me.
2016-08-23 10:30:52 -04:00
Robert Haas ff36700c3b Remove duplicate word from comment.
Erik Rijkers
2016-08-23 10:05:13 -04:00
Tom Lane 7b405b3e04 Refactor some network.c code to create cidr_set_masklen_internal().
Merge several copies of "copy an inet value and adjust the mask length"
code to create a single, conveniently C-callable function.  This function
is exported for future use by inet SPGiST support, but it's good cleanup
anyway since we had three slightly-different-for-no-good-reason copies.

(Extracted from a larger patch, to separate new code from refactoring
of old code)

Emre Hasegeli
2016-08-23 09:39:54 -04:00
Robert Haas 008c4135cc Fix possible sorting error when aborting use of abbreviated keys.
Due to an error in the abbreviated key abort logic, the most recently
processed SortTuple could be incorrectly marked NULL, resulting in an
incorrect final sort order.

In the worst case, this could result in a corrupt btree index, which
would need to be rebuild using REINDEX.  However, abbrevation doesn't
abort very often, not all data types use it, and only one tuple would
end up in the wrong place, so the practical impact of this mistake may
be somewhat limited.

Report and patch by Peter Geoghegan.
2016-08-22 15:22:11 -04:00
Robert Haas af5743851d Improve header comment for LockHasWaitersRelation.
Dimitry Ivanov spotted a typo, and I added a bit of wordsmithing.
2016-08-22 11:53:20 -04:00
Peter Eisentraut f9472d7256 Run select_parallel test by itself
Remove the plpgsql wrapping that hides the context.  So now the test
will fail if the work doesn't actually happen in a parallel worker.  Run
the test in its own test group to ensure it won't run out of resources
for that.
2016-08-22 12:00:00 -04:00
Tom Lane 234309fa87 initdb now needs submake-libpq and submake-libpgfeutils.
More fallout from commit a00c58314.  Pointed out by Michael Paquier.
2016-08-22 08:01:12 -04:00
Noah Misch 9132c01429 Retire escapeConnectionParameter().
It is redundant with appendConnStrVal(), which became an extern function
in commit 41f18f021a.  This changes the
handling of out-of-memory and of certain inputs for which quoting is
optional, but pg_basebackup has no need for unusual treatment thereof.
2016-08-21 22:05:57 -04:00
Tom Lane 04164deb7c initdb now needs to reference libpq include files in MSVC builds.
Fallout from commit a00c58314.  Per buildfarm.
2016-08-20 16:53:25 -04:00
Tom Lane a00c583147 Make initdb's suggested "pg_ctl start" command line more reliable.
The original coding here was not nearly careful enough about quoting
special characters, and it didn't get corner cases right for constructing
the pg_ctl path either.  Use join_path_components() and appendShellString()
to do it honestly, so that the string will more likely work if blindly
copied-and-pasted.

While at it, teach appendShellString() not to quote strings that clearly
don't need it, so that the output from initdb doesn't become uglier than
it was before in typical cases where quoting is not needed.

Ryan Murphy, reviewed by Michael Paquier and myself

Discussion: <CAHeEsBeAe1FeBypT3E8R1ZVZU0e8xv3A-7BHg6bEOi=jZny2Uw@mail.gmail.com>
2016-08-20 15:05:25 -04:00
Tom Lane 6471045230 Allow empty queries in pgbench.
This might have been too much of a foot-gun before 9.6, but with the
new commands-end-at-semicolons parsing rule, the only way to get an
empty query into a script is to explicitly write an extra ";".
So we may as well allow the case.

Fabien Coelho

Patch: <alpine.DEB.2.20.1607090922170.3412@sto>
2016-08-19 17:32:59 -04:00
Tom Lane 8299471c37 Use LEFT JOINs in some system views in case referenced row doesn't exist.
In particular, left join to pg_authid so that rows in pg_stat_activity
don't disappear if the session's owning user has been dropped.
Also convert a few joins to pg_database to left joins, in the same spirit,
though that case might be harder to hit.  We were doing this in other
views already, so it was a bit inconsistent that these views didn't.

Oskari Saarenmaa, with some further tweaking by me

Discussion: <56E87CD8.60007@ohmu.fi>
2016-08-19 17:13:47 -04:00
Tom Lane 65a603e903 Guard against parallel-restricted functions in VALUES expressions.
Obvious brain fade in set_rel_consider_parallel().  Noticed it while
adjusting the adjacent RTE_FUNCTION case.

In 9.6, also make the code look more like what I just did in HEAD
by removing the unnecessary function_rte_parallel_ok subroutine
(it does nothing that expression_tree_walker wouldn't do).
2016-08-19 14:35:32 -04:00
Tom Lane da1c91631e Speed up planner's scanning for parallel-query hazards.
We need to scan the whole parse tree for parallel-unsafe functions.
If there are none, we'll later need to determine whether particular
subtrees contain any parallel-restricted functions.  The previous coding
retained no knowledge from the first scan, even though this is very
wasteful in the common case where the query contains only parallel-safe
functions.  We can bypass all of the later scans by remembering that fact.
This provides a small but measurable speed improvement when the case
applies, and shouldn't cost anything when it doesn't.

Patch by me, reviewed by Robert Haas

Discussion: <3740.1471538387@sss.pgh.pa.us>
2016-08-19 14:03:13 -04:00
Alvaro Herrera 6f79ae7fe5 reorderbuffer: preserve errno while reporting error
Clobbering errno during cleanup after an error is an oft-repeated, easy
to make mistake.  Deal with it here as everywhere else, by saving it
aside and restoring after cleanup, before ereport'ing.

In passing, add a missing errcode declaration in another ereport() call
in the same file, which I noticed while skimming the file looking for
similar problems.

Backpatch to 9.4, where this code was introduced.
2016-08-19 14:38:55 -03:00
Tom Lane a859e64003 Clean up another pre-ANSI-C-ism in regex code: get rid of pcolor typedef.
pcolor was used to represent function arguments that are nominally of
type color, but when using a pre-ANSI C compiler would be passed as the
promoted integer type.  We really don't need that anymore.
2016-08-19 13:31:10 -04:00
Tom Lane 6eefd2422e Remove typedef celt from the regex library, along with macro NOCELT.
The regex library used to have a notion of a "collating element" that was
distinct from a "character", but Henry Spencer never actually implemented
his planned support for multi-character collating elements, and the Tcl
crew ripped out most of the stubs for that years ago.  The only thing left
that distinguished the "celt" typedef from the "chr" typedef was that
"celt" was supposed to also be able to hold the not-a-character "NOCELT"
value.  However, NOCELT was not used anywhere after the MCCE stub removal
changes, which means there's no need for celt to be different from chr.
Removing the separate typedef simplifies matters and also removes a trap
for the unwary, in that celt is signed while chr may not be, so comparisons
could mean different things.  There's no bug there today because we
restrict CHR_MAX to be less than INT_MAX, but I think there may have been
such bugs before we did that, and there could be again if anyone ever
decides to fool with the range of chr.

This patch also removes assorted unnecessary casts to "chr" of values
that are already chrs.  Many of these seem to be leftover from days when
the code was compatible with pre-ANSI C.
2016-08-19 12:51:02 -04:00
Andres Freund 9595383bc6 Add alternative output for ON CONFLICT toast isolation test.
On some buildfarm animals the isolationtest added in 07ef0351 failed, as
the order in which processes are run after unlocking is not
guaranteed. Add an alternative output for that.

Discussion: <7969.1471484738@sss.pgh.pa.us>
Backpatch: 9.6, like the test in the aforementioned commit
2016-08-18 17:34:05 -07:00
Peter Eisentraut 1d2e73a3df Remove obsolete replacement system() on darwin
Per comment in the file, this was fixed around OS X 10.2.
2016-08-18 12:00:00 -04:00
Tom Lane c5d4f40cb5 Update line count totals for psql help displays.
As usual, we've been pretty awful about maintaining these counts.
They're not all that critical, perhaps, but let's get them right
at release time.  Also fix 9.5, which I notice is just as bad.
It's probably wrong further back, but the lack of --help=foo
options before 9.5 makes it too painful to count.
2016-08-18 16:04:35 -04:00
Tom Lane 5697522d84 In plpgsql, don't try to convert int2vector or oidvector to expanded array.
These types are storage-compatible with real arrays, but they don't support
toasting, so of course they can't support expansion either.

Per bug #14289 from Michael Overmeyer.  Back-patch to 9.5 where expanded
arrays were introduced.

Report: <20160818174414.1529.37913@wrigleys.postgresql.org>
2016-08-18 14:49:08 -04:00
Tom Lane 8019b5a89c Improve psql's tab completion for \l.
Offer a list of database names; formerly no help was offered.

Ian Barwick, reviewed by Gerdan Santos

Patch: <5724132E.1030804@2ndquadrant.com>
2016-08-18 11:29:16 -04:00
Tom Lane 49917dbd76 Improve psql's tab completion for ALTER EXTENSION foo UPDATE ...
Offer a list of available versions for that extension.  Formerly, since
there was no special support for this, it triggered off the UPDATE
keyword and offered a list of table names --- not too helpful.

Jeff Janes, reviewed by Gerdan Santos

Patch: <CAMkU=1z0gxEOLg2BWa69P4X4Ot8xBxipGUiGkXe_tC+raj79-Q@mail.gmail.com>
2016-08-18 11:17:10 -04:00
Peter Eisentraut 9f31e45a6d Improve formatting of comments in plpgsql.h
This file had some unusual comment layout.  Most of the comments
introducing structs ended up to the right of the screen and following
the start of the struct.  Some comments for struct members ended up
after the member definition.

Fix that by moving comments consistently before what they are
describing.  Also add missing struct tags where missing so that it is
easier to tell what the struct is.
2016-08-18 12:00:00 -04:00
Heikki Linnakangas fa878703f4 Refactor RandomSalt to handle salts of different lengths.
All we need is 4 bytes at the moment, for MD5 authentication. But in
upcomint patches for SCRAM authentication, SCRAM will need a salt of
different length. It's less scary for the caller to pass the buffer
length anyway, than assume a certain-sized output buffer.

Author: Michael Paquier
Discussion: <CAB7nPqQvO4sxLFeS9D+NM3wpy08ieZdAj_6e117MQHZAfxBFsg@mail.gmail.com>
2016-08-18 13:41:17 +03:00
Magnus Hagander a79a685622 Update Windows timezone mapping from Windows 7 and 10
This adds a couple of new timezones that are present in the newer
versions of Windows. It also updates comments to reference UTC rather
than GMT, as this change has been made in Windows.

Michael Paquier
2016-08-18 12:32:42 +02:00
Heikki Linnakangas 8d3b9cce81 Refactor sendAuthRequest.
This way sendAuthRequest doesn't need to know the details of all the
different authentication methods. This is in preparation for adding SCRAM
authentication, which will add yet another authentication request message
type, with different payload.

Reviewed-By: Michael Paquier
Discussion: <CAB7nPqQvO4sxLFeS9D+NM3wpy08ieZdAj_6e117MQHZAfxBFsg@mail.gmail.com>
2016-08-18 13:25:31 +03:00
Andres Freund 07ef035129 Fix deletion of speculatively inserted TOAST on conflict
INSERT ..  ON CONFLICT runs a pre-check of the possible conflicting
constraints before performing the actual speculative insertion.  In case
the inserted tuple included TOASTed columns the ON CONFLICT condition
would be handled correctly in case the conflict was caught by the
pre-check, but if two transactions entered the speculative insertion
phase at the same time, one would have to re-try, and the code for
aborting a speculative insertion did not handle deleting the
speculatively inserted TOAST datums correctly.

TOAST deletion would fail with "ERROR: attempted to delete invisible
tuple" as we attempted to remove the TOAST tuples using
simple_heap_delete which reasoned that the given tuples should not be
visible to the command that wrote them.

This commit updates the heap_abort_speculative() function which aborts
the conflicting tuple to use itself, via toast_delete, for deleting
associated TOAST datums.  Like before, the inserted toast rows are not
marked as being speculative.

This commit also adds a isolationtester spec test, exercising the
relevant code path. Unfortunately 9.5 cannot handle two waiting
sessions, and thus cannot execute this test.

Reported-By: Viren Negi, Oskari Saarenmaa
Author: Oskari Saarenmaa, edited a bit by me
Bug: #14150
Discussion: <20160519123338.12513.20271@wrigleys.postgresql.org>
Backpatch: 9.5, where ON CONFLICT was introduced
2016-08-17 17:03:36 -07:00
Tom Lane cf9b0fea5f Implement regexp_match(), a simplified alternative to regexp_matches().
regexp_match() is like regexp_matches(), but it disallows the 'g' flag
and in consequence does not need to return a set.  Instead, it returns
a simple text array value, or NULL if there's no match.  Previously people
usually got that behavior with a sub-select, but this way is considerably
more efficient.

Documentation adjusted so that regexp_match() is presented first and then
regexp_matches() is introduced as a more complicated version.  This is
a bit historically revisionist but seems pedagogically better.

Still TODO: extend contrib/citext to support this function.

Emre Hasegeli, reviewed by David Johnston

Discussion: <CAE2gYzy42sna2ME_e3y1KLQ-4UBrB-eVF0SWn8QG39sQSeVhEw@mail.gmail.com>
2016-08-17 18:33:01 -04:00
Andres Freund 2d7e591007 Properly re-initialize replication slot shared memory upon creation.
Slot creation did not clear all fields upon creation. After start the
memory is zeroed, but when a physical replication slot was created in
the shared memory of a previously existing logical slot, catalog_xmin
would not be cleared. That in turn would prevent vacuum from doing its
duties.

To fix initialize all the fields. To make similar future bugs less
likely, zero all of ReplicationSlotPersistentData, and re-order the
rest of the initialization to be in struct member order.

Analysis: Andrew Gierth
Reported-By: md@chewy.com
Author: Michael Paquier
Discussion: <20160705173502.1398.70934@wrigleys.postgresql.org>
Backpatch: 9.4, where replication slots were introduced
2016-08-17 13:15:03 -07:00
Tom Lane bfaaacc805 Improve plpgsql's memory management to fix some function-lifespan leaks.
In some cases, exiting out of a plpgsql statement due to an error, then
catching the error in a surrounding exception block, led to leakage of
temporary data the statement was working with, because we kept all such
data in the function-lifespan SPI Proc context.  Iterating such behavior
many times within one function call thus led to noticeable memory bloat.

To fix, create an additional memory context meant to have statement
lifespan.  Since many plpgsql statements, particularly the simpler/more
common ones, don't need this, create it only on demand.  Reset this context
at the end of any statement that uses it, and arrange for exception cleanup
to reset it too, thereby fixing the memory-leak issue.  Allow a stack of
such contexts to exist to handle cases where a compound statement needs
statement-lifespan data that persists across calls of inner statements.

While at it, clean up code and improve comments referring to the existing
short-term memory context, which by plpgsql convention is the per-tuple
context of the eval_econtext ExprContext.  We now uniformly refer to that
as the eval_mcontext, whereas the new statement-lifespan memory contexts
are called stmt_mcontext.

This change adds some context-creation overhead, but on the other hand
it allows removal of some retail pfree's in favor of context resets.
On balance it seems to be about a wash performance-wise.

In principle this is a bug fix, but it seems too invasive for a back-patch,
and the infrequency of complaints weighs against taking the risk in the
back branches.  So we'll fix it only in HEAD, at least for now.

Tom Lane, reviewed by Pavel Stehule

Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 14:51:10 -04:00
Magnus Hagander 0921554657 Disable update_process_title by default on Windows
The performance overhead of this can be significant on Windows, and most
people don't have the tools to view it anyway as Windows does not have
native support for process titles.

Discussion: <0A3221C70F24FB45833433255569204D1F5BE3E8@G01JPEXMBYT05>

Takayuki Tsunakawa
2016-08-17 10:43:16 +02:00
Tom Lane 0bb51aa967 Improve parsetree representation of special functions such as CURRENT_DATE.
We implement a dozen or so parameterless functions that the SQL standard
defines special syntax for.  Up to now, that was done by converting them
into more or less ad-hoc constructs such as "'now'::text::date".  That's
messy for multiple reasons: it exposes what should be implementation
details to users, and performance is worse than it needs to be in several
cases.  To improve matters, invent a new expression node type
SQLValueFunction that can represent any of these parameterless functions.

Bump catversion because this changes stored parsetrees for rules.

Discussion: <30058.1463091294@sss.pgh.pa.us>
2016-08-16 20:33:01 -04:00
Tom Lane 4bc4cfe3bd Suppress -Wunused-result warning for strtol().
I'm not sure which bozo thought it's a problem to use strtol() only
for its endptr result, but silence the warning using same method
used elsewhere.

Report: <f845d3a6-5328-3e2a-924f-f8e91aa2b6d2@2ndquadrant.com>
2016-08-16 16:14:16 -04:00
Tom Lane 7f61fd10ce Fix assorted places in psql to print version numbers >= 10 in new style.
This is somewhat cosmetic, since as long as you know what you are looking
at, "10.0" is a serviceable substitute for "10".  But there is a potential
for confusion between version numbers with minor numbers and those without
--- we don't want people asking "why is psql saying 10.0 when my server is
10.2".  Therefore, back-patch as far as practical, which turns out to be
9.3.  I could have redone the patch to use fprintf(stderr) in place of
psql_error(), but it seems more work than is warranted for branches that
will be EOL or nearly so by the time v10 comes out.

Although only psql seems to contain any code that needs this, I chose
to put the support function into fe_utils, since it seems likely we'll
need it in other client programs in future.  (In 9.3-9.5, use dumputils.c,
the predecessor of fe_utils/string_utils.c.)

In HEAD, also fix the backend code that whines about loadable-library
version mismatch.  I don't see much need to back-patch that.
2016-08-16 15:58:45 -04:00
Peter Eisentraut f0fe1c8f70 Fix typos
From: Alexander Law <exclusion@gmail.com>
2016-08-16 14:52:29 -04:00
Tom Lane a3bce17ef1 Automate the maintenance of SO_MINOR_VERSION for our shared libraries.
Up to now we've manually adjusted these numbers in several different
Makefiles at the start of each development cycle.  While that's not
much work, it's easily forgotten, so let's get rid of it by setting
the SO_MINOR_VERSION values directly from $(MAJORVERSION).

In the case of libpq, this dev cycle's value of SO_MINOR_VERSION happens
to be "10" anyway, so this switch is transparent.  For ecpg's shared
libraries, this will result in skipping one or two minor version numbers
between v9.6 and v10, which seems like no big problem; and it was a bit
inconsistent that they didn't have equal minor version numbers anyway.

Discussion: <21969.1471287988@sss.pgh.pa.us>
2016-08-16 13:58:54 -04:00
Robert Haas 41fb35fabf Fix possible crash due to incorrect allocation context.
Commit af33039317 aimed to reduce
leakage from tqueue.c, which is good.  Unfortunately, by changing the
memory context in which all of gather_readnext() executes, it also
changed the context in which ExecShutdownGatherWorkers executes, which
is not good, because that function eventually causes a call to
ExecParallelRetrieveInstrumentation, which proceeds to allocate
planstate->worker_instrument in a short-lived context, causing a
crash.

Rushabh Lathia, reviewed by Amit Kapila and by me.
2016-08-16 13:23:32 -04:00
Tom Lane a7b5573d66 Remove separate version numbering for ecpg preprocessor.
Once upon a time, it made sense for the ecpg preprocessor to have its
own version number, because it used a manually-maintained grammar that
wasn't always in sync with the core grammar.  But those days are
thankfully long gone, leaving only a maintenance nuisance behind.
Let's use the PG v10 version numbering changeover as an excuse to get
rid of the ecpg version number and just have ecpg identify itself by
PG_VERSION.  From the user's standpoint, ecpg will go from "4.12" in
the 9.6 branch to "10" in the 10 branch, so there's no failure of
monotonicity.

Discussion: <1471332659.4410.67.camel@postgresql.org>
2016-08-16 12:49:30 -04:00
Robert Haas b25b6c9701 Once again allow LWLocks to be used within DSM segments.
Prior to commit 7882c3b0b9, it was
possible to use LWLocks within DSM segments, but that commit broke
this use case by switching from a doubly linked list to a circular
linked list.  Switch back, using a new bit of general infrastructure
for maintaining lists of PGPROCs.

Thomas Munro, reviewed by me.
2016-08-15 18:09:55 -04:00
Tom Lane 3149a12166 Update git_changelog to know that there's a 9.6 branch.
Missed this in the main 10devel version stamping patch.
2016-08-15 15:24:54 -04:00
Tom Lane 0b9358d440 Stamp shared-library minor version numbers for v10. 2016-08-15 14:35:55 -04:00
Tom Lane ca9112a424 Stamp HEAD as 10devel.
This is a good bit more complicated than the average new-version stamping
commit, because it includes various adjustments in pursuit of changing
from three-part to two-part version numbers.  It's likely some further
work will be needed around that change; but this is enough to get through
the regression tests, at least in Unix builds.

Peter Eisentraut and Tom Lane
2016-08-15 13:49:49 -04:00
Tom Lane b5bce6c1ec Final pgindent + perltidy run for 9.6. 2016-08-15 13:42:51 -04:00