Commit Graph

26087 Commits

Author SHA1 Message Date
Robert Haas
5ea86e6e65 Use the sortsupport infrastructure in more cases.
This removes some fmgr overhead from cases such as btree index builds.

Peter Geoghegan, reviewed by Andreas Karlsson and me.
2014-11-07 15:50:55 -05:00
Alvaro Herrera
0e892e04ef Fix serial schedule
Test misc depends on brin, but it was earlier in the serial schedule
file.  I didn't notice this because I only run the parallel schedule,
but the buildfarm exposed my folly ...
2014-11-07 17:08:16 -03:00
Alvaro Herrera
7516f52594 BRIN: Block Range Indexes
BRIN is a new index access method intended to accelerate scans of very
large tables, without the maintenance overhead of btrees or other
traditional indexes.  They work by maintaining "summary" data about
block ranges.  Bitmap index scans work by reading each summary tuple and
comparing them with the query quals; all pages in the range are returned
in a lossy TID bitmap if the quals are consistent with the values in the
summary tuple, otherwise not.  Normal index scans are not supported
because these indexes do not store TIDs.

As new tuples are added into the index, the summary information is
updated (if the block range in which the tuple is added is already
summarized) or not; in the latter case, a subsequent pass of VACUUM or
the brin_summarize_new_values() function will create the summary
information.

For data types with natural 1-D sort orders, the summary info consists
of the maximum and the minimum values of each indexed column within each
page range.  This type of operator class we call "Minmax", and we
supply a bunch of them for most data types with B-tree opclasses.
Since the BRIN code is generalized, other approaches are possible for
things such as arrays, geometric types, ranges, etc; even for things
such as enum types we could do something different than minmax with
better results.  In this commit I only include minmax.

Catalog version bumped due to new builtin catalog entries.

There's more that could be done here, but this is a good step forwards.

Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera,
with contribution by Heikki Linnakangas.

Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas.
Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo.

PS:
  The research leading to these results has received funding from the
  European Union's Seventh Framework Programme (FP7/2007-2013) under
  grant agreement n° 318633.
2014-11-07 16:38:14 -03:00
Heikki Linnakangas
1961b1c131 Fix generation of SP-GiST vacuum WAL records.
I broke these in 8776faa81c. Backpatch to
9.4, where that was done.
2014-11-07 21:17:46 +02:00
Heikki Linnakangas
2effb72e68 Remove obsolete cases from GiST update redo code.
The code that generated a record to clear the F_TUPLES_DELETED flag hasn't
existed since we got rid of old-style VACUUM FULL. I kept the code that sets
the flag, although it's not used for anything anymore, because it might
still be interesting information for debugging purposes that some tuples
have been deleted from a page.

Likewise, the code to turn the root page from non-leaf to leaf page was
removed when we got rid of old-style VACUUM FULL. Remove the code to replay
that action, too.
2014-11-07 15:13:02 +02:00
Tom Lane
d6e37b35cd Cope with more than 64K phrases in a thesaurus dictionary.
dict_thesaurus stored phrase IDs in uint16 fields, so it would get confused
and even crash if there were more than 64K entries in the configuration
file.  It turns out to be basically free to widen the phrase IDs to uint32,
so let's just do so.

This was complained of some time ago by David Boutin (in bug #7793);
he later submitted an informal patch but it was never acted on.
We now have another complaint (bug #11901 from Luc Ouellette) so it's
time to make something happen.

This is basically Boutin's patch, but for future-proofing I also added a
defense against too many words per phrase.  Note that we don't need any
explicit defense against overflow of the uint32 counters, since before that
happens we'd hit array allocation sizes that repalloc rejects.

Back-patch to all supported branches because of the crash risk.
2014-11-06 20:52:40 -05:00
Tom Lane
4875931938 Fix normalization of numeric values in JSONB GIN indexes.
The default JSONB GIN opclass (jsonb_ops) converts numeric data values
to strings for storage in the index.  It must ensure that numeric values
that would compare equal (such as 12 and 12.00) produce identical strings,
else index searches would have behavior different from regular JSONB
comparisons.  Unfortunately the function charged with doing this was
completely wrong: it could reduce distinct numeric values to the same
string, or reduce equivalent numeric values to different strings.  The
former type of error would only lead to search inefficiency, but the
latter type of error would cause index entries that should be found by
a search to not be found.

Repairing this bug therefore means that it will be necessary for 9.4 beta
testers to reindex GIN jsonb_ops indexes, if they care about getting
correct results from index searches involving numeric data values within
the comparison JSONB object.

Per report from Thomas Fanghaenel.
2014-11-06 11:41:06 -05:00
Fujii Masao
5332b8cec5 Prevent the unnecessary creation of .ready file for the timeline history file.
Previously .ready file was created for the timeline history file at the end
of an archive recovery even when WAL archiving was not enabled.
This creation is unnecessary and causes .ready file to remain infinitely.

This commit changes an archive recovery so that it creates .ready file for
the timeline history file only when WAL archiving is enabled.

Backpatch to all supported versions.
2014-11-06 21:24:40 +09:00
Heikki Linnakangas
2076db2aea Move the backup-block logic from XLogInsert to a new file, xloginsert.c.
xlog.c is huge, this makes it a little bit smaller, which is nice. Functions
related to putting together the WAL record are in xloginsert.c, and the
lower level stuff for managing WAL buffers and such are in xlog.c.

Also move the definition of XLogRecord to a separate header file. This
causes churn in the #includes of all the files that write WAL records, and
redo routines, but it avoids pulling in xlog.h into most places.

Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.
2014-11-06 13:55:36 +02:00
Fujii Masao
d2b8a2c7ec Fix typo in comment.
Etsuro Fujita
2014-11-06 20:04:11 +09:00
Fujii Masao
08309aaf74 Implement IF NOT EXIST for CREATE INDEX.
Fabrízio de Royes Mello, reviewed by Marti Raudsepp, Adam Brightwell and me.
2014-11-06 18:48:33 +09:00
Bruce Momjian
171c377a0a C comment: mention why the Gregorian calendar is used pre-1582 2014-11-06 02:33:05 -05:00
Tom Lane
525a489915 Remove the last vestige of server-side autocommit.
Long ago we briefly had an "autocommit" GUC that turned server-side
autocommit on and off.  That behavior was removed in 7.4 after concluding
that it broke far too much client-side logic, and making clients cope with
both behaviors was impractical.  But the GUC variable was left behind, so
as not to break any client code that might be trying to read its value.
Enough time has now passed that we should remove the GUC completely.
Whatever vestigial backwards-compatibility benefit it had is outweighed by
the risk of confusion for newbies who assume it ought to do something,
as per a recent complaint from Wolfgang Wilhelm.

In passing, adjust what seemed to me a rather confusing documentation
reference to libpq's autocommit behavior.  libpq as such knows nothing
about autocommit, so psql is probably what was meant.
2014-11-05 19:35:23 -05:00
Robert Haas
c30be9787b Fix thinko in commit 2bd9e412f9.
Obviously, every translation unit should not be declaring this
separately.  It needs to be PGDLLIMPORT as well, to avoid breaking
third-party code that uses any of the functions that the commit
mentioned above changed to macros.
2014-11-05 17:12:23 -05:00
Tom Lane
465d7e1882 Make CREATE TYPE print warnings if a datatype's I/O functions are volatile.
This is a followup to commit 43ac12c6e6,
which added regression tests checking that I/O functions of built-in
types are not marked volatile.  Complaining in CREATE TYPE should push
developers of add-on types to fix any misdeclared functions in their
types.  It's just a warning not an error, to avoid creating upgrade
problems for what might be just cosmetic mis-markings.

Aside from adding the warning code, fix a number of types that were
sloppily created in the regression tests.
2014-11-05 11:44:06 -05:00
Tom Lane
33f80f8480 Drop no-longer-needed buffers during ALTER DATABASE SET TABLESPACE.
The previous coding assumed that we could just let buffers for the
database's old tablespace age out of the buffer arena naturally.
The folly of that is exposed by bug #11867 from Marc Munro: the user could
later move the database back to its original tablespace, after which any
still-surviving buffers would match lookups again and appear to contain
valid data.  But they'd be missing any changes applied while the database
was in the new tablespace.

This has been broken since ALTER SET TABLESPACE was introduced, so
back-patch to all supported branches.
2014-11-04 13:24:06 -05:00
Heikki Linnakangas
5028f22f6e Switch to CRC-32C in WAL and other places.
The old algorithm was found to not be the usual CRC-32 algorithm, used by
Ethernet et al. We were using a non-reflected lookup table with code meant
for a reflected lookup table. That's a strange combination that AFAICS does
not correspond to any bit-wise CRC calculation, which makes it difficult to
reason about its properties. Although it has worked well in practice, seems
safer to use a well-known algorithm.

Since we're changing the algorithm anyway, we might as well choose a
different polynomial. The Castagnoli polynomial has better error-correcting
properties than the traditional CRC-32 polynomial, even if we had
implemented it correctly. Another reason for picking that is that some new
CPUs have hardware support for calculating CRC-32C, but not CRC-32, let
alone our strange variant of it. This patch doesn't add any support for such
hardware, but a future patch could now do that.

The old algorithm is kept around for tsquery and pg_trgm, which use the
values in indexes that need to remain compatible so that pg_upgrade works.
While we're at it, share the old lookup table for CRC-32 calculation
between hstore, ltree and core. They all use the same table, so might as
well.
2014-11-04 11:39:48 +02:00
Heikki Linnakangas
404bc51cde Remove support for 64-bit CRC.
It hasn't been used for anything for a long time.
2014-11-04 11:33:08 +02:00
Robert Haas
585e0b9b27 pqmq.h needs to include something that defines StringInfo.
Reported by Peter Eisentraut.
2014-11-03 12:24:42 -05:00
Noah Misch
c40212baf6 Clarify .def file comments. 2014-11-02 21:43:33 -05:00
Noah Misch
00c07e497f Re-remove dependency on the DLL of pythonxx.def file.
The reasons behind commit 0d147e43ad still
stand, so this reverts the non-cosmetic portion of commit
a7983e989d.  Back-patch to 9.4, where the
latter commit first appeared.
2014-11-02 21:43:30 -05:00
Noah Misch
67a4120494 Make ECPG test programs depend on "ecpg$(X)", not "ecpg".
Cygwin builds require this of dependencies pertaining to pattern rules.
On Cygwin, stat("foo") in the absence of a file with that exact name can
locate foo.exe.  While GNU make uses stat() for dependencies of ordinary
rules, it uses readdir() to assess dependencies of pattern rules.
Therefore, a pattern rule dependency should match any underlying file
name exactly.  Back-patch to 9.4, where the dependency was introduced.
2014-11-02 21:43:25 -05:00
Noah Misch
8463195217 Fix win32setlocale.c const-related warnings.
Back-patch to 9.2, like commit db29620d4d.
2014-11-02 21:43:20 -05:00
Peter Eisentraut
a409b464f9 Add configure --enable-tap-tests option
Don't skip the TAP tests anymore when IPC::Run is not found.  This will
fail normally now.
2014-11-02 09:17:26 -05:00
Robert Haas
2bd9e412f9 Support frontend-backend protocol communication using a shm_mq.
A background worker can use pq_redirect_to_shm_mq() to direct protocol
that would normally be sent to the frontend to a shm_mq so that another
process may read them.

The receiving process may use pq_parse_errornotice() to parse an
ErrorResponse or NoticeResponse from the background worker and, if
it wishes, ThrowErrorData() to propagate the error (with or without
further modification).

Patch by me.  Review by Andres Freund.
2014-10-31 12:02:40 -04:00
Robert Haas
f7102b0463 Extend dsm API with a new function dsm_unpin_mapping.
This reassociates a dynamic shared memory handle previous passed to
dsm_pin_mapping with the current resource owner, so that it will be
cleaned up at the end of the current query.

Patch by me.  Review of the function name by Andres Freund, Amit
Kapila, Jim Nasby, Petr Jelinek, and Álvaro Herrera.
2014-10-30 14:55:23 -04:00
Tom Lane
fd0f651a86 Test IsInTransactionChain, not IsTransactionBlock, in vac_update_relstats.
As noted by Noah Misch, my initial cut at fixing bug #11638 didn't cover
all cases where ANALYZE might be invoked in an unsafe context.  We need to
test the result of IsInTransactionChain not IsTransactionBlock; which is
notationally a pain because IsInTransactionChain requires an isTopLevel
flag, which would have to be passed down through several levels of callers.
I chose to pass in_outer_xact (ie, the result of IsInTransactionChain)
rather than isTopLevel per se, as that seemed marginally more apropos
for the intermediate functions to know about.
2014-10-30 13:04:06 -04:00
Robert Haas
6057c212f3 "Pin", rather than "keep", dynamic shared memory mappings and segments.
Nobody seemed concerned about this naming when it originally went in,
but there's a pending patch that implements the opposite of
dsm_keep_mapping, and the term "unkeep" was judged unpalatable.
"unpin" has existing precedent in the PostgreSQL code base, and the
English language, so use this terminology instead.

Per discussion, back-patch to 9.4.
2014-10-30 11:35:55 -04:00
Peter Eisentraut
7912f9b7dc Remove use of TAP subtests
They turned out to be too much of a portability headache, because they
need a fairly new version of Test::More to work properly.
2014-10-29 19:41:19 -04:00
Tom Lane
e0722d9cb5 Avoid corrupting tables when ANALYZE inside a transaction is rolled back.
VACUUM and ANALYZE update the target table's pg_class row in-place, that is
nontransactionally.  This is OK, more or less, for the statistical columns,
which are mostly nontransactional anyhow.  It's not so OK for the DDL hint
flags (relhasindex etc), which might get changed in response to
transactional changes that could still be rolled back.  This isn't a
problem for VACUUM, since it can't be run inside a transaction block nor
in parallel with DDL on the table.  However, we allow ANALYZE inside a
transaction block, so if the transaction had earlier removed the last
index, rule, or trigger from the table, and then we roll back the
transaction after ANALYZE, the table would be left in a corrupted state
with the hint flags not set though they should be.

To fix, suppress the hint-flag updates if we are InTransactionBlock().
This is safe enough because it's always OK to postpone hint maintenance
some more; the worst-case consequence is a few extra searches of pg_index
et al.  There was discussion of instead using a transactional update,
but that would change the behavior in ways that are not all desirable:
in most scenarios we're better off keeping ANALYZE's statistical values
even if the ANALYZE itself rolls back.  In any case we probably don't want
to change this behavior in back branches.

Per bug #11638 from Casey Shobe.  This has been broken for a good long
time, so back-patch to all supported branches.

Tom Lane and Michael Paquier, initial diagnosis by Andres Freund
2014-10-29 18:12:02 -04:00
Robert Haas
6cb4afff33 Avoid setup work for invalidation messages at start-of-(sub)xact.
Instead of initializing a new TransInvalidationInfo for every
transaction or subtransaction, we can just do it for those
transactions or subtransactions that actually need to queue
invalidation messages.  That also avoids needing to free those
entries at the end of a transaction or subtransaction that does
not generate any invalidation messages, which is by far the
common case.

Patch by me.  Review by Simon Riggs and Andres Freund.
2014-10-29 12:35:19 -04:00
Heikki Linnakangas
8f8314b560 Reset error message at PQreset()
If you call PQreset() repeatedly, and the connection cannot be
re-established, the error messages from the failed connection attempts
kept accumulating in the error string.

Fixes bug #11455 reported by Caleb Epstein. Backpatch to all supported
versions.
2014-10-29 14:34:43 +02:00
Tom Lane
a00d468e65 Remove obsolete commentary.
Since we got rid of non-MVCC catalog scans, the fourth reason given for
using a non-transactional update in index_update_stats() is obsolete.
The other three are still good, so we're not going to change the code,
but fix the comment.
2014-10-28 18:36:02 -04:00
Heikki Linnakangas
18f158ef69 Remove unnecessary assignment.
Reported by MauMau.
2014-10-28 20:26:20 +02:00
Noah Misch
c0e190365b MinGW: Include .dll extension in .def file LIBRARY commands.
Newer toolchains append the extension implicitly if missing, but
buildfarm member narwhal (gcc 3.4.2, ld 2.15.91 20040904) does not.
This affects most core libraries having an exports.txt file, namely
libpq and the ECPG support libraries.  On Windows Server 2003, Windows
API functions that load and unload DLLs internally will mistakenly
unload a libpq whose DLL header reports "LIBPQ" instead of "LIBPQ.dll".
When, subsequently, control would return to libpq, the backend crashes.
Back-patch to 9.4, like commit 846e91e022.
Before that commit, we used a different linking technique that yielded
"libpq.dll" in the DLL header.

Commit 53566fc094 worked around this by
eliminating a call to a function that loads and unloads DLLs internally.
That commit is no longer necessary for correctness, but its improving
consistency with the MSVC build remains valid.
2014-10-27 19:59:39 -04:00
Heikki Linnakangas
22926e00f7 Fix two bugs in tsquery @> operator.
1. The comparison for matching terms used only the CRC to decide if there's
a match. Two different terms with the same CRC gave a match.

2. It assumed that if the second operand has more terms than the first, it's
never a match. That assumption is bogus, because there can be duplicate
terms in either operand.

Rewrite the implementation in a way that doesn't have those bugs.

Backpatch to all supported versions.
2014-10-27 10:50:41 +02:00
Bruce Momjian
a4da35a0d2 Add variable names to two LWLock C prototypes
Previously only the variable types appeared.
2014-10-27 04:45:57 -04:00
Tom Lane
f455fcfdb8 Avoid unportable strftime() behavior in pg_dump/pg_dumpall.
Commit ad5d46a449 thought that we could
get around the known portability issues of strftime's %Z specifier by
using %z instead.  However, that idea seems to have been innocent of
any actual research, as it certainly missed the facts that
(1) %z is not portable to pre-C99 systems, and
(2) %z doesn't actually act differently from %Z on Windows anyway.

Per failures on buildfarm member hamerkop.

While at it, centralize the code defining what strftime format we
want to use in pg_dump; three copies of that string seems a bit much.
2014-10-26 20:59:21 -04:00
Tom Lane
9711fa0608 Fix undersized result buffer in pset_quoted_string().
The malloc request was 1 byte too small for the worst-case output.
This seems relatively unlikely to cause any problems in practice,
as the worst case only occurs if the input string contains no
characters other than single-quote or newline, and even then
malloc alignment padding would probably save the day.  But it's
definitely a bug.

David Rowley
2014-10-26 19:17:55 -04:00
Tom Lane
a4523c5aa5 Improve planning of btree index scans using ScalarArrayOpExpr quals.
Since we taught btree to handle ScalarArrayOpExpr quals natively (commit
9e8da0f757), the planner has always included
ScalarArrayOpExpr quals in index conditions if possible.  However, if the
qual is for a non-first index column, this could result in an inferior plan
because we can no longer take advantage of index ordering (cf. commit
807a40c551).  It can be better to omit the
ScalarArrayOpExpr qual from the index condition and let it be done as a
filter, so that the output doesn't need to get sorted.  Indeed, this is
true for the query introduced as a test case by the latter commit.

To fix, restructure get_index_paths and build_index_paths so that we
consider paths both with and without ScalarArrayOpExpr quals in non-first
index columns.  Redesign the API of build_index_paths so that it reports
what it found, saving useless second or third calls.

Report and patch by Andrew Gierth (though rather heavily modified by me).
Back-patch to 9.2 where this code was introduced, since the issue can
result in significant performance regressions compared to plans produced
by 9.1 and earlier.
2014-10-26 16:12:22 -04:00
Peter Eisentraut
17009fb9eb Fix TAP tests with Perl 5.12
Perl 5.12 ships with a somewhat broken version of Test::Simple, so skip
the tests if that is found.

The relevant fix is

    0.98  Wed, 23 Feb 2011 14:38:02 +1100
        Bug Fixes
        * subtest() should not fail if $? is non-zero. (Aaron Crane)
2014-10-26 10:26:36 -04:00
Peter Eisentraut
5c3d830e44 Fix TAP tests with Perl 5.8
The prove program included in Perl 5.8 does not support the --ext
option, so don't use that and use wildcards on the command line instead.

Note that the tests will still all be skipped, because, for instance,
the version of Test::More is too old, but at least the regular
mechanisms for handling that will apply, instead of failing to call
prove altogether.
2014-10-26 09:47:01 -04:00
Andres Freund
4a54b99e9c Add native compiler and memory barriers for solaris studio.
Discussion: 20140925133459.GB9633@alap3.anarazel.de
Author: Oskari Saarenmaa
2014-10-25 11:11:39 +02:00
Heikki Linnakangas
db29620d4d Work around Windows locale name with non-ASCII character.
Windows has one a locale whose name contains a non-ASCII character:
"Norwegian (Bokmål)" (that's an 'a' with a ring on top). That causes
trouble; when passing it setlocale(), it's not clear what encoding the
argument should be in. Another problem is that the locale name is stored in
pg_database catalog table, and the encoding used there depends on what
server encoding happens to be in use when the database is created. For
example, if you issue the CREATE DATABASE when connected to a UTF-8
database, the locale name is stored in pg_database in UTF-8. As long as all
locale names are pure ASCII, that's not a problem.

To work around that, map the troublesome locale name to a pure-ASCII alias
of the same locale, "norwegian-bokmal".

Now, this doesn't change the existing values that are already in
pg_database and in postgresql.conf. Old clusters will need to be fixed
manually. Instructions for that need to be put in the release notes.

This fixes bug #11431 reported by Alon Siman-Tov. Backpatch to 9.2;
backpatching further would require more work than seems worth it.
2014-10-24 21:10:13 +03:00
Heikki Linnakangas
c0c1f6fc97 Forgot #include "pg_getopt.h", now that pg_controldata uses getopt.
Needed at least on Windows.
2014-10-24 20:41:21 +03:00
Heikki Linnakangas
2d53003432 Complain if too many options are passed to pg_controldata or pg_resetxlog. 2014-10-24 19:15:57 +03:00
Heikki Linnakangas
22b743b2ca Oops, the commit accept pg_controldata -D datadir missed code changes.
I updated the docs and usage blurp, but forgot to commit the code changes
required.

Spotted by Michael Paquier.
2014-10-24 19:15:54 +03:00
Robert Haas
85bb81de53 Fix off-by-one error in 2781b4bea7.
Spotted by Tom Lane.
2014-10-24 08:18:28 -04:00
Alvaro Herrera
3c2aa0c6f2 psql: complain if pg_dump custom-format is detected
Apparently, this is a very common mistake for users to make; it is
better to have it fail reasonably rather than throw potentially a large
number of errors.  Since we have a magic string at the start of the
file, we can detect the case easily and there's no other possible useful
behavior anyway.

Author: Craig Ringer
2014-10-24 07:14:09 -03:00
Alvaro Herrera
b01a4f6838 Update README.tuplock
This file was documenting an older version of patch 0ac5ad5134; update
it to match what was really committed

Author: Florian Pflug
2014-10-23 20:51:58 -03:00
Tom Lane
43ac12c6e6 In type_sanity, check I/O functions of built-in types are not volatile.
We have a project policy that I/O functions must not be volatile, as per
commit aab353a60b, but we weren't doing
anything to enforce that.  In most usage the marking of the function
doesn't matter as long as its behavior is sane --- but I/O casts can
expose the marking as user-visible behavior, as per today's complaint
from Joe Van Dyk about contrib/ltree.

This test as such will only protect us against future errors in built-in
data types.  To catch the same error in contrib or third-party types,
perhaps we should make CREATE TYPE complain?  But that's a separate
issue from enforcing the policy for built-in types.
2014-10-23 15:59:40 -04:00
Tom Lane
b34d6f03db Improve ispell dictionary's defenses against bad affix files.
Don't crash if an ispell dictionary definition contains flags but not
any compound affixes.  (This isn't a security issue since only superusers
can install affix files, but still it's a bad thing.)

Also, be more careful about detecting whether an affix-file FLAG command
is old-format (ispell) or new-format (myspell/hunspell).  And change the
error message about mixed old-format and new-format commands into something
intelligible.

Per bug #11770 from Emre Hasegeli.  Back-patch to all supported branches.
2014-10-23 13:12:00 -04:00
Robert Haas
2781b4bea7 Perform less setup work for AFTER triggers at transaction start.
Testing reveals that the memory allocation we do at transaction start
has small but measurable overhead on simple transactions.  To cut
down on that overhead, defer some of that work to the point when
AFTER triggers are first used, thus avoiding it altogether if they
never are.

Patch by me.  Review by Andres Freund.
2014-10-23 12:33:02 -04:00
Fujii Masao
efbbbbc8b5 Remove the unused argument of PSQLexec().
This commit simply removes the second argument of PSQLexec that was
set to the same value everywhere. Comments and code blocks related
to this parameter are removed.

Noticed by Heikki Linnakangas, reviewed by Michael Paquier
2014-10-23 22:33:56 +09:00
Robert Haas
5ac372fc1a Add a function to get the authenticated user ID.
Previously, this was not exposed outside of miscinit.c.  It is needed
for the pending pg_background patch, and will also be needed for
parallelism.  Without it, there's no way for a background worker to
re-create the exact authentication environment that was present in the
process that started it, which could lead to security exposures.
2014-10-23 08:18:45 -04:00
Fujii Masao
c7371c4a60 Prevent the already-archived WAL file from being archived again.
Previously the archive recovery always created .ready file for
the last WAL file of the old timeline at the end of recovery even when
it's restored from the archive and has .done file. That is, there was
the case where the WAL file had both .ready and .done files.
This caused the already-archived WAL file to be archived again.

This commit prevents the archive recovery from creating .ready file
for the last WAL file if it has .done file, in order to prevent it from
being archived again.

This bug was added when cascading replication feature was introduced,
i.e., the commit 5286105800.
So, back-patch to 9.2, where cascading replication was added.

Reviewed by Michael Paquier
2014-10-23 16:21:27 +09:00
Peter Eisentraut
e64d3c5635 Minimize calls of pg_class_aclcheck to minimum necessary
In a couple of code paths, pg_class_aclcheck is called in succession
with multiple different modes set.  This patch combines those modes to
have a single call of this function and reduce a bit process overhead
for permission checking.

Author: Michael Paquier <michael@otacoo.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
2014-10-22 21:41:43 -04:00
Peter Eisentraut
a5f7d58194 Add tests for sequence privileges 2014-10-22 21:39:07 -04:00
Tom Lane
69fed5b26f Ensure libpq reports a suitable error message on unexpected socket EOF.
The EOF-detection logic in pqReadData was a bit confused about who should
set up the error message in case the kernel gives us read-ready-but-no-data
rather than ECONNRESET or some other explicit error condition.  Since the
whole point of this situation is that the lower-level functions don't know
there's anything wrong, pqReadData itself must set up the message.  But
keep the assumption that if an errno was reported, a message was set up at
lower levels.

Per bug #11712 from Marko Tiikkaja.  It's been like this for a very long
time, so back-patch to all supported branches.
2014-10-22 18:41:44 -04:00
Michael Meskes
2ae7811db8 Small code cleanup.
Declare static variable as static and external as extern.
2014-10-22 16:46:55 +02:00
Heikki Linnakangas
98b3743779 Update comment.
The _bt_tuplecompare() function mentioned in comment hasn't existed for a
long time.

Peter Geoghegan
2014-10-22 15:44:07 +03:00
Noah Misch
284590e416 MinGW: Use -static-libgcc when linking a DLL.
When commit 846e91e022 switched the linker
driver from dlltool/dllwrap to gcc, it became possible for linking to
choose shared libgcc.  Backends having loaded a module dynamically
linked to libgcc can exit abnormally, which the postmaster treats like a
crash.  Resume use of static libgcc exclusively, like 9.3 and earlier.
Back-patch to 9.4.
2014-10-21 22:55:47 -04:00
Noah Misch
53566fc094 MinGW: Link with shell32.dll instead of shfolder.dll.
This improves consistency with the MSVC build.  On buildfarm member
narwhal, since commit 846e91e022,
shfolder.dll:SHGetFolderPath() crashes when dblink calls it by way of
pqGetHomeDirectory().  Back-patch to 9.4, where that commit first
appeared.  How it caused this regression remains a mystery.  This is a
partial revert of commit 889f038129, which
adopted shfolder.dll for Windows NT 4.0 compatibility.  PostgreSQL 8.2
dropped support for that operating system.
2014-10-21 22:55:43 -04:00
Tom Lane
31dd7fcd03 Update expected/sequence_1.out.
The last three updates to the sequence regression test have all forgotten
to touch the alternate expected-output file.  Sigh.

Michael Paquier
2014-10-21 18:25:58 -04:00
Peter Eisentraut
6f04368cfc Allow input format xxxx-xxxx-xxxx for macaddr type
Author: Herwin Weststrate <herwin@quarantainenet.nl>
Reviewed-by: Ali Akbar <the.apaan@gmail.com>
2014-10-21 16:16:39 -04:00
Peter Eisentraut
5d93ce2d0c doc: Check DocBook XML validity during the build
Building the documentation with XSLT does not check the DTD, like a
DSSSL build would.  One can often get away with having invalid XML, but
the stylesheets might then create incorrect output, as they are not
designed to handle that.  Therefore, check the validity of the XML
against the DTD, using xmllint, during the build.

Add xmllint detection to configure, and add some documentation.

xmllint comes with libxml2, which is already in use, but it might be in
a separate package, such as libxml2-utils on Debian.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
2014-10-21 14:46:38 -04:00
Andres Freund
5e5b65f359 Don't duplicate log_checkpoint messages for both of restart and checkpoints.
The duplication originated in cdd46c765, where restartpoints were
introduced.

In LogCheckpointStart's case the duplication actually lead to the
compiler's format string checking not to be effective because the
format string wasn't constant.

Arguably these messages shouldn't be elog(), but ereport() style
messages. That'd even allow to translate the messages... But as
there's more mistakes of that kind in surrounding code, it seems
better to change that separately.
2014-10-21 01:01:56 +02:00
Andres Freund
11abd6c90f Renumber CHECKPOINT_* flags.
Commit 7dbb606938 added a new CHECKPOINT_FLUSH_ALL flag. As that
commit needed to be backpatched I didn't change the numeric values of
the existing flags as that could lead to nastly problems if any
external code issued checkpoints. That's not a concern on master, so
renumber them there.

Also add a comment about CHECKPOINT_FLUSH_ALL above
CreateCheckPoint().
2014-10-21 00:20:08 +02:00
Andres Freund
7dbb606938 Flush unlogged table's buffers when copying or moving databases.
CREATE DATABASE and ALTER DATABASE .. SET TABLESPACE copy the source
database directory on the filesystem level. To ensure the on disk
state is consistent they block out users of the affected database and
force a checkpoint to flush out all data to disk. Unfortunately, up to
now, that checkpoint didn't flush out dirty buffers from unlogged
relations.

That bug means there could be leftover dirty buffers in either the
template database, or the database in its old location. Leading to
problems when accessing relations in an inconsistent state; and to
possible problems during shutdown in the SET TABLESPACE case because
buffers belonging files that don't exist anymore are flushed.

This was reported in bug #10675 by Maxim Boguk.

Fix by Pavan Deolasee, modified somewhat by me. Reviewed by MauMau and
Fujii Masao.

Backpatch to 9.1 where unlogged tables were introduced.
2014-10-20 23:43:46 +02:00
Andrew Dunstan
af2b8fd057 Correct volatility markings of a few json functions.
json_agg and json_object_agg and their associated transition functions
should have been marked as stable rather than immutable, as they call IO
functions indirectly. Changing this probably isn't going to make much
difference, as you can't use an aggregate function in an index
expression, but we should be correct nevertheless.

json_object, on the other hand, should be marked immutable rather than
stable, as it does not call IO functions.

As discussed on -hackers, this change is being made without bumping the
catalog version, as we don't want to do that at this stage of the  cycle,
and  the changes are very unlikely to affect anyone.
2014-10-20 15:31:05 -04:00
Tom Lane
f330a6d140 Fix mishandling of FieldSelect-on-whole-row-Var in nested lateral queries.
If an inline-able SQL function taking a composite argument is used in a
LATERAL subselect, and the composite argument is a lateral reference,
the planner could fail with "variable not found in subplan target list",
as seen in bug #11703 from Karl Bartel.  (The outer function call used in
the bug report and in the committed regression test is not really necessary
to provoke the bug --- you can get it if you manually expand the outer
function into "LATERAL (SELECT inner_function(outer_relation))", too.)

The cause of this is that we generate the reltargetlist for the referenced
relation before doing eval_const_expressions() on the lateral sub-select's
expressions (cf find_lateral_references()), so what's scheduled to be
emitted by the referenced relation is a whole-row Var, not the simplified
single-column Var produced by optimizing the function's FieldSelect on the
whole-row Var.  Then setrefs.c fails to match up that lateral reference to
what's available from the outer scan.

Preserving the FieldSelect optimization in such cases would require either
major planner restructuring (to recursively do expression simplification
on sub-selects much earlier) or some amazingly ugly kluge to change the
reltargetlist of a possibly-already-planned relation.  It seems better
just to skip the optimization when the Var is from an upper query level;
the case is not so common that it's likely anyone will notice a few
wasted cycles.

AFAICT this problem only occurs for uplevel LATERAL references, so
back-patch to 9.3 where LATERAL was added.
2014-10-20 12:23:42 -04:00
Robert Haas
bc279c92f0 Fix typos.
David Rowley
2014-10-20 10:33:16 -04:00
Robert Haas
0f565c0743 Fix typos.
Etsuro Fujita
2014-10-20 10:23:40 -04:00
Peter Eisentraut
49d182e61e initdb: Fix compiler error in USE_PREFETCH case 2014-10-19 00:45:40 -04:00
Peter Eisentraut
6895866510 psql: Improve \pset without arguments
Revert the output of the individual backslash commands that change print
settings back to the 9.3 way (not showing the command name in
parentheses).  Implement \pset without arguments separately, showing all
settings with values in a table form.
2014-10-18 22:48:15 -04:00
Peter Eisentraut
7feaccc217 Allow setting effective_io_concurrency even on unsupported systems
This matches the behavior of other parameters that are unsupported on
some systems (e.g., ssl).

Also document the default value.
2014-10-18 21:35:46 -04:00
Bruce Momjian
b87671f1b6 Shorten warning about hash creation
Also document that PITR is also affected.
2014-10-18 10:36:09 -04:00
Bruce Momjian
417f92484d interval: tighten precision specification
interval precision can only be specified after the "interval" keyword if
no units are specified.

Previously we incorrectly checked the units to see if the precision was
legal, causing confusion.

Report by Alvaro Herrera
2014-10-18 10:31:00 -04:00
Tom Lane
60f8133dc9 Declare mkdtemp() only if we're providing it.
Follow our usual style of providing an "extern" for a standard library
function only when we're also providing the implementation.  This avoids
issues when the system headers declare the function slightly differently
than we do, as noted by Caleb Welton.

We might have to go to the extent of probing to see if the system headers
declare the function, but let's not do that until it's demonstrated to be
necessary.

Oversight in commit 9e6b1bf258.  Back-patch
to all supported branches, as that was.
2014-10-17 22:55:20 -04:00
Tom Lane
5ba062ee44 Avoid core dump in _outPathInfo() for Path without a parent RelOptInfo.
Nearly all Paths have parents, but a ResultPath representing an empty FROM
clause does not.  Avoid a core dump in such cases.  I believe this is only
a hazard for debugging usage, not for production, else we'd have heard
about it before.  Nonetheless, back-patch to 9.1 where the troublesome code
was introduced.  Noted while poking at bug #11703.
2014-10-17 22:33:52 -04:00
Fujii Masao
504c717599 Fix bug in handling of connections that pg_receivexlog creates.
Previously pg_receivexlog created new connection for WAL streaming
even though another connection which had been established to create
or delete the replication slot was being left. This caused the unused
connection to be left uselessly until pg_receivexlog exited.
This bug was introduced by the commit d9f38c7.

This patch changes pg_receivexlog so that the connection for
the replication slot is reused for WAL streaming.

Andres Freund, slightly modified by me, reviewed by Michael Paquier
2014-10-18 03:06:34 +09:00
Tom Lane
5c38a1d4ec Fix core dump in pg_dump --binary-upgrade on zero-column composite type.
This reverts nearly all of commit 28f6cab61a
in favor of just using the typrelid we already have in pg_dump's TypeInfo
struct for the composite type.  As coded, it'd crash if the composite type
had no attributes, since then the query would return no rows.

Back-patch to all supported versions.  It seems to not really be a problem
in 9.0 because that version rejects the syntax "create type t as ()", but
we might as well keep the logic similar in all affected branches.

Report and fix by Rushabh Lathia.
2014-10-17 12:49:00 -04:00
Tom Lane
7584649a1c Re-pgindent src/bin/pg_dump/*.
Seems to have gotten rather messy lately, as a consequence of a couple
of large recent commits.
2014-10-17 12:19:05 -04:00
Stephen Frost
389573fd19 Fix pg_dump for UPDATE policies
pg_dump had the wrong character for update and so was failing when
attempts were made to pg_dump databases with UPDATE policies.

Pointed out by Fujii Masao (thanks!)
2014-10-17 08:07:46 -04:00
Tom Lane
b2cbced9ee Support timezone abbreviations that sometimes change.
Up to now, PG has assumed that any given timezone abbreviation (such as
"EDT") represents a constant GMT offset in the usage of any particular
region; we had a way to configure what that offset was, but not for it
to be changeable over time.  But, as with most things horological, this
view of the world is too simplistic: there are numerous regions that have
at one time or another switched to a different GMT offset but kept using
the same timezone abbreviation.  Almost the entire Russian Federation did
that a few years ago, and later this month they're going to do it again.
And there are similar examples all over the world.

To cope with this, invent the notion of a "dynamic timezone abbreviation",
which is one that is referenced to a particular underlying timezone
(as defined in the IANA timezone database) and means whatever it currently
means in that zone.  For zones that use or have used daylight-savings time,
the standard and DST abbreviations continue to have the property that you
can specify standard or DST time and get that time offset whether or not
DST was theoretically in effect at the time.  However, the abbreviations
mean what they meant at the time in question (or most recently before that
time) rather than being absolutely fixed.

The standard abbreviation-list files have been changed to use this behavior
for abbreviations that have actually varied in meaning since 1970.  The
old simple-numeric definitions are kept for abbreviations that have not
changed, since they are a bit faster to resolve.

While this is clearly a new feature, it seems necessary to back-patch it
into all active branches, because otherwise use of Russian zone
abbreviations is going to become even more problematic than it already was.
This change supersedes the changes in commit 513d06ded et al to modify the
fixed meanings of the Russian abbreviations; since we've not shipped that
yet, this will avoid an undesirably incompatible (not to mention incorrect)
change in behavior for timestamps between 2011 and 2014.

This patch makes some cosmetic changes in ecpglib to keep its usage of
datetime lookup tables as similar as possible to the backend code, but
doesn't do anything about the increasingly obsolete set of timezone
abbreviation definitions that are hard-wired into ecpglib.  Whatever we
do about that will likely not be appropriate material for back-patching.
Also, a potential free() of a garbage pointer after an out-of-memory
failure in ecpglib has been fixed.

This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that
caused it to produce unexpected results near a timezone transition, if
both the "before" and "after" states are marked as standard time.  We'd
only ever thought about or tested transitions between standard and DST
time, but that's not what's happening when a zone simply redefines their
base GMT offset.

In passing, update the SGML documentation to refer to the Olson/zoneinfo/
zic timezone database as the "IANA" database, since it's now being
maintained under the auspices of IANA.
2014-10-16 15:22:10 -04:00
Tom Lane
90063a7612 Print planning time only in EXPLAIN ANALYZE, not plain EXPLAIN.
We've gotten enough push-back on that change to make it clear that it
wasn't an especially good idea to do it like that.  Revert plain EXPLAIN
to its previous behavior, but keep the extra output in EXPLAIN ANALYZE.
Per discussion.

Internally, I set this up as a separate flag ExplainState.summary that
controls printing of planning time and execution time.  For now it's
just copied from the ANALYZE option, but we could consider exposing it
to users.
2014-10-15 18:50:13 -04:00
Alvaro Herrera
076d29a1ee Blind attempt at fixing Win32 pg_dump issues
Per buildfarm failures
2014-10-14 17:33:36 -03:00
Alvaro Herrera
0eea8047bf pg_dump: Reduce use of global variables
Most pg_dump.c global variables, which were passed down individually to
dumping routines, are now grouped as members of the new DumpOptions
struct, which is used as a local variable and passed down into routines
that need it.  This helps future development efforts; in particular it
is said to enable a mode in which a parallel pg_dump run can output
multiple streams, and have them restored in parallel.

Also take the opportunity to clean up the pg_dump header files somewhat,
to avoid circularity.

Author: Joachim Wieland, revised by Álvaro Herrera
Reviewed by Peter Eisentraut
2014-10-14 15:00:55 -03:00
Heikki Linnakangas
e0d97d77bf Fix deadlock with LWLockAcquireWithVar and LWLockWaitForVar.
LWLockRelease should release all backends waiting with LWLockWaitForVar,
even when another backend has already been woken up to acquire the lock,
i.e. when releaseOK is false. LWLockWaitForVar can return as soon as the
protected value changes, even if the other backend will acquire the lock.
Fix that by resetting releaseOK to true in LWLockWaitForVar, whenever
adding itself to the wait queue.

This should fix the bug reported by MauMau, where the system occasionally
hangs when there is a lot of concurrent WAL activity and a checkpoint.
Backpatch to 9.4, where this code was added.
2014-10-14 10:06:47 +03:00
Peter Eisentraut
7ce09e6148 psql: Fix \? output alignment
This was inadvertently changed in commit c64e68fd.
2014-10-13 22:22:20 -04:00
Bruce Momjian
fe65280fea C comments: adjust execTuples.c for new structure
Report by Peter Geoghegan
2014-10-13 16:54:38 -04:00
Bruce Momjian
d6c8e8d7ac Consistently use NULL for invalid GUC unit strings
Patch by Euler Taveira
2014-10-13 16:11:43 -04:00
Kevin Grittner
30d7ae3c76 Increase number of hash join buckets for underestimate.
If we expect batching at the very beginning, we size nbuckets for
"full work_mem" (see how many tuples we can get into work_mem,
while not breaking NTUP_PER_BUCKET threshold).

If we expect to be fine without batching, we start with the 'right'
nbuckets and track the optimal nbuckets as we go (without actually
resizing the hash table). Once we hit work_mem (considering the
optimal nbuckets value), we keep the value.

At the end of the first batch, we check whether (nbuckets !=
nbuckets_optimal) and resize the hash table if needed. Also, we
keep this value for all batches (it's OK because it assumes full
work_mem, and it makes the batchno evaluation trivial). So the
resize happens only once.

There could be cases where it would improve performance to allow
the NTUP_PER_BUCKET threshold to be exceeded to keep everything in
one batch rather than spilling to a second batch, but attempts to
generate such a case have so far been unsuccessful; that issue may
be addressed with a follow-on patch after further investigation.

Tomas Vondra with minor format and comment cleanup by me
Reviewed by Robert Haas, Heikki Linnakangas, and Kevin Grittner
2014-10-13 10:16:36 -05:00
Noah Misch
494affbd90 Fix quoting in the add_to_path Makefile macro.
The previous quoting caused "make -C src/bin check" to ignore, rather
than add to, any LD_LIBRARY_PATH content from the environment.
Back-patch to 9.4, where the macro was introduced.
2014-10-12 23:33:37 -04:00
Noah Misch
27a23f9dfe pg_ctl: Cast DWORD values to avoid -Wformat warnings.
This affects pg_ctl alone, because pg_ctl takes the exceptional step of
calling Windows API functions in a Cygwin build.
2014-10-12 23:33:19 -04:00
Noah Misch
772945b4df Suppress dead, unportable src/port/crypt.c code.
This file used __int64, which is specific to native Windows, rather than
int64.  Suppress the long-unused union field of this type.  Noticed on
Cygwin x86_64 with -lcrypt not installed.  Back-patch to 9.0 (all
supported versions).
2014-10-12 23:27:06 -04:00
Peter Eisentraut
a0fb59d8bd pg_recvlogical: Improve --help output
List the actions first, as they are the most important options.  Group
the other options more sensibly, consistent with the man page.  Correct
a few typographical errors, clarify some things.

Also update the pg_receivexlog --help output to make it a bit more
consistent with that of pg_recvlogical.
2014-10-12 01:54:25 -04:00
Peter Eisentraut
b7a08c8028 Message improvements 2014-10-12 01:06:35 -04:00
Bruce Momjian
4f2e5a8a84 regression: adjust polygon diagrams to not use tabs
Also, small diagram adjustments

Patch by Emre Hasegeli
2014-10-11 17:14:16 -04:00
Tom Lane
4a50de1312 Fix bogus optimization in JSONB containment tests.
When determining whether one JSONB object contains another, it's okay to
make a quick exit if the first object has fewer pairs than the second:
because we de-duplicate keys within objects, it is impossible that the
first object has all the keys the second does.  However, the code was
applying this rule to JSONB arrays as well, where it does *not* hold
because arrays can contain duplicate entries.  The test was really in
the wrong place anyway; we should do it within JsonbDeepContains, where
it can be applied to nested objects not only top-level ones.

Report and test cases by Alexander Korotkov; fix by Peter Geoghegan and
Tom Lane.
2014-10-11 14:13:51 -04:00
Alvaro Herrera
7b1c2a0f20 Split builtins.h to a new header ruleutils.h
The new header contains many prototypes for functions in ruleutils.c
that are not exposed to the SQL level.

Reviewed by Andres Freund and Michael Paquier.
2014-10-08 18:10:47 -03:00
Robert Haas
7bb0e97407 Extend shm_mq API with new functions shm_mq_sendv, shm_mq_set_handle.
shm_mq_sendv sends a message to the queue assembled from multiple
locations.  This is expected to be used by forthcoming patches to
allow frontend/backend protocol messages to be sent via shm_mq, but
might be useful for other purposes as well.

shm_mq_set_handle associates a BackgroundWorkerHandle with an
already-existing shm_mq_handle.  This solves a timing problem when
creating a shm_mq to communicate with a newly-launched background
worker: if you attach to the queue first, and the background worker
fails to start, you might block forever trying to do I/O on the queue;
but if you start the background worker first, but then die before
attaching to the queue, the background worrker might block forever
trying to do I/O on the queue.  This lets you attach before starting
the worker (so that the worker is protected) and then associate the
BackgroundWorkerHandle later (so that you are also protected).

Patch by me, reviewed by Stephen Frost.
2014-10-08 14:38:31 -04:00
Alvaro Herrera
df630b0dd5 Implement SKIP LOCKED for row-level locks
This clause changes the behavior of SELECT locking clauses in the
presence of locked rows: instead of causing a process to block waiting
for the locks held by other processes (or raise an error, with NOWAIT),
SKIP LOCKED makes the new reader skip over such rows.  While this is not
appropriate behavior for general purposes, there are some cases in which
it is useful, such as queue-like tables.

Catalog version bumped because this patch changes the representation of
stored rules.

Reviewed by Craig Ringer (based on a previous attempt at an
implementation by Simon Riggs, who also provided input on the syntax
used in the current patch), David Rowley, and Álvaro Herrera.

Author: Thomas Munro
2014-10-07 17:23:34 -03:00
Robert Haas
c421efd213 Fix typo in elog message. 2014-10-07 00:08:59 -04:00
Tom Lane
55bfdd1cfd Fix array overrun in ecpg's version of ParseDateTime().
The code wrote a value into the caller's field[] array before checking
to see if there was room, which of course is backwards.  Per report from
Michael Paquier.

I fixed the equivalent bug in the backend's version of this code way back
in 630684d3a1, but failed to think about ecpg's copy.  Fortunately
this doesn't look like it would be exploitable for anything worse than a
core dump: an external attacker would have no control over the single word
that gets written.
2014-10-06 21:23:20 -04:00
Stephen Frost
273b29dbe9 Clean up Create/DropReplicationSlot query buffer
CreateReplicationSlot() and DropReplicationSlot() were not cleaning up
the query buffer in some cases (mostly error conditions) which meant a
small leak.  Not generally an issue as the error case would result in an
immediate exit, but not difficult to fix either and reduces the number
of false positives from code analyzers.

In passing, also add appropriate PQclear() calls to RunIdentifySystem().

Pointed out by Coverity.
2014-10-06 11:18:13 -04:00
Andres Freund
d9f38c7a55 Add support for managing physical replication slots to pg_receivexlog.
pg_receivexlog already has the capability to use a replication slot to
reserve WAL on the upstream node. But the used slot currently has to
be created via SQL.

To allow using slots directly, without involving SQL, add
--create-slot and --drop-slot actions, analogous to the logical slot
manipulation support in pg_recvlogical.

Author: Michael Paquier
Discussion: CABUevEx+zrOHZOQg+dPapNPFRJdsk59b=TSVf30Z71GnFXhQaw@mail.gmail.com
2014-10-06 12:51:37 +02:00
Andres Freund
c8b6cba84a Rename pg_recvlogical's --create/--drop to --create-slot/--drop-slot.
A future patch (9.5 only) adds slot management to pg_receivexlog. The
verbs create/drop don't seem descriptive enough there. It seems better
to rename pg_recvlogical's commands now, in beta, than live with the
inconsistency forever.

The old form (e.g. --drop) will still be accepted by virtue of most
getopt_long() options accepting abbreviations for long commands.

Backpatch to 9.4 where pg_recvlogical was introduced.

Author: Michael Paquier and Andres Freund
Discussion: CAB7nPqQtt79U6FmhwvgqJmNyWcVCbbV-nS72j_jyPEopERg9rg@mail.gmail.com
2014-10-06 12:11:52 +02:00
Peter Eisentraut
1ec4a970fe Translation updates 2014-10-05 23:23:50 -04:00
Robert Haas
d0410d6603 Eliminate one background-worker-related flag variable.
Teach sigusr1_handler() to use the same test for whether a worker
might need to be started as ServerLoop().  Aside from being perhaps
a bit simpler, this prevents a potentially-unbounded delay when
starting a background worker.  On some platforms, select() doesn't
return when interrupted by a signal, but is instead restarted,
including a reset of the timeout to the originally-requested value.
If signals arrive often enough, but no connection requests arrive,
sigusr1_handler() will be executed repeatedly, but the body of
ServerLoop() won't be reached.  This change ensures that, even in
that case, background workers will eventually get launched.

This is far from a perfect fix; really, we need select() to return
control to ServerLoop() after an interrupt, either via the self-pipe
trick or some other mechanism.  But that's going to require more
work and discussion, so let's do this for now to at least mitigate
the damage.

Per investigation of test_shm_mq failures on buildfarm member anole.
2014-10-04 21:25:41 -04:00
Tom Lane
513d06ded1 Update time zone data files to tzdata release 2014h.
Most zones in the Russian Federation are subtracting one or two hours
as of 2014-10-26.  Update the meanings of the abbreviations IRKT, KRAT,
MAGT, MSK, NOVT, OMST, SAKT, VLAT, YAKT, YEKT to match.

The IANA timezone database has adopted abbreviations of the form AxST/AxDT
for all Australian time zones, reflecting what they believe to be current
majority practice Down Under.  These names do not conflict with usage
elsewhere (other than ACST for Acre Summer Time, which has been in disuse
since 1994).  Accordingly, adopt these names into our "Default" timezone
abbreviation set.  The "Australia" abbreviation set now contains only
CST,EAST,EST,SAST,SAT,WST, all of which are thought to be mostly historical
usage.  Note that SAST has also been changed to be South Africa Standard
Time in the "Default" abbreviation set.

Add zone abbreviations SRET (Asia/Srednekolymsk) and XJT (Asia/Urumqi),
and use WSST/WSDT for western Samoa.

Also a DST law change in the Turks & Caicos Islands (America/Grand_Turk),
and numerous corrections for historical time zone data.
2014-10-04 14:18:19 -04:00
Tom Lane
4f499eee9d Update time zone abbreviations lists.
This updates known_abbrevs.txt to be what it should have been already,
were my -P patch not broken; and updates some tznames/ entries that
missed getting any love in previous timezone data updates because zic
failed to flag the change of abbreviation.

The non-cosmetic updates:

* Remove references to "ADT" as "Arabia Daylight Time", an abbreviation
that's been out of use since 2007; therefore, claiming there is a conflict
with "Atlantic Daylight Time" doesn't seem especially helpful.  (We have
left obsolete entries in the files when they didn't conflict with anything,
but that seems like a different situation.)

* Fix entirely incorrect GMT offsets for CKT (Cook Islands), FJT, FJST
(Fiji); we didn't even have them on the proper side of the date line.
(Seems to have been aboriginal errors in our tznames data; there's no
evidence anything actually changed recently.)

* FKST (Falkland Islands Summer Time) is now used all year round, so
don't mark it as a DST abbreviation.

* Update SAKT (Sakhalin) to mean GMT+11 not GMT+10.

In cosmetic changes, I fixed a bunch of wrong (or at least obsolete)
claims about abbreviations not being present in the zic files, and
tried to be consistent about how obsolete abbreviations are labeled.

Note the underlying timezone/data files are still at release 2014e;
this is just trying to get us in sync with what those files actually
say before we go to the next update.
2014-10-03 17:46:18 -04:00
Stephen Frost
78d72563ef Fix CreatePolicy, pg_dump -v; psql and doc updates
Peter G pointed out that valgrind was, rightfully, complaining about
CreatePolicy() ending up copying beyond the end of the parsed policy
name.  Name is a fixed-size type and we need to use namein (through
DirectFunctionCall1()) to flush out the entire array before we pass
it down to heap_form_tuple.

Michael Paquier pointed out that pg_dump --verbose was missing a
newline and Fabrízio de Royes Mello further pointed out that the
schema was also missing from the messages, so fix those also.

Also, based on an off-list comment from Kevin, rework the psql \d
output to facilitate copy/pasting into a new CREATE or ALTER POLICY
command.

Lastly, improve the pg_policies view and update the documentation for
it, along with a few other minor doc corrections based on an off-list
discussion with Adam Brightwell.
2014-10-03 16:31:53 -04:00
Tom Lane
5968570430 Fix bogus logic for zic -P option.
The quick hack I added to zic to dump out currently-in-use timezone
abbreviations turns out to have a nasty bug: within each zone, it was
printing the last "struct ttinfo" to be *defined*, not necessarily the
last one in use.  This was mainly a problem in zones that had changed the
meaning of their zone abbreviation (to another GMT offset value) and later
changed it back.

As a result of this error, we'd missed out updating the tznames/ files
for some jurisdictions that have changed their zone abbreviations since
the tznames/ files were originally created.  I'll address the missing data
updates in a separate commit.
2014-10-03 14:48:11 -04:00
Alvaro Herrera
1021bd6a89 Don't balance vacuum cost delay when per-table settings are in effect
When there are cost-delay-related storage options set for a table,
trying to make that table participate in the autovacuum cost-limit
balancing algorithm produces undesirable results: instead of using the
configured values, the global values are always used,
as illustrated by Mark Kirkwood in
http://www.postgresql.org/message-id/52FACF15.8020507@catalyst.net.nz

Since the mechanism is already complicated, just disable it for those
cases rather than trying to make it cope.  There are undesirable
side-effects from this too, namely that the total I/O impact on the
system will be higher whenever such tables are vacuumed.  However, this
is seen as less harmful than slowing down vacuum, because that would
cause bloat to accumulate.  Anyway, in the new system it is possible to
tweak options to get the precise behavior one wants, whereas with the
previous system one was simply hosed.

This has been broken forever, so backpatch to all supported branches.
This might affect systems where cost_limit and cost_delay have been set
for individual tables.
2014-10-03 13:01:27 -03:00
Robert Haas
017b2e9822 Fix typos in comments.
Etsuro Fujita
2014-10-03 11:47:27 -04:00
Robert Haas
9019264f2e Still another typo fix for 0709b7ee72.
Buildfarm member anole caught this one.
2014-10-03 11:39:38 -04:00
Heikki Linnakangas
7690ddea0c Check for GiST index tuples that don't fit on a page.
The page splitting code would go into infinite recursion if you try to
insert an index tuple that doesn't fit even on an empty page.

Per analysis and suggested fix by Andrew Gierth. Fixes bug #11555, reported
by Bryan Seitz (analysis happened over IRC). Backpatch to all supported
versions.
2014-10-03 14:49:52 +03:00
Robert Haas
3acc10c997 Increase the number of buffer mapping partitions to 128.
Testing by Amit Kapila, Andres Freund, and myself, with and without
other patches that also aim to improve scalability, seems to indicate
that this change is a significant win over the current value and over
smaller values such as 64.  It's not clear how high we can push this
value before it starts to have negative side-effects elsewhere, but
going this far looks OK.
2014-10-02 13:58:50 -04:00
Andres Freund
952872698d Install all headers for the new atomics API.
Previously, by mistake, only atomics.h was installed.

Kohei KaiGai
2014-10-02 16:52:21 +02:00
Tom Lane
5a6c168c78 Fix some more problems with nested append relations.
As of commit a87c72915 (which later got backpatched as far as 9.1),
we're explicitly supporting the notion that append relations can be
nested; this can occur when UNION ALL constructs are nested, or when
a UNION ALL contains a table with inheritance children.

Bug #11457 from Nelson Page, as well as an earlier report from Elvis
Pranskevichus, showed that there were still nasty bugs associated with such
cases: in particular the EquivalenceClass mechanism could try to generate
"join" clauses connecting an appendrel child to some grandparent appendrel,
which would result in assertion failures or bogus plans.

Upon investigation I concluded that all current callers of
find_childrel_appendrelinfo() need to be fixed to explicitly consider
multiple levels of parent appendrels.  The most complex fix was in
processing of "broken" EquivalenceClasses, which are ECs for which we have
been unable to generate all the derived equality clauses we would like to
because of missing cross-type equality operators in the underlying btree
operator family.  That code path is more or less entirely untested by
the regression tests to date, because no standard opfamilies have such
holes in them.  So I wrote a new regression test script to try to exercise
it a bit, which turned out to be quite a worthwhile activity as it exposed
existing bugs in all supported branches.

The present patch is essentially the same as far back as 9.2, which is
where parameterized paths were introduced.  In 9.0 and 9.1, we only need
to back-patch a small fragment of commit 5b7b5518d, which fixes failure to
propagate out the original WHERE clauses when a broken EC contains constant
members.  (The regression test case results show that these older branches
are noticeably stupider than 9.2+ in terms of the quality of the plans
generated; but we don't really care about plan quality in such cases,
only that the plan not be outright wrong.  A more invasive fix in the
older branches would not be a good idea anyway from a plan-stability
standpoint.)
2014-10-01 19:31:12 -04:00
Andres Freund
0c013e08cf Refactor replication connection code of various pg_basebackup utilities.
Move some more code to manage replication connection command to
streamutil.c. A later patch will introduce replication slot via
pg_receivexlog and this avoid duplicating relevant code between
pg_receivexlog and pg_recvlogical.

Author: Michael Paquier, with some editing by me.
2014-10-01 17:35:56 +02:00
Andres Freund
fdf81c9a6c pg_recvlogical.c code review.
Several comments still referred to 'initiating', 'freeing', 'stopping'
replication slots. These were terms used during different phases of
the development of logical decoding, but are no long accurate.

Also rename StreamLog() to StreamLogicalLog() and add 'void' to the
prototype.

Author: Michael Paquier, with some editing by me.

Backpatch to 9.4 where pg_recvlogical was introduced.
2014-10-01 17:35:07 +02:00
Heikki Linnakangas
5fa6c81a43 Remove num_xloginsert_locks GUC, replace with a #define
I left the GUC in place for the beta period, so that people could experiment
with different values. No-one's come up with any data that a different value
would be better under some circumstances, so rather than try to document to
users what the GUC, let's just hard-code the current value, 8.
2014-10-01 16:42:26 +03:00
Andres Freund
a39e78b710 Block signals while computing the sleep time in postmaster's main loop.
DetermineSleepTime() was previously called without blocked
signals. That's not good, because it allows signal handlers to
interrupt its workings.

DetermineSleepTime() was added in 9.3 with the addition of background
workers (da07a1e856), where it only read from
BackgroundWorkerList.

Since 9.4, where dynamic background workers were added (7f7485a0cd),
the list is also manipulated in DetermineSleepTime(). That's bad
because the list now can be persistently corrupted if modified by both
a signal handler and DetermineSleepTime().

This was discovered during the investigation of hangs on buildfarm
member anole. It's unclear whether this bug is the source of these
hangs or not, but it's worth fixing either way. I have confirmed that
it can cause crashes.

It luckily looks like this only can cause problems when bgworkers are
actively used.

Discussion: 20140929193733.GB14400@awork2.anarazel.de

Backpatch to 9.3 where background workers were introduced.
2014-10-01 15:19:40 +02:00
Andres Freund
0ef3c29a4b Improve documentation about binary/textual output mode for output plugins.
Also improve related error message as it contributed to the confusion.

Discussion: CAB7nPqQrqFzjqCjxu4GZzTrD9kpj6HMn9G5aOOMwt1WZ8NfqeA@mail.gmail.com,
    CAB7nPqQXc_+g95zWnqaa=mVQ4d3BVRs6T41frcEYi2ocUrR3+A@mail.gmail.com

Per discussion between Michael Paquier, Robert Haas and Andres Freund

Backpatch to 9.4 where logical decoding was introduced.
2014-10-01 13:22:17 +02:00
Andres Freund
ef8863844b Rename CACHE_LINE_SIZE to PG_CACHE_LINE_SIZE.
As noted in http://bugs.debian.org/763098 there is a conflict between
postgres' definition of CACHE_LINE_SIZE and the definition by various
*bsd platforms. It's debatable who has the right to define such a
name, but postgres' use was only introduced in 375d8526f2 (9.4), so
it seems like a good idea to rename it.

Discussion: 20140930195756.GC27407@msg.df7cb.de

Per complaint of Christoph Berg in the above email, although he's not
the original bug reporter.

Backpatch to 9.4 where the define was introduced.
2014-10-01 12:17:03 +02:00
Alvaro Herrera
fd02931a6c Fix pg_dump's --if-exists for large objects
This was born broken in 9067310cc5.

Per trouble report from Joachim Wieland.

Pavel Stěhule and Álvaro Herrera
2014-09-30 12:06:37 -03:00
Stephen Frost
08da8947f4 Also revert e3ec0728, JSON regression tests
Managed to forget to update the other JSON regression test output,
again.  Revert the commit which fixed it before.

Per buildfarm.
2014-09-29 13:59:32 -04:00
Stephen Frost
c8a026e4f1 Revert 95d737ff to add 'ignore_nulls'
Per discussion, revert the commit which added 'ignore_nulls' to
row_to_json.  This capability would be better added as an independent
function rather than being bolted on to row_to_json.  Additionally,
the implementation didn't address complex JSON objects, and so was
incomplete anyway.

Pointed out by Tom and discussed with Andrew and Robert.
2014-09-29 13:32:22 -04:00
Tom Lane
def4c28cf9 Change JSONB's on-disk format for improved performance.
The original design used an array of offsets into the variable-length
portion of a JSONB container.  However, such an array is basically
uncompressible by simple compression techniques such as TOAST's LZ
compressor.  That's bad enough, but because the offset array is at the
front, it tended to trigger the give-up-after-1KB heuristic in the TOAST
code, so that the entire JSONB object was stored uncompressed; which was
the root cause of bug #11109 from Larry White.

To fix without losing the ability to extract a random array element in O(1)
time, change this scheme so that most of the JEntry array elements hold
lengths rather than offsets.  With data that's compressible at all, there
tend to be fewer distinct element lengths, so that there is scope for
compression of the JEntry array.  Every N'th entry is still an offset.
To determine the length or offset of any specific element, we might have
to examine up to N preceding JEntrys, but that's still O(1) so far as the
total container size is concerned.  Testing shows that this cost is
negligible compared to other costs of accessing a JSONB field, and that
the method does largely fix the incompressible-data problem.

While at it, rearrange the order of elements in a JSONB object so that
it's "all the keys, then all the values" not alternating keys and values.
This doesn't really make much difference right at the moment, but it will
allow providing a fast path for extracting individual object fields from
large JSONB values stored EXTERNAL (ie, uncompressed), analogously to the
existing optimization for substring extraction from large EXTERNAL text
values.

Bump catversion to denote the incompatibility in on-disk format.
We will need to fix pg_upgrade to disallow upgrading jsonb data stored
with 9.4 betas 1 and 2.

Heikki Linnakangas and Tom Lane
2014-09-29 12:29:21 -04:00
Stephen Frost
ff27fcfa0a Fix relcache for policies, and doc updates
Andres pointed out that there was an extra ';' in equalPolicies, which
made me realize that my prior testing with CLOBBER_CACHE_ALWAYS was
insufficient (it didn't always catch the issue, just most of the time).
Thanks to that, a different issue was discovered, specifically in
equalRSDescs.  This change corrects eqaulRSDescs to return 'true' once
all policies have been confirmed logically identical.  After stepping
through both functions to ensure correct behavior, I ran this for
about 12 hours of CLOBBER_CACHE_ALWAYS runs of the regression tests
with no failures.

In addition, correct a few typos in the documentation which were pointed
out by Thom Brown (thanks!) and improve the policy documentation further
by adding a flushed out usage example based on a unix passwd file.

Lastly, clean up a few comments in the regression tests and pg_dump.h.
2014-09-26 12:46:26 -04:00
Robert Haas
07d46a8963 Fix identify_locking_dependencies for schema-only dumps.
Without this fix, parallel restore of a schema-only dump can deadlock,
because when the dump is schema-only, the dependency will still be
pointing at the TABLE item rather than the TABLE DATA item.

Robert Haas and Tom Lane
2014-09-26 11:21:35 -04:00
Andres Freund
f9f07411a5 Further atomic ops portability improvements and bug fixes.
* Don't play tricks for a more efficient pg_atomic_clear_flag() in the
  generic gcc implementation. The old version was broken on gcc < 4.7
  on !x86 platforms. Per buildfarm member chipmunk.
* Make usage of __atomic() fences depend on HAVE_GCC__ATOMIC_INT32_CAS
  instead of HAVE_GCC__ATOMIC_INT64_CAS - there's platforms with 32bit
  support that don't support 64bit atomics.
* Blindly fix two superflous #endif in generic-xlc.h
* Check for --disable-atomics in platforms but x86.
2014-09-26 15:55:44 +02:00
Andres Freund
a30199b01b Fix a couple occurrences of 'the the' in the new atomics API.
Author: Erik Rijkers
2014-09-26 09:37:20 +02:00
Peter Eisentraut
d11339c099 Fix whitespace 2014-09-26 02:43:46 -04:00
Andres Freund
f18cad9449 Fix atomic ops inline x86 inline assembly for older 32bit gccs.
Some x86 32bit versions of gcc apparently generate references to the
nonexistant %sil register when using when using the r input
constraint, but not with the =q constraint. The latter restricts
allocations to a/b/c/d which should all work.
2014-09-26 02:44:44 +02:00
Andres Freund
88bcdf9da5 Fix atomic ops for x86 gcc compilers that don't understand atomic intrinsics.
Per buildfarm animal locust.
2014-09-26 02:28:52 +02:00
Andres Freund
b64d92f1a5 Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.

For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.

The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.

To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.

The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.

As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.

Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
    20131015123303.GH5300@awork2.anarazel.de,
    20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
Andrew Dunstan
9111d46351 Remove ill-conceived ban on zero length json object keys.
We removed a similar ban on this in json_object recently, but the ban in
datum_to_json was left, which generate4d sprutious errors in othee json
generators, notable json_build_object.

Along the way, add an assertion that datum_to_json is not passed a null
key. All current callers comply with this rule, but the assertion will
catch any possible future misbehaviour.
2014-09-25 15:08:42 -04:00
Robert Haas
5d7962c679 Change locking regimen around buffer replacement.
Previously, we used an lwlock that was held from the time we began
seeking a candidate buffer until the time when we found and pinned
one, which is disastrous for concurrency.  Instead, use a spinlock
which is held just long enough to pop the freelist or advance the
clock sweep hand, and then released.  If we need to advance the clock
sweep further, we reacquire the spinlock once per buffer.

This represents a significant increase in atomic operations around
buffer eviction, but it still wins on many workloads.  On others, it
may result in no gain, or even cause a regression, unless the number
of buffer mapping locks is also increased.  However, that seems like
material for a separate commit.  We may also need to consider other
methods of mitigating contention on this spinlock, such as splitting
it into multiple locks or jumping the clock sweep hand more than one
buffer at a time, but those, too, seem like separate improvements.

Patch by me, inspired by a much larger patch from Amit Kapila.
Reviewed by Andres Freund.
2014-09-25 10:43:24 -04:00
Andres Freund
56a312aac8 Fix VPATH builds of the replication parser from git for some !gcc compilers.
Some compilers don't automatically search the current directory for
included files. 9cc2c182fc fixed that for builds from tarballs by
adding an include to the source directory. But that doesn't work when
the scanner is generated in the VPATH directory. Use the same search
path as the other parsers in the tree.

One compiler that definitely was affected is solaris' sun cc.

Backpatch to 9.1 which introduced using an actual parser for
replication commands.
2014-09-25 15:22:26 +02:00
Andrew Dunstan
ecacbdbcee Return NULL from json_object_agg if it gets no rows.
This makes it consistent with the docs and with all other builtin
aggregates apart from count().
2014-09-25 08:18:18 -04:00
Heikki Linnakangas
b0d81adea6 Add -D option to specify data directory to pg_controldata and pg_resetxlog.
It was confusing that to other commands, like initdb and postgres, you would
pass the data directory with "-D datadir", but pg_controldata and
pg_resetxlog would take just plain path, without the "-D". With this patch,
pg_controldata and pg_resetxlog also accept "-D datadir".

Abhijit Menon-Sen, with minor kibitzing by me
2014-09-25 13:37:19 +03:00
Stephen Frost
afd1d95f5b Copy-editing of row security
Address a few typos in the row security update, pointed out
off-list by Adam Brightwell.  Also include 'ALL' in the list
of commands supported, for completeness.
2014-09-24 17:45:11 -04:00
Stephen Frost
6550b901fe Code review for row security.
Buildfarm member tick identified an issue where the policies in the
relcache for a relation were were being replaced underneath a running
query, leading to segfaults while processing the policies to be added
to a query.  Similar to how TupleDesc RuleLocks are handled, add in a
equalRSDesc() function to check if the policies have actually changed
and, if not, swap back the rsdesc field (using the original instead of
the temporairly built one; the whole structure is swapped and then
specific fields swapped back).  This now passes a CLOBBER_CACHE_ALWAYS
for me and should resolve the buildfarm error.

In addition to addressing this, add a new chapter in Data Definition
under Privileges which explains row security and provides examples of
its usage, change \d to always list policies (even if row security is
disabled- but note that it is disabled, or enabled with no policies),
rework check_role_for_policy (it really didn't need the entire policy,
but it did need to be using has_privs_of_role()), and change the field
in pg_class to relrowsecurity from relhasrowsecurity, based on
Heikki's suggestion.  Also from Heikki, only issue SET ROW_SECURITY in
pg_restore when talking to a 9.5+ server, list Bypass RLS in \du, and
document --enable-row-security options for pg_dump and pg_restore.

Lastly, fix a number of minor whitespace and typo issues from Heikki,
Dimitri, add a missing #include, per Peter E, fix a few minor
variable-assigned-but-not-used and resource leak issues from Coverity
and add tab completion for role attribute bypassrls as well.
2014-09-24 16:32:22 -04:00
Tom Lane
3f6f9260e3 Fix bogus variable-mangling in security_barrier_replace_vars().
This function created new Vars with varno different from varnoold, which
is a condition that should never prevail before setrefs.c does the final
variable-renumbering pass.  The created Vars could not be seen as equal()
to normal Vars, which among other things broke equivalence-class processing
for them.  The consequences of this were indeed visible in the regression
tests, in the form of failure to propagate constants as one would expect.
I stumbled across it while poking at bug #11457 --- after intentionally
disabling join equivalence processing, the security-barrier regression
tests started falling over with fun errors like "could not find pathkey
item to sort", because of failure to match the corrupted Vars to normal
ones.
2014-09-24 15:59:34 -04:00
Andrew Dunstan
b1a52872ae Fix typos in descriptions of json_object functions. 2014-09-24 11:24:42 -04:00
Tom Lane
3694b4d7e1 Fix incorrect search for "x?" style matches in creviterdissect().
When the number of allowed iterations is limited (either a "?" quantifier
or a bound expression), the last sub-match has to reach to the end of the
target string.  The previous coding here first tried the shortest possible
match (one character, usually) and then gave up and back-tracked if that
didn't work, typically leading to failure to match overall, as shown in
bug #11478 from Christoph Berg.  The minimum change to fix that would be to
not decrement k before "goto backtrack"; but that would be a pretty stupid
solution, because we'd laboriously try each possible sub-match length
before finally discovering that only ending at the end can work.  Instead,
force the sub-match endpoint limit up to the end for even the first
shortest() call if we cannot have any more sub-matches after this one.

Bug introduced in my rewrite that added the iterdissect logic, commit
173e29aa5d.  The shortest-first search code
was too closely modeled on the longest-first code, which hasn't got this
issue since it tries a match reaching to the end to start with anyway.
Back-patch to all affected branches.
2014-09-23 20:26:14 -04:00
Stephen Frost
a564307373 Add unicode_*_linestyle to \? variables
In a2dabf0 we added the ability to have single or double unicode
linestyle for the border, column, or header.  Unfortunately, the
\? variables output was not updated for these new psql variables.

This corrects that oversight.

Patch by Pavel Stehule.
2014-09-22 21:51:25 -04:00
Stephen Frost
43bed84c32 Log ALTER SYSTEM statements as DDL
Per discussion in bug #11350, log ALTER SYSTEM commands at the
log_statement=ddl level, rather than at the log_statement=all level.

Pointed out by Tomonari Katsumata.

Back-patch to 9.4 where ALTER SYSTEM was introduced.
2014-09-22 20:50:17 -04:00
Stephen Frost
6ef8c658af Process withCheckOption exprs in setrefs.c
While withCheckOption exprs had been handled in many cases by
happenstance, they need to be handled during set_plan_references and
more specifically down in set_plan_refs for ModifyTable plan nodes.
This is to ensure that the opfuncid's are set for operators referenced
in the withCheckOption exprs.

Identified as an issue by Thom Brown

Patch by Dean Rasheed

Back-patch to 9.4, where withCheckOption was introduced.
2014-09-22 20:12:51 -04:00
Andres Freund
6ba4ecbf47 Remove most volatile qualifiers from xlog.c
For the reason outlined in df4077cda2 also remove volatile qualifiers
from xlog.c. Some of these uses of volatile have been added after
noticing problems back when spinlocks didn't imply compiler
barriers. So they are a good test - in fact removing the volatiles
breaks when done without the barriers in spinlocks present.

Several uses of volatile remain where they are explicitly used to
access shared memory without locks. These locations are ok with
slightly out of date data, but removing the volatile might lead to the
variables never being reread from memory. These uses could also be
replaced by barriers, but that's a separate change of doubtful value.
2014-09-22 23:35:08 +02:00
Robert Haas
df4077cda2 Remove volatile qualifiers from lwlock.c.
Now that spinlocks (hopefully!) act as compiler barriers, as of commit
0709b7ee72, this should be safe.  This
serves as a demonstration of the new coding style, and may be optimized
better on some machines as well.
2014-09-22 16:42:14 -04:00
Robert Haas
e38da8d6b1 Fix compiler warning.
It is meaningless to declare a pass-by-value return type const.
2014-09-22 16:32:35 -04:00
Robert Haas
763ba1b0f2 Fix mishandling of CreateEventTrigStmt's eventname field.
It's a string, not a scalar.

Petr Jelinek
2014-09-22 16:05:51 -04:00
Andres Freund
0926ef43c1 Remove postgres --help blurb about the removed -A option.
I missed this in 3bdcf6a5a7.

Noticed by Merlin Moncure
Discussion: CAHyXU0yC7uPeeVzQROwtnrOP9dxTEUPYjB0og4qUnbipMEV57w@mail.gmail.com
2014-09-22 17:54:34 +02:00
Andres Freund
604f7956b9 Improve code around the recently added rm_identify rmgr callback.
There are four weaknesses in728f152e07f998d2cb4fe5f24ec8da2c3bda98f2:

* append_init() in heapdesc.c was ugly and required that rm_identify
  return values are only valid till the next call. Instead just add a
  couple more switch() cases for the INIT_PAGE cases. Now the returned
  value will always be valid.
* a couple rm_identify() callbacks missed masking xl_info with
  ~XLR_INFO_MASK.
* pg_xlogdump didn't map a NULL rm_identify to UNKNOWN or a similar
  string.
* append_init() was called when id=NULL - which should never actually
  happen. But it's better to be careful.
2014-09-22 17:49:34 +02:00
Robert Haas
e246b3d6ea Add a fast pre-check for equality of equal-length strings.
Testing reveals that that doing a memcmp() before the strcoll() costs
practically nothing, at least on the systems we tested, and it speeds
up sorts containing many equal strings significatly.

Peter Geoghegan.  Review by myself and Heikki Linnakangas.  Comments
rewritten by me.
2014-09-19 12:39:00 -04:00
Stephen Frost
491c029dbc Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table.  Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.

New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner.  Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.

Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used.  If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.

By default, row security is applied at all times except for the
table owner and the superuser.  A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE.  When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.

Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.

A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.

Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.

Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.

Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 11:18:35 -04:00
Andres Freund
e5603a2f35 Mark x86's memory barrier inline assembly as clobbering the cpu flags.
x86's memory barrier assembly was marked as clobbering "memory" but
not "cc" even though 'addl' sets various flags. As it turns out gcc on
x86 implicitly assumes "cc" on every inline assembler statement, so
it's not a bug. But as that's poorly documented and might get copied
to architectures or compilers where that's not the case, it seems
better to be precise.

Discussion: 20140919100016.GH4277@alap3.anarazel.de

To keep the code common, backpatch to 9.2 where explicit memory
barriers were introduced.
2014-09-19 17:04:00 +02:00
Andres Freund
afaefa1b31 Avoid 'clobbered by longjmp' warning in psql/copy.c.
This was introduced in 51bb79569f.
2014-09-19 16:47:27 +02:00
Andres Freund
728f152e07 Add rmgr callback to name xlog record types for display purposes.
This is primarily useful for the upcoming pg_xlogdump --stats feature,
but also allows to remove some duplicated code in the rmgr_desc
routines.

Due to the separation and harmonization, the output of dipsplayed
records changes somewhat. But since this isn't enduser oriented
content that's ok.

It's potentially desirable to further change pg_xlogdump's display of
records. It previously wasn't possible to show the record type
separately from the description forcing it to be in the last
column. But that's better done in a separate commit.

Author: Abhijit Menon-Sen, slightly editorialized by me
Reviewed-By: Álvaro Herrera, Andres Freund, and Heikki Linnakangas
Discussion: 20140604104716.GA3989@toroid.org
2014-09-19 16:20:29 +02:00
Peter Eisentraut
f7d6759ec2 Fix TAP checks when current directory name contains spaces
Add some quotes in the makefile snippet that creates the temporary
installation, so that it can handle spaces in the directory name and
possibly some other oddities.
2014-09-17 00:54:12 -04:00
Heikki Linnakangas
77e65bf369 Fix the return type of GIN triConsistent support functions to "char".
They were marked to return a boolean, but they actually return a
GinTernaryValue, which is more like a "char". It makes no practical
difference, as the triConsistent functions cannot be called directly from
SQL because they have "internal" arguments, but this nevertheless seems
more correct.

Also fix the GinTernaryValue name in the documentation. I renamed the enum
earlier, but neglected the docs.

Alexander Korotkov. This is new in 9.4, so backpatch there.
2014-09-16 09:22:33 +03:00
Heikki Linnakangas
58e70cf9fb Follow the RFCs more closely in libpq server certificate hostname check.
The RFCs say that the CN must not be checked if a subjectAltName extension
of type dNSName is present. IOW, if subjectAltName extension is present,
but there are no dNSNames, we can still check the CN.

Alexey Klyukin
2014-09-15 16:16:06 +03:00
Heikki Linnakangas
2df465e696 Fix pointer type in size passed to memset.
Pointers are all the same size, so it makes no practical difference, but
let's be tidy.

Found by Coverity, noted off-list by Tom Lane.
2014-09-14 16:48:57 +03:00
Tom Lane
fe550b2ac2 Invent PGC_SU_BACKEND and mark log_connections/log_disconnections that way.
This new GUC context option allows GUC parameters to have the combined
properties of PGC_BACKEND and PGC_SUSET, ie, they don't change after
session start and non-superusers can't change them.  This is a more
appropriate choice for log_connections and log_disconnections than their
previous context of PGC_BACKEND, because we don't want non-superusers
to be able to affect whether their sessions get logged.

Note: the behavior for log_connections is still a bit odd, in that when
a superuser attempts to set it from PGOPTIONS, the setting takes effect
but it's too late to enable or suppress connection startup logging.
It's debatable whether that's worth fixing, and in any case there is
a reasonable argument for PGC_SU_BACKEND to exist.

In passing, re-pgindent the files touched by this commit.

Fujii Masao, reviewed by Joe Conway and Amit Kapila
2014-09-13 21:01:57 -04:00
Peter Eisentraut
c2a01439c0 Run missing documentation tools through "missing"
Instead of just erroring out when a tool is missing, wrap the call with
the "missing" script that we are already using for bison, flex, and
perl, so that the users get a useful error message.
2014-09-13 20:22:21 -04:00
Peter Eisentraut
839acf9461 pg_ctl: Add tests for behavior with nonexistent data directory
This behavior was made more precise in commit
11d205e2bd.
2014-09-13 15:18:49 -04:00
Bruce Momjian
95c38a9895 Revert f68dc5d86b
Renaming will have to be more comprehensive, so I need approval.
2014-09-12 20:42:19 -04:00
Bruce Momjian
f68dc5d86b More formatting.c variable renaming, for clarity 2014-09-12 20:35:07 -04:00
Robert Haas
8cce08f168 Change NTUP_PER_BUCKET to 1 to improve hash join lookup speed.
Since this makes the bucket headers use ~10x as much memory, properly
account for that memory when we figure out whether everything fits
in work_mem.  This might result in some cases that previously used
only a single batch getting split into multiple batches, but it's
unclear as yet whether we need defenses against that case, and if so,
what the shape of those defenses should be.

It's worth noting that even in these edge cases, users should still be
no worse off than they would have been last week, because commit
45f6240a8f saved a big pile of memory
on exactly the same workloads.

Tomas Vondra, reviewed and somewhat revised by me.
2014-09-12 16:18:09 -04:00
Fujii Masao
4ad2a54805 Add GUC to enable logging of replication commands.
Previously replication commands like IDENTIFY_COMMAND were not logged
even when log_statements is set to all. Some users who want to audit
all types of statements were not satisfied with this situation. To
address the problem, this commit adds new GUC log_replication_commands.
If it's enabled, all replication commands are logged in the server log.

There are many ways to allow us to enable that logging. For example,
we can extend log_statement so that replication commands are logged
when it's set to all. But per discussion in the community, we reached
the consensus to add separate GUC for that.

Reviewed by Ian Barwick, Robert Haas and Heikki Linnakangas.
2014-09-13 02:55:45 +09:00
Stephen Frost
a2dabf0e1d Add unicode_{column|header|border}_style to psql
With the unicode linestyle, this adds support to control if the
column, header, or border style should be single or double line
unicode characters.  The default remains 'single'.

In passing, clean up the border documentation and address some
minor formatting/spelling issues.

Pavel Stehule, with some additional changes by me.
2014-09-12 12:04:37 -04:00
Stephen Frost
82962838d4 Handle border = 3 in expanded mode
In psql, expanded mode was not being displayed correctly when using
the normal ascii or unicode linestyles and border set to '3'.  Now,
per the documentation, border '3' is really only sensible for HTML
and LaTeX formats, however, that's no excuse for ascii/unicode to
break in that case, and provisions had been made for psql to cleanly
handle this case (and it did, in non-expanded mode).

This was broken when ascii/unicode was initially added a good five
years ago because print_aligned_vertical_line wasn't passed in the
border setting being used by print_aligned_vertical but instead was
given the whole printTableContent.  There really isn't a good reason
for vertical_line to have the entire printTableContent structure, so
just pass in the printTextFormat and border setting (similar to how
this is handled in horizontal_line).

Pointed out by Pavel Stehule, fix by me.

Back-patch to all currently-supported versions.
2014-09-12 11:11:53 -04:00
Heikki Linnakangas
acd08d764a Support Subject Alternative Names in SSL server certificates.
This patch makes libpq check the server's hostname against DNS names listed
in the X509 subjectAltName extension field in the server certificate. This
allows the same certificate to be used for multiple domain names. If there
are no SANs in the certificate, the Common Name field is used, like before
this patch. If both are given, the Common Name is ignored. That is a bit
surprising, but that's the behavior mandated by the relevant RFCs, and it's
also what the common web browsers do.

This also adds a libpq_ngettext helper macro to allow plural messages to be
translated in libpq. Apparently this happened to be the first plural message
in libpq, so it was not needed before.

Alexey Klyukin, with some kibitzing by me.
2014-09-12 17:17:05 +03:00
Heikki Linnakangas
774a78ffe4 Fix GIN data page split ratio calculation.
The code that tried to split a page at 75/25 ratio, when appending to the
end of an index, was buggy in two ways. First, there was a silly typo that
caused it to just fill the left page as full as possible. But the logic as
it was intended wasn't correct either, and would actually have given a ratio
closer to 60/40 than 75/25.

Gaetano Mendola spotted the typo. Backpatch to 9.4, where this code was added.
2014-09-12 11:27:56 +03:00
Tom Lane
1d352325b8 Fix power_var_int() for large integer exponents.
The code for raising a NUMERIC value to an integer power wasn't very
careful about large powers.  It got an outright wrong answer for an
exponent of INT_MIN, due to failure to consider overflow of the Abs(exp)
operation; which is fixable by using an unsigned rather than signed
exponent value after that point.  Also, even though the number of
iterations of the power-computation loop is pretty limited, it's easy for
the repeated squarings to result in ridiculously enormous intermediate
values, which can take unreasonable amounts of time/memory to process,
or even overflow the internal "weight" field and so produce a wrong answer.
We can forestall misbehaviors of that sort by bailing out as soon as the
weight value exceeds what will fit in int16, since then the final answer
must overflow (if exp > 0) or underflow (if exp < 0) the packed numeric
format.

Per off-list report from Pavel Stehule.  Back-patch to all supported
branches.
2014-09-11 23:30:51 -04:00
Tom Lane
e3ec07280c Fix JSON regression tests.
Commit 95d737ff45 neglected to update
expected/json_1.out.  Per buildfarm.
2014-09-11 22:34:32 -04:00
Peter Eisentraut
da24813c20 Fix vacuumdb --analyze-in-stages --all order
When running vacuumdb --analyze-in-stages --all, it needs to run the
first stage across all databases before the second one, instead of
running all stages in a database before processing the next one.

Also respect the --quiet option with --analyze-in-stages.
2014-09-11 21:40:46 -04:00
Stephen Frost
95d737ff45 Add 'ignore_nulls' option to row_to_json
Provide an option to skip NULL values in a row when generating a JSON
object from that row with row_to_json.  This can reduce the size of the
JSON object in cases where columns are NULL without really reducing the
information in the JSON object.

This also makes row_to_json into a single function with default values,
rather than having multiple functions.  In passing, change array_to_json
to also be a single function with default values (we don't add an
'ignore_nulls' option yet- it's not clear that there is a sensible
use-case there, and it hasn't been asked for in any case).

Pavel Stehule
2014-09-11 21:23:51 -04:00
Heikki Linnakangas
aae7af3df8 Remove dead InRecovery check.
With the new B-tree incomplete split handling in 9.4, _bt_insert_parent is
never called in recovery.
2014-09-11 22:43:56 +03:00
Bruce Momjian
849462a9fa improve hash creation warning message
This improves the wording of commit 84aa8ba128.

Report by Kevin Grittner
2014-09-11 13:40:06 -04:00
Robert Haas
68e66923ff Add missing volatile qualifier.
Yet another silly mistake in 0709b7ee72,
again found by buildfarm member castoroides.
2014-09-11 09:07:32 -04:00
Heikki Linnakangas
0ed41529f6 Silence compiler warning on Windows.
David Rowley.
2014-09-11 13:50:14 +03:00
Peter Eisentraut
75717ce8f0 Handle old versions of Test::More
Really old versions of Test::More don't support subplans, so skip the
tests in that case.
2014-09-10 20:52:35 -04:00
Peter Eisentraut
8632ba6de4 Support older versions of "prove"
Apparently, older versions of "prove" (couldn't identify the exact
version from the changelog) don't look into the t/ directory for tests
by default, so specify it explicitly.
2014-09-10 20:52:34 -04:00
Bruce Momjian
36ad1a87a3 Implement mxid_age() to compute multi-xid age
Report by Josh Berkus
2014-09-10 17:13:04 -04:00
Bruce Momjian
84aa8ba128 Issue a warning during the creation of hash indexes 2014-09-10 16:54:47 -04:00
Robert Haas
5b26278822 Fix thinko in 0709b7ee72.
Buildfarm member castoroides is unhappy with this, for entirely
understandable reasons.
2014-09-10 14:40:21 -04:00
Heikki Linnakangas
45f6240a8f Pack tuples in a hash join batch densely, to save memory.
Instead of palloc'ing each HashJoinTuple individually, allocate 32kB chunks
and pack the tuples densely in the chunks. This avoids the AllocChunk
header overhead, and the space wasted by standard allocator's habit of
rounding sizes up to the nearest power of two.

This doesn't contain any planner changes, because the planner's estimate of
memory usage ignores the palloc overhead. Now that the overhead is smaller,
the planner's estimates are in fact more accurate.

Tomas Vondra, reviewed by Robert Haas.
2014-09-10 21:24:52 +03:00
Andres Freund
311da16439 Add support for optional_argument to our own getopt_long() implementation.
07c8651dd9 currently causes compilation errors on mscv (and
probably some other) compilers because our getopt_long()
implementation doesn't have support for optional_argument.

Thus implement optional_argument in our fallback implemenation. It's
quite possibly also useful in other cases.

Arguably this needs a configure check for optional_argument, but it
has existed pretty much since getopt_long() was introduced and thus
doesn't seem worth the configure runtime.

Normally I'd would not push a patch this fast, but this allows msvc to
build again and has low risk as only optional_argument behaviour has
changed.

Author: Michael Paquier and Andres Freund

Discussion: CAB7nPqS5VeedSCxrK=QouokbawgGKLpyc1Q++RRFCa_sjcSVrg@mail.gmail.com
2014-09-10 17:21:50 +02:00
Andres Freund
b4c28d1b92 Fix typo in 07c8651dd9 causing WIN32_ONLY_COMPILER builds to fail. 2014-09-10 02:44:39 +02:00
Tom Lane
1b4cc493d2 Preserve AND/OR flatness while extracting restriction OR clauses.
The code I added in commit f343a880d5 was
careless about preserving AND/OR flatness: it could create a structure with
an OR node directly underneath another one.  That breaks an assumption
that's fairly important for planning efficiency, not to mention triggering
various Asserts (as reported by Benjamin Smith).  Add a trifle more logic
to handle the case properly.
2014-09-09 18:35:31 -04:00
Andres Freund
07c8651dd9 Add new psql help topics, accessible to both --help and \?.
Add --help=<topic> for the commandline, and \? <topic> as a backslash
command, to show more help than the invocations without parameters
do. "commands", "variables" and "options" currently exist as help
topics describing, respectively, backslash commands, psql variables,
and commandline switches. Without parameters the help commands show
their previous topic.

Some further wordsmithing or extending of the added help content might
be needed; but there seems little benefit delaying the overall feature
further.

Author: Pavel Stehule, editorialized by many

Reviewed-By: Andres Freund, Petr Jelinek, Fujii Masao, MauMau, Abhijit
    Menon-Sen and Erik Rijkers.

Discussion: CAFj8pRDVGuC-nXBfe2CK8vpyzd2Dsr9GVpbrATAnZO=2YQ0s2Q@mail.gmail.com,
    CAFj8pRA54AbTv2RXDTRxiAd8hy8wxmoVLqhJDRCwEnhdd7OUkw@mail.gmail.com
2014-09-10 00:08:56 +02:00
Robert Haas
0709b7ee72 Change the spinlock primitives to function as compiler barriers.
Previously, they functioned as barriers against CPU reordering but not
compiler reordering, an odd API that required extensive use of volatile
everywhere that spinlocks are used.  That's error-prone and has negative
implications for performance, so change it.

In theory, this makes it safe to remove many of the uses of volatile
that we currently have in our code base, but we may find that there are
some bugs in this effort when we do.  In the long run, though, this
should make for much more maintainable code.

Patch by me.  Review by Andres Freund.
2014-09-09 17:48:50 -04:00
Tom Lane
e80252d424 Add width_bucket(anyelement, anyarray).
This provides a convenient method of classifying input values into buckets
that are not necessarily equal-width.  It works on any sortable data type.

The choice of function name is a bit debatable, perhaps, but showing that
there's a relationship to the SQL standard's width_bucket() function seems
more attractive than the other proposals.

Petr Jelinek, reviewed by Pavel Stehule
2014-09-09 15:34:14 -04:00
Peter Eisentraut
57b1085df5 Allow empty content in xml type
The xml type previously rejected "content" that is empty or consists
only of spaces.  But the SQL/XML standard allows that, so change that.
The accepted values for XML "documents" are not changed.

Reviewed-by: Ali Akbar <the.apaan@gmail.com>
2014-09-09 11:34:52 -04:00
Stephen Frost
f0051c1a14 Move ALTER ... ALL IN to ProcessUtilitySlow
Now that ALTER TABLE .. ALL IN TABLESPACE has replaced the previous
ALTER TABLESPACE approach, it makes sense to move the calls down in
to ProcessUtilitySlow where the rest of ALTER TABLE is handled.

This also means that event triggers will support ALTER TABLE .. ALL
(which was the impetus for the original change, though it has other
good qualities also).

Álvaro Herrera

Back-patch to 9.4 as the original rework was.
2014-09-09 10:58:48 -04:00
Andres Freund
50881036b1 Fix typo in solaris spinlock fix.
07968dbfaa missed part of the S_UNLOCK define when building for
sparcv8+.
2014-09-09 13:57:38 +02:00
Andres Freund
07968dbfaa Fix spinlock implementation for some !solaris sparc platforms.
Some Sparc CPUs can be run in various coherence models, ranging from
RMO (relaxed) over PSO (partial) to TSO (total). Solaris has always
run CPUs in TSO mode while in userland, but linux didn't use to and
the various *BSDs still don't. Unfortunately the sparc TAS/S_UNLOCK
were only correct under TSO. Fix that by adding the necessary memory
barrier instructions. On sparcv8+, which should be all relevant CPUs,
these are treated as NOPs if the current consistency model doesn't
require the barriers.

Discussion: 20140630222854.GW26930@awork2.anarazel.de

Will be backpatched to all released branches once a few buildfarm
cycles haven't shown up problems. As I've no access to sparc, this is
blindly written.
2014-09-09 00:47:32 +02:00
Tom Lane
750c5ee6ce Fix psql \s to work with recent libedit, and add pager support.
psql's \s (print command history) doesn't work at all with recent libedit
versions when printing to the terminal, because libedit tries to do an
fchmod() on the target file which will fail if the target is /dev/tty.
(We'd already noted this in the context of the target being /dev/null.)
Even before that, it didn't work pleasantly, because libedit likes to
encode the command history file (to ensure successful reloading), which
renders it nigh unreadable, not to mention significantly different-looking
depending on exactly which libedit version you have.  So let's forget using
write_history() for this purpose, and instead print the data ourselves,
using logic similar to that used to iterate over the history for newline
encoding/decoding purposes.

While we're at it, insert the ability to use the pager when \s is printing
to the terminal.  This has been an acknowledged shortcoming of \s for many
years, so while you could argue it's not exactly a back-patchable bug fix
it still seems like a good improvement.  Anyone who's seriously annoyed
at this can use "\s /dev/tty" or local equivalent to get the old behavior.

Experimentation with this showed that the history iteration logic was
actually rather broken when used with libedit.  It turns out that with
libedit you have to use previous_history() not next_history() to advance
to more recent history entries.  The easiest and most robust fix for this
seems to be to make a run-time test to verify which function to call.
We had not noticed this because libedit doesn't really need the newline
encoding logic: its own encoding ensures that command entries containing
newlines are reloaded correctly (unlike libreadline).  So the effective
behavior with recent libedits was that only the oldest history entry got
newline-encoded or newline-decoded.  However, because of yet other bugs in
history_set_pos(), some old versions of libedit allowed the existing loop
logic to reach entries besides the oldest, which means there may be libedit
~/.psql_history files out there containing encoded newlines in more than
just the oldest entry.  To ensure we can reload such files, it seems
appropriate to back-patch this fix, even though that will result in some
incompatibility with older psql versions (ie, multiline history entries
written by a psql with this fix will look corrupted to a psql without it,
if its libedit is reasonably up to date).

Stepan Rutz and Tom Lane
2014-09-08 16:09:45 -04:00
Stephen Frost
b2de2a1172 Tab completion for ALTER .. ALL IN TABLESPACE
Update the tab completion for the changes made in
3c4cf08087, which rework 'MOVE ALL' to be
'ALTER .. ALL IN TABLESPACE'.

Fujii Masao

Back-patch to 9.4, as the original change was.
2014-09-07 08:04:35 -04:00
Bruce Momjian
ad5d46a449 Report timezone offset in pg_dump/pg_dumpall
Use consistent format for all such displays.

Report by Gavin Flower
2014-09-05 19:22:31 -04:00
Bruce Momjian
a9c22d1480 Rename C variables in formatting.c, for clarity
Also add C comments.  This should help future debugging of this
notorious file.
2014-09-05 09:52:31 -04:00
Peter Eisentraut
303f4d1012 Assorted message fixes and improvements 2014-09-05 01:25:27 -04:00
Fujii Masao
d85e7fac41 Add tab-completion for reloptions like user_catalog_table.
Back-patch to 9.4 where user_catalog_table was added.

Review by Michael Paquier
2014-09-05 11:40:08 +09:00
Fujii Masao
a73c9dbab0 Fix segmentation fault that an empty prepared statement could cause.
Back-patch to all supported branches.

Per bug #11335 from Haruka Takatsuka
2014-09-05 02:17:57 +09:00
Robert Haas
d8d4965dc2 Update comment to reflect commit 1d41739e5a.
Peter Geoghegan
2014-09-04 12:17:10 -04:00
Fujii Masao
f6f654ff12 Allow \watch to display query execution time if \timing is enabled.
Previously \watch could not display the query execution time even
when \timing was enabled because it used PSQLexec instead of
SendQuery and that function didn't support \timing. This patch
introduces PSQLexecWatch and changes \watch so as to use it, instead.
PSQLexecWatch is the function to run the query, print its results and
display how long it took (only when \timing is enabled).

This patch also changes --echo-hidden so that it doesn't print
the query that \watch executes. Since \watch cannot execute
backslash command queries, they should not be printed even
when --echo-hidden is set.

Patch by me, review by Heikki Linnakangas and Michael Paquier
2014-09-04 12:31:48 +09:00
Bruce Momjian
4011f8c98b Issue clearer notice when inherited merged columns are moved
CREATE TABLE INHERIT moves user-specified columns to the location of the
inherited column.

Report by Fatal Majid
2014-09-03 11:54:31 -04:00
Heikki Linnakangas
c1008f0037 Check number of parameters in RAISE statement at compile time.
The number of % parameter markers in RAISE statement should match the number
of parameters given. We used to check that at execution time, but we have
all the information needed at compile time, so let's check it at compile
time instead. It's generally better to find mistakes earlier.

Marko Tiikkaja, reviewed by Fabien Coelho
2014-09-02 15:56:50 +03:00
Heikki Linnakangas
f8f4227976 Refactor per-page logic common to all redo routines to a new function.
Every redo routine uses the same idiom to determine what to do to a page:
check if there's a backup block for it, and if not read, the buffer if the
block exists, and check its LSN. Refactor that into a common function,
XLogReadBufferForRedo, making all the redo routines shorter and more
readable.

This has no user-visible effect, and makes no changes to the WAL format.

Reviewed by Andres Freund, Alvaro Herrera, Michael Paquier.
2014-09-02 15:10:28 +03:00
Heikki Linnakangas
26f8b99b24 Silence warning on new versions of clang.
Andres Freund
2014-09-02 14:26:06 +03:00
Andres Freund
51bb79569f Add psql PROMPT variable showing which line of a statement is being edited.
The new %l substitution shows the line number inside a (potentially
multi-line) statement starting from one.

Author: Sawada Masahiko, heavily editorialized by me.
Reviewed-By: Jeevan Chalke, Alvaro Herrera
2014-09-02 13:06:11 +02:00
Fujii Masao
bd3b7a9eef Support ALTER SYSTEM RESET command.
This patch allows us to execute ALTER SYSTEM RESET command to
remove the configuration entry from postgresql.auto.conf.

Vik Fearing, reviewed by Amit Kapila and me.
2014-09-02 16:06:58 +09:00
Tom Lane
01b6976c13 Fix unportable use of isspace().
Introduced in commit 11a020eb6.
2014-09-01 18:37:45 -04:00
Andres Freund
9a0a12f683 Add valgrind suppression for padding bytes in twophase records. 2014-09-01 15:59:44 +02:00
Andres Freund
5a64cb740d Fix s/pluggins/plugins/ typo in two comments.
Michael Paquier
2014-09-01 12:01:29 +02:00
Andres Freund
9c4b55db1d Declare lwlock.c's LWLockAcquireCommon() as a static inline.
68a2e52bba has introduced LWLockAcquireCommon() containing the
previous contents of LWLockAcquire() plus added functionality. The
latter then calls it, just like LWLockAcquireWithVar(). Because the
majority of callers don't need the added functionality, declare the
common code as inline. The compiler then can optimize away the unused
code. Doing so is also useful when looking at profiles, to
differentiate the users.

Backpatch to 9.4, the first branch to contain LWLockAcquireCommon().
2014-09-01 00:16:55 +02:00
Andres Freund
5c1faa7ba7 Protect definition of SpinlockSemaArray, just like its declaration.
Found via clang's -Wmissing-variable-declarations.
2014-09-01 00:03:53 +02:00
Andres Freund
8fff977e29 Declare two variables in snapbuild.c as static.
Neither is accessed externally, I just seem to have missed the static
when writing the code.
2014-08-31 23:53:12 +02:00
Bruce Momjian
d5d7d07765 Again update C comments for pg_attribute.attislocal 2014-08-30 10:25:11 -04:00
Andres Freund
4b4b680c3d Make backend local tracking of buffer pins memory efficient.
Since the dawn of time (aka Postgres95) multiple pins of the same
buffer by one backend have been optimized not to modify the shared
refcount more than once. This optimization has always used a NBuffer
sized array in each backend keeping track of a backend's pins.

That array (PrivateRefCount) was one of the biggest per-backend memory
allocations, depending on the shared_buffers setting. Besides the
waste of memory it also has proven to be a performance bottleneck when
assertions are enabled as we make sure that there's no remaining pins
left at the end of transactions. Also, on servers with lots of memory
and a correspondingly high shared_buffers setting the amount of random
memory accesses can also lead to poor cpu cache efficiency.

Because of these reasons a backend's buffers pins are now kept track
of in a small statically sized array that overflows into a hash table
when necessary. Benchmarks have shown neutral to positive performance
results with considerably lower memory usage.

Patch by me, review by Robert Haas.

Discussion: 20140321182231.GA17111@alap3.anarazel.de
2014-08-30 14:03:21 +02:00
Bruce Momjian
c6eaa880ee Update C comment for pg_attribute.attislocal
Indicates if column has ever been local/non-inherited
2014-08-29 19:01:04 -04:00
Bruce Momjian
01363beae5 pg_is_xlog_replay_paused(): remove super-user-only restriction
Also update docs to mention which function are super-user-only.

Report by sys-milan@statpro.com

Backpatch through 9.4
2014-08-29 09:06:05 -04:00
Heikki Linnakangas
88231ec578 Fix bug in compressed GIN data leaf page splitting code.
The list of posting lists it's dealing with can contain placeholders for
deleted posting lists. The placeholders are kept around so that they can
be WAL-logged, but we must be careful to not try to access them.

This fixes bug #11280, reported by Mårten Svantesson. Backpatch to 9.4,
where the compressed data leaf page code was added.
2014-08-29 14:22:25 +03:00
Peter Eisentraut
65c9dc231a Assorted message improvements 2014-08-29 00:26:17 -04:00
Tom Lane
6c40f8316e Add min and max aggregates for inet/cidr data types.
Haribabu Kommi, reviewed by Muhammad Asif Naeem
2014-08-28 22:37:58 -04:00
Fujii Masao
9df492664a Revert "Allow units to be specified in relation option setting value."
This reverts commit e23014f3d4.

As the side effect of the reverted commit, when the unit is
specified, the reloption was stored in the catalog with the unit.
This broke pg_dump (specifically, it prevented pg_dump from
outputting restorable backup regarding the reloption) and
turned the buildfarm red. Revert the commit until the fixed
version is ready.
2014-08-29 05:10:47 +09:00
Andres Freund
11a020eb6e Allow escaping of option values for options passed at connection start.
This is useful to allow to set GUCs to values that include spaces;
something that wasn't previously possible. The primary case motivating
this is the desire to set default_transaction_isolation to 'repeatable
read' on a per connection basis, but other usecases like seach_path do
also exist.

This introduces a slight backward incompatibility: Previously a \ in
an option value would have been passed on literally, now it'll be
taken as an escape.

The relevant mailing list discussion starts with
20140204125823.GJ12016@awork2.anarazel.de.
2014-08-28 13:59:29 +02:00
Fujii Masao
e23014f3d4 Allow units to be specified in relation option setting value.
This introduces an infrastructure which allows us to specify the units
like ms (milliseconds) in integer relation option, like GUC parameter.
Currently only autovacuum_vacuum_cost_delay reloption can accept
the units.

Reviewed by Michael Paquier
2014-08-28 15:55:50 +09:00
Jeff Davis
8167a3883a Allow multibyte characters as escape in SIMILAR TO and SUBSTRING.
Previously, only a single-byte character was allowed as an
escape. This patch allows it to be a multi-byte character, though it
still must be a single character.

Reviewed by Heikki Linnakangas and Tom Lane.
2014-08-27 21:07:36 -07:00
Alvaro Herrera
1c9701cfe5 Fix FOR UPDATE NOWAIT on updated tuple chains
If SELECT FOR UPDATE NOWAIT tries to lock a tuple that is concurrently
being updated, it might fail to honor its NOWAIT specification and block
instead of raising an error.

Fix by adding a no-wait flag to EvalPlanQualFetch which it can pass down
to heap_lock_tuple; also use it in EvalPlanQualFetch itself to avoid
blocking while waiting for a concurrent transaction.

Authors: Craig Ringer and Thomas Munro, tweaked by Álvaro
http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com

Per Thomas Munro in the course of his SKIP LOCKED feature submission,
who also provided one of the isolation test specs.

Backpatch to 9.4, because that's as far back as it applies without
conflicts (although the bug goes all the way back).  To that branch also
backpatch Thomas Munro's new NOWAIT test cases, committed in master by
Heikki as commit 9ee16b49f0 .
2014-08-27 19:15:18 -04:00
Fujii Masao
9a2d94892f Add header comments to receivelog.h and streamutil.h.
This commit also adds the include guards to those header files.

Michael Paquier
2014-08-27 19:31:48 +09:00
Stephen Frost
e414ba93ad Fix Var handling for security barrier views
In some cases, not all Vars were being correctly marked as having been
modified for updatable security barrier views, which resulted in invalid
plans (eg: when security barrier views were created over top of
inheiritance structures).

In passing, be sure to update both varattno and varonattno, as _equalVar
won't consider the Vars identical otherwise.  This isn't known to cause
any issues with updatable security barrier views, but was noticed as
missing while working on RLS and makes sense to get fixed.

Back-patch to 9.4 where updatable security barrier views were
introduced.
2014-08-26 23:08:41 -04:00
Robert Haas
9522ec3e70 Fix typo in b34e37bfef.
Spotted by Peter Geoghegan.
2014-08-26 15:58:50 -04:00
Kevin Grittner
a9d0f1cff3 Fix superuser concurrent refresh of matview owned by another.
Use SECURITY_LOCAL_USERID_CHANGE while building temporary tables;
only escalate to SECURITY_RESTRICTED_OPERATION while potentially
running user-supplied code.  The more secure mode was preventing
temp table creation.  Add regression tests to cover this problem.

This fixes Bug #11208 reported by Bruno Emanuel de Andrade Silva.

Backpatch to 9.4, where the bug was introduced.
2014-08-26 09:56:26 -05:00
Andres Freund
5569d75d6a Mark IsBinaryUpgrade as PGDLLIMPORT to fix windows builds after a7ae1dc.
Author: David Rowley
2014-08-26 15:25:18 +02:00
Heikki Linnakangas
0076f264b6 Implement IF NOT EXISTS for CREATE SEQUENCE.
Fabrízio de Royes Mello
2014-08-26 16:18:17 +03:00
Heikki Linnakangas
2bde29739d Show schema names in pg_dump verbose output.
Fabrízio de Royes Mello, reviewed by Michael Paquier
2014-08-26 11:50:48 +03:00
Bruce Momjian
a7ae1dcf49 pg_upgrade: prevent automatic oid assignment
Prevent automatic oid assignment when in binary upgrade mode.  Also
throw an error when contrib/pg_upgrade_support functions are called when
not in binary upgrade mode.

This prevent automatically-assigned oids from conflicting with later
pre-assigned oids coming from the old cluster.  It also makes sure oids
are preserved in call important cases.
2014-08-25 22:19:05 -04:00
Bruce Momjian
73fe87503f rename macro isTempOrToastNamespace to isTempOrTempToastNamespace
Done for clarity
2014-08-25 21:28:19 -04:00
Bruce Momjian
6cb74a67e2 revert "Throw error for ALTER TABLE RESET of an invalid option"
Reverts commits 73d78e11a0 and
b0488e5c4f.  Also reverts pg_upgrade
changes.
2014-08-25 20:07:37 -04:00
Bruce Momjian
73d78e11a0 Throw error for ALTER TABLE RESET of an invalid option
Also adjust pg_upgrade to not use this method for optional TOAST table
creation.

Patch by Fabrízio de Royes Mello
2014-08-25 17:06:40 -04:00
Bruce Momjian
ebe30ad59b pg_ctl, pg_upgrade: allow multiple -o/-O options, append them
Report by Pavel Raiskup
2014-08-25 16:30:26 -04:00
Alvaro Herrera
3adba73662 Revert XactLockTableWait context setup in conditional multixact wait
There's no point in setting up a context error callback when doing
conditional lock acquisition, because we never actually wait and so the
user wouldn't be able to see the context message anywhere.  In fact,
this is more in line with what ConditionalXactLockTableWait is doing.

Backpatch to 9.4, where this was added.
2014-08-25 15:33:17 -04:00
Alvaro Herrera
6f822952ee Use newly added InvalidCommandId instead of 0
The symbol was added by 71901ab6d; the original code was introduced by
6868ed749.  Development of both overlapped which is why we apparently
failed to notice.

This is a (very slight) behavior change, so I'm not backpatching this to
9.4 for now, even though the symbol does exist there.
2014-08-25 15:32:30 -04:00
Alvaro Herrera
832a12f65e DefineType: return base type OID, not its array
Event triggers want to know the OID of the interesting object created,
which is the main type.  The array created as part of the operation is
just a subsidiary object which is not of much interest.
2014-08-25 15:32:26 -04:00