Commit Graph

40174 Commits

Author SHA1 Message Date
Andrew Dunstan 76a1c97bf2 Silence warning from modern perl about unescaped braces 2016-04-08 12:50:30 -04:00
Teodor Sigaev 386e3d7609 CREATE INDEX ... INCLUDING (column[, ...])
Now indexes (but only B-tree for now) can contain "extra" column(s) which
doesn't participate in index structure, they are just stored in leaf
tuples. It allows to use index only scan by using single index instead
of two or more indexes.

Author: Anastasia Lubennikova with minor editorializing by me
Reviewers: David Rowley, Peter Geoghegan, Jeff Janes
2016-04-08 19:45:59 +03:00
Peter Eisentraut 339025c68f Replace printf format %i by %d
see also ce8d7bb644
2016-04-08 12:42:58 -04:00
Andrew Dunstan 01a07e6c11 Turn down MSVC compiler verbosity
Most of what is produced by the detailed verbosity level is of no
interest at all, so switch to the normal level for more usable output.

Christian Ullrich

Backpatch to all live branches
2016-04-08 12:37:20 -04:00
Peter Eisentraut 8b737f9084 Fix printf format 2016-04-08 12:34:33 -04:00
Tom Lane 93c301fc4f Fix multiple bugs in tablespace symlink removal.
Don't try to examine S_ISLNK(st.st_mode) after a failed lstat().
It's undefined.

Also, if the lstat() reported ENOENT, we do not wish that to be a hard
error, but the code might nonetheless treat it as one (giving an entirely
misleading error message, too) depending on luck-of-the-draw as to what
S_ISLNK() returned.

Don't throw error for ENOENT from rmdir(), either.  (We're not really
expecting ENOENT because we just stat'd the file successfully; but
if we're going to allow ENOENT in the symlink code path, surely the
directory code path should too.)

Generate an appropriate errcode for its-the-wrong-type-of-file complaints.
(ERRCODE_SYSTEM_ERROR doesn't seem appropriate, and failing to write
errcode() around it certainly doesn't work, and not writing an errcode
at all is not per project policy.)

Valgrind noticed the undefined S_ISLNK result; the other problems emerged
while reading the code in the area.

All of this appears to have been introduced in 8f15f74a44.
Back-patch to 9.5 where that commit appeared.
2016-04-08 12:31:53 -04:00
Robert Haas 752b948dfc Document which aggregates support partial mode.
David Rowley, reviewed by Tomas Vondra
2016-04-08 12:13:07 -04:00
Teodor Sigaev 5c3c3cd0a3 Enhanced custom error in PLPythonu
Patch adds a new, more rich,  way to emit error message or exception from
PL/Pythonu code.

Author: Pavel Stehule
Reviewers: Catalin Iacob, Peter Eisentraut, Jim Nasby
2016-04-08 18:33:06 +03:00
Andres Freund 5364b357fb Increase maximum number of clog buffers.
Benchmarking has shown that the current number of clog buffers limits
scalability. We've previously increased the number in 33aaa139, but
that's not sufficient with a large number of clients.

We've benchmarked the cost of increasing the limit by benchmarking worst
case scenarios; testing showed that 128 buffers don't cause a
regression, even in contrived scenarios, whereas 256 does

There are a number of more complex patches flying around to address
various clog scalability problems, but this is simple enough that we can
get it into 9.6; and is beneficial even after those patches have been
applied.

It is a bit unsatisfactory to increase this in small steps every few
releases, but a better solution seems to require a rewrite of slru.c;
not something done quickly.

Author: Amit Kapila and Andres Freund
Discussion: CAA4eK1+-=18HOrdqtLXqOMwZDbC_15WTyHiFruz7BvVArZPaAw@mail.gmail.com
2016-04-08 08:25:59 -07:00
Robert Haas 25fe8b5f1a Add a 'parallel_degree' reloption.
The code that estimates what parallel degree should be uesd for the
scan of a relation is currently rather stupid, so add a parallel_degree
reloption that can be used to override the planner's rather limited
judgement.

Julien Rouhaud, reviewed by David Rowley, James Sewell, Amit Kapila,
and me.  Some further hacking by me.
2016-04-08 11:14:56 -04:00
Robert Haas b0b64f6505 Attempt to fix breakage due to declaration following code.
Per Tom Lane and the buildfarm.
2016-04-08 10:52:56 -04:00
Peter Eisentraut 2f1d2b7a75 Set PAM_RHOST item for PAM authentication
The PAM_RHOST item is set to the remote IP address or host name and can
be used by PAM modules.  A pg_hba.conf option is provided to choose
between IP address and resolved host name.

From: Grzegorz Sampolski <grzsmp@gmail.com>
Reviewed-by: Haribabu Kommi <kommi.haribabu@gmail.com>
2016-04-08 10:48:44 -04:00
Teodor Sigaev 4e55b3f033 Rename comparePos() to compareWordEntryPos()
Rename comparePos() to compareWordEntryPos() to prevent export of too
generic name.

Per gripe from Tom Lane.
2016-04-08 12:04:15 +03:00
Fujii Masao 196b72fb9a Add regression tests for multiple synchronous standbys.
Authors: Suraj Kharage, Michael Paquier, Masahiko Sawada, refactored by me
Reviewed-By: Kyotaro Horiguchi
2016-04-08 16:48:53 +09:00
Robert Haas 0711803775 Use quicksort, not replacement selection, for external sorting.
We still use replacement selection for the first run of the sort only
and only when the number of tuples is relatively small.  Otherwise,
the first run, and subsequent runs in all cases, are produced using
quicksort.  This tends to be faster except perhaps for very small
amounts of working memory.

Peter Geoghegan, reviewed by Tomas Vondra, Jeff Janes, Mithun Cy,
Greg Stark, and me.
2016-04-08 02:36:26 -04:00
Robert Haas 719c84c1be Extend relations multiple blocks at a time to improve scalability.
Contention on the relation extension lock can become quite fierce when
multiple processes are inserting data into the same relation at the same
time at a high rate.  Experimentation shows the extending the relation
multiple blocks at a time improves scalability.

Dilip Kumar, reviewed by Petr Jelinek, Amit Kapila, and me.
2016-04-08 02:04:46 -04:00
Fujii Masao 8643b91ecf Fix a couple of places in doc that implied there was only one sync standby.
Thomas Munro
2016-04-08 13:24:50 +09:00
Simon Riggs 137805f89a Use Foreign Key relationships to infer multi-column join selectivity
In cases where joins use multiple columns we currently assess each join
separately causing gross mis-estimates for join cardinality.

This patch adds use of FK information for the first time into the
planner. When FKs are present and we have multi-column join information,
plan estimates will be drastically improved. Cases with multiple FKs
are handled, though partial matches are ignored currently.

Net effect is substantial performance improvements for joins in many
common cases. Additional planning time is isolated to cases that are
currently performing poorly, measured at 0.08 - 0.15 ms.

Please watch for planner performance regressions; circumstances seem
unlikely but the law of unintended consequences may apply somewhen.
Additional complex tests welcome to prove this before release.

Tests can be performed using SET enable_fkey_estimates = on | off
using scripts provided during Hackers discussions, message id:
552335D9.3090707@2ndquadrant.com

Authors: Tomas Vondra and David Rowley
Reviewed and tested by Simon Riggs, adding comments only
2016-04-08 02:51:09 +01:00
Stephen Frost 6928484bda GRANT rights to CURRENT_USER instead of adding roles
We shouldn't be adding roles during the regression tests as that can
cause back-to-back installcheck runs to fail and users running the
regression tests likley don't want those extra roles.

Pointed out by Tom
2016-04-07 14:40:23 -04:00
Teodor Sigaev 3308467905 Zeroing unused parts ducring tsquery construction.
Per investigation failure skink buildfarm member and
RANDOMIZE_ALLOCATED_MEMORY help
2016-04-07 20:45:24 +03:00
Tom Lane f338dd7585 Refactor join_is_removable() to separate out distinctness-proving logic.
Extracted from pending unique-join patch, since this is a rather large
delta but it's simply moving code out into separately-accessible
subroutines.

I (tgl) did choose to add a bit more logic to rel_supports_distinctness,
so that it verifies that there's at least one potentially usable unique
index rather than just checking indexlist != NIL.  Otherwise there's
no functional change here.

David Rowley
2016-04-07 13:12:31 -04:00
Teodor Sigaev a7ace3b6d9 Make testing of phraseto_tsquery independ from value of
default_text_search_config variable.

Per skink buldfarm member
2016-04-07 19:33:23 +03:00
Kevin Grittner fcff8a5751 Detect SSI conflicts before reporting constraint violations
While prior to this patch the user-visible effect on the database
of any set of successfully committed serializable transactions was
always consistent with some one-at-a-time order of execution of
those transactions, the presence of declarative constraints could
allow errors to occur which were not possible in any such ordering,
and developers had no good workarounds to prevent user-facing
errors where they were not necessary or desired.  This patch adds
a check for serialization failure ahead of duplicate key checking
so that if a developer explicitly (redundantly) checks for the
pre-existing value they will get the desired serialization failure
where the problem is caused by a concurrent serializable
transaction; otherwise they will get a duplicate key error.

While it would be better if the reads performed by the constraints
could count as part of the work of the transaction for
serialization failure checking, and we will hopefully get there
some day, this patch allows a clean and reliable way for developers
to work around the issue.  In many cases existing code will already
be doing the right thing for this to "just work".

Author: Thomas Munro, with minor editing of docs by me
Reviewed-by: Marko Tiikkaja, Kevin Grittner
2016-04-07 11:12:35 -05:00
Teodor Sigaev bb140506df Phrase full text search.
Patch introduces new text search operator (<-> or <DISTANCE>) into tsquery.
On-disk and binary in/out format of tsquery are backward compatible.
It has two side effect:
- change order for tsquery, so, users, who has a btree index over tsquery,
  should reindex it
- less number of parenthesis in tsquery output, and tsquery becomes more
  readable

Authors: Teodor Sigaev, Oleg Bartunov, Dmitry Ivanov
Reviewers: Alexander Korotkov, Artur Zakirov
2016-04-07 18:44:18 +03:00
Simon Riggs 015e88942a Load FK defs into relcache for use by planner
Fastpath ignores this if no triggers defined.

Author: Tomas Vondra, with fastpath and comments added by me
Reviewers: David Rowley, Simon Riggs
2016-04-07 12:08:33 +01:00
Noah Misch f2b1b3079c Standardize GetTokenInformation() error reporting.
Commit c22650cd64 sparked a discussion
about diverse interpretations of "token user" in error messages.  Expel
old and new specimens of that phrase by making all GetTokenInformation()
callers report errors the way GetTokenUser() has been reporting them.
These error conditions almost can't happen, so users are unlikely to
observe this change.

Reviewed by Tom Lane and Stephen Frost.
2016-04-06 23:41:43 -04:00
Noah Misch 33d3fc5e2a Remove redundant message in AddUserToTokenDacl().
GetTokenUser() will have reported an adequate error message.  These
error conditions almost can't happen, so users are unlikely to observe
this change.

Reviewed by Tom Lane and Stephen Frost.
2016-04-06 23:40:51 -04:00
Stephen Frost 29dd1504a1 Bump catversion for pg_dump dump catalog ACL patches
Pointed out by Tom.
2016-04-06 23:04:48 -04:00
Stephen Frost 1574783b4c Use GRANT system to manage access to sensitive functions
Now that pg_dump will properly dump out any ACL changes made to
functions which exist in pg_catalog, switch to using the GRANT system
to manage access to those functions.

This means removing 'if (!superuser()) ereport()' checks from the
functions themselves and then REVOKEing EXECUTE right from 'public' for
these functions in system_views.sql.

Reviews by Alexander Korotkov, Jose Luis Tallon
2016-04-06 21:45:32 -04:00
Stephen Frost 23f34fa4ba In pg_dump, include pg_catalog and extension ACLs, if changed
Now that all of the infrastructure exists, add in the ability to
dump out the ACLs of the objects inside of pg_catalog or the ACLs
for objects which are members of extensions, but only if they have
been changed from their original values.

The original values are tracked in pg_init_privs.  When pg_dump'ing
9.6-and-above databases, we will dump out the ACLs for all objects
in pg_catalog and the ACLs for all extension members, where the ACL
has been changed from the original value which was set during either
initdb or CREATE EXTENSION.

This should not change dumps against pre-9.6 databases.

Reviews by Alexander Korotkov, Jose Luis Tallon
2016-04-06 21:45:32 -04:00
Stephen Frost d217b2c360 In pg_dump, split "dump" into "dump" and "dump_contains"
Historically, the "dump" component of the namespace has been used
to decide if the objects inside of the namespace should be dumped
also.  Given that "dump" is now a bitmask and may be partial, and
we may want to dump out all components of the namespace object but
only some of the components of objects contained in the namespace,
create a "dump_contains" bitmask which will represent what components
of the objects inside of a namespace should be dumped out.

No behavior change here, but in preparation for a change where we
will dump out just the ACLs of objects in pg_catalog, but we might
not dump out the ACL of the pg_catalog namespace itself (for instance,
when it hasn't been changed from the value set at initdb time).

Reviews by Alexander Korotkov, Jose Luis Tallon
2016-04-06 21:45:32 -04:00
Stephen Frost a9f0e8e5a2 In pg_dump, use a bitmap to represent what to include
pg_dump has historically used a simple boolean 'dump' value to indicate
if a given object should be included in the dump or not.  Instead, use
a bitmap which breaks down the components of an object into their
distinct pieces and use that bitmap to only include the components
requested.

This does not include any behavioral change, but is in preperation for
the change to dump out just ACLs for objects in pg_catalog.

Reviews by Alexander Korotkov, Jose Luis Tallon
2016-04-06 21:45:32 -04:00
Stephen Frost 6c268df127 Add new catalog called pg_init_privs
This new catalog holds the privileges which the system was
initialized with at initdb time, along with any permissions set
by extensions at CREATE EXTENSION time.  This allows pg_dump
(and any other similar use-cases) to detect when the privileges
set on initdb-created or extension-created objects have been
changed from what they were set to at initdb/extension-creation
time and handle those changes appropriately.

Reviews by Alexander Korotkov, Jose Luis Tallon
2016-04-06 21:45:32 -04:00
Teodor Sigaev 0b62fd036e Add jsonb_insert
It inserts a new value into an jsonb array at arbitrary position or
a new key to jsonb object.

Author: Dmitry Dolgov
Reviewers: Petr Jelinek, Vitaly Burovoy, Andrew Dunstan
2016-04-06 19:25:00 +03:00
Peter Eisentraut 3b3fcc4eea pg_dump: Add table qualifications to some tags
Some object types have names that are only unique for one table.  But
for those we generally didn't put the table name into the dump TOC tag.
So it was impossible to identify these objects if the same name was used
for multiple tables.  This affects policies, column defaults,
constraints, triggers, and rules.

Fix by adding the table name to the TOC tag, so that it now reads
"$schema $table $object".

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-04-06 12:13:11 -04:00
Tom Lane de94e2af18 Run pgindent on a batch of (mostly-planner-related) source files.
Getting annoyed at the amount of unrelated chatter I get from pgindent'ing
Rowley's unique-joins patch.  Re-indent all the files it touches.
2016-04-06 11:34:02 -04:00
Simon Riggs d25379eb23 Modify test_decoding/messages to remove non-ascii chars 2016-04-06 14:55:11 +01:00
Fujii Masao ead9963c47 Use proper format specifier %X/%X for LSN, again.
Commit cee31f5 fixed this problem, but commit 989be08 accidentally
reverted the fix.

Thomas Munro
2016-04-06 22:20:52 +09:00
Simon Riggs cac0e36682 Revert bf08f2292f
Remove recent changes to logging XLOG_RUNNING_XACTS by request.
2016-04-06 14:03:46 +01:00
Simon Riggs 3fe3511d05 Generic Messages for Logical Decoding
API and mechanism to allow generic messages to be inserted into WAL that are
intended to be read by logical decoding plugins. This commit adds an optional
new callback to the logical decoding API.

Messages are either text or bytea. Messages can be transactional, or not, and
are identified by a prefix to allow multiple concurrent decoding plugins.

(Not to be confused with Generic WAL records, which are intended to allow crash
recovery of extensible objects.)

Author: Petr Jelinek and Andres Freund
Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs
Discussion: 5685F999.6010202@2ndquadrant.com
2016-04-06 10:05:41 +01:00
Fujii Masao 989be0810d Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.

This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.

Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.

The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.

This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.

The regression test for multiple synchronous standbys is not included
in this commit. It should come later.

Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi

Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 17:18:25 +09:00
Alvaro Herrera 2143f5e127 Fix broken ALTER INDEX documentation
Commit b8a91d9d1c put the description of the new IF EXISTS clause in the
wrong place -- move it where it belongs.

Backpatch to 9.2.
2016-04-05 19:03:42 -03:00
Alvaro Herrera f2fcad27d5 Support ALTER THING .. DEPENDS ON EXTENSION
This introduces a new dependency type which marks an object as depending
on an extension, such that if the extension is dropped, the object
automatically goes away; and also, if the database is dumped, the object
is included in the dump output.  Currently the grammar supports this for
indexes, triggers, materialized views and functions only, although the
utility code is generic so adding support for more object types is a
matter of touching the parser rules only.

Author: Abhijit Menon-Sen
Reviewed-by: Alexander Korotkov, Álvaro Herrera
Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org
2016-04-05 18:38:54 -03:00
Robert Haas 41ea0c2376 Fix parallel-safety code for parallel aggregation.
has_parallel_hazard() was ignoring the proparallel markings for
aggregates, which is no good.  Fix that.  There was no way to mark
an aggregate as actually being parallel-safe, either, so add a
PARALLEL option to CREATE AGGREGATE.

Patch by me, reviewed by David Rowley.
2016-04-05 16:06:15 -04:00
Robert Haas 09adc9a8c0 Align all shared memory allocations to cache line boundaries.
Experimentation shows this only costs about 6kB, which seems well
worth it given the major performance effects that can be caused
by insufficient alignment, especially on larger systems.

Discussion: 14166.1458924422@sss.pgh.pa.us
2016-04-05 15:47:49 -04:00
Tom Lane 1d2fe56e42 Fix PL/Python for recursion and interleaved set-returning functions.
PL/Python failed if a PL/Python function was invoked recursively via SPI,
since arguments are passed to the function in its global dictionary
(a horrible decision that's far too ancient to undo) and it would delete
those dictionary entries on function exit, leaving the outer recursion
level(s) without any arguments.  Not deleting them would be little better,
since the outer levels would then see the innermost level's arguments.

Since PL/Python uses ValuePerCall mode for evaluating set-returning
functions, it's possible for multiple executions of the same SRF to be
interleaved within a query.  PL/Python failed in such a case, because
it stored only one iterator per function, directly in the function's
PLyProcedure struct.  Moreover, one interleaved instance of the SRF
would see argument values that should belong to another.

Hence, invent code for saving and restoring the argument entries.  To fix
the recursion case, we only need to save at recursive entry and restore
at recursive exit, so the overhead in non-recursive cases is negligible.
To fix the SRF case, we have to save when suspending a SRF and restore
when resuming it, which is potentially not negligible; but fortunately
this is mostly a matter of manipulating Python object refcounts and
should not involve much physical data copying.

Also, store the Python iterator and saved argument values in a structure
associated with the SRF call site rather than the function itself.  This
requires adding a memory context deletion callback to ensure that the SRF
state is cleaned up if the calling query exits before running the SRF to
completion.  Without that we'd leak a refcount to the iterator object in
such a case, resulting in session-lifespan memory leakage.  (In the
pre-existing code, there was no memory leak because there was only one
iterator pointer, but what would happen is that the previous iterator
would be resumed by the next query attempting to use the SRF.  Hardly the
semantics we want.)

We can buy back some of whatever overhead we've added by getting rid of
PLy_function_delete_args(), which seems a useless activity: there is no
need to delete argument entries from the global dictionary on exit,
since the next time anyone would see the global dict is on the next
fresh call of the PL/Python function, at which time we'd overwrite those
entries with new arg values anyway.

Also clean up some really ugly coding in the SRF implementation, including
such gems as returning directly out of a PG_TRY block.  (The only reason
that failed to crash hard was that all existing call sites immediately
exited their own PG_TRY blocks, popping the dangling longjmp pointer before
there was any chance of it being used.)

In principle this is a bug fix; but it seems a bit too invasive relative to
its value for a back-patch, and besides the fix depends on memory context
callbacks so it could not go back further than 9.5 anyway.

Alexey Grishchenko and Tom Lane
2016-04-05 14:51:19 -04:00
Robert Haas 11c8669c0c Add parallel query support functions for assorted aggregates.
This lets us use parallel aggregate for a variety of useful cases
that didn't work before, like sum(int8), sum(numeric), several
versions of avg(), and various other functions.

Add some regression tests, as well, testing the general sanity of
these and future catalog entries.

David Rowley, reviewed by Tomas Vondra, with a few further changes
by me.
2016-04-05 14:32:53 -04:00
Magnus Hagander 7117685461 Implement backup API functions for non-exclusive backups
Previously non-exclusive backups had to be done using the replication protocol
and pg_basebackup. With this commit it's now possible to make them using
pg_start_backup/pg_stop_backup as well, as long as the backup program can
maintain a persistent connection to the database.

Doing this, backup_label and tablespace_map are returned as results from
pg_stop_backup() instead of being written to the data directory. This makes
the server safe from a crash during an ongoing backup, which can be a problem
with exclusive backups.

The old syntax of the functions remain and work exactly as before, but since the
new syntax is safer this should eventually be deprecated and removed.

Only reference documentation is included. The main section on backup still needs
to be rewritten to cover this, but since that is already scheduled for a separate
large rewrite, it's not included in this patch.

Reviewed by David Steele and Amit Kapila
2016-04-05 20:03:49 +02:00
Magnus Hagander 9457b591b9 Fix typo
Etsuro Fujita
2016-04-05 11:05:01 +02:00
Peter Eisentraut 4dcd4da98c Fix error message from wal_level value renaming
found by Ian Barwick
2016-04-04 21:17:54 -04:00