Commit Graph

12808 Commits

Author SHA1 Message Date
Tom Lane 17d5db352c Remove warning about num_sync being too large in synchronous_standby_names.
If we're not going to reject such setups entirely, throwing a WARNING in
check_synchronous_standby_names() is unhelpful, because it will cause the
warning to be logged again every time the postmaster receives SIGHUP.
Per discussion, just remove the warning.

In passing, improve the documentation for synchronous_commit, which had not
gotten the word that now there can be more than one synchronous standby.
2016-04-30 10:54:45 -04:00
Peter Eisentraut 82881b2b43 doc: Minor wording changes
From: Dmitry Igrishin <dmitigr@gmail.com>
2016-04-29 13:03:58 -04:00
Andrew Dunstan 0fb54de9aa Support building with Visual Studio 2015
Adjust the way we detect the locale. As a result the minumum Windows
version supported by VS2015 and later is Windows Vista. Add some tweaks
to remove new compiler warnings. Remove documentation references to the
now obsolete msysGit.

Michael Paquier, somewhat edited by me, reviewed by Christian Ullrich.

Backpatch to 9.5
2016-04-29 08:09:07 -04:00
Tom Lane 4c804fbdfb Clean up parsing of synchronous_standby_names GUC variable.
Commit 989be0810d added a flex/bison lexer/parser to interpret
synchronous_standby_names.  It was done in a pretty crufty way, though,
making assorted end-use sites responsible for calling the parser at the
right times.  That was not only vulnerable to errors of omission, but made
it possible that lexer/parser errors occur at very undesirable times,
and created memory leakages even if there was no error.

Instead, perform the parsing once during check_synchronous_standby_names
and let guc.c manage the resulting data.  To do that, we have to flatten
the parsed representation into a single hunk of malloc'd memory, but that
is not very hard.

While at it, work a little harder on making useful error reports for
parsing problems; the previous code felt that "synchronous_standby_names
parser returned 1" was an appropriate user-facing error message.  (To
be fair, it did also log a syntax error message, but separately from the
GUC problem report, which is at best confusing.)  It had some outright
bugs in the face of invalid input, too.

I (tgl) also concluded that we need to restrict unquoted names in
synchronous_standby_names to be just SQL identifiers.  The previous coding
would accept darn near anything, which (1) makes the quoting convention
both nearly-unnecessary and formally ambiguous, (2) makes it very hard to
understand what is a syntax error and what is a creative interpretation of
the input as a standby name, and (3) makes it impossible to further extend
the syntax in future without a compatibility break.  I presume that we're
intending future extensions of the syntax, else this parsing infrastructure
is massive overkill, so (3) is an important objection.  Since we've taken
a compatibility hit for non-identifier names with this change anyway, we
might as well lock things down now and insist that users use double quotes
for standby names that aren't identifiers.

Kyotaro Horiguchi and Tom Lane
2016-04-27 17:55:25 -04:00
Teodor Sigaev f1e3c76066 Fix tsearch docs
Remove mention of setweight(tsquery) which wasn't included in 9.6. Also
replace old forgotten phrase operator to new one.

Dmitry Ivanov
2016-04-26 20:26:26 +03:00
Robert Haas 77cd477c4b Enable parallel query by default.
Change max_parallel_degree default from 0 to 2.  It is possible that
this is not a good idea, or that we should go with 1 worker rather
than 2, but we won't find out without trying it.  Along the way,
reword the documentation for max_parallel_degree a little bit to
hopefully make it more clear.

Discussion: 20160420174631.3qjjhpwsvvx5bau5@alap3.anarazel.de
2016-04-26 08:35:58 -04:00
Peter Eisentraut 96687497b6 doc: Fix typo
From: Andreas Seltenreich <andreas.seltenreich@credativ.de>
2016-04-24 20:45:59 -04:00
Andres Freund 8f91d87d43 Fix documentation & config inconsistencies around 428b1d6b2.
Several issues:
1) checkpoint_flush_after doc and code disagreed about the default
2) new GUCs were missing from postgresql.conf.sample
3) Outdated source-code comment about bgwriter_flush_after's default
4) Sub-optimal categories assigned to new GUCs
5) Docs suggested backend_flush_after is PGC_SIGHUP, but it's PGC_USERSET.
6) Spell out int as integer in the docs, as done elsewhere

Reported-By: Magnus Hagander, Fujii Masao
Discussion: CAHGQGwETyTG5VYQQ5C_srwxWX7RXvFcD3dKROhvAWWhoSBdmZw@mail.gmail.com
2016-04-24 12:26:55 -07:00
Peter Eisentraut b87b2f4bda doc: Fix typos
From: Erik Rijkers <er@xs4all.nl>
2016-04-23 14:48:02 -04:00
Magnus Hagander cfb863f20a Update backup documentation for new APIs
This includes the rest of the documentation that was not included
in 7117685. A larger restructure would still be wanted, but with
this commit the documentation of the new features is complete.
2016-04-20 14:40:04 -04:00
Fujii Masao 8ce8307bd4 Fix typo in docs.
Artur Zakirov
2016-04-18 13:35:21 +09:00
Peter Eisentraut d460c7cc0f doc: Document that sequences can also be extension configuration tables
From: Michael Paquier <michael.paquier@gmail.com>
2016-04-17 23:13:45 -04:00
Peter Eisentraut 5fdda1ceab doc: Change some "user" to "role" for consistency in the section
suggested by Johannes Choo
2016-04-16 12:54:56 -04:00
Peter Eisentraut efb25e56d8 doc: Markup improvement 2016-04-16 12:49:36 -04:00
Peter Eisentraut d2de44c2ce doc: Add missing parentheses
From: Alexander Law <exclusion@gmail.com>
2016-04-15 20:44:10 -04:00
Tom Lane 6f0d6a5078 Rethink \crosstabview's argument parsing logic.
\crosstabview interpreted its arguments in an unusual way, including
doing case-insensitive matching of unquoted column names, which is
surely not the right thing.  Rip that out in favor of doing something
equivalent to the dequoting/case-folding rules used by other psql
commands.  To keep it simple, change the syntax so that the optional
sort column is specified as a separate argument, instead of the
also-quite-unusual syntax that attached it to the colH argument with
a colon.

Also, rework the error messages to be closer to project style.
2016-04-14 22:54:31 -04:00
Tom Lane fda21aa05b Docs: clarify description of LIMIT/OFFSET behavior.
Section 7.6 was a tad confusing because it specified what LIMIT NULL
does, but neglected to do the same for OFFSET NULL, making this look
like perhaps a special case or a wrong restatement of the bit about
LIMIT ALL.  Wordsmith a bit while at it.  Per bug #14084.
2016-04-14 10:57:29 -04:00
Fujii Masao c8cb745323 Fix duplicated index entry in doc.
Commit cfe96ae corrected the name of pg_logical_emit_message()
in its index entry. But this typo fix caused duplicated index
entry because there was another index entry for the function.

Spotted by Tom Lane.
2016-04-14 11:17:41 +09:00
Tom Lane 85e0047077 Improve documentation for \crosstabview.
Fix misleading syntax summary (there cannot be a space between colH and
scolH).  Provide a link from the existing crosstab() function's
documentation to \crosstabview.  Copy-edit the command's description.

Christoph Berg and Tom Lane
2016-04-13 11:49:47 -04:00
Tom Lane 5713f03973 Improve API of GenericXLogRegister().
Rename this function to GenericXLogRegisterBuffer() to make it clearer
what it does, and leave room for other sorts of "register" actions in
future.  Also, replace its "bool isNew" argument with an integer flags
argument, so as to allow adding more flags in future without an API
break.

Alexander Korotkov, adjusted slightly by me
2016-04-12 11:42:06 -04:00
Fujii Masao cfe96ae24c Fix documented return type of pg_logical_emit_message() in func.sgml. 2016-04-11 21:28:17 +09:00
Tom Lane 08e785436f Get rid of GenericXLogUnregister().
This routine is unsafe as implemented, because it invalidates the page
image pointers returned by previous GenericXLogRegister() calls.

Rather than complicate the API or the implementation to avoid that,
let's just get rid of it; the use-case for having it seems much
too thin to justify a lot of work here.

While at it, do some wordsmithing on the SGML docs for generic WAL.
2016-04-09 16:39:30 -04:00
Alvaro Herrera 1ff3f420d4 Move \crosstabview regression tests to a separate file
It cannot run in the same parallel group as misc, because it creates a
table which is unpredictably visible in that test.

Per buildfarm member crake.
2016-04-08 23:42:24 -03:00
Alvaro Herrera c09b18f21c Support \crosstabview in psql
\crosstabview is a completely different way to display results from a
query: instead of a vertical display of rows, the data values are placed
in a grid where the column and row headers come from the data itself,
similar to a spreadsheet.

The sort order of the horizontal header can be specified by using
another column in the query, and the vertical header determines its
ordering from the order in which they appear in the query.

This only allows displaying a single value in each cell.  If more than
one value correspond to the same cell, an error is thrown.  Merging of
values can be done in the query itself, if necessary.  This may be
revisited in the future.

Author: Daniel Verité
Reviewed-by: Pavel Stehule, Dean Rasheed
2016-04-08 20:23:18 -03:00
Stephen Frost 7a542700df Create default roles
This creates an initial set of default roles which administrators may
use to grant access to, historically, superuser-only functions.  Using
these roles instead of granting superuser access reduces the number of
superuser roles required for a system.  Documention for each of the
default roles has been added to user-manag.sgml.

Bump catversion to 201604082, as we had a commit that bumped it to
201604081 and another that set it back to 201604071...

Reviews by José Luis Tallón and Robert Haas
2016-04-08 16:56:27 -04:00
Stephen Frost 293007898d Reserve the "pg_" namespace for roles
This will prevent users from creating roles which begin with "pg_" and
will check for those roles before allowing an upgrade using pg_upgrade.

This will allow for default roles to be provided at initdb time.

Reviews by José Luis Tallón and Robert Haas
2016-04-08 16:56:27 -04:00
Kevin Grittner 848ef42bb8 Add the "snapshot too old" feature
This feature is controlled by a new old_snapshot_threshold GUC.  A
value of -1 disables the feature, and that is the default.  The
value of 0 is just intended for testing.  Above that it is the
number of minutes a snapshot can reach before pruning and vacuum
are allowed to remove dead tuples which the snapshot would
otherwise protect.  The xmin associated with a transaction ID does
still protect dead tuples.  A connection which is using an "old"
snapshot does not get an error unless it accesses a page modified
recently enough that it might not be able to produce accurate
results.

This is similar to the Oracle feature, and we use the same SQLSTATE
and error message for compatibility.
2016-04-08 14:36:30 -05:00
Teodor Sigaev 8b99edefca Revert CREATE INDEX ... INCLUDING ...
It's not ready yet, revert two commits
690c543550 - unstable test output
386e3d7609 - patch itself
2016-04-08 21:52:13 +03:00
Magnus Hagander 35e2e357cb Add authentication parameters compat_realm and upn_usename for SSPI
These parameters are available for SSPI authentication only, to make
it possible to make it behave more like "normal gssapi", while
making it possible to maintain compatibility.

compat_realm is on by default, but can be turned off to make the
authentication use the full Kerberos realm instead of the NetBIOS name.

upn_username is off by default, and can be turned on to return the users
Kerberos UPN rather than the SAM-compatible name (a user in Active
Directory can have both a legacy SAM-compatible username and a new
Kerberos one. Normally they are the same, but not always)

Author: Christian Ullrich
Reviewed by: Robbie Harwood, Alvaro Herrera, me
2016-04-08 20:28:38 +02:00
Tom Lane 34c33a1f00 Add BSD authentication method.
Create a "bsd" auth method that works the same as "password" so far as
clients are concerned, but calls the BSD Authentication service to
check the password.  This is currently only available on OpenBSD.

Marisa Emerson, reviewed by Thomas Munro
2016-04-08 13:52:06 -04:00
Robert Haas af025eed53 Add combine functions for various floating-point aggregates.
This allows parallel aggregation to use them.  It may seem surprising
that we use float8_combine for both float4_accum and float8_accum
transition functions, but that's because those functions differ only
in the type of the non-transition-state argument.

Haribabu Kommi, reviewed by David Rowley and Tomas Vondra
2016-04-08 13:47:06 -04:00
Teodor Sigaev 386e3d7609 CREATE INDEX ... INCLUDING (column[, ...])
Now indexes (but only B-tree for now) can contain "extra" column(s) which
doesn't participate in index structure, they are just stored in leaf
tuples. It allows to use index only scan by using single index instead
of two or more indexes.

Author: Anastasia Lubennikova with minor editorializing by me
Reviewers: David Rowley, Peter Geoghegan, Jeff Janes
2016-04-08 19:45:59 +03:00
Robert Haas 752b948dfc Document which aggregates support partial mode.
David Rowley, reviewed by Tomas Vondra
2016-04-08 12:13:07 -04:00
Teodor Sigaev 5c3c3cd0a3 Enhanced custom error in PLPythonu
Patch adds a new, more rich,  way to emit error message or exception from
PL/Pythonu code.

Author: Pavel Stehule
Reviewers: Catalin Iacob, Peter Eisentraut, Jim Nasby
2016-04-08 18:33:06 +03:00
Robert Haas 25fe8b5f1a Add a 'parallel_degree' reloption.
The code that estimates what parallel degree should be uesd for the
scan of a relation is currently rather stupid, so add a parallel_degree
reloption that can be used to override the planner's rather limited
judgement.

Julien Rouhaud, reviewed by David Rowley, James Sewell, Amit Kapila,
and me.  Some further hacking by me.
2016-04-08 11:14:56 -04:00
Peter Eisentraut 2f1d2b7a75 Set PAM_RHOST item for PAM authentication
The PAM_RHOST item is set to the remote IP address or host name and can
be used by PAM modules.  A pg_hba.conf option is provided to choose
between IP address and resolved host name.

From: Grzegorz Sampolski <grzsmp@gmail.com>
Reviewed-by: Haribabu Kommi <kommi.haribabu@gmail.com>
2016-04-08 10:48:44 -04:00
Robert Haas 0711803775 Use quicksort, not replacement selection, for external sorting.
We still use replacement selection for the first run of the sort only
and only when the number of tuples is relatively small.  Otherwise,
the first run, and subsequent runs in all cases, are produced using
quicksort.  This tends to be faster except perhaps for very small
amounts of working memory.

Peter Geoghegan, reviewed by Tomas Vondra, Jeff Janes, Mithun Cy,
Greg Stark, and me.
2016-04-08 02:36:26 -04:00
Fujii Masao 8643b91ecf Fix a couple of places in doc that implied there was only one sync standby.
Thomas Munro
2016-04-08 13:24:50 +09:00
Kevin Grittner fcff8a5751 Detect SSI conflicts before reporting constraint violations
While prior to this patch the user-visible effect on the database
of any set of successfully committed serializable transactions was
always consistent with some one-at-a-time order of execution of
those transactions, the presence of declarative constraints could
allow errors to occur which were not possible in any such ordering,
and developers had no good workarounds to prevent user-facing
errors where they were not necessary or desired.  This patch adds
a check for serialization failure ahead of duplicate key checking
so that if a developer explicitly (redundantly) checks for the
pre-existing value they will get the desired serialization failure
where the problem is caused by a concurrent serializable
transaction; otherwise they will get a duplicate key error.

While it would be better if the reads performed by the constraints
could count as part of the work of the transaction for
serialization failure checking, and we will hopefully get there
some day, this patch allows a clean and reliable way for developers
to work around the issue.  In many cases existing code will already
be doing the right thing for this to "just work".

Author: Thomas Munro, with minor editing of docs by me
Reviewed-by: Marko Tiikkaja, Kevin Grittner
2016-04-07 11:12:35 -05:00
Teodor Sigaev bb140506df Phrase full text search.
Patch introduces new text search operator (<-> or <DISTANCE>) into tsquery.
On-disk and binary in/out format of tsquery are backward compatible.
It has two side effect:
- change order for tsquery, so, users, who has a btree index over tsquery,
  should reindex it
- less number of parenthesis in tsquery output, and tsquery becomes more
  readable

Authors: Teodor Sigaev, Oleg Bartunov, Dmitry Ivanov
Reviewers: Alexander Korotkov, Artur Zakirov
2016-04-07 18:44:18 +03:00
Stephen Frost 1574783b4c Use GRANT system to manage access to sensitive functions
Now that pg_dump will properly dump out any ACL changes made to
functions which exist in pg_catalog, switch to using the GRANT system
to manage access to those functions.

This means removing 'if (!superuser()) ereport()' checks from the
functions themselves and then REVOKEing EXECUTE right from 'public' for
these functions in system_views.sql.

Reviews by Alexander Korotkov, Jose Luis Tallon
2016-04-06 21:45:32 -04:00
Stephen Frost 23f34fa4ba In pg_dump, include pg_catalog and extension ACLs, if changed
Now that all of the infrastructure exists, add in the ability to
dump out the ACLs of the objects inside of pg_catalog or the ACLs
for objects which are members of extensions, but only if they have
been changed from their original values.

The original values are tracked in pg_init_privs.  When pg_dump'ing
9.6-and-above databases, we will dump out the ACLs for all objects
in pg_catalog and the ACLs for all extension members, where the ACL
has been changed from the original value which was set during either
initdb or CREATE EXTENSION.

This should not change dumps against pre-9.6 databases.

Reviews by Alexander Korotkov, Jose Luis Tallon
2016-04-06 21:45:32 -04:00
Stephen Frost 6c268df127 Add new catalog called pg_init_privs
This new catalog holds the privileges which the system was
initialized with at initdb time, along with any permissions set
by extensions at CREATE EXTENSION time.  This allows pg_dump
(and any other similar use-cases) to detect when the privileges
set on initdb-created or extension-created objects have been
changed from what they were set to at initdb/extension-creation
time and handle those changes appropriately.

Reviews by Alexander Korotkov, Jose Luis Tallon
2016-04-06 21:45:32 -04:00
Teodor Sigaev 0b62fd036e Add jsonb_insert
It inserts a new value into an jsonb array at arbitrary position or
a new key to jsonb object.

Author: Dmitry Dolgov
Reviewers: Petr Jelinek, Vitaly Burovoy, Andrew Dunstan
2016-04-06 19:25:00 +03:00
Simon Riggs 3fe3511d05 Generic Messages for Logical Decoding
API and mechanism to allow generic messages to be inserted into WAL that are
intended to be read by logical decoding plugins. This commit adds an optional
new callback to the logical decoding API.

Messages are either text or bytea. Messages can be transactional, or not, and
are identified by a prefix to allow multiple concurrent decoding plugins.

(Not to be confused with Generic WAL records, which are intended to allow crash
recovery of extensible objects.)

Author: Petr Jelinek and Andres Freund
Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs
Discussion: 5685F999.6010202@2ndquadrant.com
2016-04-06 10:05:41 +01:00
Fujii Masao 989be0810d Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.

This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.

Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.

The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.

This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.

The regression test for multiple synchronous standbys is not included
in this commit. It should come later.

Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi

Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 17:18:25 +09:00
Alvaro Herrera 2143f5e127 Fix broken ALTER INDEX documentation
Commit b8a91d9d1c put the description of the new IF EXISTS clause in the
wrong place -- move it where it belongs.

Backpatch to 9.2.
2016-04-05 19:03:42 -03:00
Alvaro Herrera f2fcad27d5 Support ALTER THING .. DEPENDS ON EXTENSION
This introduces a new dependency type which marks an object as depending
on an extension, such that if the extension is dropped, the object
automatically goes away; and also, if the database is dumped, the object
is included in the dump output.  Currently the grammar supports this for
indexes, triggers, materialized views and functions only, although the
utility code is generic so adding support for more object types is a
matter of touching the parser rules only.

Author: Abhijit Menon-Sen
Reviewed-by: Alexander Korotkov, Álvaro Herrera
Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org
2016-04-05 18:38:54 -03:00
Robert Haas 41ea0c2376 Fix parallel-safety code for parallel aggregation.
has_parallel_hazard() was ignoring the proparallel markings for
aggregates, which is no good.  Fix that.  There was no way to mark
an aggregate as actually being parallel-safe, either, so add a
PARALLEL option to CREATE AGGREGATE.

Patch by me, reviewed by David Rowley.
2016-04-05 16:06:15 -04:00
Magnus Hagander 7117685461 Implement backup API functions for non-exclusive backups
Previously non-exclusive backups had to be done using the replication protocol
and pg_basebackup. With this commit it's now possible to make them using
pg_start_backup/pg_stop_backup as well, as long as the backup program can
maintain a persistent connection to the database.

Doing this, backup_label and tablespace_map are returned as results from
pg_stop_backup() instead of being written to the data directory. This makes
the server safe from a crash during an ongoing backup, which can be a problem
with exclusive backups.

The old syntax of the functions remain and work exactly as before, but since the
new syntax is safer this should eventually be deprecated and removed.

Only reference documentation is included. The main section on backup still needs
to be rewritten to cover this, but since that is already scheduled for a separate
large rewrite, it's not included in this patch.

Reviewed by David Steele and Amit Kapila
2016-04-05 20:03:49 +02:00
Tom Lane 2bbe9112ae Add a \gexec command to psql for evaluation of computed queries.
\gexec executes the just-entered query, like \g, but instead of printing
the results it takes each field as a SQL command to send to the server.
Computing a series of queries to be executed is a fairly common thing,
but up to now you always had to resort to kluges like writing the queries
to a file and then inputting the file.  Now it can be done with no
intermediate step.

The implementation is fairly straightforward except for its interaction
with FETCH_COUNT.  ExecQueryUsingCursor isn't capable of being called
recursively, and even if it were, its need to create a transaction
block interferes unpleasantly with the desired behavior of \gexec after
a failure of a generated query (i.e., that it can continue).  Therefore,
disable use of ExecQueryUsingCursor when doing the master \gexec query.
We can still apply it to individual generated queries, however, and there
might be some value in doing so.

While testing this feature's interaction with single-step mode, I (tgl) was
led to conclude that SendQuery needs to recognize SIGINT (cancel_pressed)
as a negative response to the single-step prompt.  Perhaps that's a
back-patchable bug fix, but for now I just included it here.

Corey Huinker, reviewed by Jim Nasby, Daniel Vérité, and myself
2016-04-04 15:25:16 -04:00
Teodor Sigaev 9b27aebe71 fix typo
Andreas Ulbrich
2016-04-04 14:55:04 +03:00
Tom Lane 3cc38ca7d2 Add psql \errverbose command to see last server error at full verbosity.
Often, upon getting an unexpected error in psql, one's first wish is that
the verbosity setting had been higher; for example, to be able to see the
schema-name field or the server code location info.  Up to now the only way
has been to adjust the VERBOSITY variable and repeat the failing query.
That's a pain, and it doesn't work if the error isn't reproducible.

This commit adds a psql feature that redisplays the most recent server
error at full verbosity, without needing to make any variable changes or
re-execute the failed command.  We just need to hang onto the latest error
PGresult in case the user executes \errverbose, and then apply libpq's
new PQresultVerboseErrorMessage() function to it.  This will consume
some trivial amount of psql memory, but otherwise the cost when the
feature isn't used should be negligible.

Alex Shulgin, reviewed by Daniel Vérité, some improvements by me
2016-04-03 12:29:55 -04:00
Tom Lane e3161b231c Add libpq support for recreating an error message with different verbosity.
Often, upon getting an unexpected error in psql, one's first wish is that
the verbosity setting had been higher; for example, to be able to see the
schema-name field or the server code location info.  Up to now the only way
has been to adjust the VERBOSITY variable and repeat the failing query.
That's a pain, and it doesn't work if the error isn't reproducible.

This commit adds support in libpq for regenerating the error message for
an existing error PGresult at any desired verbosity level.  This is almost
just a matter of refactoring the existing code into a subroutine, but there
is one bit of possibly-needed information that was not getting put into
PGresults: the text of the last query sent to the server.  We must add that
string to the contents of an error PGresult.  But we only need to save it
if it might be used, which with the existing error-formatting code only
happens if there is a PG_DIAG_STATEMENT_POSITION error field, which is
probably pretty rare for errors in production situations.  So really the
overhead when the feature isn't used should be negligible.

Alex Shulgin, reviewed by Daniel Vérité, some improvements by me
2016-04-03 12:24:54 -04:00
Noah Misch 4ad6f13500 Copyedit comments and documentation. 2016-04-01 21:53:10 -04:00
Teodor Sigaev a361c22ebf Fix English in bloom module documentation
Author: Erik Rijkers
2016-04-01 18:47:44 +03:00
Teodor Sigaev 9ee014fc89 Bloom index contrib module
Module provides new access method. It is actually a simple Bloom filter
implemented as pgsql's index. It could give some benefits on search
with large number of columns.

Module is a single way to test generic WAL interface committed earlier.

Author: Teodor Sigaev, Alexander Korotkov
Reviewers: Aleksander Alekseev, Michael Paquier, Jim Nasby
2016-04-01 16:42:24 +03:00
Teodor Sigaev 4e56e5a6de Fix typo in generic wal docs
Markus Nullmeier
2016-04-01 16:37:42 +03:00
Teodor Sigaev 65578341af Add Generic WAL interface
This interface is designed to give an access to WAL for extensions which
could implement new access method, for example. Previously it was
impossible because restoring from custom WAL would need to access system
catalog to find a redo custom function. This patch suggests generic way
to describe changes on page with standart layout.

Bump XLOG_PAGE_MAGIC because of new record type.

Author: Alexander Korotkov with a help of Petr Jelinek, Markus Nullmeier and
	minor editorization by my
Reviewers: Petr Jelinek, Alvaro Herrera, Teodor Sigaev, Jim Nasby,
	Michael Paquier
2016-04-01 12:21:48 +03:00
Teodor Sigaev acdf2a8b37 Introduce SP-GiST operator class over box.
Patch implements quad-tree over boxes, naive approach of 2D quad tree will not
work for any non-point objects because splitting space on node is not
efficient. The idea of pathc is treating 2D boxes as 4D points, so,
object will not overlap (in 4D space).

The performance tests reveal that this technique especially beneficial
with too much overlapping objects, so called "spaghetti data".

Author: Alexander Lebedev with editorization by Emre Hasegeli and me
2016-03-30 18:42:36 +03:00
Teodor Sigaev ccd6eb49a4 Introduce traversalValue for SP-GiST scan
During scan sometimes it would be very helpful to know some information about
parent node or all 	ancestor nodes. Right now reconstructedValue could be used
but it's not a right usage of it (range opclass uses that).

traversalValue is arbitrary piece of memory in separate MemoryContext while
reconstructedVale should have the same type as indexed column.

Subsequent patches for range opclass and quad4d tree will use it.

Author: Alexander Lebedev, Teodor Sigaev
2016-03-30 18:29:28 +03:00
Tom Lane c3834ef9e8 Remove TZ environment-variable entry from postgres reference page.
The server hasn't paid attention to the TZ environment variable since
commit ca4af308c3, but that commit missed removing this documentation
reference, as did commit d883b916a9 which added the reference where
it now belongs (initdb).

Back-patch to 9.2 where the behavior changed.  Also back-patch
d883b916a9 as needed.

Matthew Somerville
2016-03-29 21:38:41 -04:00
Robert Haas 314cbfc5da Add new replication mode synchronous_commit = 'remote_apply'.
In this mode, the master waits for the transaction to be applied on
the remote side, not just written to disk.  That means that you can
count on a transaction started on the standby to see all commits
previously acknowledged by the master.

To make this work, the standby sends a reply after replaying each
commit record generated with synchronous_commit >= 'remote_apply'.
This introduces a small inefficiency: the extra replies will be sent
even by standbys that aren't the current synchronous standby.  But
previously-existing synchronous_commit levels make no attempt at all
to optimize which replies are sent based on what the primary cares
about, so this is no worse, and at least avoids any extra replies for
people not using the feature at all.

Thomas Munro, reviewed by Michael Paquier and by me.  Some additional
tweaks by me.
2016-03-29 21:29:49 -04:00
Tom Lane e511d878f3 Allow to_timestamp(float8) to convert float infinity to timestamp infinity.
With the original SQL-function implementation, such cases failed because
we don't support infinite intervals.  Converting the function to C lets
us bypass the interval representation, which should be a bit faster as
well as more flexible.

Vitaly Burovoy, reviewed by Anastasia Lubennikova
2016-03-29 17:09:29 -04:00
Robert Haas 5fe5a2cee9 Allow aggregate transition states to be serialized and deserialized.
This is necessary infrastructure for supporting parallel aggregation
for aggregates whose transition type is "internal".  Such values
can't be passed between cooperating processes, because they are
just pointers.

David Rowley, reviewed by Tomas Vondra and by me.
2016-03-29 15:04:05 -04:00
Robert Haas 7f0a2c85fb Improve pgbench docs regarding per-transaction logging.
The old documentation didn't know about the new -b flag, only about -f.

Fabien Coelho
2016-03-29 14:07:55 -04:00
Robert Haas d797bf7da2 Fix pgbench documentation error.
The description of what the per-transaction log file says for skipped
transactions is just plain wrong.

Report and patch by Tomas Vondra, reviewed by Fabien Coelho and
modified by me.
2016-03-29 13:50:10 -04:00
Alvaro Herrera a1c935d3b7 pgbench: allow a script weight of zero
This refines the previous weight range and allows a script to be "turned
off" by passing a zero weight, which is useful when scripting multiple
pgbench runs.

I did not apply the suggested warning when a script uses zero weight; we
use the principle elsewhere that if there's nothing to be done, do
nothing quietly.

Adjust docs accordingly.

Author: Jeff Janes, Fabien Coelho
2016-03-29 14:47:10 -03:00
Robert Haas ad9566470b pgbench: Remove \setrandom.
You can now do the same thing via \set using the appropriate function,
either random(), random_gaussian(), or random_exponential(), depending
on the desired distribution.  This is not backward-compatible, but per
discussion, it's worth it to avoid having the old syntax hang around
forever.

Fabien Coelho, reviewed by Michael Paquier, and adjusted by me.
2016-03-29 12:08:49 -04:00
Robert Haas 86c43f4e22 pgbench: Support double constants and functions.
The new functions are pi(), random(), random_exponential(),
random_gaussian(), and sqrt().  I was worried that this would be
slower than before, but, if anything, it actually turns out to be
slightly faster, because we now express the built-in pgbench scripts
using fewer lines; each \setrandom can be merged into a subsequent
\set.

Fabien Coelho
2016-03-28 20:45:57 -04:00
Alvaro Herrera 80b986cf52 Mention BRIN as able to do multi-column indexes
Documentation mentioned B-tree, GiST and GIN as able to do multicolumn
indexes; I failed to add BRIN to the list.

Author: Petr Jediný
Reviewed-By: Fujii Masao, Emre Hasegeli
2016-03-28 19:11:12 -03:00
Tom Lane e5a4dea80f Document errhidecontext() where it ought to be documented.
Seems to have been missed when this function was added.  Noted while
looking at David Steele's proposal to add another similar function.
2016-03-28 14:18:14 -04:00
Tom Lane 4c46f83386 Last-minute updates for release notes.
Security: CVE-2016-2193, CVE-2016-3065
2016-03-28 11:32:17 -04:00
Tom Lane d12e5bb79b Code and docs review for commit 3187d6de0e.
Fix up check for high-bit-set characters, which provoked "comparison is
always true due to limited range of data type" warnings on some compilers,
and was unlike the way we do it elsewhere anyway.  Fix omission of "$"
from the set of valid identifier continuation characters.  Get rid of
sanitize_text(), which was utterly inconsistent with any other error report
anywhere in the system, and wasn't even well designed on its own terms
(double-quoting the result string without escaping contained double quotes
doesn't seem very well thought out).  Fix up error messages, which didn't
follow the message style guidelines very well, and were overly specific in
situations where the actual mistake might not be what they said.  Improve
documentation.

(I started out just intending to fix the compiler warning, but the more
I looked at the patch the less I liked it.)
2016-03-28 01:00:30 -04:00
Tom Lane 499a50571c Release notes for 9.5.2, 9.4.7, 9.3.12, 9.2.16, 9.1.21. 2016-03-27 19:26:26 -04:00
Tom Lane 29b6123ecb First-draft release notes for 9.5.2.
As usual, the release notes for other branches will be made by cutting
these down, but put them up for community review first.
2016-03-26 19:27:58 -04:00
Tom Lane cd37bb7859 Improve PL/Tcl errorCode facility by providing decoded name for SQLSTATE.
We don't really want to encourage people to write numeric SQLSTATEs in
programs; that's unreadable and error-prone.  Copy plpgsql's infrastructure
for converting between SQLSTATEs and exception names shown in Appendix A,
and modify examples in tests and documentation to do it that way.
2016-03-25 16:54:52 -04:00
Tom Lane fb8d2a7f57 In PL/Tcl, make database errors return additional info in the errorCode.
Tcl has a convention for returning additional info about an error in a
global variable named errorCode.  Up to now PL/Tcl has ignored that,
but this patch causes database errors caught by PL/Tcl to fill in
errorCode with useful information from the ErrorData struct.

Jim Nasby, reviewed by Pavel Stehule and myself
2016-03-25 15:52:53 -04:00
Robert Haas a596db332b Improve documentation for combine functions.
David Rowley
2016-03-24 12:59:18 -04:00
Alvaro Herrera 473b932870 Support CREATE ACCESS METHOD
This enables external code to create access methods.  This is useful so
that extensions can add their own access methods which can be formally
tracked for dependencies, so that DROP operates correctly.  Also, having
explicit support makes pg_dump work correctly.

Currently only index AMs are supported, but we expect different types to
be added in the future.

Authors: Alexander Korotkov, Petr Jelínek
Reviewed-By: Teodor Sigaev, Petr Jelínek, Jim Nasby
Commitfest-URL: https://commitfest.postgresql.org/9/353/
Discussion: https://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com
2016-03-23 23:01:35 -03:00
Teodor Sigaev f6bd0da63b Improve docs of pg_trgm changes
Artur Zakirov, per gripe from Jeff Janes
2016-03-22 17:08:10 +03:00
Fujii Masao 112a2d0615 Fix typo in docs.
Jeff Janes
2016-03-22 22:05:17 +09:00
Tom Lane dea2b5960a Improve header output from psql's \watch command.
Include the \pset title string if there is one, and shorten the prefab
part of the header to be "timestamp (every Ns)".  Per suggestion by
David Johnston.

Michael Paquier and Tom Lane
2016-03-21 18:18:13 -04:00
Tom Lane 68ab8e8ba4 SQL commands in pgbench scripts are now ended by semicolons, not newlines.
To allow multiline SQL commands in scripts, adopt the same rules psql uses
to decide what is the end of a SQL command, to wit, an unquoted semicolon
not encased in parentheses.  Do this by importing the same flex lexer that
psql uses, since coping with stuff like dollar-quoted literals is hard to
get right without going the full nine yards.

This makes use of the infrastructure added in commit 0ea9efbe9e to
support independently-written flex lexers scanning the same PsqlScanState
input-buffer data structure.  Since that infrastructure isn't very
friendly to ad-hoc parsing code such as strtok(), improve exprscan.l
so that it can parse either whitespace-separated words or expression
tokens, on demand, and rewrite pgbench.c's backslash-command parsing
code to always use the lexer to fetch tokens.

It's still the case that pgbench backslash commands extend to the end
of the line, no more and no less.  That could be changed in a fairly
localized way now, and there was some interest in doing so, but it
seems like material for a separate patch.

In passing, make some marginal cleanups in syntax error reporting,
const-ify a few data structures that could use it, and run some of
this code through pgindent.

I can't tell whether the MSVC build scripts need to be taught explicitly
about the changes here or not, but the buildfarm will soon tell us.

Kyotaro Horiguchi and Tom Lane
2016-03-20 12:58:51 -04:00
Alvaro Herrera 7bafffea64 pgbench: Allow changing weights for scripts
Previously, all scripts had the same probability of being chosen when
multiple of them were specified via -b, -f, -N, -S.  With this commit,
-b and -f now search for an "@" in the script name and use the integer
found after it as the drawing probability for that script.

(One disadvantage is that if you have script whose names contain @, you
are now forced to specify "@1" at the end; otherwise the name's @ is
confused with a weight separator.  We don't expect many pgbench script
with @ in their names in the wild, so this shouldn't be too serious a
problem.)

While at it, rework the interface between addScript, process_file,
process_builtin, and findBuiltin.  It had gotten a bit out of hand with
recent commits.

Author: Fabien Coelho
Reviewed-By: Andres Freund, Robert Haas, Álvaro Herrera, Michaël Paquier
Discussion: http://www.postgresql.org/message-id/alpine.DEB.2.10.1603160721240.1666@sto
2016-03-19 12:32:42 -03:00
Peter Eisentraut 9a83564c58 Allow SSL server key file to have group read access if owned by root
We used to require the server key file to have permissions 0600 or less
for best security.  But some systems (such as Debian) have certificate
and key files managed by the operating system that can be shared with
other services.  In those cases, the "postgres" user is made a member of
a special group that has access to those files, and the server key file
has permissions 0640.  To accommodate that kind of setup, also allow the
key file to have permissions 0640 but only if owned by root.

From: Christoph Berg <myon@debian.org>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
2016-03-19 11:03:22 +01:00
Peter Eisentraut b555ed8102 Merge wal_level "archive" and "hot_standby" into new name "replica"
The distinction between "archive" and "hot_standby" existed only because
at the time "hot_standby" was added, there was some uncertainty about
stability.  This is now a long time ago.  We would like to move forward
with simplifying the replication configuration, but this distinction is
in the way, because a primary server cannot tell (without asking a
standby or predicting the future) which one of these would be the
appropriate level.

Pick a new name for the combined setting to make it clearer that it
covers all (non-logical) backup and replication uses.  The old values
are still accepted but are converted internally.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: David Steele <david@pgmasters.net>
2016-03-18 23:56:03 +01:00
Robert Haas 0bf3ae88af Directly modify foreign tables.
postgres_fdw can now sent an UPDATE or DELETE statement directly to
the foreign server in simple cases, rather than sending a SELECT FOR
UPDATE statement and then updating or deleting rows one-by-one.

Etsuro Fujita, reviewed by Rushabh Lathia, Shigeru Hanada, Kyotaro
Horiguchi, Albe Laurenz, Thom Brown, and me.
2016-03-18 13:55:52 -04:00
Teodor Sigaev 61d2ebdbf9 Fix a typo
Erik Rijkers
2016-03-18 18:49:24 +03:00
Teodor Sigaev 3187d6de0e Introduce parse_ident()
SQL-layer function to split qualified identifier into array parts.

Author: Pavel Stehule with minor editorization by me and Jim Nasby
2016-03-18 18:16:14 +03:00
Alvaro Herrera 696684d878 docs: Fix typo'd brin_summarize_new_values
I wrote "brin_summarize_new_pages" instead, in docs as well as in the
commit message of commit ac443d1034.

Bug: #14030
Reported-By: Chris Pacejo
2016-03-17 20:17:04 -03:00
Peter Eisentraut fc201dfd95 Add syslog_split_messages parameter
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
2016-03-16 23:21:44 -04:00
Peter Eisentraut f4c454e9ba Add syslog_sequence_numbers parameter
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
2016-03-16 23:21:44 -04:00
Teodor Sigaev f576b17cd6 Add word_similarity to pg_trgm contrib module.
Patch introduces a concept of similarity over string and just a word from
another string.

Version of extension is not changed because 1.2 was already introduced in 9.6
release cycle, so, there wasn't a public version.

Author: Alexander Korotkov, Artur Zakirov
2016-03-16 18:59:21 +03:00
Robert Haas 1c4f001b79 Fix typo.
Amit Langote
2016-03-16 11:38:30 -04:00
Robert Haas c6dda1f48e Add idle_in_transaction_session_timeout.
Vik Fearing, reviewed by Stéphane Schildknecht and me, and revised
slightly by me.
2016-03-16 11:30:45 -04:00
Teodor Sigaev 5871b88487 GUC variable pg_trgm.similarity_threshold insead of set_limit()
Use GUC variable pg_trgm.similarity_threshold insead of
set_limit()/show_limit() which was introduced when defining GUC varuables
by modules was absent.

Author: Artur Zakirov
2016-03-16 17:44:58 +03:00
Robert Haas 3aff33aa68 Fix typos.
Oskari Saarenmaa
2016-03-15 18:06:11 -04:00
Robert Haas 2a90cb69e1 Fix typos.
Thomas Reiss
2016-03-15 16:28:17 -04:00
Robert Haas c16dc1aca5 Add simple VACUUM progress reporting.
There's a lot more that could be done here yet - in particular, this
reports only very coarse-grained information about the index vacuuming
phase - but even as it stands, the new pg_stat_progress_vacuum can
tell you quite a bit about what a long-running vacuum is actually
doing.

Amit Langote and Robert Haas, based on earlier work by Vinayak Pokale
and Rahila Syed.
2016-03-15 13:32:56 -04:00
Tom Lane 101fd9349e Add a GetForeignUpperPaths callback function for FDWs.
This is basically like the just-added create_upper_paths_hook, but
control is funneled only to the FDW responsible for all the baserels
of the current query; so providing such a callback is much less likely
to add useless overhead than using the hook function is.

The documentation is a bit sketchy.  We'll likely want to improve it,
and/or adjust the call conventions, when we get some experience with
actually using this callback.  Hopefully somebody will find time to
experiment with it before 9.6 feature freeze.
2016-03-14 20:04:48 -04:00
Tom Lane 307c78852f Rethink representation of PathTargets.
In commit 19a541143a I did not make PathTarget a subtype of Node,
and embedded a RelOptInfo's reltarget directly into it rather than having
a separately-allocated Node.  In hindsight that was misguided
micro-optimization, enabled by the fact that at that point we didn't have
any Paths with custom PathTargets.  Now that PathTarget processing has
been fleshed out some more, it's easier to see that it's better to have
PathTarget as an indepedent Node type, even if it does cost us one more
palloc to create a RelOptInfo.  So change it while we still can.

This commit just changes the representation, without doing anything more
interesting than that.
2016-03-14 16:59:59 -04:00
Tom Lane 9da70efcbe Mop-up for setting minimum Tcl version to 8.4.
Commit e2609323e set the minimum Tcl version we support to 8.4, but
I forgot to adjust the documentation to say the same.  Some nosing
around for other consequences found that the configure script could
be simplified slightly as well.
2016-03-13 17:14:49 -04:00
Magnus Hagander 7a8d874836 Rename auto_explain.sample_ratio to sample_rate
Per suggestion from Tomas Vondra

Author: Julien Rouhaud
2016-03-13 13:18:03 +01:00
Tom Lane 23a27b039d Widen query numbers-of-tuples-processed counters to uint64.
This patch widens SPI_processed, EState's es_processed field, PortalData's
portalPos field, FuncCallContext's call_cntr and max_calls fields,
ExecutorRun's count argument, PortalRunFetch's result, and the max number
of rows in a SPITupleTable to uint64, and deals with (I hope) all the
ensuing fallout.  Some of these values were declared uint32 before, and
others "long".

I also removed PortalData's posOverflow field, since that logic seems
pretty useless given that portalPos is now always 64 bits.

The user-visible results are that command tags for SELECT etc will
correctly report tuple counts larger than 4G, as will plpgsql's GET
GET DIAGNOSTICS ... ROW_COUNT command.  Queries processing more tuples
than that are still not exactly the norm, but they're becoming more
common.

Most values associated with FETCH/MOVE distances, such as PortalRun's count
argument and the count argument of most SPI functions that have one, remain
declared as "long".  It's not clear whether it would be worth promoting
those to int64; but it would definitely be a large dollop of additional
API churn on top of this, and it would only help 32-bit platforms which
seem relatively less likely to see any benefit.

Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me
2016-03-12 16:05:29 -05:00
Tom Lane 9118d03a8c When appropriate, postpone SELECT output expressions till after ORDER BY.
It is frequently useful for volatile, set-returning, or expensive functions
in a SELECT's targetlist to be postponed till after ORDER BY and LIMIT are
done.  Otherwise, the functions might be executed for every row of the
table despite the presence of LIMIT, and/or be executed in an unexpected
order.  For example, in
	SELECT x, nextval('seq') FROM tab ORDER BY x LIMIT 10;
it's probably desirable that the nextval() values are ordered the same
as x, and that nextval() is not run more than 10 times.

In the past, Postgres was inconsistent in this area: you would get the
desirable behavior if the ordering were performed via an indexscan, but
not if it had to be done by an explicit sort step.  Getting the desired
behavior reliably required contortions like
	SELECT x, nextval('seq')
	  FROM (SELECT x FROM tab ORDER BY x) ss LIMIT 10;

This patch conditionally postpones evaluation of pure-output target
expressions (that is, those that are not used as DISTINCT, ORDER BY, or
GROUP BY columns) so that they effectively occur after sorting, even if an
explicit sort step is necessary.  Volatile expressions and set-returning
expressions are always postponed, so as to provide consistent semantics.
Expensive expressions (costing more than 10 times typical operator cost,
which by default would include any user-defined function) are postponed
if there is a LIMIT or if there are expressions that must be postponed.

We could be more aggressive and postpone any nontrivial expression, but
there are costs associated with doing so: it requires an extra Result plan
node which adds some overhead, and postponement changes the volume of data
going through the sort step, perhaps for the worse.  Since we tend not to
have very good estimates of the output width of nontrivial expressions,
it's hard to have much confidence in our ability to predict whether
postponement would increase or decrease the cost of the sort; therefore
this patch doesn't attempt to make decisions conditionally on that.
Between these factors and a general desire not to change query behavior
when there's not a demonstrable benefit, it seems best to be conservative
about applying postponement.  We might tweak the decision rules in the
future, though.

Konstantin Knizhnik, heavily rewritten by me
2016-03-11 12:27:50 -05:00
Teodor Sigaev 6943a946c7 Tsvector editing functions
Adds several tsvector editting function: convert tsvector to/from text array,
set weight for given lexemes, delete lexeme(s), unnest, filter lexemes
with given weights

Author: Stas Kelvich with some editorization by me
Reviewers: Tomas Vondram, Teodor Sigaev
2016-03-11 19:22:36 +03:00
Magnus Hagander 92f03fe76f Allow setting sample ratio for auto_explain
New configuration parameter auto_explain.sample_ratio makes it
possible to log just a fraction of the queries meeting the configured
threshold, to reduce the amount of logging.

Author: Craig Ringer and Julien Rouhaud
Review: Petr Jelinek
2016-03-11 15:08:34 +01:00
Robert Haas 69ab7b9d6c psql: Don't automatically use expanded format when there's 1 column.
Andreas Karlsson and Robert Haas
2016-03-11 08:04:01 -05:00
Andres Freund 428b1d6b29 Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches.  When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively.  This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.

On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.

Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache.  Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.

While desirable and likely possible this patch does not contain an
implementation for windows.

With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.

A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences.  This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.

Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-03-10 17:04:34 -08:00
Robert Haas fd31cd2651 Don't vacuum all-frozen pages.
Commit a892234f83 gave us enough
infrastructure to avoid vacuuming pages where every tuple on the
page is already frozen.  So, replace the notion of a scan_all or
whole-table vacuum with the less onerous notion of an "aggressive"
vacuum, which will pages that are all-visible, but still skip those
that are all-frozen.

This should greatly reduce the cost of anti-wraparound vacuuming
on large clusters where the majority of data is never touched
between one cycle and the next, because we'll no longer have to
read all of those pages only to find out that we don't need to
do anything with them.

Patch by me, reviewed by Masahiko Sawada.
2016-03-10 16:14:42 -05:00
Robert Haas 53be0b1add Provide much better wait information in pg_stat_activity.
When a process is waiting for a heavyweight lock, we will now indicate
the type of heavyweight lock for which it is waiting.  Also, you can
now see when a process is waiting for a lightweight lock - in which
case we will indicate the individual lock name or the tranche, as
appropriate - or for a buffer pin.

Amit Kapila, Ildus Kurbangaliev, reviewed by me.  Lots of helpful
discussion and suggestions by many others, including Alexander
Korotkov, Vladimir Borodin, and many others.
2016-03-10 12:44:09 -05:00
Alvaro Herrera a3a8309d45 Document BRIN a bit more thoroughly
The chapter "Interfacing Extensions To Indexes" and CREATE OPERATOR
CLASS reference page were missed when BRIN was added.  We document
all our other index access methods there, so make sure BRIN complies.

Author: Álvaro Herrera
Reported-By: Julien Rouhaud, Tom Lane
Reviewed-By: Emre Hasegeli
Discussion: https://www.postgresql.org/message-id/56CF604E.9000303%40dalibo.com
Backpatch: 9.5, where BRIN was introduced
2016-03-10 13:15:08 -03:00
Simon Riggs fcb4bfddb6 Reduce lock level for altering fillfactor
Fabrízio de Royes Mello and Simon Riggs
2016-03-10 12:07:33 +00:00
Peter Eisentraut e19e4cf0be doc: Reorganize pg_resetxlog reference page
The pg_resetxlog reference page didn't have a proper options list, only
running text listing the options and some explanations of them.  This
might have worked when there were only a few options, but the list has
grown over the releases, and now it's hard to find an option and its
associated explanation.  So write out the options list as on other
reference pages.
2016-03-09 21:26:06 -05:00
Alvaro Herrera 188f359d39 pgcrypto: support changing S2K iteration count
pgcrypto already supports key-stretching during symmetric encryption,
including the salted-and-iterated method; but the number of iterations
was not configurable.  This commit implements a new s2k-count parameter
to pgp_sym_encrypt() which permits selecting a larger number of
iterations.

Author: Jeff Janes
2016-03-09 14:31:07 -03:00
Robert Haas dff7ad3c61 Update GetForeignPlan documentation.
Commit 385f337c9f added a new argument
to the FDW GetForeignPlan method, but failed to update the documentation
to match.

Etsuro Fujita
2016-03-08 14:30:12 -05:00
Robert Haas 272baaa538 Fix typo.
Masahiko Sawada
2016-03-08 13:28:00 -05:00
Robert Haas ba0a198fb1 Add pg_visibility contrib module.
This lets you examine the visibility map as well as page-level
visibility information.  I initially wrote it as a debugging aid,
but was encouraged to polish it for commit.

Patch by me, reviewed by Masahiko Sawada.

Discussion: 56D77803.6080503@BlueTreble.com
2016-03-08 08:42:01 -05:00
Tom Lane a93aec4e0f Fix minor typo in logical-decoding docs.
David Rowley
2016-03-07 21:52:30 -05:00
Tom Lane 3fc6e2d7f5 Make the upper part of the planner work by generating and comparing Paths.
I've been saying we needed to do this for more than five years, and here it
finally is.  This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps.  Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step.  We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.

In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan.  It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation.  (A couple of regression test outputs change in consequence of
that.  However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)

There is a great deal left to do here.  This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations.  (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.)  I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.

Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
2016-03-07 15:58:22 -05:00
Magnus Hagander 2b46259b46 Fix typos
Author: Guillaume Lelarge
2016-03-06 12:25:47 +01:00
Joe Conway dc7d70ea05 Expose control file data via SQL accessible functions.
Add four new SQL accessible functions: pg_control_system(),
pg_control_checkpoint(), pg_control_recovery(), and pg_control_init()
which expose a subset of the control file data.

Along the way move the code to read and validate the control file to
src/common, where it can be shared by the new backend functions
and the original pg_controldata frontend program.

Patch by me, significant input, testing, and review by Michael Paquier.
2016-03-05 11:10:19 -08:00
Fujii Masao d34794f7d5 Ignore recovery_min_apply_delay until recovery has reached consistent state
Previously recovery_min_apply_delay was applied even before recovery
had reached consistency. This could cause us to wait a long time
unexpectedly for read-only connections to be allowed. It's problematic
because the standby was useless during that wait time.

This patch changes recovery_min_apply_delay so that it's applied once
the database has reached the consistent state. That is, even if the delay
is set, the standby tries to replay WAL records as fast as possible until
it has reached consistency.

Author: Michael Paquier
Reviewed-By: Julien Rouhaud
Reported-By: Greg Clough
Backpatch: 9.4, where recovery_min_apply_delay was added
Bug: #13770
Discussion: http://www.postgresql.org/message-id/20151111155006.2644.84564@wrigleys.postgresql.org
2016-03-06 02:29:04 +09:00
Teodor Sigaev d78a7d9c7f Improve support of Hunspell in ispell dictionary.
Now it's possible to load recent version of Hunspell for several languages.
To handle these dictionaries Hunspell patch adds support for:
* FLAG long - sets the double extended ASCII character flag type
* FLAG num - sets the decimal number flag type (from 1 to 65535)
* AF parameter - alias for flag's set

Also it moves test dictionaries into separate directory.

Author: Artur Zakirov with editorization by me
2016-03-04 20:08:47 +03:00
Robert Haas 33b5eab7ab Fix the way GetExistingLocalJoinPath is documented.
The old approach made it look like it was an FDW callback, which it
is not.

Per a gripe from Stephen Frost.  Patch by me, reviewed by Ashutosh
Bapat.
2016-03-04 11:44:17 -05:00
Alvaro Herrera d561f1caec pgbench: accept unambiguous builtin prefixes for -b
This makes it easier to use "-b se" instead of typing the full "-b
select-only".

Author: Fabien Coelho
Reviewed-by: Michaël Paquier
2016-03-03 19:37:13 -03:00
Robert Haas a892234f83 Change the format of the VM fork to add a second bit per page.
The new bit indicates whether every tuple on the page is already frozen.
It is cleared only when the all-visible bit is cleared, and it can be
set only when we vacuum a page and find that every tuple on that page is
both visible to every transaction and in no need of any future
vacuuming.

A future commit will use this new bit to optimize away full-table scans
that would otherwise be triggered by XID wraparound considerations.  A
page which is merely all-visible must still be scanned in that case, but
a page which is all-frozen need not be.  This commit does not attempt
that optimization, although that optimization is the goal here.  It
seems better to get the basic infrastructure in place first.

Per discussion, it's very desirable for pg_upgrade to automatically
migrate existing VM forks from the old format to the new format.  That,
too, will be handled in a follow-on patch.

Masahiko Sawada, reviewed by Kyotaro Horiguchi, Fujii Masao, Amit
Kapila, Simon Riggs, Andres Freund, and others, and substantially
revised by me.
2016-03-01 21:49:41 -05:00
Robert Haas 7e137f846d Extend pgbench's expression syntax to support a few built-in functions.
Fabien Coelho, reviewed mostly by Michael Paquier and me, but also by
Heikki Linnakangas, BeomYong Lee, Kyotaro Horiguchi, Oleksander
Shulgin, and Álvaro Herrera.
2016-03-01 13:08:30 -05:00
Alvaro Herrera 5847397dec Minor tweaks for new src/test/recovery
Author: Michael Paquier
2016-02-29 18:16:59 -03:00
Alvaro Herrera 1c7c189fc1 doc: document MANPATH as /usr/local/pgsql/share/man
The docs were advising to use /usr/local/pgsql/man instead, but that's
wrong.

Reported-By: Slawomir Sudnik
Backpatch-To: 9.1
Bug: #13894
2016-02-29 17:53:55 -03:00
Alvaro Herrera 49148645f7 Add a test framework for recovery
This long-awaited framework is an expansion of the existing PostgresNode
stuff to support additional features for recovery testing; the recovery
tests included in this commit are a starting point that cover some of
the recovery features we have.  More scripts are expected to be added
later.

Author: Michaël Paquier, a bit of help from Amir Rohan
Reviewed by: Amir Rohan, Stas Kelvich, Kyotaro Horiguchi, Victor Wagner,
Craig Ringer, Álvaro Herrera
Discussion: http://www.postgresql.org/message-id/CAB7nPqTf7V6rswrFa=q_rrWeETUWagP=h8LX8XAov2Jcxw0DRg@mail.gmail.com
Discussion: http://www.postgresql.org/message-id/trinity-b4a8035d-59af-4c42-a37e-258f0f28e44a-1443795007012@3capp-mailcom-lxa08
2016-02-26 16:13:30 -03:00
Robert Haas 35746bc348 Add new FDW API to test for parallel-safety.
This is basically a bug fix; the old code assumes that a ForeignScan
is always parallel-safe, but for postgres_fdw, for example, this is
definitely false.  It should be true for file_fdw, though, since a
worker can read a file from the filesystem just as well as any other
backend process.

Original patch by Thomas Munro.  Documentation, and changes to the
comments, by me.
2016-02-26 16:14:46 +05:30
Tom Lane 52f5d578d6 Create a function to reliably identify which sessions block which others.
This patch introduces "pg_blocking_pids(int) returns int[]", which returns
the PIDs of any sessions that are blocking the session with the given PID.
Historically people have obtained such information using a self-join on
the pg_locks view, but it's unreasonably tedious to do it that way with any
modicum of correctness, and the addition of parallel queries has pretty
much broken that approach altogether.  (Given some more columns in the view
than there are today, you could imagine handling parallel-query cases with
a 4-way join; but ugh.)

The new function has the following behaviors that are painful or impossible
to get right via pg_locks:

1. Correctly understands which lock modes block which other ones.

2. In soft-block situations (two processes both waiting for conflicting lock
modes), only the one that's in front in the wait queue is reported to
block the other.

3. In parallel-query cases, reports all sessions blocking any member of
the given PID's lock group, and reports a session by naming its leader
process's PID, which will be the pg_backend_pid() value visible to
clients.

The motivation for doing this right now is mostly to fix the isolation
tests.  Commit 38f8bdcac4 lobotomized
isolationtester's is-it-waiting query by removing its ability to recognize
nonconflicting lock modes, as a crude workaround for the inability to
handle soft-block situations properly.  But even without the lock mode
tests, the old query was excessively slow, particularly in
CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new
deadlock-hard test because the deadlock timeout elapses before they can
probe the waiting status of all eight sessions.  Replacing the pg_locks
self-join with use of pg_blocking_pids() is not only much more correct, but
a lot faster: I measure it at about 9X faster in a typical dev build with
Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds.  That should provide
enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the
test, without having to lengthen deadlock_timeout yet more and thus slow
down the test for everyone else.
2016-02-22 14:31:43 -05:00
Tom Lane 64a169d131 Docs: make prose discussion match the ordering of Table 9-58.
The "Session Information Functions" table seems to be sorted mostly
alphabetically (although it's not perfect), which would be all right
if it didn't lead to some related functions being described in a
pretty nonintuitive order.  Also, the prose discussions after the table
were in an order that hardly matched the table at all.  Rearrange to
make things a bit easier to follow.
2016-02-21 15:23:17 -05:00
Tatsuo Ishii 902fd1f4e2 Fix wording in the Tutorial document.
With suggentions from Tom Lane.
2016-02-21 09:04:59 +09:00
Dean Rasheed 53874c5228 Add pg_size_bytes() to parse human-readable size strings.
This will parse strings in the format produced by pg_size_pretty() and
return sizes in bytes. This allows queries to be written with clauses
like "pg_total_relation_size(oid) > pg_size_bytes('10 GB')".

Author: Pavel Stehule with various improvements by Vitaly Burovoy
Discussion: http://www.postgresql.org/message-id/CAFj8pRD-tGoDKnxdYgECzA4On01_uRqPrwF-8LdkSE-6bDHp0w@mail.gmail.com
Reviewed-by: Vitaly Burovoy, Oleksandr Shulgin, Kyotaro Horiguchi,
    Michael Paquier and Robert Haas
2016-02-20 09:57:27 +00:00
Peter Eisentraut 091b6055e3 doc: Improve CSS style of option element
Prevent wrapping of the element content to avoid confusing line breaks
between hyphens.
2016-02-19 23:01:54 -05:00
Tom Lane 19a541143a Add an explicit representation of the output targetlist to Paths.
Up to now, there's been an assumption that all Paths for a given relation
compute the same output column set (targetlist).  However, there are good
reasons to remove that assumption.  For example, an indexscan on an
expression index might be able to return the value of an expensive function
"for free".  While we have the ability to generate such a plan today in
simple cases, we don't have a way to model that it's cheaper than a plan
that computes the function from scratch, nor a way to create such a plan
in join cases (where the function computation would normally happen at
the topmost join node).  Also, we need this so that we can have Paths
representing post-scan/join steps, where the targetlist may well change
from one step to the next.  Therefore, invent a "struct PathTarget"
representing the columns we expect a plan step to emit.  It's convenient
to include the output tuple width and tlist evaluation cost in this struct,
and there will likely be additional fields in future.

While Path nodes that actually do have custom outputs will need their own
PathTargets, it will still be true that most Paths for a given relation
will compute the same tlist.  To reduce the overhead added by this patch,
keep a "default PathTarget" in RelOptInfo, and allow Paths that compute
that column set to just point to their parent RelOptInfo's reltarget.
(In the patch as committed, actually every Path is like that, since we
do not yet have any cases of custom PathTargets.)

I took this opportunity to provide some more-honest costing of
PlaceHolderVar evaluation.  Up to now, the assumption that "scan/join
reltargetlists have cost zero" was applied not only to Vars, where it's
reasonable, but also PlaceHolderVars where it isn't.  Now, we add the eval
cost of a PlaceHolderVar's expression to the first plan level where it can
be computed, by including it in the PathTarget cost field and adding that
to the cost estimates for Paths.  This isn't perfect yet but it's much
better than before, and there is a way forward to improve it more.  This
costing change affects the join order chosen for a couple of the regression
tests, changing expected row ordering.
2016-02-18 20:02:03 -05:00
Tom Lane 48e6c943e5 Fix multiple bugs in contrib/pgstattuple's pgstatindex() function.
Dead or half-dead index leaf pages were incorrectly reported as live, as a
consequence of a code rearrangement I made (during a moment of severe brain
fade, evidently) in commit d287818eb5.

The index metapage was not counted in index_size, causing that result to
not agree with the actual index size on-disk.

Index root pages were not counted in internal_pages, which is inconsistent
compared to the case of a root that's also a leaf (one-page index), where
the root would be counted in leaf_pages.  Aside from that inconsistency,
this could lead to additional transient discrepancies between the reported
page counts and index_size, since it's possible for pgstatindex's scan to
see zero or multiple pages marked as BTP_ROOT, if the root moves due to
a split during the scan.  With these fixes, index_size will always be
exactly one page more than the sum of the displayed page counts.

Also, the index_size result was incorrectly documented as being measured in
pages; it's always been measured in bytes.  (While fixing that, I couldn't
resist doing some small additional wordsmithing on the pgstattuple docs.)

Including the metapage causes the reported index_size to not be zero for
an empty index.  To preserve the desired property that the pgstattuple
regression test results are platform-independent (ie, BLCKSZ configuration
independent), scale the index_size result in the regression tests.

The documentation issue was reported by Otsuka Kenji, and the inconsistent
root page counting by Peter Geoghegan; the other problems noted by me.
Back-patch to all supported branches, because this has been broken for
a long time.
2016-02-18 15:40:35 -05:00
Joe Conway a5c43b8869 Add new system view, pg_config
Move and refactor the underlying code for the pg_config client
application to src/common in support of sharing it with a new
system information SRF called pg_config() which makes the same
information available via SQL. Additionally wrap the SRF with a
new system view, as called pg_config.

Patch by me with extensive input and review by Michael Paquier
and additional review by Alvaro Herrera.
2016-02-17 09:12:06 -08:00
Tom Lane a65313f28b Improve documentation about CREATE INDEX CONCURRENTLY.
Clarify the description of which transactions will block a CREATE INDEX
CONCURRENTLY command from proceeding, and mention that the index might
still not be usable after CREATE INDEX completes.  (This happens if the
index build detected broken HOT chains, so that pg_index.indcheckxmin gets
set, and there are open old transactions preventing the xmin horizon from
advancing past the index's initial creation.  I didn't want to explain what
broken HOT chains are, though, so I omitted an explanation of exactly when
old transactions prevent the index from being used.)

Per discussion with Chris Travers.  Back-patch to all supported branches,
since the same text appears in all of them.
2016-02-16 13:43:25 -05:00
Bruce Momjian ab0757c1f1 release notes: fix 9.5 SGML comment about commit
Reported-by: Tatsuo Ishii

Backpatch-through: 9.5
2016-02-16 12:43:00 -05:00
Tatsuo Ishii bdc309c7dc Improve wording in the planner doc
Change "In this case" to "In the example above" to clarify what it
actually refers to.
2016-02-16 15:49:00 +09:00
Fujii Masao 597f7e3a6e Correct the formulas for System V IPC parameters SEMMNI and SEMMNS in docs.
In runtime.sgml, the old formulas for calculating the reasonable
values of SEMMNI and SEMMNS were incorrect. They have forgotten to
count the number of semaphores which both the checkpointer process
(introduced in 9.2) and the background worker processes (introduced
in 9.3) need.

This commit fixes those formulas so that they count the number of
semaphores which the checkpointer process and the background worker
processes need.

Report and patch by Kyotaro Horiguchi. Only the patch for 9.3 was
modified by me. Back-patch to 9.2 where the checkpointer process was
added and the number of needed semaphores was increased.

Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Backpatch: 9.2
Discussion: http://www.postgresql.org/message-id/20160203.125119.66820697.horiguchi.kyotaro@lab.ntt.co.jp
2016-02-16 14:49:47 +09:00
Andres Freund 7975c5e0a9 Allow the WAL writer to flush WAL at a reduced rate.
Commit 4de82f7d7 increased the WAL flush rate, mainly to increase the
likelihood that hint bits can be set quickly. More quickly set hint bits
can reduce contention around the clog et al.  But unfortunately the
increased flush rate can have a significant negative performance impact,
I have measured up to a factor of ~4.  The reason for this slowdown is
that if there are independent writes to the underlying devices, for
example because shared buffers is a lot smaller than the hot data set,
or because a checkpoint is ongoing, the fdatasync() calls force cache
flushes to be emitted to the storage.

This is achieved by flushing WAL only if the last flush was longer than
wal_writer_delay ago, or if more than wal_writer_flush_after (new GUC)
unflushed blocks are pending. Based on some tests the default for
wal_writer_delay is 1MB, which seems to work well both on SSD and
rotational media.

To avoid negative performance impact due to 4de82f7d7 an earlier
commit (db76b1e) made SetHintBits() more likely to succeed; preventing
performance regressions in the pgbench tests I performed.

Discussion: 20160118163908.GW10941@awork2.anarazel.de
2016-02-16 00:56:34 +01:00
Magnus Hagander 57c9324755 Fix typo 2016-02-15 11:41:34 +01:00
Noah Misch 2ffa869620 Accept pg_ctl timeout from the PGCTLTIMEOUT environment variable.
Many automated test suites call pg_ctl.  Buildfarm members axolotl,
hornet, mandrill, shearwater, sungazer and tern have failed when server
shutdown took longer than the pg_ctl default 60s timeout.  This addition
permits slow hosts to easily raise the timeout without us editing a
--timeout argument into every test suite pg_ctl call.  Back-patch to 9.1
(all supported versions) for the sake of automated testing.

Reviewed by Tom Lane.
2016-02-10 20:34:02 -05:00
Robert Haas e4106b2528 postgres_fdw: Push down joins to remote servers.
If we've got a relatively straightforward join between two tables,
this pushes that join down to the remote server instead of fetching
the rows for each table and performing the join locally.  Some cases
are not handled yet, such as SEMI and ANTI joins.  Also, we don't
yet attempt to create presorted join paths or parameterized join
paths even though these options do get tried for a base relation
scan.  Nevertheless, this seems likely to be a very significant win
in many practical cases.

Shigeru Hanada and Ashutosh Bapat, reviewed by Robert Haas, with
additional review at various points by Tom Lane, Etsuro Fujita,
KaiGai Kohei, and Jeevan Chalke.
2016-02-09 14:00:50 -05:00
Tom Lane 02292845ac Last-minute updates for release notes.
Security: CVE-2016-0773
2016-02-08 10:49:37 -05:00
Tom Lane c477e84fe2 Improve documentation about PRIMARY KEY constraints.
Get rid of the false implication that PRIMARY KEY is exactly equivalent to
UNIQUE + NOT NULL.  That was more-or-less true at one time in our
implementation, but the standard doesn't say that, and we've grown various
features (many of them required by spec) that treat a pkey differently from
less-formal constraints.  Per recent discussion on pgsql-general.

I failed to resist the temptation to do some other wordsmithing in the
same area.
2016-02-07 16:02:44 -05:00
Tom Lane 1d76c97250 Release notes for 9.5.1, 9.4.6, 9.3.11, 9.2.15, 9.1.20. 2016-02-07 14:16:31 -05:00
Robert Haas 7c944bd903 Introduce a new GUC force_parallel_mode for testing purposes.
When force_parallel_mode = true, we enable the parallel mode restrictions
for all queries for which this is believed to be safe.  For the subset of
those queries believed to be safe to run entirely within a worker, we spin
up a worker and run the query there instead of running it in the
original process.  When force_parallel_mode = regress, make additional
changes to allow the regression tests to run cleanly even though parallel
workers have been injected under the hood.

Taken together, this facilitates both better user testing and better
regression testing of the parallelism code.

Robert Haas, with help from Amit Kapila and Rushabh Lathia.
2016-02-07 11:41:33 -05:00
Tom Lane 7008e70d10 First-draft release notes for 9.4.6.
As usual, the release notes for other branches will be made by cutting
these down, but put them up for community review first.
2016-02-05 17:06:23 -05:00
Robert Haas e0e7b8fa22 Remove parallel-safety check from GetExistingLocalJoinPath.
Commit a104a017fc has this check because
I added it to the submitted patch before commit, but that was entirely
wrongheaded, as explained to me by Ashutosh Bapat, who also wrote this
patch.
2016-02-05 08:07:38 -05:00
Tom Lane 6819514fca Add num_nulls() and num_nonnulls() to count NULL arguments.
An example use-case is "CHECK(num_nonnulls(a,b,c) = 1)" to assert that
exactly one of a,b,c isn't NULL.  The functions are variadic, so they
can also be pressed into service to count the number of null or nonnull
elements in an array.

Marko Tiikkaja, reviewed by Pavel Stehule
2016-02-04 23:03:37 -05:00
Robert Haas a104a017fc Add some additional core functions to support join pushdown for FDWs.
GetExistingLocalJoinPath() is useful for handling EvalPlanQual rechecks
properly, and GetUserMappingById() is needed to make sure you're using
the right credentials.

Shigeru Hanada, Etsuro Fujita, Ashutosh Bapat, Robert Haas
2016-02-04 17:05:09 -05:00
Robert Haas c1772ad922 Change the way that LWLocks for extensions are allocated.
The previous RequestAddinLWLocks() method had several disadvantages.
First, the locks would be in the main tranche; we've recently decided
that it's useful for LWLocks used for separate purposes to have
separate tranche IDs.  Second, there wasn't any correlation between
what code called RequestAddinLWLocks() and what code called
LWLockAssign(); when multiple modules are in use, it could become
quite difficult to troubleshoot problems where LWLockAssign() ran out
of locks.  To fix, create a concept of named LWLock tranches which
can be used either by extension or by core code.

Amit Kapila and Robert Haas
2016-02-04 16:43:04 -05:00
Tom Lane 5ef244a282 Simplify syntax diagram for REINDEX.
Since there currently is only one possible parenthesized option, namely
VERBOSE, it's a bit pointless to show it with "{ } [, ... ]".  The curly
braces are useless and therefore confusing, as seen in a recent question
from Karsten Hilbert.  Remove the extra decoration for the time being;
we can put it back when and if REINDEX grows some more options.
2016-02-04 13:58:40 -05:00
Robert Haas b47b4dbf68 Extend sortsupport for text to more opclasses.
Have varlena.c expose an interface that allows the char(n), bytea, and
bpchar types to piggyback on a now-generalized SortSupport for text.
This pushes a little more knowledge of the bpchar/char(n) type into
varlena.c than might be preferred, but that seems like the approach
that creates least friction.  Also speed things up for index builds
that use text_pattern_ops or varchar_pattern_ops.

This patch does quite a bit of renaming, but it seems likely to be
worth it, so as to avoid future confusion about the fact that this code
is now more generally used than the old names might have suggested.

Peter Geoghegan, reviewed by Álvaro Herrera and Andreas Karlsson,
with small tweaks by me.
2016-02-03 14:29:53 -05:00
Tom Lane 24a26c9f54 Add hstore_to_jsonb() and hstore_to_jsonb_loose() to hstore documentation.
These were never documented anywhere user-visible.  Tut tut.
2016-02-03 12:57:13 -05:00
Robert Haas 69d34408e5 Allow parallel custom and foreign scans.
This patch doesn't put the new infrastructure to use anywhere, and
indeed it's not clear how it could ever be used for something like
postgres_fdw which has to send an SQL query and wait for a reply,
but there might be FDWs or custom scan providers that are CPU-bound,
so let's give them a way to join club parallel.

KaiGai Kohei, reviewed by me.
2016-02-03 12:49:46 -05:00
Peter Eisentraut 25e44518c1 doc: Fix stand-alone INSTALL file build
Commit 7d17e683fc introduced an external
link.
2016-02-03 12:32:35 -05:00
Robert Haas f2305d40ec Remove CustomPath's TextOutCustomPath method.
You can't really do anything useful with this in the form it currently
exists; among other problems, there's no way to reread whatever
information might be produced when the path is output.  Work is
underway to replace this with a more useful and more general system of
extensible nodes, but let's start by getting rid of this bit.

Extracted from a larger patch by KaiGai Kohei.
2016-02-03 10:38:50 -05:00
Robert Haas dc203dc3ac postgres_fdw: Allow fetch_size to be set per-table or per-server.
The default fetch size of 100 rows might not be right in every
environment, so allow users to configure it.

Corey Huinker, reviewed by Kyotaro Horiguchi, Andres Freund, and me.
2016-02-03 09:07:35 -05:00
Peter Eisentraut 7d17e683fc Add support for systemd service notifications
Insert sd_notify() calls at server start and stop for integration with
systemd.  This allows the use of systemd service units of type "notify",
which greatly simplifies the systemd configuration.

Reviewed-by: Pavel Stěhule <pavel.stehule@gmail.com>
2016-02-02 21:04:29 -05:00
Alvaro Herrera 1d0c3b3f8a pgbench: allow per-script statistics
Provide per-script statistical info (count of transactions executed
under that script, average latency for the whole script) after a
multi-script run, adding an intermediate level of detail to existing
global stats and per-command stats.

Author: Fabien Coelho
Reviewer: Michaël Paquier, Álvaro Herrera
2016-02-01 15:55:33 +01:00
Andrew Dunstan 7dc09c1384 Fix error in documentated use of mingw-w64 compilers
Error reported by Igal Sapir.
2016-01-30 19:28:44 -05:00
Fujii Masao c35c4ec454 Fix syntax descriptions for replication commands in logicaldecoding.sgml
Patch-by: Oleksandr Shulgin
Reviewed-by: Craig Ringer and Fujii Masao
Backpatch-through: 9.4 where logical decoding was introduced
2016-01-29 12:14:56 +09:00
Robert Haas 80db1ca2d7 Add [NO]BYPASSRLS options to CREATE USER and ALTER USER docs.
Patch-by: Filip Rembiałkowski
Reviewed-by: Robert Haas
Backpatch-through: 9.5
2016-01-28 09:33:09 -05:00
Alvaro Herrera e37483857d Fix spi_worker mention in bgworker documentation
The documentation mentioned contrib/ but the module was moved to
src/test/modules/ by commit 22dfd116a1 of 9.5 era.

Problem pointed out by Dickson Guedes in bug #13896
Backpatch-to: 9.5.
2016-01-28 14:08:21 +01:00
Fujii Masao 62e2ddd4ca Fix typos in comments and doc
overriden -> overridden

The misspelling in create_extension.sgml was introduced in b67aaf2,
so no need to backpatch.
2016-01-28 16:47:36 +09:00
Fujii Masao 7f46eaf035 Add gin_clean_pending_list function to clean up GIN pending list
This function cleans up the pending list of the GIN index by
moving entries in it to the main GIN data structure in bulk.
It returns the number of pages cleaned up from the pending list.

This function is useful, for example, when the pending list
needs to be cleaned up *quickly* to improve the performance of
the search using GIN index. VACUUM can do the same thing, too,
but it may take days to run on a large table.

Jeff Janes,
reviewed by Julien Rouhaud, Jaime Casanova, Alvaro Herrera and me.

Discussion: CAMkU=1x8zFkpfnozXyt40zmR3Ub_kHu58LtRmwHUKRgQss7=iQ@mail.gmail.com
2016-01-28 12:57:52 +09:00
Alvaro Herrera 8bea3d2219 pgbench: improve multi-script support
Previously, it was possible to specify one or several custom scripts to
run, or only one of the builtin scripts.  With this patch it is also
possible to specify to run the builtin scripts multiple times, using the
new -b option.  Also, unify the code for both cases; this eases future
pgbench improvements.

Author: Fabien Coelho
Review: Michaël Paquier, Álvaro Herrera
2016-01-27 02:54:22 +01:00
Tom Lane e1bd684a34 Add trigonometric functions that work in degrees.
The implementations go to some lengths to deliver exact results for values
where an exact result can be expected, such as sind(30) = 0.5 exactly.

Dean Rasheed, reviewed by Michael Paquier
2016-01-22 15:46:22 -05:00
Tom Lane 80aa219146 Improve levenshtein() docs.
Fix chars-vs-bytes confusion here too.  Improve poor grammar and
markup.
2016-01-22 12:29:07 -05:00
Tom Lane 647d87c56a Make extract() do something more reasonable with infinite datetimes.
Historically, extract() just returned zero for any case involving an
infinite timestamp[tz] input; even cases in which the unit name was
invalid.  This is not very sensible.  Instead, return infinity or
-infinity as appropriate when the requested field is one that is
monotonically increasing (e.g, year, epoch), or NULL when it is not
(e.g., day, hour).  Also, throw the expected errors for bad unit names.

BACKWARDS INCOMPATIBLE CHANGE

Vitaly Burovoy, reviewed by Vik Fearing
2016-01-21 22:26:20 -05:00
Robert Haas a7de3dc5c3 Support multi-stage aggregation.
Aggregate nodes now have two new modes: a "partial" mode where they
output the unfinalized transition state, and a "finalize" mode where
they accept unfinalized transition states rather than individual
values as input.

These new modes are not used anywhere yet, but they will be necessary
for parallel aggregation.  The infrastructure also figures to be
useful for cases where we want to aggregate local data and remote
data via the FDW interface, and want to bring back partial aggregates
from the remote side that can then be combined with locally generated
partial aggregates to produce the final value.  It may also be useful
even when neither FDWs nor parallelism are in play, as explained in
the comments in nodeAgg.c.

David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki
Linnakangas, Haribabu Kommi, and me.
2016-01-20 13:46:50 -05:00
Tom Lane dbe2328959 Fix assorted inconsistencies in GIN opclass support function declarations.
GIN had some minor issues too, mostly using "internal" where something
else would be more appropriate.  I went with the same approach as in
9ff60273e3, namely preferring the opclass' indexed datatype for
arguments that receive an operator RHS value, even if that's not
necessarily what they really are.

Again, this is with an eye to having a uniform rule for ginvalidate()
to check support function signatures.
2016-01-19 22:32:22 -05:00
Tom Lane 9ff60273e3 Fix assorted inconsistencies in GiST opclass support function declarations.
The conventions specified by the GiST SGML documentation were widely
ignored.  For example, the strategy-number argument for "consistent" and
"distance" functions is specified to be a smallint, but most of the
built-in support functions declared it as an integer, and for that matter
the core code passed it using Int32GetDatum not Int16GetDatum.  None of
that makes any real difference at runtime, but it's quite confusing for
newcomers to the code, and it makes it very hard to write an amvalidate()
function that checks support function signatures.  So let's try to instill
some consistency here.

Another similar issue is that the "query" argument is not of a single
well-defined type, but could have different types depending on the strategy
(corresponding to search operators with different righthand-side argument
types).  Some of the functions threw up their hands and declared the query
argument as being of "internal" type, which surely isn't right ("any" would
have been more appropriate); but the majority position seemed to be to
declare it as being of the indexed data type, corresponding to a search
operator with both input types the same.  So I've specified a convention
that that's what to do always.

Also, the result of the "union" support function actually must be of the
index's storage type, but the documentation suggested declaring it to
return "internal", and some of the functions followed that.  Standardize
on telling the truth, instead.

Similarly, standardize on declaring the "same" function's inputs as
being of the storage type, not "internal".

Also, somebody had forgotten to add the "recheck" argument to both
the documentation of the "distance" support function and all of their
SQL declarations, even though the C code was happily using that argument.
Clean that up too.

Fix up some other omissions in the docs too, such as documenting that
union's second input argument is vestigial.

So far as the errors in core function declarations go, we can just fix
pg_proc.h and bump catversion.  Adjusting the erroneous declarations in
contrib modules is more debatable: in principle any change in those
scripts should involve an extension version bump, which is a pain.
However, since these changes are purely cosmetic and make no functional
difference, I think we can get away without doing that.
2016-01-19 12:04:36 -05:00
Tatsuo Ishii 85f22281a1 Fix typo.
Reported by KOIZUMI Satoru.
2016-01-18 21:26:30 +09:00
Tom Lane 65c5fcd353 Restructure index access method API to hide most of it at the C level.
This patch reduces pg_am to just two columns, a name and a handler
function.  All the data formerly obtained from pg_am is now provided
in a C struct returned by the handler function.  This is similar to
the designs we've adopted for FDWs and tablesample methods.  There
are multiple advantages.  For one, the index AM's support functions
are now simple C functions, making them faster to call and much less
error-prone, since the C compiler can now check function signatures.
For another, this will make it far more practical to define index access
methods in installable extensions.

A disadvantage is that SQL-level code can no longer see attributes
of index AMs; in particular, some of the crosschecks in the opr_sanity
regression test are no longer possible from SQL.  We've addressed that
by adding a facility for the index AM to perform such checks instead.
(Much more could be done in that line, but for now we're content if the
amvalidate functions more or less replace what opr_sanity used to do.)
We might also want to expose some sort of reporting functionality, but
this patch doesn't do that.

Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily
editorialized on by me.
2016-01-17 19:36:59 -05:00
Simon Riggs e63bb4549a Add new user fn pg_current_xlog_flush_location()
Tomas Vondra, reviewed by Michael Paquier and Amit Kapila
Minor edits by me
2016-01-12 07:54:52 +00:00
Peter Eisentraut c618e1b506 doc: Fix typo in logical decoding documentation
From: Petr Jelinek <petr@2ndquadrant.com>
2016-01-10 20:12:27 -05:00
Alvaro Herrera b1a9bad9e7 pgstat: add WAL receiver status view & SRF
This new view provides insight into the state of a running WAL receiver
in a HOT standby node.
The information returned includes the PID of the WAL receiver process,
its status (stopped, starting, streaming, etc), start LSN and TLI, last
received LSN and TLI, timestamp of last message send and receipt, latest
end-of-WAL LSN and time, and the name of the slot (if any).

Access to the detailed data is only granted to superusers; others only
get the PID.

Author: Michael Paquier
Reviewer: Haribabu Kommi
2016-01-07 16:21:19 -03:00
Tatsuo Ishii 65681d08b4 Fix typo in create_transform.sgml. 2016-01-06 08:03:50 +09:00
Alvaro Herrera abb1733922 Add scale(numeric)
Author: Marko Tiikkaja
2016-01-05 19:02:13 -03:00
Tom Lane ea0d494dae Make the to_reg*() functions accept text not cstring.
Using cstring as the input type was a poor decision, because that's not
really a full-fledged type.  In particular, it lacks implicit coercions
from text or varchar, meaning that usages like to_regproc('foo'||'bar')
wouldn't work; basically the only case that did work without explicit
casting was a simple literal constant argument.

The lack of field complaints about this suggests that hardly anyone
is using these functions, so hopefully fixing it won't cause much of
a compatibility problem.  They've only been there since 9.4, anyway.

Petr Korobeinikov
2016-01-05 13:02:43 -05:00
Tom Lane 83be1844ac Add to_regnamespace() and to_regrole() to the documentation.
Commits cb9fa802b3 and 0c90f6769d added these functions,
but did not bother with documentation.
2016-01-05 12:35:18 -05:00
Tom Lane 7debf36072 Docs: provide a concrete discussion and example for RLS race conditions.
Commit 43cd468cf0 added some wording to create_policy.sgml purporting
to warn users against a race condition of the sort that had been noted some
time ago by Peter Geoghegan.  However, that warning was far too vague to be
useful (or at least, I completely failed to grasp what it was on about).
Since the problem case occurs with a security design pattern that lots of
people are likely to try to use, we need to be as clear as possible about
it.  Provide a concrete example in the main-line docs in place of the
original warning.
2016-01-04 15:11:43 -05:00
Tom Lane 5d35438273 Adjust behavior of row_security GUC to match the docs.
Some time back we agreed that row_security=off should not be a way to
bypass RLS entirely, but only a way to get an error if it was being
applied.  However, the code failed to act that way for table owners.
Per discussion, this is a must-fix bug for 9.5.0.

Adjust the logic in rls.c to behave as expected; also, modify the
error message to be more consistent with the new interpretation.
The regression tests need minor corrections as well.  Also update
the comments about row_security in ddl.sgml to be correct.  (The
official description of the GUC in config.sgml is already correct.)

I failed to resist the temptation to do some other very minor
cleanup as well, such as getting rid of a duplicate extern declaration.
2016-01-04 12:21:41 -05:00
Tom Lane c1611db01f Do some copy-editing on the docs for row-level security.
Clarifications, markup improvements, corrections of misleading or
outright wrong statements.
2016-01-03 20:04:11 -05:00
Tom Lane c6aeba353a Do some copy-editing on the docs for replication origins.
Minor grammar and markup improvements.
2016-01-03 16:03:42 -05:00
Tom Lane 027989197a Do a final round of copy-editing on the 9.5 release notes. 2016-01-03 15:33:12 -05:00
Tom Lane df35af2ca7 Adjust back-branch release note description of commits a2a718b22 et al.
As pointed out by Michael Paquier, recovery_min_apply_delay didn't exist
in 9.0-9.3, making the release note text not very useful.  Instead make it
talk about recovery_target_xid, which did exist then.

9.0 is already out of support, but we can fix the text in the newer
branches' copies of its release notes.
2016-01-02 15:29:02 -05:00
Bruce Momjian ee94300446 Update copyright for 2016
Backpatch certain files through 9.1
2016-01-02 13:33:40 -05:00
Peter Eisentraut 253de19b84 doc: Remove redundant duplicate URLs from ulink elements
Empty ulink elements default to displaying the URL, so there is no need
to specify the URL again.  This was already done for most occurrences,
but some cases didn't follow this convention.
2015-12-31 22:26:57 -05:00
Peter Eisentraut 805ac78aab doc: Add index entries and better documentation link for Linux OOM 2015-12-31 22:03:13 -05:00
Tom Lane e5e5267a91 Minor hacking on contrib/cube documentation.
Improve markup, particularly of the table of functions; add or improve
examples for some of the functions; wordsmith some of the function
descriptions.
2015-12-29 21:21:04 -05:00
Tom Lane 81ee726d87 Code and docs review for cube kNN support.
Commit 33bd250f6c could have done with
some more review:

Adjust coding so that compilers unfamiliar with elog/ereport don't complain
about uninitialized values.

Fix misuse of PG_GETARG_INT16 to retrieve arguments declared as "integer"
at the SQL level.  (This was evidently copied from cube_ll_coord and
cube_ur_coord, but those were wrong too.)

Fix non-style-guide-conforming error messages.

Fix underparenthesized if statements, which pgindent would have made a
hash of, and remove some unnecessary parens elsewhere.

Run pgindent over new code.

Revise documentation: repeated accretion of more operators without any
rethinking of the text already there had left things in a bit of a mess.
Merge all the cube operators into one table and adjust surrounding text
appropriately.

David Rowley and Tom Lane
2015-12-28 14:39:12 -05:00
Alvaro Herrera ac443d1034 Document brin_summarize_new_pages
Pointer out by Jeff Janes
2015-12-28 15:28:19 -03:00
Tom Lane 54aaafe95f Document the exponentiation operator as associating left to right.
Common mathematical convention is that exponentiation associates right to
left.  We aren't going to change the parser for this, but we could note
it in the operator's description.  (It's already noted in the operator
precedence/associativity table, but users might not look there.)
Per bug #13829 from Henrik Pauli.
2015-12-28 12:09:00 -05:00
Alvaro Herrera 151c4ffe41 doc: pg_committs -> pg_commit_ts
Reported by: Alain Laporte (#13836)
2015-12-28 13:45:03 -03:00
Tom Lane 731dfc7d5f Update documentation about pseudo-types.
Tone down an overly strong statement about which pseudo-types PLs are
likely to allow.  Add "event_trigger" to the list, as well as
"pg_ddl_command" in 9.5/HEAD.  Back-patch to 9.3 where event_trigger
was added.
2015-12-28 11:04:42 -05:00
Tom Lane 71dd092c01 Docs: fix erroneously-given function name.
pg_replication_session_is_setup() exists nowhere; apparently this is
meant to refer to pg_replication_origin_session_is_setup().

Adrien Nayrat
2015-12-24 10:50:03 -05:00
Tom Lane bee172fcd5 Docs typo fix.
Michael Paquier
2015-12-24 10:23:44 -05:00
Tom Lane 6efbded6e4 Allow omitting one or both boundaries in an array slice specifier.
Omitted boundaries represent the upper or lower limit of the corresponding
array subscript.  This allows simpler specification of many common
use-cases.

(Revised version of commit 9246af6799)

YUriy Zhuravlev
2015-12-22 21:05:29 -05:00
Teodor Sigaev bbbd807097 Revert 9246af6799 because
I miss too much. Patch is returned to commitfest process.
2015-12-18 21:35:22 +03:00
Robert Haas 3c7042a7d7 pgbench: Change terminology from "threshold" to "parameter".
Per a recommendation from Tomas Vondra, it's more helpful to refer to
the value that determines how skewed a Gaussian or exponential
distribution is as a parameter rather than a threshold.

Since it's not quite too late to get this right in 9.5, where it was
introduced, back-patch this.  Most of the patch changes only comments
and documentation, but a few pgbench messages are altered to match.

Fabien Coelho, reviewed by Michael Paquier and by me.
2015-12-18 13:24:51 -05:00
Teodor Sigaev 9246af6799 Allow to omit boundaries in array subscript
Allow to omiy lower or upper or both boundaries in array subscript
for selecting slice of array.

Author: YUriy Zhuravlev
2015-12-18 15:18:58 +03:00
Teodor Sigaev 33bd250f6c Cube extension kNN support
Introduce distance operators over cubes:
<#> taxicab distance
<->  euclidean distance
<=> chebyshev distance

Also add kNN support of those distances in GiST opclass.

Author: Stas Kelvich
2015-12-18 14:38:27 +03:00
Tom Lane 66d947b9d3 Adjust behavior of single-user -j mode for better initdb error reporting.
Previously, -j caused the entire input file to be read in and executed as
a single command string.  That's undesirable, not least because any error
causes the entire file to be regurgitated as the "failing query".  Some
experimentation suggests a better rule: end the command string when we see
a semicolon immediately followed by two newlines, ie, an empty line after
a query.  This serves nicely to break up the existing examples such as
information_schema.sql and system_views.sql.  A limitation is that it's
no longer possible to write such a sequence within a string literal or
multiline comment in a file meant to be read with -j; but there are no
instances of such a problem within the data currently used by initdb.
(If someone does make such a mistake in future, it'll be obvious because
they'll get an unterminated-literal or unterminated-comment syntax error.)
Other than that, there shouldn't be any negative consequences; you're not
forced to end statements that way, it's just a better idea in most cases.

In passing, remove src/include/tcop/tcopdebug.h, which is dead code
because it's not included anywhere, and hasn't been for more than
ten years.  One of the debug-support symbols it purported to describe
has been unreferenced for at least the same amount of time, and the
other is removed by this commit on the grounds that it was useless:
forcing -j mode all the time would have broken initdb.  The lack of
complaints about that, or about the missing inclusion, shows that
no one has tried to use TCOP_DONTUSENEWLINE in many years.
2015-12-17 19:34:15 -05:00
Tom Lane 0625dbb0b9 Document use of Subject Alternative Names in SSL server certificates.
Commit acd08d764 did not bother with updating the documentation.
2015-12-15 16:57:23 -05:00
Tom Lane bfc7f5dd5d Update 9.5 release notes through today.
Also do another round of copy-editing, and fix up remaining FIXME items.
2015-12-15 16:42:27 -05:00
Stephen Frost 43cd468cf0 Improve CREATE POLICY documentation
Clarify that SELECT policies are now applied when SELECT rights
are required for a given query, even if the query is an UPDATE or
DELETE query.  Pointed out by Noah.

Additionally, note the risk regarding concurrently open transactions
where a relation which controls access to the rows of another relation
are updated and the rows of the primary relation are also being
modified.  Pointed out by Peter Geoghegan.

Back-patch to 9.5.
2015-12-15 10:08:09 -05:00
Tom Lane 7bd149ce2a Docs: document that psql's "\i -" means read from stdin.
This has worked that way for a long time, maybe always, but you would
not have known it from the documentation.  Also back-patch the notes
I added to HEAD earlier today about behavior of the "-f -" switch,
which likewise have been valid for many releases.
2015-12-13 23:42:54 -05:00
Tom Lane fcbbf82d2b Code and docs review for multiple -c and -f options in psql.
Commit d5563d7df9 drew complaints from Coverity, which quite
correctly complained that one copy of each -c or -f string was being
leaked.  What's more, simple_action_list_append was allocating enough space
for still a third copy of each string as part of the SimpleActionListCell,
even though that coding method had been superseded by a separate strdup
operation.  There were some other minor coding infelicities too.  The
documentation needed more work as well, eg it forgot to explain that -c
causes psql not to accept any interactive input.
2015-12-13 14:52:07 -05:00
Tom Lane 6d96cd077b Doc: update external URLs for PostGIS project.
Paul Ramsey
2015-12-12 20:02:09 -05:00
Peter Eisentraut 19e7ca8938 doc: Add some markup 2015-12-12 11:31:28 -05:00
Robert Haas 348bcd8645 Fix typo.
Etsuro Fujita
2015-12-10 11:14:09 -05:00
Robert Haas c00239ea6a Remove redundant sentence.
Peter Geoghegan
2015-12-09 14:11:58 -05:00
Robert Haas d5563d7df9 psql: Support multiple -c and -f options, and allow mixing them.
To support this, we must reconcile some historical anomalies in the
behavior of -c.  In particular, as a backward-incompatibility, -c no
longer implies --no-psqlrc.

Pavel Stehule (code) and Catalin Iacob (documentation).  Review by
Michael Paquier and myself.  Proposed behavior per Tom Lane.
2015-12-08 14:04:08 -05:00
Robert Haas 385f337c9f Allow foreign and custom joins to handle EvalPlanQual rechecks.
Commit e7cb7ee145 provided basic
infrastructure for allowing a foreign data wrapper or custom scan
provider to replace a join of one or more tables with a scan.
However, this infrastructure failed to take into account the need
for possible EvalPlanQual rechecks, and ExecScanFetch would fail
an assertion (or just overwrite memory) if such a check was attempted
for a plan containing a pushed-down join.  To fix, adjust the EPQ
machinery to skip some processing steps when scanrelid == 0, making
those the responsibility of scan's recheck method, which also has
the responsibility in this case of correctly populating the relevant
slot.

To allow foreign scans to gain control in the right place to make
use of this new facility, add a new, optional RecheckForeignScan
method.  Also, allow a foreign scan to have a child plan, which can
be used to correctly populate the slot (or perhaps for something
else, but this is the only use currently envisioned).

KaiGai Kohei, reviewed by Robert Haas, Etsuro Fujita, and Kyotaro
Horiguchi.
2015-12-08 12:31:03 -05:00
Tom Lane b0cfb02cec Update xindex.sgml for recent additions to GIST opclass API.
Commit d04c8ed904 added another support function to the GIST API,
but overlooked mentioning it in xindex.sgml's summary of index support
functions.

Anastasia Lubennikova
2015-12-06 12:42:32 -05:00
Tom Lane 63acfb79ab Further improve documentation of the role-dropping process.
In commit 1ea0c73c2 I added a section to user-manag.sgml about how to drop
roles that own objects; but as pointed out by Stephen Frost, I neglected
that shared objects (databases or tablespaces) may need special treatment.
Fix that.  Back-patch to supported versions, like the previous patch.
2015-12-04 14:44:13 -05:00
Peter Eisentraut f15b820a5c doc: Add serial comma 2015-12-03 10:24:16 -05:00
Peter Eisentraut 9ff1a11a2d doc: Fix markup and improve placeholder names 2015-12-03 10:20:54 -05:00
Teodor Sigaev e50cda7840 Use pg_rewind when target timeline was switched
Allow pg_rewind to work when target timeline was switched. Now
user can return promoted standby to old master.

Target timeline history becomes a global variable. Index
in target timeline history is used in function interfaces instead of
specifying TLI directly. Thus, SimpleXLogPageRead() can easily start
reading XLOGs from next timeline when current timeline ends.

Author: Alexander Korotkov
Review: Michael Paquier
2015-12-01 18:56:44 +03:00
Tom Lane 40cb21f70b Improve PQhost() to return useful data for default Unix-socket connections.
Previously, if no host information had been specified at connection time,
PQhost() would return NULL (unless you are on Windows, in which case you
got "localhost").  This is an unhelpful definition for a couple of reasons:
it can cause corner-case crashes in applications (cf commit c5ef8ce53d),
and there's no well-defined way for applications to find out the socket
directory path that's actually in use.  As an example of the latter
problem, psql substituted DEFAULT_PGSOCKET_DIR for NULL in a couple of
places, but this is subtly wrong because it's conceivable that psql is
using a libpq shared library that was built with a different setting.

Hence, change PQhost() to return DEFAULT_PGSOCKET_DIR when appropriate,
and strip out the now-dead substitutions in psql.  (There is still one
remaining reference to DEFAULT_PGSOCKET_DIR in psql, in prompt.c, which
I don't see a nice way to get rid of.  But it only controls a prompt
abbreviation decision, so it seems noncritical.)

Also update the docs for PQhost, which had never previously mentioned
the possibility of a socket directory path being returned.  In passing
fix the outright-incorrect code comment about PGconn.pgunixsocket.
2015-11-27 14:13:53 -05:00
Teodor Sigaev 92e38182d7 COPY (INSERT/UPDATE/DELETE .. RETURNING ..)
Attached is a patch for being able to do COPY (query) without a CTE.

Author: Marko Tiikkaja
Review: Michael Paquier
2015-11-27 19:11:22 +03:00
Teodor Sigaev d6061f83a1 Improve pageinspect module
Now pageinspect can show data stored in the heap tuple.

Nikolay Shaplov
2015-11-25 16:31:55 +03:00
Peter Eisentraut cbd96eff25 doc: Some improvements on CREATE POLICY and ALTER POLICY documentation 2015-11-23 21:36:57 -05:00
Teodor Sigaev d00352573a Clarify pg_rewind connection requirements.
Per http://www.postgresql.org/message-id/flat/564C4CE6.9000509@postgrespro.ru
Pavel Luzanov <p.luzanov@postgrespro.ru>
2015-11-23 19:27:01 +03:00
Peter Eisentraut 2ef7a985fb doc: Add more documentation about wal_retrieve_retry_interval
from Michael Paquier
2015-11-23 09:14:35 -05:00
Tom Lane 00cdd83521 Adopt the GNU convention for handling tar-archive members exceeding 8GB.
The POSIX standard for tar headers requires archive member sizes to be
printed in octal with at most 11 digits, limiting the representable file
size to 8GB.  However, GNU tar and apparently most other modern tars
support a convention in which oversized values can be stored in base-256,
allowing any practical file to be a tar member.  Adopt this convention
to remove two limitations:
* pg_dump with -Ft output format failed if the contents of any one table
exceeded 8GB.
* pg_basebackup failed if the data directory contained any file exceeding
8GB.  (This would be a fatal problem for installations configured with a
table segment size of 8GB or more, and it has also been seen to fail when
large core dump files exist in the data directory.)

File sizes under 8GB are still printed in octal, so that no compatibility
issues are created except in cases that would have failed entirely before.

In addition, this patch fixes several bugs in the same area:

* In 9.3 and later, we'd defined tarCreateHeader's file-size argument as
size_t, which meant that on 32-bit machines it would write a corrupt tar
header for file sizes between 4GB and 8GB, even though no error was raised.
This broke both "pg_dump -Ft" and pg_basebackup for such cases.

* pg_restore from a tar archive would fail on tables of size between 4GB
and 8GB, on machines where either "size_t" or "unsigned long" is 32 bits.
This happened even with an archive file not affected by the previous bug.

* pg_basebackup would fail if there were files of size between 4GB and 8GB,
even on 64-bit machines.

* In 9.3 and later, "pg_basebackup -Ft" failed entirely, for any file size,
on 64-bit big-endian machines.

In view of these potential data-loss bugs, back-patch to all supported
branches, even though removal of the documented 8GB limit might otherwise
be considered a new feature rather than a bug fix.
2015-11-21 20:21:31 -05:00
Peter Eisentraut db135e834a doc: Clarify some things on pg_receivexlog reference page 2015-11-19 14:26:44 -05:00
Andrew Dunstan c2d5657c0f Update docs for vcregress.pl bincheck changes 2015-11-18 23:32:16 -05:00
Andres Freund edf68b2ed5 Improve ON CONFLICT documentation.
Author: Peter Geoghegan and Andres Freund
Discussion: CAM3SWZScpWzQ-7EJC77vwqzZ1GO8GNmURQ1QqDQ3wRn7AbW1Cg@mail.gmail.com
Backpatch: 9.5, where ON CONFLICT was introduced
2015-11-19 01:37:58 +01:00
Peter Eisentraut 53264c7b1e doc: Fix commas and improve spacing 2015-11-16 19:02:38 -05:00
Stephen Frost 42aa1c032e Correct sepgsql docs with regard to RLS
The sepgsql docs included a comment that PG doesn't support RLS.  That
is only true for versions prior to 9.5.

Update the docs for 9.5 and master to say that PG supports RLS but that
sepgsql does not yet.

Pointed out by Heikki.

Back-patch to 9.5
2015-11-13 11:06:38 -05:00
Robert Haas a05dc4d7fd Provide readfuncs support for custom scans.
Commit a0d9f6e434 added this support for
all other plan node types; this fills in the gap.

Since TextOutCustomScan complicates this and is pretty well useless,
remove it.

KaiGai Kohei, with some modifications by me.
2015-11-12 07:40:31 -05:00
Tom Lane 39b9978d9c Do a round of copy-editing on the 9.5 release notes.
Also fill in the previously empty "major enhancements" list.  YMMV as to
which items should make the cut, but it's past time we had something more
than a placeholder here.

(I meant to get this done before beta2 was wrapped, but got distracted by
PDF build problems.  Better late than never.)
2015-11-11 19:19:14 -05:00
Tom Lane 6404751ce9 Improve documentation around autovacuum-related storage parameters.
These were discussed in three different sections of the manual, which
unsurprisingly had diverged over time; and the descriptions of individual
variables lacked stylistic consistency even within each section (and
frequently weren't in very good English anyway).  Clean up the mess, and
remove some of the redundant information in hopes that future additions
will be less likely to re-introduce inconsistency.  For instance I see
no need for maintenance.sgml to include its very own list of all the
autovacuum storage parameters, especially since that list was already
incomplete.
2015-11-11 17:13:38 -05:00
Tom Lane 7b6fb76349 Docs: fix misleading example.
Commit 8457d0beca introduced an example which, while not incorrect,
failed to exhibit the behavior it meant to describe, as a result of omitting
an E'' prefix that needed to be there.  Noticed and fixed by Peter Geoghegan.

I (tgl) failed to resist the temptation to wordsmith nearby text a bit
while at it.
2015-11-10 22:11:39 -05:00
Tom Lane 944b41fc00 Improve our workaround for 'TeX capacity exceeded' in building PDF files.
In commit a5ec86a7c7 I wrote a quick hack
that reduced the number of TeX string pool entries created while converting
our documentation to PDF form.  That held the fort for awhile, but as of
HEAD we're back up against the same limitation.  It turns out that the
original coding of \FlowObjectSetup actually results in *three* string pool
entries being generated for every "flow object" (that is, potential
cross-reference target) in the documentation, and my previous hack only got
rid of one of them.  With a little more care, we can reduce the string
count to one per flow object plus one per actually-cross-referenced flow
object (about 115000 + 5000 as of current HEAD); that should work until
the documentation volume roughly doubles from where it is today.

As a not-incidental side benefit, this change also causes pdfjadetex to
stop emitting unreferenced hyperlink anchors (bookmarks) into the PDF file.
It had been making one willy-nilly for every flow object; now it's just one
per actually-cross-referenced object.  This results in close to a 2X
savings in PDF file size.  We will still want to run the output through
"jpdftweak" to get it to be compressed; but we no longer need removal of
unreferenced bookmarks, so we might be able to find a quicker tool for
that step.

Although the failure only affects HEAD and US-format output at the moment,
9.5 cannot be more than a few pages short of failing likewise, so it
will inevitably fail after a few rounds of minor-version release notes.
I don't have a lot of faith that we'll never hit the limit in the older
branches; and anyway it would be nice to get rid of jpdftweak across the
board.  Therefore, back-patch to all supported branches.
2015-11-10 15:59:59 -05:00
Andres Freund c31f1dc559 Add paragraph about ON CONFLICT interaction with partitioning.
Author: Peter Geoghegan and Andres Freund
Discussion: CAM3SWZScpWzQ-7EJC77vwqzZ1GO8GNmURQ1QqDQ3wRn7AbW1Cg@mail.gmail.com,
    CAHGQGwFUCWwSU7dtc2aRdRk73ztyr_jY5cPOyts+K8xKJ92X4Q@mail.gmail.com
Backpatch: 9.5, where UPSERT was introduced
2015-11-09 06:57:21 +01:00
Tom Lane ad9fad7b68 Update 9.5 release notes through today. 2015-11-07 17:09:04 -05:00
Tom Lane 9042f58342 Rename PQsslAttributes() to PQsslAttributeNames(), and const-ify fully.
Per discussion, the original name was a bit misleading, and
PQsslAttributeNames() seems more apropos.  It's not quite too late to
change this in 9.5, so let's change it while we can.

Also, make sure that the pointer array is const, not only the pointed-to
strings.

Minor documentation wordsmithing while at it.

Lars Kanis, slight adjustments by me
2015-11-07 16:13:49 -05:00
Robert Haas dde5f09fad Document interaction of bgworkers with LISTEN/NOTIFY.
Thomas Munro and Robert Haas, reviewed by Haribabu Kommi
2015-11-06 00:31:46 -05:00
Robert Haas 64b2e7ad91 Pass extra data to bgworkers, and use this to fix parallel contexts.
Up until now, the total amount of data that could be passed to a
background worker at startup was one datum, which can be a small as
4 bytes on some systems.  That's enough to pass a dsm_handle or an
array index, but not much else.  Add a bgw_extra flag to the
BackgroundWorker struct, allowing up to 128 bytes to be passed to
a new worker on any platform.

Use this to fix a problem I recently discovered with the parallel
context machinery added in 9.5: the master assigns each worker an
array index, and each worker subsequently assigns itself an array
index, and there's nothing to guarantee that the two sets of indexes
match, leading to chaos.

Normally, I would not back-patch the change to add bgw_extra, since it
is basically a feature addition.  However, since 9.5 is still in beta
and there seems to be no other sensible way to repair the broken
parallel context machinery, back-patch to 9.5.  Existing background
worker code can ignore the bgw_extra field without a problem, but
might need to be recompiled since the structure size has changed.

Report and patch by me.  Review by Amit Kapila.
2015-11-05 12:13:56 -05:00