Commit Graph

12953 Commits

Author SHA1 Message Date
Tom Lane 1e7c4bb004 Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column).  This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on.  For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison".  Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.

To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command.  In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.

Also, change creation of an unknown-type column in a relation from a
warning to a hard error.  The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE.  We want to forbid that because
it's nothing but a foot-gun.

This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have.  Add a checking pass to detect that.  While at it,
we can detect tables or composite types that would fail, essentially
for free.  Those would fail safely anyway later on, but we might as
well fail earlier.

This patch is by me, but it owes something to previous investigations
by Rahila Syed.  Also thanks to Ashutosh Bapat and Michael Paquier for
review.

Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 09:17:24 -05:00
Peter Eisentraut 123f03ba2c doc: Update ALTER SEQUENCE documentation to match
Update documentation to match change in
0bc1207aeb.
2017-01-25 08:59:24 -05:00
Tom Lane ba005f193d Allow password file name to be specified as a libpq connection parameter.
Formerly an alternate password file could only be selected via the
environment variable PGPASSFILE; now it can also be selected via a
new connection parameter "passfile", corresponding to the conventions
for most other connection parameters.  There was some concern about
this creating a security weakness, but it was agreed that that argument
was pretty thin, and there are clear use-cases for handling password
files this way.

Julian Markwort, reviewed by Fabien Coelho, some adjustments by me

Discussion: https://postgr.es/m/a4b4f4f1-7b58-a0e8-5268-5f7db8e8ccaa@uni-muenster.de
2017-01-24 17:06:34 -05:00
Robert Haas d1ecd53947 Add a SHOW command to the replication command language.
This is useful infrastructure for an upcoming proposed patch to
allow the WAL segment size to be changed at initdb time; tools like
pg_basebackup need the ability to interrogate the server setting.
But it also doesn't seem like a bad thing to have independently of
that; it may find other uses in the future.

Robert Haas and Beena Emerson.  (The original patch here was by
Beena, but I rewrote it to such a degree that most of the code
being committed here is mine.)

Discussion: http://postgr.es/m/CA+TgmobNo4qz06wHEmy9DszAre3dYx-WNhHSCbU9SAwf+9Ft6g@mail.gmail.com
2017-01-24 17:04:12 -05:00
Robert Haas 7b4ac19982 Extend index AM API for parallel index scans.
This patch doesn't actually make any index AM parallel-aware, but it
provides the necessary functions at the AM layer to do so.

Rahila Syed, Amit Kapila, Robert Haas
2017-01-24 16:42:58 -05:00
Peter Eisentraut 0bc1207aeb Fix default minimum value for descending sequences
For some reason that is lost in history, a descending sequence would
default its minimum value to -2^63+1 (-PG_INT64_MAX) instead of
-2^63 (PG_INT64_MIN), even though explicitly specifying a minimum value
of -2^63 would work.  Fix this inconsistency by using the full range by
default.

Reported-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-01-23 14:00:58 -05:00
Tom Lane cdc2a70470 Allow backslash line continuations in pgbench's meta commands.
A pgbench meta command can now be continued onto additional line(s) of a
script file by writing backslash-return.  The continuation marker is
equivalent to white space in that it separates tokens.

Eventually it'd be nice to have the same thing in psql, but that will
be a much larger project.

Fabien Coelho, reviewed by Rafia Sabih

Discussion: https://postgr.es/m/alpine.DEB.2.20.1610031049310.19411@lancre
2017-01-20 11:10:22 -05:00
Fujii Masao 9547370950 Add description of temporary column into pg_replication_slots doc.
Ayumi Ishii
2017-01-21 00:55:36 +09:00
Peter Eisentraut 665d1fad99 Logical replication
- Add PUBLICATION catalogs and DDL
- Add SUBSCRIPTION catalog and DDL
- Define logical replication protocol and output plugin
- Add logical replication workers

From: Petr Jelinek <petr@2ndquadrant.com>
Reviewed-by: Steve Singer <steve@ssinger.info>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Erik Rijkers <er@xs4all.nl>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
2017-01-20 09:04:49 -05:00
Tom Lane f13a1277aa Doc: improve documentation of new SRF-in-tlist behavior.
Correct a misstatement about how things used to work: we did allow nested
SRFs before, as long as no function had more than one set-returning input.

Also, attempt to document the fact that the new implementation changes the
behavior for SRFs within conditional constructs (eg CASE): the conditional
construct no longer gates whether the SRF is run, and thus cannot affect
the number of rows emitted.  We might want to change this behavior, but
first it behooves us to see if we can explain it.

Minor other wordsmithing on what I wrote yesterday, too.

Discussion: https://postgr.es/m/20170118214702.54b2mdbxce5piwv5@alap3.anarazel.de
2017-01-18 18:10:23 -05:00
Andres Freund 69f4b9c85f Move targetlist SRF handling from expression evaluation to new executor node.
Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT
generate_series(1,5)) so far was done in the expression evaluation (i.e.
ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code.

This meant that most executor nodes performing projection, and most
expression evaluation functions, had to deal with the possibility that an
evaluated expression could return a set of return values.

That's bad because it leads to repeated code in a lot of places. It also,
and that's my (Andres's) motivation, made it a lot harder to implement a
more efficient way of doing expression evaluation.

To fix this, introduce a new executor node (ProjectSet) that can evaluate
targetlists containing one or more SRFs. To avoid the complexity of the old
way of handling nested expressions returning sets (e.g. having to pass up
ExprDoneCond, and dealing with arguments to functions returning sets etc.),
those SRFs can only be at the top level of the node's targetlist.  The
planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is
only necessary in ProjectSet nodes and that SRFs are only present at the
top level of the node's targetlist. If there are nested SRFs the planner
creates multiple stacked ProjectSet nodes.  The ProjectSet nodes always get
input from an underlying node.

We also discussed and prototyped evaluating targetlist SRFs using ROWS
FROM(), but that turned out to be more complicated than we'd hoped.

While moving SRF evaluation to ProjectSet would allow to retain the old
"least common multiple" behavior when multiple SRFs are present in one
targetlist (i.e.  continue returning rows until all SRFs are at the end of
their input at the same time), we decided to instead only return rows till
all SRFs are exhausted, returning NULL for already exhausted ones.  We
deemed the previous behavior to be too confusing, unexpected and actually
not particularly useful.

As a side effect, the previously prohibited case of multiple set returning
arguments to a function, is now allowed. Not because it's particularly
desirable, but because it ends up working and there seems to be no argument
for adding code to prohibit it.

Currently the behavior for COALESCE and CASE containing SRFs has changed,
returning multiple rows from the expression, even when the SRF containing
"arm" of the expression is not evaluated. That's because the SRFs are
evaluated in a separate ProjectSet node.  As that's quite confusing, we're
likely to instead prohibit SRFs in those places.  But that's still being
discussed, and the code would reside in places not touched here, so that's
a task for later.

There's a lot of, now superfluous, code dealing with set return expressions
around. But as the changes to get rid of those are verbose largely boring,
it seems better for readability to keep the cleanup as a separate commit.

Author: Tom Lane and Andres Freund
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 13:40:27 -08:00
Magnus Hagander d00ca333c3 Implement array version of jsonb_delete and operator
This makes it possible to delete multiple keys from a jsonb value by
passing in an array of text values, which makes the operaiton much
faster than individually deleting the keys (which would require copying
the jsonb structure over and over again.

Reviewed by Dmitry Dolgov and Michael Paquier
2017-01-18 21:37:59 +01:00
Peter Eisentraut aa17c06fb5 Add function to import operating system collations
Move this logic out of initdb into a user-callable function.  This
simplifies the code and makes it possible to update the standard
collations later on if additional operating system collations appear.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Euler Taveira <euler@timbira.com.br>
2017-01-18 09:35:56 -05:00
Peter Eisentraut 6181c34da8 doc: Update URL for Microsoft download site 2017-01-17 10:05:01 -05:00
Magnus Hagander cada1af31d Add compression support to pg_receivexlog
Author: Michael Paquier, review and small changes by me
2017-01-17 12:10:26 +01:00
Magnus Hagander e7b020f786 Make pg_basebackup use temporary replication slots
Temporary replication slots will be used by default when wal streaming
is used and no slot name is specified with -S. If a slot name is
specified, then a permanent slot with that name is used. If --no-slot is
specified, then no permanent or temporary slot will be used.

Temporary slots are only used on 10.0 and newer, of course.
2017-01-16 13:56:43 +01:00
Magnus Hagander f6d6d2920d Change default values for backup and replication parameters
This changes the default values of the following parameters:

wal_level = replica
max_wal_senders = 10
max_replication_slots = 10

in order to make it possible to make a backup and set up simple
replication on the default settings, without requiring a system restart.

Discussion: https://postgr.es/m/CABUevEy4PR_EAvZEzsbF5s+V0eEvw7shJ2t-AUwbHOjT+yRb3A@mail.gmail.com

Reviewed by Peter Eisentraut. Benchmark help from Tomas Vondra.
2017-01-14 17:14:56 +01:00
Peter Eisentraut 05cd12ed5b pg_ctl: Change default to wait for all actions
The different actions in pg_ctl had different defaults for -w and -W,
mostly for historical reasons.  Most users will want the -w behavior, so
make that the default.

Remove the -w option in most example and test code, so avoid confusion
and reduce verbosity.  pg_upgrade is not touched, so it can continue to
work with older installations.

Reviewed-by: Beena Emerson <memissemerson@gmail.com>
Reviewed-by: Ryan Murphy <ryanfmurphy@gmail.com>
2017-01-14 09:15:08 -05:00
Peter Eisentraut e574f15d62 Updates to reflect that pg_ctl stop -m fast is the default
Various example and test code used -m fast explicitly, but since it's
the default, this can be omitted now or should be replaced by a better
example.

pg_upgrade is not touched, so it can continue to operate with older
installations.
2017-01-13 21:25:36 -05:00
Bruce Momjian 73f8d73313 pg_xlogdump: document --path behavior
The previous --path documentation and --help output were wrong in both
its meaning and the defaults.

Reviewed-by: Michael Paquier

Backpatch-through: 9.6
2017-01-10 22:38:14 -05:00
Peter Eisentraut 933b46644c Use 'use strict' in all Perl programs 2017-01-05 12:34:48 -05:00
Robert Haas 44f7afba79 Improve documentation of timestamp internal representation.
Be more clear that we represent timestamps in microseconds when
integer timestamps are used, and in fractional seconds when
floating-point timestamps are used.

Discussion: http://postgr.es/m/20161212135045.GB15488@e733.localdomain

Report by Alexander Alekseev.  Wording by me with a suggestion
from Tom Lane.
2017-01-04 16:30:16 -05:00
Simon Riggs 7c030783a5 Add pg_recvlogical —-endpos=LSN
Allow pg_recvlogical to specify an ending LSN, complementing
the existing -—startpos=LSN option.

Craig Ringer, reviewed by Euler Taveira and Naoki Okano
2017-01-04 19:02:07 +00:00
Tom Lane 6667d9a6d7 Re-allow SSL passphrase prompt at server start, but not thereafter.
Leave OpenSSL's default passphrase collection callback in place during
the first call of secure_initialize() in server startup.  Although that
doesn't work terribly well in daemon contexts, some people feel we should
not break it for anyone who was successfully using it before.  We still
block passphrase demands during SIGHUP, meaning that you can't adjust SSL
configuration on-the-fly if you used a passphrase, but this is no worse
than what it was before commit de41869b6.  And we block passphrase demands
during EXEC_BACKEND reloads; that behavior wasn't useful either, but at
least now it's documented.

Tweak some related log messages for more readability, and avoid issuing
essentially duplicate messages about reload failure caused by a passphrase.

Discussion: https://postgr.es/m/29982.1483412575@sss.pgh.pa.us
2017-01-04 12:44:03 -05:00
Magnus Hagander 9a4d51077c Make wal streaming the default mode for pg_basebackup
Since streaming is now supported for all output formats, make this the
default as this is what most people want.

To get the old behavior, the parameter -X none can be specified to turn
it off.

This also removes the parameter -x for fetch, now requiring -X fetch to
be specified to use that.

Reviewed by Vladimir Rusinov, Michael Paquier and Simon Riggs
2017-01-04 10:40:38 +01:00
Bruce Momjian 1d25779284 Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Bruce Momjian 60f1e514ad Update manual set of copyright files for 2017 2017-01-03 13:45:17 -05:00
Bruce Momjian 7090eed3b3 Update copyright for 2017
Backpatch-through: certain files through 9.2
2017-01-03 12:37:53 -05:00
Tom Lane 1e942c7474 Disable prompting for passphrase while (re)loading SSL config files.
OpenSSL's default behavior when loading a passphrase-protected key file
is to open /dev/tty and demand the password from there.  It was kinda
sorta okay to allow that to happen at server start, but really that was
never workable in standard daemon environments.  And it was a complete
fail on Windows, where the same thing would happen at every backend launch.
Yesterday's commit de41869b6 put the final nail in the coffin by causing
that to happen at every SIGHUP; even if you've still got a terminal acting
as the server's TTY, having the postmaster freeze until you enter the
passphrase again isn't acceptable.

Hence, override the default behavior with a callback that returns an empty
string, ensuring failure.  Change the documentation to say that you can't
have a passphrase-protected server key, period.

If we can think of a production-grade way of collecting a passphrase from
somewhere, we might do that once at server startup and use this callback
to feed it to OpenSSL, but it's far from clear that anyone cares enough
to invest that much work in the feature.  The lack of complaints about
the existing fractionally-baked behavior suggests nobody's using it anyway.

Discussion: https://postgr.es/m/29982.1483412575@sss.pgh.pa.us
2017-01-03 12:33:29 -05:00
Heikki Linnakangas 31c54096a1 Remove bogus notice that older clients might not work with MD5 passwords.
That was written when we still had "crypt" authentication, and it was
referring to the fact that an older client might support "crypt"
authentication but not "md5". But we haven't supported "crypt" for years.
(As soon as we add a new authentication mechanism that doesn't work with
MD5 hashes, we'll need a similar notice again. But this text as it's worded
now is just wrong.)

Backpatch to all supported versions.

Discussion: https://www.postgresql.org/message-id/9a7263eb-0980-2072-4424-440bb2513dc7@iki.fi
2017-01-03 14:09:01 +02:00
Tom Lane de41869b64 Allow SSL configuration to be updated at SIGHUP.
It is no longer necessary to restart the server to enable, disable,
or reconfigure SSL.  Instead, we just create a new SSL_CTX struct
(by re-reading all relevant files) whenever we get SIGHUP.  Testing
shows that this is fast enough that it shouldn't be a problem.

In conjunction with that, downgrade the logic that complains about
pg_hba.conf "hostssl" lines when SSL isn't active: now that's just
a warning condition not an error.

An issue that still needs to be addressed is what shall we do with
passphrase-protected server keys?  As this stands, the server would
demand the passphrase again on every SIGHUP, which is certainly
impractical.  But the case was only barely supported before, so that
does not seem a sufficient reason to hold up committing this patch.

Andreas Karlsson, reviewed by Michael Banck and Michael Paquier

Discussion: https://postgr.es/m/556A6E8A.9030400@proxel.se
2017-01-02 21:37:12 -05:00
Tom Lane 67a875355e In pgbench logging, avoid assuming that instr_times match Unix timestamps.
For aggregated logging, pg_bench supposed that printing the integer part of
INSTR_TIME_GET_DOUBLE() would produce a Unix timestamp.  That was already
broken on Windows, and it's about to get broken on most other platforms as
well.  As in commit 74baa1e3b, we can remove the entanglement at the price
of one extra syscall per transaction; though here it seems more convenient
to use time(NULL) instead of gettimeofday(), since we only need
integral-second precision.

I took the time to do some wordsmithing on the documentation about
pgbench's logging features, too.

Discussion: https://postgr.es/m/8837.1483216839@sss.pgh.pa.us
2017-01-02 12:26:03 -05:00
Andrew Dunstan 71f996d221 Explain unaccounted for space in pgstattuple.
In addition to space accounted for by tuple_len, dead_tuple_len and
free_space, the table_len includes page overhead, the item pointers
table and padding bytes.

Backpatch to live branches.
2016-12-27 11:23:46 -05:00
Tom Lane 3c9d398484 Doc: improve index entry for "median".
We had an index entry for "median" attached to the percentile_cont function
entry, which was pretty useless because a person following the link would
never realize that that function was the one they were being hinted to use.

Instead, make the index entry point at the example in syntax-aggregates,
and add a <seealso> link to "percentile".

Also, since that example explicitly claims to be calculating the median,
make it use percentile_cont not percentile_disc.  This makes no difference
in terms of the larger goals of that section, but so far as I can find,
nearly everyone thinks that "median" means the continuous not discrete
calculation.

Per gripe from Steven Winfield.  Back-patch to 9.4 where we introduced
percentile_cont.

Discussion: https://postgr.es/m/20161223102056.25614.1166@wrigleys.postgresql.org
2016-12-23 12:53:09 -05:00
Tom Lane ff33d1456e Spellcheck: s/descendent/descendant/g
I got a little annoyed by reading documentation paragraphs containing
both spellings within a few lines of each other.  My dictionary says
"descendant" is the preferred spelling, and it's certainly the majority
usage in our tree, so standardize on that.

For one usage in parallel.sgml, I thought it better to rewrite to avoid
the term altogether.
2016-12-23 11:53:35 -05:00
Robert Haas e13486eba0 Remove sql_inheritance GUC.
This backward-compatibility GUC is long overdue for removal.

Discussion: http://postgr.es/m/CA+TgmoYe+EG7LdYX6pkcNxr4ygkP4+A=jm9o-CPXyOvRiCNwaQ@mail.gmail.com
2016-12-23 07:35:01 -05:00
Joe Conway 0a85c10225 Improve RLS documentation with respect to COPY
Documentation for pg_restore said COPY TO does not support row security
when in fact it should say COPY FROM. Fix that.

While at it, make it clear that "COPY FROM" does not allow RLS to be
enabled and INSERT should be used instead. Also that SELECT policies
will apply to COPY TO statements.

Back-patch to 9.5 where RLS first appeared.

Author: Joe Conway
Reviewed-By: Dean Rasheed and Robert Haas
Discussion: https://postgr.es/m/5744FA24.3030008%40joeconway.com
2016-12-22 17:56:50 -08:00
Peter Eisentraut 909cb78a8c doc: Further speed improvements for HTML XSLT build 2016-12-22 15:41:44 -05:00
Andres Freund 6ef2eba3f5 Skip checkpoints, archiving on idle systems.
Some background activity (like checkpoints, archive timeout, standby
snapshots) is not supposed to happen on an idle system. Unfortunately
so far it was not easy to determine when a system is idle, which
defeated some of the attempts to avoid redundant activity on an idle
system.

To make that easier, allow to make individual WAL insertions as not
being "important". By checking whether any important activity happened
since the last time an activity was performed, it now is easy to check
whether some action needs to be repeated.

Use the new facility for checkpoints, archive timeout and standby
snapshots.

The lack of a facility causes some issues in older releases, but in my
opinion the consequences (superflous checkpoints / archived segments)
aren't grave enough to warrant backpatching.

Author: Michael Paquier, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Amit Kapila, Kyotaro HORIGUCHI
Bug: #13685
Discussion:
    https://www.postgresql.org/message-id/20151016203031.3019.72930@wrigleys.postgresql.org
    https://www.postgresql.org/message-id/CAB7nPqQcPqxEM3S735Bd2RzApNqSNJVietAC=6kfkYv_45dKwA@mail.gmail.com
Backpatch: -
2016-12-22 11:31:50 -08:00
Michael Meskes 4032ef18d0 Fix buffer overflow on particularly named files and clarify documentation about
output file naming.

Patch by Tsunakawa, Takayuki <tsunakawa.takay@jp.fujitsu.com>
2016-12-22 08:28:13 +01:00
Tom Lane 89fcea1ace Fix strange behavior (and possible crashes) in full text phrase search.
In an attempt to simplify the tsquery matching engine, the original
phrase search patch invented rewrite rules that would rearrange a
tsquery so that no AND/OR/NOT operator appeared below a PHRASE operator.
But this approach had numerous problems.  The rearrangement step was
missed by ts_rewrite (and perhaps other places), allowing tsqueries
to be created that would cause Assert failures or perhaps crashes at
execution, as reported by Andreas Seltenreich.  The rewrite rules
effectively defined semantics for operators underneath PHRASE that were
buggy, or at least unintuitive.  And because rewriting was done in
tsqueryin() rather than at execution, the rearrangement was user-visible,
which is not very desirable --- for example, it might cause unexpected
matches or failures to match in ts_rewrite.

As a somewhat independent problem, the behavior of nested PHRASE operators
was only sane for left-deep trees; queries like "x <-> (y <-> z)" did not
behave intuitively at all.

To fix, get rid of the rewrite logic altogether, and instead teach the
tsquery execution engine to manage AND/OR/NOT below a PHRASE operator
by explicitly computing the match location(s) and match widths for these
operators.

This requires introducing some additional fields into the publicly visible
ExecPhraseData struct; but since there's no way for third-party code to
pass such a struct to TS_phrase_execute, it shouldn't create an ABI problem
as long as we don't move the offsets of the existing fields.

Another related problem was that index searches supposed that "!x <-> y"
could be lossily approximated as "!x & y", which isn't correct because
the latter will reject, say, "x q y" which the query itself accepts.
This required some tweaking in TS_execute_ternary along with the main
tsquery engine.

Back-patch to 9.6 where phrase operators were introduced.  While this
could be argued to change behavior more than we'd like in a stable branch,
we have to do something about the crash hazards and index-vs-seqscan
inconsistency, and it doesn't seem desirable to let the unintuitive
behaviors induced by the rewriting implementation stand as precedent.

Discussion: https://postgr.es/m/28215.1481999808@sss.pgh.pa.us
Discussion: https://postgr.es/m/26706.1482087250@sss.pgh.pa.us
2016-12-21 15:18:39 -05:00
Stephen Frost 2d1018ca56 Improve ALTER TABLE documentation
The ALTER TABLE documentation wasn't terribly clear when it came to
which commands could be combined together and what it meant when they
were.

In particular, SET TABLESPACE *can* be combined with other commands,
when it's operating against a single table, but not when multiple tables
are being moved with ALL IN TABLESPACE.  Further, the actions are
applied together but not really in 'parallel', at least today.

Pointed out by: Amit Langote

Improved wording from Tom.

Back-patch to 9.4, where the ALL IN TABLESPACE option was added.

Discussion: https://www.postgresql.org/message-id/14c535b4-13ef-0590-1b98-76af355a0763%40lab.ntt.co.jp
2016-12-21 15:03:32 -05:00
Peter Eisentraut f3b421da5f Reorder pg_sequence columns to avoid alignment issue
On AIX, doubles are aligned at 4 bytes, but int64 is aligned at 8 bytes.
Our code assumes that doubles have alignment that can also be applied to
int64, but that fails in this case.  One effect is that
heap_form_tuple() writes tuples in a different layout than
Form_pg_sequence expects.

Rather than rewrite the whole alignment code, work around the issue by
reordering the columns in pg_sequence so that the first int64 column
naturally comes out at an 8-byte boundary.
2016-12-21 09:06:49 -05:00
Peter Eisentraut 1753b1b027 Add pg_sequence system catalog
Move sequence metadata (start, increment, etc.) into a proper system
catalog instead of storing it in the sequence heap object.  This
separates the metadata from the sequence data.  Sequence metadata is now
operated on transactionally by DDL commands, whereas previously
rollbacks of sequence-related DDL commands would be ignored.

Reviewed-by: Andreas Karlsson <andreas@proxel.se>
2016-12-20 08:28:18 -05:00
Robert Haas e13029a5ce Provide a DSA area for all parallel queries.
This will allow future parallel query code to dynamically allocate
storage shared by all participants.

Thomas Munro, with assorted changes by me.
2016-12-19 17:11:46 -05:00
Fujii Masao 3901fd70cc Support quorum-based synchronous replication.
This feature is also known as "quorum commit" especially in discussion
on pgsql-hackers.

This commit adds the following new syntaxes into synchronous_standby_names
GUC. By using FIRST and ANY keywords, users can specify the method to
choose synchronous standbys from the listed servers.

  FIRST num_sync (standby_name [, ...])
  ANY num_sync (standby_name [, ...])

The keyword FIRST specifies a priority-based synchronous replication
which was available also in 9.6 or before. This method makes transaction
commits wait until their WAL records are replicated to num_sync
synchronous standbys chosen based on their priorities.

The keyword ANY specifies a quorum-based synchronous replication
and makes transaction commits wait until their WAL records are
replicated to *at least* num_sync listed standbys. In this method,
the values of sync_state.pg_stat_replication for the listed standbys
are reported as "quorum". The priority is still assigned to each standby,
but not used in this method.

The existing syntaxes having neither FIRST nor ANY keyword are still
supported. They are the same as new syntax with FIRST keyword, i.e.,
a priorirty-based synchronous replication.

Author: Masahiko Sawada
Reviewed-By: Michael Paquier, Amit Kapila and me
Discussion: <CAD21AoAACi9NeC_ecm+Vahm+MMA6nYh=Kqs3KB3np+MBOS_gZg@mail.gmail.com>

Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-12-19 21:15:30 +09:00
Peter Eisentraut b645a05fc6 doc: Remove some trailing whitespace
Per discussion, we will not at this time remove trailing whitespace in
psql output displays where it is part of the actual psql output.

From: Vladimir Rusinov <vrusinov@google.com>
2016-12-17 09:35:31 -05:00
Robert Haas 3761fe3c20 Simplify LWLock tranche machinery by removing array_base/array_stride.
array_base and array_stride were added so that we could identify the
offset of an LWLock within a tranche, but this facility is only very
marginally used apart from the main tranche.  So, give every lock in
the main tranche its own tranche ID and get rid of array_base,
array_stride, and all that's attached.  For debugging facilities
(Trace_lwlocks and LWLOCK_STATS) print the pointer address of the
LWLock using %p instead of the offset.  This is arguably more useful,
and certainly a lot cheaper.  Drop the offset-within-tranche from
the information reported to dtrace and from one can't-happen message
inside lwlock.c.

The main user-visible impact of this change is that pg_stat_activity
will now report all waits for LWLocks as "LWLock" rather than
reporting some as "LWLockTranche" and others as "LWLockNamed".

The main motivation for this change is that the need to specify an
array_base and an array_stride is awkward for parallel query.  There
is only a very limited supply of tranche IDs so we can't just keep
allocating new ones, and if we try to use the same tranche IDs every
time then we run into trouble when multiple parallel contexts are
use simultaneously.  So if we didn't get rid of this mechanism we'd
have to make it even more complicated.  By simplifying it in this
way, we instead reduce the size of the generated code for lwlock.c
by about 5%.

Discussion: http://postgr.es/m/CA+TgmoYsFn6NUW1x0AZtupJGUAs1UDY4dJtCN47_Q6D0sP80PA@mail.gmail.com
2016-12-16 11:29:23 -05:00
Fujii Masao 4e344c2cf4 Add missing documentation for effective_io_concurrency tablespace option.
The description of effective_io_concurrency option was missing in ALTER
TABLESPACE docs though it's included in CREATE TABLESPACE one.

Back-patch to 9.6 where effective_io_concurrency tablespace option was added.

Michael Paquier, reported by Marc-Olaf Jaschke
2016-12-17 01:25:29 +09:00
Robert Haas a1a4459c29 doc: Improve documentation related to table partitioning feature.
Commit f0e44751d7 implemented table
partitioning, but failed to mention the "no row movement"
restriction in the documentation.  Fix that and a few other issues.

Amit Langote, with some additional wordsmithing by me.
2016-12-13 08:18:00 -05:00