The ALTER TABLE documentation wasn't terribly clear when it came to
which commands could be combined together and what it meant when they
were.
In particular, SET TABLESPACE *can* be combined with other commands,
when it's operating against a single table, but not when multiple tables
are being moved with ALL IN TABLESPACE. Further, the actions are
applied together but not really in 'parallel', at least today.
Pointed out by: Amit Langote
Improved wording from Tom.
Back-patch to 9.4, where the ALL IN TABLESPACE option was added.
Discussion: https://www.postgresql.org/message-id/14c535b4-13ef-0590-1b98-76af355a0763%40lab.ntt.co.jp
Per discussion, we will not at this time remove trailing whitespace in
psql output displays where it is part of the actual psql output.
From: Vladimir Rusinov <vrusinov@google.com>
The description of effective_io_concurrency option was missing in ALTER
TABLESPACE docs though it's included in CREATE TABLESPACE one.
Back-patch to 9.6 where effective_io_concurrency tablespace option was added.
Michael Paquier, reported by Marc-Olaf Jaschke
Commit f0e44751d7 implemented table
partitioning, but failed to mention the "no row movement"
restriction in the documentation. Fix that and a few other issues.
Amit Langote, with some additional wordsmithing by me.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
If the PAGER environment variable is set but contains an empty string,
psql would pass it to "sh" which would silently exit, causing whatever
query output we were printing to vanish entirely. This is quite
mystifying; it took a long time for us to figure out that this was the
cause of Joseph Brenner's trouble report. Rather than allowing that
to happen, we should treat this as another way to specify "no pager".
(We could alternatively treat it as selecting the default pager, but
it seems more likely that the former is what the user meant to achieve
by setting PAGER this way.)
Nonempty, but all-white-space, PAGER values have the same behavior, and
it's pretty easy to test for that, so let's handle that case the same way.
Most other cases of faulty PAGER values will result in the shell printing
some kind of complaint to stderr, which should be enough to diagnose the
problem, so we don't need to work harder than this. (Note that there's
been an intentional decision not to be very chatty about apparent failure
returns from the pager process, since that may happen if, eg, the user
quits the pager with control-C or some such. I'd just as soon not start
splitting hairs about which exit codes might merit making our own report.)
libpq's old PQprint() function was already on board with ignoring empty
PAGER values, but for consistency, make it ignore all-white-space values
as well.
It's been like this a long time, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAFfgvXWLOE2novHzYjmQK8-J6TmHz42G8f3X0SORM44+stUGmw@mail.gmail.com
We have had support for restrictive RLS policies since 9.5, but they
were only available through extensions which use the appropriate hooks.
This adds support into the grammer, catalog, psql and pg_dump for
restrictive RLS policies, thus reducing the cases where an extension is
necessary.
In passing, also move away from using "AND"d and "OR"d in comments.
As pointed out by Alvaro, it's not really appropriate to attempt
to make verbs out of "AND" and "OR", so reword those comments which
attempted to.
Reviewed By: Jeevan Chalke, Dean Rasheed
Discussion: https://postgr.es/m/20160901063404.GY4028@tamriel.snowman.net
Add an option to exclude blobs when running pg_dump. By default, blobs
are included but this option can be used to exclude them while keeping
the rest of the dump.
Commment updates and regression tests from me.
Author: Guillaume Lelarge
Reviewed-by: Amul Sul
Discussion: https://postgr.es/m/VisenaEmail.48.49926ea6f91dceb6.15355a48249@tc7-visena
The documentation around the -b/--blobs option to pg_dump seemed to
imply that it might be possible to add blobs to a "schema-only" dump or
similar. Clarify that blobs are data and therefore will only be
included in dumps where data is being included, even when -b is used to
request blobs be included.
The -b option has been around since before 9.2, so back-patch to all
supported branches.
Discussion: https://postgr.es/m/20161119173316.GA13284@tamriel.snowman.net
An example in the psql documentation had an incorrect field name from
what the command actually produced.
Pointed out by Fabien COELHO
Back-patch to 9.6 where the example was added.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1611291349400.19314@lancre
Previously, the right-hand side of a multiple-column assignment, if it
wasn't a sub-SELECT, had to be a simple parenthesized expression list,
because gram.y was responsible for "bursting" the construct into
independent column assignments. This had the minor defect that you
couldn't write ROW (though you should be able to, since the standard says
this is a row constructor), and the rather larger defect that unlike other
uses of row constructors, we would not expand a "foo.*" item into multiple
columns.
Fix that by changing the RHS to be just "a_expr" in the grammar, leaving
it to transformMultiAssignRef to separate the elements of a RowExpr;
which it will do only after performing standard transformation of the
RowExpr, so that "foo.*" behaves as expected.
The key reason we didn't do that before was the hard-wired handling of
DEFAULT tokens (SetToDefault nodes). This patch deals with that issue by
allowing DEFAULT in any a_expr and having parse analysis throw an error
if SetToDefault is found in an unexpected place. That's an improvement
anyway since the error can be more specific than just "syntax error".
The SQL standard suggests that the RHS could be any a_expr yielding a
suitable row value. This patch doesn't really move the goal posts in that
respect --- you're still limited to RowExpr or a sub-SELECT --- but it does
fix the grammar restriction, so it provides some tangible progress towards
a full implementation. And the limitation is now documented by an explicit
error message rather than an unhelpful "syntax error".
Discussion: <8542.1479742008@sss.pgh.pa.us>
Commit 3c4cf08087 should have removed SET TABLESPACE from the synopsis
of ALTER MATERIALIZE VIEW as a possible "action" when it added a
separate line for it in the main command listing, but failed to.
Repair.
Backpatch to 9.4, like the aforementioned commit.
We just pass the data to the INSTEAD trigger.
Haribabu Kommi, reviewed by Dilip Kumar
Patch: <CAJrrPGcSQkrNkO+4PhLm4B8UQQQmU9YVUuqmtgM=pmzMfxWaWQ@mail.gmail.com>
This is infrastructure for the complete SQL standard feature. No
support is included at this point for execution nodes or PLs. The
intent is to add that soon.
As this patch leaves things, standard syntax can create tuplestores
to contain old and/or new versions of rows affected by a statement.
References to these tuplestores are in the TriggerData structure.
C triggers can access the tuplestores directly, so they are usable,
but they cannot yet be referenced within a SQL statement.
This will write the received transaction log into a file called
pg_wal.tar(.gz) next to the other tarfiles instead of writing it to
base.tar. When using fetch mode, the transaction log is still written to
base.tar like before, and when used against a pre-10 server, the file
is named pg_xlog.tar.
To do this, implement a new concept of a "walmethod", which is
responsible for writing the WAL. Two implementations exist, one that
writes to a plain directory (which is also used by pg_receivexlog) and
one that writes to a tar file with optional compression.
Reviewed by Michael Paquier
"xlog" is not a particularly clear abbreviation for "write-ahead log",
and it sometimes confuses users into believe that the contents of the
"pg_xlog" directory are not critical data, leading to unpleasant
consequences. So, rename the directory to "pg_wal".
This patch modifies pg_upgrade and pg_basebackup to understand both
the old and new directory layouts; the former is necessary given the
purpose of the tool, while the latter merely avoids an unnecessary
backward-compatibility break.
We may wish to consider renaming other programs, switches, and
functions which still use the old "xlog" naming to also refer to
"wal". However, that's still under discussion, so let's do just this
much for now.
Discussion: CAB7nPqTeC-8+zux8_-4ZD46V7YPwooeFxgndfsq5Rg8ibLVm1A@mail.gmail.com
Michael Paquier
--noclean and --nosync were the only options spelled without a hyphen,
so change this for consistency with other options. The options in
pg_basebackup have not been in a release, so we just rename them. For
initdb, we retain the old variants.
Vik Fearing and me
The need for dumping from such ancient servers has decreased to about nil
in the field, so let's remove all the code that catered to it. Aside
from removing a lot of boilerplate variant queries, this allows us to not
have to cope with servers that don't have (a) schemas or (b) pg_depend.
That means we can get rid of assorted squishy code around that. There
may be some nonobvious additional simplifications possible, but this patch
already removes about 1500 lines of code.
I did not remove the ability for pg_restore to read custom-format archives
generated by these old versions (and light testing says that that does
still work). If you have an old server, you probably also have a pg_dump
that will work with it; but you have an old custom-format backup file,
that might be all you have.
It'd be possible at this point to remove fmtQualifiedId()'s version
argument, but I refrained since that would affect code outside pg_dump.
Discussion: <2661.1475849167@sss.pgh.pa.us>
It was perhaps not entirely clear that internal self-references shouldn't
be schema-qualified even if the view name is written with a schema.
Spell it out.
Discussion: <871sznz69m.fsf@metapensiero.it>
Without this, an extension containing an access method is not properly
dumped/restored during pg_upgrade --- the AM ends up not being a member
of the extension after upgrading.
Another oversight in commit 473b93287, reported by Andrew Dunstan.
Report: <f7ac29f3-515c-2a44-21c5-ec925053265f@dunslane.net>
The list of files and directories that pg_basebackup excludes from the
backup was somewhat incomplete and unorganized. Change that with having
the exclusion driven from tables. Clean up some code around it. Also
document the exclusions in more detail so that users of pg_start_backup
can make use of it as well.
The contents of these directories are now excluded from the backup:
pg_dynshmem, pg_notify, pg_serial, pg_snapshots, pg_subtrans
Also fix a bug that a pg_repl_slot or pg_stat_tmp being a symlink would
cause a corrupt tar header to be created. Now such symlinks are
included in the backup as empty directories. Bug found by Ashutosh
Sharma <ashu.coek88@gmail.com>.
From: David Steele <david@pgmasters.net>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
We had thirty different GIN array opclasses sharing the same operators and
support functions. That still didn't cover all the built-in types, nor
did it cover arrays of extension-added types. What we want is a single
polymorphic opclass for "anyarray". There were two missing features needed
to make this possible:
1. We have to be able to declare the index storage type as ANYELEMENT
when the opclass is declared to index ANYARRAY. This just takes a few
more lines in index_create(). Although this currently seems of use only
for GIN, there's no reason to make index_create() restrict it to that.
2. We have to be able to identify the proper GIN compare function for
the index storage type. This patch proceeds by making the compare function
optional in GIN opclass definitions, and specifying that the default btree
comparison function for the index storage type will be looked up when the
opclass omits it. Again, that seems pretty generically useful.
Since the comparison function lookup is done in initGinState(), making
use of the second feature adds an additional cache lookup to GIN index
access setup. It seems unlikely that that would be very noticeable given
the other costs involved, but maybe at some point we should consider
making GinState data persist longer than it now does --- we could keep it
in the index relcache entry, perhaps.
Rather fortuitously, we don't seem to need to do anything to get this
change to play nice with dump/reload or pg_upgrade scenarios: the new
opclass definition is automatically selected to replace existing index
definitions, and the on-disk data remains compatible. Also, if a user has
created a custom opclass definition for a non-builtin type, this doesn't
break that, since CREATE INDEX will prefer an exact match to opcintype
over a match to ANYARRAY. However, if there's anyone out there with
handwritten DDL that explicitly specifies _bool_ops or one of the other
replaced opclass names, they'll need to adjust that.
Tom Lane, reviewed by Enrique Meneses
Discussion: <14436.1470940379@sss.pgh.pa.us>
When waiting is selected for the promote action, look into pg_control
until the state changes, then use the PQping-based waiting until the
server is reachable.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
This is similar to the -N option in pg_dump, except that it doesn't take
a pattern, just like the existing -n option in pg_restore.
From: Michael Banck <michael.banck@credativ.de>
Like initdb, clean up created data and xlog directories, unless the new
-n/--noclean option is specified.
Tablespace directories are not cleaned up, but a message is written
about that.
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Not much to be said about this patch: it does what it says on the tin.
In passing, rename AlterEnumStmt.skipIfExists to skipIfNewValExists
to clarify what it actually does. In the discussion of this patch
we considered supporting other similar options, such as IF EXISTS
on the type as a whole or IF NOT EXISTS on the target name. This
patch doesn't actually add any such feature, but it might happen later.
Dagfinn Ilmari Mannsåker, reviewed by Emre Hasegeli
Discussion: <CAO=2mx6uvgPaPDf-rHqG8=1MZnGyVDMQeh8zS4euRyyg4D35OQ@mail.gmail.com>
Document the formerly-undocumented behavior that schema and comment
control-file entries for an extension are honored only during initial
installation, whereas other properties are also honored during updates.
While at it, do some copy-editing on the recently-added docs for CREATE
EXTENSION ... CASCADE, use links for some formerly vague cross references,
and make a couple other minor improvements.
Back-patch to 9.6 where CASCADE was added. The other parts of this
could go further back, but they're probably not important enough to
bother.
To prevent possibly breaking indexes on enum columns, we must keep
uncommitted enum values from getting stored in tables, unless we
can be sure that any such column is new in the current transaction.
Formerly, we enforced this by disallowing ALTER TYPE ... ADD VALUE
from being executed at all in a transaction block, unless the target
enum type had been created in the current transaction. This patch
removes that restriction, and instead insists that an uncommitted enum
value can't be referenced unless it belongs to an enum type created
in the same transaction as the value. Per discussion, this should be
a bit less onerous. It does require each function that could possibly
return a new enum value to SQL operations to check this restriction,
but there aren't so many of those that this seems unmaintainable.
Andrew Dunstan and Tom Lane
Discussion: <4075.1459088427@sss.pgh.pa.us>
In addition to the existing decimal-milliseconds output value,
display the same value in mm:ss.fff format if it exceeds one second.
Tack on hours and even days fields if the interval is large enough.
This avoids needing mental arithmetic to convert the values into
customary time units.
Corey Huinker, reviewed by Gerdan Santos; bikeshedding by many
Discussion: <CADkLM=dbC4R8sbbuFXQVBFWoJGQkTEW8RWnC0PbW9nZsovZpJQ@mail.gmail.com>
Apparently that's not obvious to everybody, so let's belabor the point.
In passing, document that DROP POLICY has CASCADE/RESTRICT options (which
it does, per gram.y) but they do nothing (I assume, anyway). Also update
some long-obsolete commentary in gram.y.
Discussion: <20160805104837.1412.84915@wrigleys.postgresql.org>
The decision to reuse values of parameters from a previous connection
has been based on whether the new target is a conninfo string. Add this
means of overriding that default. This feature arose as one component
of a fix for security vulnerabilities in pg_dump, pg_dumpall, and
pg_upgrade, so back-patch to 9.1 (all supported versions). In 9.3 and
later, comment paragraphs that required update had already-incorrect
claims about behavior when no connection is open; fix those problems.
Security: CVE-2016-5424
The help message for pg_basebackup specifies that the numbers 0 through 9
are accepted as valid values of -Z option. But, previously -Z 0 was rejected
as an invalid compression level.
Per discussion, it's better to make pg_basebackup treat 0 as valid
compression level meaning no compression, like pg_dump.
Back-patch to all supported versions.
Reported-By: Jeff Janes
Reviewed-By: Amit Kapila
Discussion: CAMkU=1x+GwjSayc57v6w87ij6iRGFWt=hVfM0B64b1_bPVKRqg@mail.gmail.com
This text was added by commit ff213239c, and not long thereafter obsoleted
by commit 4adc2f72a (which made the test depend on NBuffers instead); but
nobody noticed the need for an update. Commit 9563d5b5e adds some further
dependency on maintenance_work_mem, but the existing verbiage seems to
cover that with about as much precision as we really want here. Let's
just take it all out rather than leaving ourselves open to more errors of
omission in future. (That solution makes this change back-patchable, too.)
Noted by Peter Geoghegan.
Discussion: <CAM3SWZRVANbj9GA9j40fAwheQCZQtSwqTN1GBTVwRrRbmSf7cg@mail.gmail.com>
The docs failed to explain that LIKE INCLUDING INDEXES would not preserve
the names of indexes and associated constraints. Also, it wasn't mentioned
that EXCLUDE constraints would be copied by this option. The latter
oversight seems enough of a documentation bug to justify back-patching.
In passing, do some minor copy-editing in the same area, and add an entry
for LIKE under "Compatibility", since it's not exactly a faithful
implementation of the standard's feature.
Discussion: <20160728151154.AABE64016B@smtp.hushmail.com>
Clarify that the reason for recommending that pg_temp be put last is to
prevent temporary tables from capturing unqualified table names. Per
discussion with Albe Laurenz.
Discussion: <A737B7A37273E048B164557ADEF4A58B5386C6E1@ntex2010i.host.magwien.gv.at>
Add display of proparallel (parallel-safety) when the server is >= 9.6,
and display of proacl (access privileges) for all server versions.
Minor tweak of column ordering to keep related columns together.
Michael Paquier
Discussion: <CAB7nPqTR3Vu3xKOZOYqSm-+bSZV0kqgeGAXD6w5GLbkbfd5Q6w@mail.gmail.com>
Add a section to xaggr.sgml, as we have done in the past for other
extensions to the aggregation functionality. Assorted wordsmithing
and other minor improvements.
David Rowley and Tom Lane
The original specification for this called for the deserialization function
to have signature "deserialize(serialtype) returns transtype", which is a
security violation if transtype is INTERNAL (which it always would be in
practice) and serialtype is not (which ditto). The patch blithely overrode
the opr_sanity check for that, which was sloppy-enough work in itself,
but the indisputable reason this cannot be allowed to stand is that CREATE
FUNCTION will reject such a signature and thus it'd be impossible for
extensions to create parallelizable aggregates.
The minimum fix to make the signature type-safe is to add a second, dummy
argument of type INTERNAL. But to lock it down a bit more and make misuse
of INTERNAL-accepting functions less likely, let's get rid of the ability
to specify a "serialtype" for an aggregate and just say that the only
useful serialtype is BYTEA --- which, in practice, is the only interesting
value anyway, due to the usefulness of the send/recv infrastructure for
this purpose. That means we only have to allow "serialize(internal)
returns bytea" and "deserialize(bytea, internal) returns internal" as
the signatures for these support functions.
In passing fix bogus signature of int4_avg_combine, which I found thanks
to adding an opr_sanity check on combinefunc signatures.
catversion bump due to removing pg_aggregate.aggserialtype and adjusting
signatures of assorted built-in functions.
David Rowley and Tom Lane
Discussion: <27247.1466185504@sss.pgh.pa.us>
Dilian Palauzov pointed out in bug #14201 that the docs failed to mention
the possibility of %R producing '(' due to an unmatched parenthesis.
He proposed just adding that in the same style as the other options were
listed; but it seemed to me that the sentence was already nearly
unintelligible, so I rewrote it a bit more extensively.
Report: <20160619121113.5789.68274@wrigleys.postgresql.org>
If you really want to vacuum every single page in the relation,
regardless of apparent visibility status or anything else, you can use
this option. In previous releases, this behavior could be achieved
using VACUUM (FREEZE), but because we can now recognize all-frozen
pages as not needing to be frozen again, that no longer works. There
should be no need for routine use of this option, but maybe bugs or
disaster recovery will necessitate its use.
Patch by me, reviewed by Andres Freund.
This terminology provoked widespread complaints. So, instead, rename
the GUC max_parallel_degree to max_parallel_workers_per_gather
(leaving room for a possible future GUC max_parallel_workers that acts
as a system-wide limit), and rename the parallel_degree reloption to
parallel_workers. Rename structure members to match.
These changes create a dump/restore hazard for users of PostgreSQL
9.6beta1 who have set the reloption (or applied the GUC using ALTER
USER or ALTER DATABASE).
Commit 6820094d1 mixed up types of parent object (table) with type of
sub-object being commented on. Noticed while fixing docs for
COMMENT ON ACCESS METHOD.
Backpatch to 9.5, like that commit.
Mostly these are just comments but there are a few in documentation
and a handful in code and tests. Hopefully this doesn't cause too much
unnecessary pain for backpatching. I relented from some of the most
common like "thru" for that reason. The rest don't seem numerous
enough to cause problems.
Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings
In the documentation for nextval(), point out explicitly that INSERT ...
ON CONFLICT will call nextval() if needed for the insertion case, whether
or not it ends up following the ON CONFLICT path. This seems to be a
matter of some confusion, cf bug #14126, so let's be clear about it.
Also mention the issue in the CREATE SEQUENCE reference page, since that
is another place where people might expect such things to be covered.
Minor wording improvements nearby, as well.
Back-patch to 9.5 where ON CONFLICT was introduced.
These functions behave like the backend's least/greatest functions,
not like min/max, so the originally-chosen names invite confusion.
Per discussion, rename to least/greatest.
I also took it upon myself to make them return double if any input is
double. The previous behavior of silently coercing all inputs to int
surely does not meet the principle of least astonishment.
Copy-edit some of the other new functions' documentation, too.
\crosstabview interpreted its arguments in an unusual way, including
doing case-insensitive matching of unquoted column names, which is
surely not the right thing. Rip that out in favor of doing something
equivalent to the dequoting/case-folding rules used by other psql
commands. To keep it simple, change the syntax so that the optional
sort column is specified as a separate argument, instead of the
also-quite-unusual syntax that attached it to the colH argument with
a colon.
Also, rework the error messages to be closer to project style.
Fix misleading syntax summary (there cannot be a space between colH and
scolH). Provide a link from the existing crosstab() function's
documentation to \crosstabview. Copy-edit the command's description.
Christoph Berg and Tom Lane
\crosstabview is a completely different way to display results from a
query: instead of a vertical display of rows, the data values are placed
in a grid where the column and row headers come from the data itself,
similar to a spreadsheet.
The sort order of the horizontal header can be specified by using
another column in the query, and the vertical header determines its
ordering from the order in which they appear in the query.
This only allows displaying a single value in each cell. If more than
one value correspond to the same cell, an error is thrown. Merging of
values can be done in the query itself, if necessary. This may be
revisited in the future.
Author: Daniel Verité
Reviewed-by: Pavel Stehule, Dean Rasheed
This will prevent users from creating roles which begin with "pg_" and
will check for those roles before allowing an upgrade using pg_upgrade.
This will allow for default roles to be provided at initdb time.
Reviews by José Luis Tallón and Robert Haas
Now indexes (but only B-tree for now) can contain "extra" column(s) which
doesn't participate in index structure, they are just stored in leaf
tuples. It allows to use index only scan by using single index instead
of two or more indexes.
Author: Anastasia Lubennikova with minor editorializing by me
Reviewers: David Rowley, Peter Geoghegan, Jeff Janes
The code that estimates what parallel degree should be uesd for the
scan of a relation is currently rather stupid, so add a parallel_degree
reloption that can be used to override the planner's rather limited
judgement.
Julien Rouhaud, reviewed by David Rowley, James Sewell, Amit Kapila,
and me. Some further hacking by me.
This introduces a new dependency type which marks an object as depending
on an extension, such that if the extension is dropped, the object
automatically goes away; and also, if the database is dumped, the object
is included in the dump output. Currently the grammar supports this for
indexes, triggers, materialized views and functions only, although the
utility code is generic so adding support for more object types is a
matter of touching the parser rules only.
Author: Abhijit Menon-Sen
Reviewed-by: Alexander Korotkov, Álvaro Herrera
Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org
has_parallel_hazard() was ignoring the proparallel markings for
aggregates, which is no good. Fix that. There was no way to mark
an aggregate as actually being parallel-safe, either, so add a
PARALLEL option to CREATE AGGREGATE.
Patch by me, reviewed by David Rowley.
\gexec executes the just-entered query, like \g, but instead of printing
the results it takes each field as a SQL command to send to the server.
Computing a series of queries to be executed is a fairly common thing,
but up to now you always had to resort to kluges like writing the queries
to a file and then inputting the file. Now it can be done with no
intermediate step.
The implementation is fairly straightforward except for its interaction
with FETCH_COUNT. ExecQueryUsingCursor isn't capable of being called
recursively, and even if it were, its need to create a transaction
block interferes unpleasantly with the desired behavior of \gexec after
a failure of a generated query (i.e., that it can continue). Therefore,
disable use of ExecQueryUsingCursor when doing the master \gexec query.
We can still apply it to individual generated queries, however, and there
might be some value in doing so.
While testing this feature's interaction with single-step mode, I (tgl) was
led to conclude that SendQuery needs to recognize SIGINT (cancel_pressed)
as a negative response to the single-step prompt. Perhaps that's a
back-patchable bug fix, but for now I just included it here.
Corey Huinker, reviewed by Jim Nasby, Daniel Vérité, and myself
Often, upon getting an unexpected error in psql, one's first wish is that
the verbosity setting had been higher; for example, to be able to see the
schema-name field or the server code location info. Up to now the only way
has been to adjust the VERBOSITY variable and repeat the failing query.
That's a pain, and it doesn't work if the error isn't reproducible.
This commit adds a psql feature that redisplays the most recent server
error at full verbosity, without needing to make any variable changes or
re-execute the failed command. We just need to hang onto the latest error
PGresult in case the user executes \errverbose, and then apply libpq's
new PQresultVerboseErrorMessage() function to it. This will consume
some trivial amount of psql memory, but otherwise the cost when the
feature isn't used should be negligible.
Alex Shulgin, reviewed by Daniel Vérité, some improvements by me
The server hasn't paid attention to the TZ environment variable since
commit ca4af308c3, but that commit missed removing this documentation
reference, as did commit d883b916a9 which added the reference where
it now belongs (initdb).
Back-patch to 9.2 where the behavior changed. Also back-patch
d883b916a9 as needed.
Matthew Somerville
This is necessary infrastructure for supporting parallel aggregation
for aggregates whose transition type is "internal". Such values
can't be passed between cooperating processes, because they are
just pointers.
David Rowley, reviewed by Tomas Vondra and by me.
The description of what the per-transaction log file says for skipped
transactions is just plain wrong.
Report and patch by Tomas Vondra, reviewed by Fabien Coelho and
modified by me.
This refines the previous weight range and allows a script to be "turned
off" by passing a zero weight, which is useful when scripting multiple
pgbench runs.
I did not apply the suggested warning when a script uses zero weight; we
use the principle elsewhere that if there's nothing to be done, do
nothing quietly.
Adjust docs accordingly.
Author: Jeff Janes, Fabien Coelho
You can now do the same thing via \set using the appropriate function,
either random(), random_gaussian(), or random_exponential(), depending
on the desired distribution. This is not backward-compatible, but per
discussion, it's worth it to avoid having the old syntax hang around
forever.
Fabien Coelho, reviewed by Michael Paquier, and adjusted by me.
The new functions are pi(), random(), random_exponential(),
random_gaussian(), and sqrt(). I was worried that this would be
slower than before, but, if anything, it actually turns out to be
slightly faster, because we now express the built-in pgbench scripts
using fewer lines; each \setrandom can be merged into a subsequent
\set.
Fabien Coelho
This enables external code to create access methods. This is useful so
that extensions can add their own access methods which can be formally
tracked for dependencies, so that DROP operates correctly. Also, having
explicit support makes pg_dump work correctly.
Currently only index AMs are supported, but we expect different types to
be added in the future.
Authors: Alexander Korotkov, Petr Jelínek
Reviewed-By: Teodor Sigaev, Petr Jelínek, Jim Nasby
Commitfest-URL: https://commitfest.postgresql.org/9/353/
Discussion: https://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com
Include the \pset title string if there is one, and shorten the prefab
part of the header to be "timestamp (every Ns)". Per suggestion by
David Johnston.
Michael Paquier and Tom Lane
To allow multiline SQL commands in scripts, adopt the same rules psql uses
to decide what is the end of a SQL command, to wit, an unquoted semicolon
not encased in parentheses. Do this by importing the same flex lexer that
psql uses, since coping with stuff like dollar-quoted literals is hard to
get right without going the full nine yards.
This makes use of the infrastructure added in commit 0ea9efbe9e to
support independently-written flex lexers scanning the same PsqlScanState
input-buffer data structure. Since that infrastructure isn't very
friendly to ad-hoc parsing code such as strtok(), improve exprscan.l
so that it can parse either whitespace-separated words or expression
tokens, on demand, and rewrite pgbench.c's backslash-command parsing
code to always use the lexer to fetch tokens.
It's still the case that pgbench backslash commands extend to the end
of the line, no more and no less. That could be changed in a fairly
localized way now, and there was some interest in doing so, but it
seems like material for a separate patch.
In passing, make some marginal cleanups in syntax error reporting,
const-ify a few data structures that could use it, and run some of
this code through pgindent.
I can't tell whether the MSVC build scripts need to be taught explicitly
about the changes here or not, but the buildfarm will soon tell us.
Kyotaro Horiguchi and Tom Lane
Previously, all scripts had the same probability of being chosen when
multiple of them were specified via -b, -f, -N, -S. With this commit,
-b and -f now search for an "@" in the script name and use the integer
found after it as the drawing probability for that script.
(One disadvantage is that if you have script whose names contain @, you
are now forced to specify "@1" at the end; otherwise the name's @ is
confused with a weight separator. We don't expect many pgbench script
with @ in their names in the wild, so this shouldn't be too serious a
problem.)
While at it, rework the interface between addScript, process_file,
process_builtin, and findBuiltin. It had gotten a bit out of hand with
recent commits.
Author: Fabien Coelho
Reviewed-By: Andres Freund, Robert Haas, Álvaro Herrera, Michaël Paquier
Discussion: http://www.postgresql.org/message-id/alpine.DEB.2.10.1603160721240.1666@sto
The distinction between "archive" and "hot_standby" existed only because
at the time "hot_standby" was added, there was some uncertainty about
stability. This is now a long time ago. We would like to move forward
with simplifying the replication configuration, but this distinction is
in the way, because a primary server cannot tell (without asking a
standby or predicting the future) which one of these would be the
appropriate level.
Pick a new name for the combined setting to make it clearer that it
covers all (non-logical) backup and replication uses. The old values
are still accepted but are converted internally.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: David Steele <david@pgmasters.net>
It is frequently useful for volatile, set-returning, or expensive functions
in a SELECT's targetlist to be postponed till after ORDER BY and LIMIT are
done. Otherwise, the functions might be executed for every row of the
table despite the presence of LIMIT, and/or be executed in an unexpected
order. For example, in
SELECT x, nextval('seq') FROM tab ORDER BY x LIMIT 10;
it's probably desirable that the nextval() values are ordered the same
as x, and that nextval() is not run more than 10 times.
In the past, Postgres was inconsistent in this area: you would get the
desirable behavior if the ordering were performed via an indexscan, but
not if it had to be done by an explicit sort step. Getting the desired
behavior reliably required contortions like
SELECT x, nextval('seq')
FROM (SELECT x FROM tab ORDER BY x) ss LIMIT 10;
This patch conditionally postpones evaluation of pure-output target
expressions (that is, those that are not used as DISTINCT, ORDER BY, or
GROUP BY columns) so that they effectively occur after sorting, even if an
explicit sort step is necessary. Volatile expressions and set-returning
expressions are always postponed, so as to provide consistent semantics.
Expensive expressions (costing more than 10 times typical operator cost,
which by default would include any user-defined function) are postponed
if there is a LIMIT or if there are expressions that must be postponed.
We could be more aggressive and postpone any nontrivial expression, but
there are costs associated with doing so: it requires an extra Result plan
node which adds some overhead, and postponement changes the volume of data
going through the sort step, perhaps for the worse. Since we tend not to
have very good estimates of the output width of nontrivial expressions,
it's hard to have much confidence in our ability to predict whether
postponement would increase or decrease the cost of the sort; therefore
this patch doesn't attempt to make decisions conditionally on that.
Between these factors and a general desire not to change query behavior
when there's not a demonstrable benefit, it seems best to be conservative
about applying postponement. We might tweak the decision rules in the
future, though.
Konstantin Knizhnik, heavily rewritten by me
The chapter "Interfacing Extensions To Indexes" and CREATE OPERATOR
CLASS reference page were missed when BRIN was added. We document
all our other index access methods there, so make sure BRIN complies.
Author: Álvaro Herrera
Reported-By: Julien Rouhaud, Tom Lane
Reviewed-By: Emre Hasegeli
Discussion: https://www.postgresql.org/message-id/56CF604E.9000303%40dalibo.com
Backpatch: 9.5, where BRIN was introduced
The pg_resetxlog reference page didn't have a proper options list, only
running text listing the options and some explanations of them. This
might have worked when there were only a few options, but the list has
grown over the releases, and now it's hard to find an option and its
associated explanation. So write out the options list as on other
reference pages.
Fabien Coelho, reviewed mostly by Michael Paquier and me, but also by
Heikki Linnakangas, BeomYong Lee, Kyotaro Horiguchi, Oleksander
Shulgin, and Álvaro Herrera.
Clarify the description of which transactions will block a CREATE INDEX
CONCURRENTLY command from proceeding, and mention that the index might
still not be usable after CREATE INDEX completes. (This happens if the
index build detected broken HOT chains, so that pg_index.indcheckxmin gets
set, and there are open old transactions preventing the xmin horizon from
advancing past the index's initial creation. I didn't want to explain what
broken HOT chains are, though, so I omitted an explanation of exactly when
old transactions prevent the index from being used.)
Per discussion with Chris Travers. Back-patch to all supported branches,
since the same text appears in all of them.
Many automated test suites call pg_ctl. Buildfarm members axolotl,
hornet, mandrill, shearwater, sungazer and tern have failed when server
shutdown took longer than the pg_ctl default 60s timeout. This addition
permits slow hosts to easily raise the timeout without us editing a
--timeout argument into every test suite pg_ctl call. Back-patch to 9.1
(all supported versions) for the sake of automated testing.
Reviewed by Tom Lane.
Get rid of the false implication that PRIMARY KEY is exactly equivalent to
UNIQUE + NOT NULL. That was more-or-less true at one time in our
implementation, but the standard doesn't say that, and we've grown various
features (many of them required by spec) that treat a pkey differently from
less-formal constraints. Per recent discussion on pgsql-general.
I failed to resist the temptation to do some other wordsmithing in the
same area.
Since there currently is only one possible parenthesized option, namely
VERBOSE, it's a bit pointless to show it with "{ } [, ... ]". The curly
braces are useless and therefore confusing, as seen in a recent question
from Karsten Hilbert. Remove the extra decoration for the time being;
we can put it back when and if REINDEX grows some more options.
Provide per-script statistical info (count of transactions executed
under that script, average latency for the whole script) after a
multi-script run, adding an intermediate level of detail to existing
global stats and per-command stats.
Author: Fabien Coelho
Reviewer: Michaël Paquier, Álvaro Herrera
This function cleans up the pending list of the GIN index by
moving entries in it to the main GIN data structure in bulk.
It returns the number of pages cleaned up from the pending list.
This function is useful, for example, when the pending list
needs to be cleaned up *quickly* to improve the performance of
the search using GIN index. VACUUM can do the same thing, too,
but it may take days to run on a large table.
Jeff Janes,
reviewed by Julien Rouhaud, Jaime Casanova, Alvaro Herrera and me.
Discussion: CAMkU=1x8zFkpfnozXyt40zmR3Ub_kHu58LtRmwHUKRgQss7=iQ@mail.gmail.com
Previously, it was possible to specify one or several custom scripts to
run, or only one of the builtin scripts. With this patch it is also
possible to specify to run the builtin scripts multiple times, using the
new -b option. Also, unify the code for both cases; this eases future
pgbench improvements.
Author: Fabien Coelho
Review: Michaël Paquier, Álvaro Herrera
Aggregate nodes now have two new modes: a "partial" mode where they
output the unfinalized transition state, and a "finalize" mode where
they accept unfinalized transition states rather than individual
values as input.
These new modes are not used anywhere yet, but they will be necessary
for parallel aggregation. The infrastructure also figures to be
useful for cases where we want to aggregate local data and remote
data via the FDW interface, and want to bring back partial aggregates
from the remote side that can then be combined with locally generated
partial aggregates to produce the final value. It may also be useful
even when neither FDWs nor parallelism are in play, as explained in
the comments in nodeAgg.c.
David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki
Linnakangas, Haribabu Kommi, and me.
The conventions specified by the GiST SGML documentation were widely
ignored. For example, the strategy-number argument for "consistent" and
"distance" functions is specified to be a smallint, but most of the
built-in support functions declared it as an integer, and for that matter
the core code passed it using Int32GetDatum not Int16GetDatum. None of
that makes any real difference at runtime, but it's quite confusing for
newcomers to the code, and it makes it very hard to write an amvalidate()
function that checks support function signatures. So let's try to instill
some consistency here.
Another similar issue is that the "query" argument is not of a single
well-defined type, but could have different types depending on the strategy
(corresponding to search operators with different righthand-side argument
types). Some of the functions threw up their hands and declared the query
argument as being of "internal" type, which surely isn't right ("any" would
have been more appropriate); but the majority position seemed to be to
declare it as being of the indexed data type, corresponding to a search
operator with both input types the same. So I've specified a convention
that that's what to do always.
Also, the result of the "union" support function actually must be of the
index's storage type, but the documentation suggested declaring it to
return "internal", and some of the functions followed that. Standardize
on telling the truth, instead.
Similarly, standardize on declaring the "same" function's inputs as
being of the storage type, not "internal".
Also, somebody had forgotten to add the "recheck" argument to both
the documentation of the "distance" support function and all of their
SQL declarations, even though the C code was happily using that argument.
Clean that up too.
Fix up some other omissions in the docs too, such as documenting that
union's second input argument is vestigial.
So far as the errors in core function declarations go, we can just fix
pg_proc.h and bump catversion. Adjusting the erroneous declarations in
contrib modules is more debatable: in principle any change in those
scripts should involve an extension version bump, which is a pain.
However, since these changes are purely cosmetic and make no functional
difference, I think we can get away without doing that.
Commit 43cd468cf0 added some wording to create_policy.sgml purporting
to warn users against a race condition of the sort that had been noted some
time ago by Peter Geoghegan. However, that warning was far too vague to be
useful (or at least, I completely failed to grasp what it was on about).
Since the problem case occurs with a security design pattern that lots of
people are likely to try to use, we need to be as clear as possible about
it. Provide a concrete example in the main-line docs in place of the
original warning.
Per a recommendation from Tomas Vondra, it's more helpful to refer to
the value that determines how skewed a Gaussian or exponential
distribution is as a parameter rather than a threshold.
Since it's not quite too late to get this right in 9.5, where it was
introduced, back-patch this. Most of the patch changes only comments
and documentation, but a few pgbench messages are altered to match.
Fabien Coelho, reviewed by Michael Paquier and by me.
Previously, -j caused the entire input file to be read in and executed as
a single command string. That's undesirable, not least because any error
causes the entire file to be regurgitated as the "failing query". Some
experimentation suggests a better rule: end the command string when we see
a semicolon immediately followed by two newlines, ie, an empty line after
a query. This serves nicely to break up the existing examples such as
information_schema.sql and system_views.sql. A limitation is that it's
no longer possible to write such a sequence within a string literal or
multiline comment in a file meant to be read with -j; but there are no
instances of such a problem within the data currently used by initdb.
(If someone does make such a mistake in future, it'll be obvious because
they'll get an unterminated-literal or unterminated-comment syntax error.)
Other than that, there shouldn't be any negative consequences; you're not
forced to end statements that way, it's just a better idea in most cases.
In passing, remove src/include/tcop/tcopdebug.h, which is dead code
because it's not included anywhere, and hasn't been for more than
ten years. One of the debug-support symbols it purported to describe
has been unreferenced for at least the same amount of time, and the
other is removed by this commit on the grounds that it was useless:
forcing -j mode all the time would have broken initdb. The lack of
complaints about that, or about the missing inclusion, shows that
no one has tried to use TCOP_DONTUSENEWLINE in many years.
Clarify that SELECT policies are now applied when SELECT rights
are required for a given query, even if the query is an UPDATE or
DELETE query. Pointed out by Noah.
Additionally, note the risk regarding concurrently open transactions
where a relation which controls access to the rows of another relation
are updated and the rows of the primary relation are also being
modified. Pointed out by Peter Geoghegan.
Back-patch to 9.5.
This has worked that way for a long time, maybe always, but you would
not have known it from the documentation. Also back-patch the notes
I added to HEAD earlier today about behavior of the "-f -" switch,
which likewise have been valid for many releases.
Commit d5563d7df9 drew complaints from Coverity, which quite
correctly complained that one copy of each -c or -f string was being
leaked. What's more, simple_action_list_append was allocating enough space
for still a third copy of each string as part of the SimpleActionListCell,
even though that coding method had been superseded by a separate strdup
operation. There were some other minor coding infelicities too. The
documentation needed more work as well, eg it forgot to explain that -c
causes psql not to accept any interactive input.
To support this, we must reconcile some historical anomalies in the
behavior of -c. In particular, as a backward-incompatibility, -c no
longer implies --no-psqlrc.
Pavel Stehule (code) and Catalin Iacob (documentation). Review by
Michael Paquier and myself. Proposed behavior per Tom Lane.
Allow pg_rewind to work when target timeline was switched. Now
user can return promoted standby to old master.
Target timeline history becomes a global variable. Index
in target timeline history is used in function interfaces instead of
specifying TLI directly. Thus, SimpleXLogPageRead() can easily start
reading XLOGs from next timeline when current timeline ends.
Author: Alexander Korotkov
Review: Michael Paquier
The POSIX standard for tar headers requires archive member sizes to be
printed in octal with at most 11 digits, limiting the representable file
size to 8GB. However, GNU tar and apparently most other modern tars
support a convention in which oversized values can be stored in base-256,
allowing any practical file to be a tar member. Adopt this convention
to remove two limitations:
* pg_dump with -Ft output format failed if the contents of any one table
exceeded 8GB.
* pg_basebackup failed if the data directory contained any file exceeding
8GB. (This would be a fatal problem for installations configured with a
table segment size of 8GB or more, and it has also been seen to fail when
large core dump files exist in the data directory.)
File sizes under 8GB are still printed in octal, so that no compatibility
issues are created except in cases that would have failed entirely before.
In addition, this patch fixes several bugs in the same area:
* In 9.3 and later, we'd defined tarCreateHeader's file-size argument as
size_t, which meant that on 32-bit machines it would write a corrupt tar
header for file sizes between 4GB and 8GB, even though no error was raised.
This broke both "pg_dump -Ft" and pg_basebackup for such cases.
* pg_restore from a tar archive would fail on tables of size between 4GB
and 8GB, on machines where either "size_t" or "unsigned long" is 32 bits.
This happened even with an archive file not affected by the previous bug.
* pg_basebackup would fail if there were files of size between 4GB and 8GB,
even on 64-bit machines.
* In 9.3 and later, "pg_basebackup -Ft" failed entirely, for any file size,
on 64-bit big-endian machines.
In view of these potential data-loss bugs, back-patch to all supported
branches, even though removal of the documented 8GB limit might otherwise
be considered a new feature rather than a bug fix.
These were discussed in three different sections of the manual, which
unsurprisingly had diverged over time; and the descriptions of individual
variables lacked stylistic consistency even within each section (and
frequently weren't in very good English anyway). Clean up the mess, and
remove some of the redundant information in hopes that future additions
will be less likely to re-introduce inconsistency. For instance I see
no need for maintenance.sgml to include its very own list of all the
autovacuum storage parameters, especially since that list was already
incomplete.
Fix some brain fade in commit a2dabf0e1d: erroneous variable names
in docs, rearrangements that made sentences less clear not more so,
undocumented and poorly-chosen-anyway API behaviors of subroutines,
bad grammar in error messages, copy-and-paste faults.
Albe Laurenz and Tom Lane
Once upon a time we did not have a separate CREATEROLE privilege, and
CREATEUSER effectively meant SUPERUSER. When we invented CREATEROLE
(in 8.1) we also added SUPERUSER so as to have a less confusing keyword
for this role property. However, we left CREATEUSER in place as a
deprecated synonym for SUPERUSER, because of backwards-compatibility
concerns. It's still there and is still confusing people, as for example
in bug #13694 from Justin Catterson. 9.6 will be ten years or so later,
which surely ought to be long enough to end the deprecation and just
remove these old keywords. Hence, do so.
We hyphenate "fixed-length" earlier in the same sentence, and overall we
more often use "variable-length" rather than "variable length".
Nikolay Shaplov
In general one may have to run both REASSIGN OWNED and DROP OWNED to get
rid of all the dependencies of a role to be dropped. This was alluded to
in the REASSIGN OWNED man page, but not really spelled out in full; and in
any case the procedure ought to be documented in a more prominent place
than that. Add a section to the "Database Roles" chapter explaining this,
and do a bit of wordsmithing in the relevant commands' man pages.
The documentation for the autovacuum_multixact_freeze_max_age and
autovacuum_freeze_max_age relation level parameters contained:
"Note that while you can set autovacuum_multixact_freeze_max_age very
small, or even zero, this is usually unwise since it will force frequent
vacuuming."
which hasn't been true since these options were made relation options,
instead of residing in the pg_autovacuum table (834a6da4f7).
Remove the outdated sentence. Even the lowered limits from 2596d70 are
high enough that this doesn't warrant calling out the risk in the CREATE
TABLE docs.
Per discussion with Tom Lane and Alvaro Herrera
Discussion: 26377.1443105453@sss.pgh.pa.us
Backpatch: 9.0- (in parts)
To allow users to force RLS to always be applied, even for table owners,
add ALTER TABLE .. FORCE ROW LEVEL SECURITY.
row_security=off overrides FORCE ROW LEVEL SECURITY, to ensure pg_dump
output is complete (by default).
Also add SECURITY_NOFORCE_RLS context to avoid data corruption when
ALTER TABLE .. FORCE ROW SECURITY is being used. The
SECURITY_NOFORCE_RLS security context is used only during referential
integrity checks and is only considered in check_enable_rls() after we
have already checked that the current user is the owner of the relation
(which should always be the case during referential integrity checks).
Back-patch to 9.5 where RLS was added.
Specifically, make its effect independent from the row_security GUC, and
make it affect permission checks pertinent to views the BYPASSRLS role
owns. The row_security GUC thereby ceases to change successful-query
behavior; it can only make a query fail with an error. Back-patch to
9.5, where BYPASSRLS was introduced.
Without CASCADE, if an extension has an unfullfilled dependency on
another extension, CREATE EXTENSION ERRORs out with "required extension
... is not installed". That is annoying, especially when that dependency
is an implementation detail of the extension, rather than something the
extension's user can make sense of.
In addition to CASCADE this also includes a small set of regression
tests around CREATE EXTENSION.
Author: Petr Jelinek, editorialized by Michael Paquier, Andres Freund
Reviewed-By: Michael Paquier, Andres Freund, Jeff Janes
Discussion: 557E0520.3040800@2ndquadrant.com
Commit 924bcf4f16 introduced a framework
for parallel computation in PostgreSQL that makes most but not all
built-in functions safe to execute in parallel mode. In order to have
parallel query, we'll need to be able to determine whether that query
contains functions (either built-in or user-defined) that cannot be
safely executed in parallel mode. This requires those functions to be
labeled, so this patch introduces an infrastructure for that. Some
functions currently labeled as safe may need to be revised depending on
how pending issues related to heavyweight locking under paralllelism
are resolved.
Parallel plans can't be used except for the case where the query will
run to completion. If portal execution were suspended, the parallel
mode restrictions would need to remain in effect during that time, but
that might make other queries fail. Therefore, this patch introduces
a framework that enables consideration of parallel plans only when it
is known that the plan will be run to completion. This probably needs
some refinement; for example, at bind time, we do not know whether a
query run via the extended protocol will be execution to completion or
run with a limited fetch count. Having the client indicate its
intentions at bind time would constitute a wire protocol break. Some
contexts in which parallel mode would be safe are not adjusted by this
patch; the default is not to try parallel plans except from call sites
that have been updated to say that such plans are OK.
This commit doesn't introduce any parallel paths or plans; it just
provides a way to determine whether they could potentially be used.
I'm committing it on the theory that the remaining parallel sequential
scan patches will also get committed to this release, hopefully in the
not-too-distant future.
Robert Haas and Amit Kapila. Reviewed (in earlier versions) by Noah
Misch.
This patch adds an option to replace the "time since pgbench run
started" with a Unix epoch timestamp in the progress report so that,
for instance, it is easier to compare timelines with pgsql log
Fabien COELHO <coelho@cri.ensmp.fr>
COMMENT supports POLICY but the documentation hadn't caught up with
that fact.
Patch by Charles Clavadetscher
Back-patch to 9.5 where POLICY was added.
Patch provides command line option --strict-names which requires that at
least one table/schema should present for each -t/-n option.
Pavel Stehule <pavel.stehule@gmail.com>
Per discussion, nowadays it is possible to have tablespaces that have
wildly different I/O characteristics from others. Setting different
effective_io_concurrency parameters for those has been measured to
improve performance.
Author: Julien Rouhaud
Reviewed by: Andres Freund
Remove the code in plpgsql that suppressed the innermost line of CONTEXT
for messages emitted by RAISE commands. That was never more than a quick
backwards-compatibility hack, and it's pretty silly in cases where the
RAISE is nested in several levels of function. What's more, it violated
our design theory that verbosity of error reports should be controlled
on the client side not the server side.
To alleviate the resulting noise increase, introduce a feature in libpq
and psql whereby the CONTEXT field of messages can be suppressed, either
always or only for non-error messages. Printing CONTEXT for errors only
is now their default behavior.
The actual code changes here are pretty small, but the effects on the
regression test outputs are widespread. I had to edit some of the
alternative expected outputs by hand; hopefully the buildfarm will soon
find anything I fat-fingered.
In passing, fix up (again) the output line counts in psql's various
help displays. Add some commentary about how to verify them.
Pavel Stehule, reviewed by Petr Jelínek, Jeevan Chalke, and others
The table-rewriting forms of ALTER TABLE are MVCC-unsafe, in much the same
way as TRUNCATE, because they replace all rows of the table with newly-made
rows with a new xmin. (Ideally, concurrent transactions with old snapshots
would continue to see the old table contents, but the data is not there
anymore --- and if it were there, it would be inconsistent with the table's
updated rowtype, so there would be serious implementation problems to fix.)
This was nowhere documented though, and the problem was only documented for
TRUNCATE in a note in the TRUNCATE reference page. Create a new "Caveats"
section in the MVCC chapter that can be home to this and other limitations
on serializable consistency.
In passing, fix a mistaken statement that VACUUM and CLUSTER would reclaim
space occupied by a dropped column. They don't reconstruct existing tuples
so they couldn't do that.
Back-patch to all supported branches.
Reduce lock levels down to ShareUpdateExclusiveLock for all autovacuum-related
relation options when setting them using ALTER TABLE.
Add infrastructure to allow varying lock levels for relation options in later
patches. Setting multiple options together uses the highest lock level required
for any option. Works for both main and toast tables.
Fabrízio Mello, reviewed by Michael Paquier, mild edit and additional regression
tests from myself
Immediately starting to stream after --create-slot is inconvenient in a
number of situations (e.g. when configuring a slot for use in
recovery.conf) and it's easy to just call pg_receivexlog twice in the
rest of the cases.
Author: Michael Paquier
Discussion: CAB7nPqQ9qEtuDiKY3OpNzHcz5iUA+DUX9FcN9K8GUkCZvG7+Ew@mail.gmail.com
Backpatch: 9.5, where the option was introduced
This option specifies a replication slot for WAL streaming (-X stream),
so that there can be continuous replication slot use between WAL
streaming during the base backup and the start of regular streaming
replication.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
This tells you what fraction of NOTIFY's queue is currently filled.
Brendan Jurd, reviewed by Merlin Moncure and Gurjeet Singh. A few
further tweaks by me.
Other options cannot be changed, as it's not totally clear if cached plans
would need to be invalidated if one of the other options change. Selectivity
estimator functions only change plan costs, not correctness of plans, so
those should be safe.
Original patch by Uriy Zhuravlev, heavily edited by me.
pg_receivexlog and pg_recvlogical error out when --create-slot is
specified and a slot with the same name already exists. In some cases,
especially with pg_receivexlog, that's rather annoying and requires
additional scripting.
Backpatch to 9.5 as slot control functions have newly been added to
pg_receivexlog, and there doesn't seem much point leaving it in a less
useful state.
Discussion: 20150619144755.GG29350@alap3.anarazel.de
Commit 31eae602 added new syntax to many DDL commands to use CURRENT_USER
or SESSION_USER instead of role name in ALTER ... OWNER TO, but because
of a misplaced '{', the syntax in the docs implied that the syntax was
"ALTER ... CURRENT_USER", instead of "ALTER ... OWNER TO CURRENT_USER".
Fix that, and also the funny indentation in some of the modified syntax
blurps.
The .backup file name can be passed to pg_archivecleanup even if
it includes the extension which is specified in -x option.
However, previously the document incorrectly warned a user
not to do that.
Back-patch to 9.2 where pg_archivecleanup's -x option and
the warning were added.