Commit Graph

12318 Commits

Author SHA1 Message Date
Tom Lane 62d16c7fc5 Improve design and implementation of pg_file_settings view.
As first committed, this view reported on the file contents as they were
at the last SIGHUP event.  That's not as useful as reporting on the current
contents, and what's more, it didn't work right on Windows unless the
current session had serviced at least one SIGHUP.  Therefore, arrange to
re-read the files when pg_show_all_settings() is called.  This requires
only minor refactoring so that we can pass changeVal = false to
set_config_option() so that it won't actually apply any changes locally.

In addition, add error reporting so that errors that would prevent the
configuration files from being loaded, or would prevent individual settings
from being applied, are visible directly in the view.  This makes the view
usable for pre-testing whether edits made in the config files will have the
desired effect, before one actually issues a SIGHUP.

I also added an "applied" column so that it's easy to identify entries that
are superseded by later entries; this was the main use-case for the original
design, but it seemed unnecessarily hard to use for that.

Also fix a 9.4.1 regression that allowed multiple entries for a
PGC_POSTMASTER variable to cause bogus complaints in the postmaster log.
(The issue here was that commit bf007a27ac unintentionally reverted
3e3f65973a, which suppressed any duplicate entries within
ParseConfigFp.  However, since the original coding of the pg_file_settings
view depended on such suppression *not* happening, we couldn't have fixed
this issue now without first doing something with pg_file_settings.
Now we suppress duplicates by marking them "ignored" within
ProcessConfigFileInternal, which doesn't hide them in the view.)

Lesser changes include:

Drive the view directly off the ConfigVariable list, instead of making a
basically-equivalent second copy of the data.  There's no longer any need
to hang onto the data permanently, anyway.

Convert show_all_file_settings() to do its work in one call and return a
tuplestore; this avoids risks associated with assuming that the GUC state
will hold still over the course of query execution.  (I think there were
probably latent bugs here, though you might need something like a cursor
on the view to expose them.)

Arrange to run SIGHUP processing in a short-lived memory context, to
forestall process-lifespan memory leaks.  (There is one known leak in this
code, in ProcessConfigDirectory; it seems minor enough to not be worth
back-patching a specific fix for.)

Remove mistaken assignment to ConfigFileLineno that caused line counting
after an include_dir directive to be completely wrong.

Add missed failure check in AlterSystemSetConfigFile().  We don't really
expect ParseConfigFp() to fail, but that's not an excuse for not checking.
2015-06-28 18:06:14 -04:00
Heikki Linnakangas d661532e27 Also trigger restartpoints based on max_wal_size on standby.
When archive recovery and restartpoints were initially introduced,
checkpoint_segments was ignored on the grounds that the files restored from
archive don't consume any space in the recovery server. That was changed in
later releases, but even then it was arguably a feature rather than a bug,
as performing restartpoints as often as checkpoints during normal operation
might be excessive, but you might nevertheless not want to waste a lot of
space for pre-allocated WAL by setting checkpoint_segments to a high value.
But now that we have separate min_wal_size and max_wal_size settings, you
can bound WAL usage with max_wal_size, and still avoid consuming excessive
space usage by setting min_wal_size to a lower value, so that argument is
moot.

There are still some issues with actually limiting the space usage to
max_wal_size: restartpoints in recovery can only start after seeing the
checkpoint record, while a checkpoint starts flushing buffers as soon as
the redo-pointer is set. Restartpoint is paced to happen at the same
leisurily speed, determined by checkpoint_completion_target, as checkpoints,
but because they are started later, max_wal_size can be exceeded by upto
one checkpoint cycle's worth of WAL, depending on
checkpoint_completion_target. But that seems better than not trying at all,
and max_wal_size is a soft limit anyway.

The documentation already claimed that max_wal_size is obeyed in recovery,
so this just fixes the behaviour to match the docs. However, add some
weasel-words there to mention that max_wal_size may well be exceeded by
some amount in recovery.
2015-06-29 00:09:10 +03:00
Heikki Linnakangas 6ab4d38ab0 Fix markup in docs.
Oops. I could swear I built the docs before pushing, but I guess not..
2015-06-29 00:01:26 +03:00
Heikki Linnakangas cb2acb1081 Add missing_ok option to the SQL functions for reading files.
This makes it possible to use the functions without getting errors, if there
is a chance that the file might be removed or renamed concurrently.
pg_rewind needs to do just that, although this could be useful for other
purposes too. (The changes to pg_rewind to use these functions will come in
a separate commit.)

The read_binary_file() function isn't very well-suited for extensions.c's
purposes anymore, if it ever was. So bite the bullet and make a copy of it
in extension.c, tailored for that use case. This seems better than the
accidental code reuse, even if it's a some more lines of code.

Michael Paquier, with plenty of kibitzing by me.
2015-06-28 21:35:46 +03:00
Robert Haas 7c02d48e69 Fix grammar.
Reported by Peter Geoghegan.
2015-06-26 16:04:46 -04:00
Robert Haas c66bc72e8a release notes: Add entry for commit 5ea86e6e6.
Peter Geoghegan and Robert Haas
2015-06-26 14:49:37 -04:00
Robert Haas 31c018ecda release notes: Combine items for pg_upgrade and pg_upgrade_support moves.
Per suggestions from Amit Langote and Álvaro Herrera.
2015-06-26 14:20:29 -04:00
Robert Haas 5ca611841b Improve handling of CustomPath/CustomPlan(State) children.
Allow CustomPath to have a list of paths, CustomPlan a list of plans,
and CustomPlanState a list of planstates known to the core system, so
that custom path/plan providers can more reasonably use this
infrastructure for nodes with multiple children.

KaiGai Kohei, per a design suggestion from Tom Lane, with some
further kibitzing by me.
2015-06-26 09:40:47 -04:00
Tom Lane d759b7eb6a Docs: fix claim that to_char('FM') removes trailing zeroes.
Of course, what it removes is leading zeroes.  Seems to have been a thinko
in commit ffe92d15d5.  Noted by Hubert Depesz
Lubaczewski.
2015-06-25 10:44:03 -04:00
Fujii Masao 0b157a0dad Add index terms for functions jsonb_set and jsonb_pretty. 2015-06-24 22:30:19 +09:00
Alvaro Herrera 1443a165db Fix BRIN supported operators table
Some of the entries in the inclusion opclasses where missing operators,
and we had an entry for inet_inclusion_ops instead of
network_inclusion_ops.  Sort the operators within each opclass by
strategy number, just to make it easier to spot mistakes.

Also sort the rows by data type name, rather than OID.
2015-06-20 12:26:36 -03:00
Tom Lane 48913db887 In immediate shutdown, postmaster should not exit till children are gone.
This adjusts commit 82233ce7ea so that the
postmaster does not exit until all its child processes have exited, even
if the 5-second timeout elapses and we have to send SIGKILL.  There is no
great value in having the postmaster process quit sooner, and doing so can
mislead onlookers into thinking that the cluster is fully terminated when
actually some child processes still survive.

This effect might explain recent test failures on buildfarm member hamster,
wherein we failed to restart a cluster just after shutting it down with
"pg_ctl stop -m immediate".

I also did a bit of code review/beautification, including fixing a faulty
use of the Max() macro on a volatile expression.

Back-patch to 9.4.  In older branches, the postmaster never waited for
children to exit during immediate shutdowns, and changing that would be
too much of a behavioral change.
2015-06-19 14:23:39 -04:00
Bruce Momjian 2bed1cd751 release notes: fix Petr's name typos
Report by Alvaro Herrera
2015-06-14 13:41:37 -04:00
Peter Eisentraut a85054181b doc: Add note to pg_dump man page about pg_dumpall
suggested by Joshua Drake
2015-06-13 21:45:56 -04:00
Peter Eisentraut 340c74dfdf Remove stray character 2015-06-13 21:41:34 -04:00
Bruce Momjian 74cb688525 release notes: consistently name "Alexander Shulgin"
Report by Alvaro Herrera
2015-06-13 21:10:48 -04:00
Bruce Momjian 62331ef3f6 release notes: move/remove/adjust items
Report by Alvaro Herrera
2015-06-13 21:07:24 -04:00
Bruce Momjian 305f815ccd release notes: add accent to Petr Jelínek last name
Report by Alvaro Herrera
2015-06-13 21:00:30 -04:00
Bruce Momjian 31cda8bf3c release notes: remove mention of pg_basebackup non-compat
Report by Amit Kapila
2015-06-13 20:56:30 -04:00
Bruce Momjian 29d80d5fa8 release notes: add Petr Jelinek to JSON function item
Report by Petr Jelinek
2015-06-12 22:34:31 -04:00
Bruce Momjian 230ff9383c release notes: fixes from Fujii Masao
Report by Fujii Masao
2015-06-12 22:31:17 -04:00
Bruce Momjian 644ac3e678 release notes: reorder hash performance authors, again
Report by Robert Haas
2015-06-12 22:25:30 -04:00
Bruce Momjian 51b47c5c09 release notes: reorder sort performance authors
Report by Peter Geoghegan
2015-06-12 22:23:40 -04:00
Bruce Momjian 8bf51ad0cc release notes: split apart hash items
Report by Tom Lane, Robert Haas
2015-06-12 22:16:08 -04:00
Bruce Momjian 89fe9bfc4e release notes: add two optimizer items
Report by Tom Lane
2015-06-12 21:47:49 -04:00
Fujii Masao 091c02a958 Fix alphabetization in catalogs.sgml.
System catalogs and views should be listed alphabetically
in catalog.sgml, but only pg_file_settings view not.

This patch also fixes typos in pg_file_settings comments.
2015-06-12 12:59:29 +09:00
Bruce Momjian 66447916f7 release notes: add links to doc sections 2015-06-11 23:04:46 -04:00
Bruce Momjian 778fed04cd release notes: update hash item
Report by Tomas Vondra
2015-06-11 11:32:32 -04:00
Bruce Momjian 7b7be78a12 release notes: move pg_buffercache item to the right section
Report by Amit Langote
2015-06-11 11:13:49 -04:00
Bruce Momjian bab74070b3 release notes: implement suggestions
Report by Michael Paquier
2015-06-11 11:11:43 -04:00
Bruce Momjian 1cc9f8ccd9 release notes: explain meaning of pg_stat_get_snapshot_timestamp()
Report by Michael Paquier
2015-06-11 10:58:38 -04:00
Bruce Momjian 5a4ea8e200 release notes: update for pg_basebackup in tar format
Report by Amit Kapila
2015-06-11 10:51:39 -04:00
Andrew Dunstan 908e234733 Rename jsonb - text[] operator to #- to avoid ambiguity.
Following recent discussion  on -hackers. The underlying function is
also renamed to jsonb_delete_path. The regression tests now don't need
ugly type casts to avoid the ambiguity, so they are also removed.

Catalog version bumped.
2015-06-11 10:06:58 -04:00
Bruce Momjian aacb8b9277 First draft of 9.5 release notes 2015-06-11 00:09:32 -04:00
Peter Eisentraut e80f619acf doc: Use "connections" instead of "slots" to avoid confusion
The text was written before replication slots existed, but now "slot" is
best not used for anything else in the space of replication.
2015-06-10 21:34:03 -04:00
Peter Eisentraut 28d17269a1 doc: Fix typo 2015-06-10 21:33:35 -04:00
Peter Eisentraut 75a49ba550 doc: Call xmllint for validity also in the fop build
This was somehow missed in commit
5d93ce2d0c.
2015-06-10 19:54:28 -04:00
Bruce Momjian 1e87d4d068 docs: update release note regex suggestions 2015-06-10 16:34:01 -04:00
Tom Lane b94085920b Release notes for 9.4.4, 9.3.9, 9.2.13, 9.1.18, 9.0.22. 2015-06-09 14:33:43 -04:00
Tom Lane 21187cfc7d First-draft release notes for 9.4.4, 9.3.9, 9.2.13, 9.1.18, 9.0.22. 2015-06-09 13:07:15 -04:00
Alvaro Herrera 94232c909d Fix typos
tablesapce -> tablespace
there -> their

These were introduced in 72d422a52, so no need to backpatch.
2015-06-08 15:37:42 -03:00
Andrew Dunstan 94d6727dbe Clarify documentation of jsonb - text
Peter Geoghegan
2015-06-07 21:31:52 -04:00
Andrew Dunstan b81c7b4098 Desupport jsonb subscript deletion on objects
Supporting deletion of JSON pairs within jsonb objects using an
array-style integer subscript allowed for surprising outcomes.  This was
mostly due to the implementation-defined ordering of pairs within
objects for jsonb.

It also seems desirable to make jsonb integer subscript deletion
consistent with the 9.4 era general purpose integer subscripting
operator for jsonb (although that operator returns NULL when an object
is encountered, while we prefer here to throw an error).

Peter Geoghegan, following discussion on -hackers.
2015-06-07 20:46:00 -04:00
Peter Eisentraut d23a3a603b doc: Fix broken links in FOP build
FOP doesn't handle links to table rows, so put the link to a cell
instead.
2015-06-07 20:27:27 -04:00
Robert Haas 99cfd5e136 doc: Session identifiers truncate, not round, the backend start time.
Joel Jacobson
2015-06-04 17:57:39 -04:00
Robert Haas 1c645da8eb docs: Fix list of object types pg_table_is_visible() can handle.
Materialized views and foreign tables were missing from the list,
probably because they are newer than the other object types that were
mentioned.

Etsuro Fujita
2015-06-04 17:48:00 -04:00
Fujii Masao 232cd63b1f Remove -i/--ignore-version option from pg_dump, pg_dumpall and pg_restore.
The commit c22ed3d523 turned
the -i/--ignore-version options into no-ops and marked as deprecated.
Considering we shipped that in 8.4, it's time to remove all trace of
those switches, per discussion. We'd still have to wait a couple releases
before it'd be safe to use -i for something else, but it'd be a start.
2015-06-04 19:54:43 +09:00
Fujii Masao 38d500ac2e Fix some issues in pg_class.relminmxid and pg_database.datminmxid documentation.
- Correct the name of directory which those catalog columns allow to be shrunk.
- Correct the name of symbol which is used as the value of pg_class.relminmxid
  when the relation is not a table.
- Fix "ID ID" typo.

Backpatch to 9.3 where those cataog columns were introduced.
2015-06-04 13:22:49 +09:00
Peter Eisentraut afae1f7854 doc: Fix PDF build with FOP
Because of a bug in the DocBook XSL FO style sheet, an xref to a
varlistentry whose term includes an indexterm fails to build.  One such
instance was introduced in commit
5086dfceba.  Fix by adding the upstream
bug fix to our customization layer.
2015-06-03 20:19:47 -04:00
Fujii Masao 37013621f3 Minor improvement to txid_current() documentation.
Michael Paquier, reviewed by Christoph Berg and Naoya Anzai
2015-06-03 12:12:48 +09:00
Tom Lane 82ec7d2821 Release notes for 9.4.3, 9.3.8, 9.2.12, 9.1.17, 9.0.21.
Also sneak entries for commits 97ff2a564 et al into the sections for
the previous releases in the relevant branches.  Those fixes did go out
in the previous releases, but missed getting documented.
2015-06-01 13:27:43 -04:00
Andrew Dunstan 37def42245 Rename jsonb_replace to jsonb_set and allow it to add new values
The function is given a fourth parameter, which defaults to true. When
this parameter is true, if the last element of the path is missing
in the original json, jsonb_set creates it in the result and assigns it
the new value. If it is false then the function does nothing unless all
elements of the path are present, including the last.

Based on some original code from Dmitry Dolgov, heavily modified by me.

Catalog version bumped.
2015-05-31 20:34:10 -04:00
Stephen Frost d5442cb243 Remove *pgaudit* references also.
Fixes the docs build.
2015-05-28 13:02:09 -04:00
Stephen Frost cde9cf170c Finish removing pg_audit 2015-05-28 12:48:25 -04:00
Tom Lane e70ec8230a Explain CHECK constraint handling in postgres_fdw's IMPORT FOREIGN SCHEMA.
The existing documentation could easily be misinterpreted, and it failed to
explain the inconsistent-evaluation hazard that deterred us from supporting
automatic importing of check constraints.  Revise it.

Etsuro Fujita, further expanded by me
2015-05-25 14:13:02 -04:00
Tom Lane 91e79260f6 Remove no-longer-required function declarations.
Remove a bunch of "extern Datum foo(PG_FUNCTION_ARGS);" declarations that
are no longer needed now that PG_FUNCTION_INFO_V1(foo) provides that.

Some of these were evidently missed in commit e7128e8dbb, but others
were cargo-culted in in code added since then.  Possibly that can be blamed
in part on the fact that we'd not fixed relevant documentation examples,
which I've now done.
2015-05-24 12:20:23 -04:00
Tom Lane 821b821a24 Still more fixes for lossy-GiST-distance-functions patch.
Fix confusion in documentation, substantial memory leakage if float8 or
float4 are pass-by-reference, and assorted comments that were obsoleted
by commit 98edd617f3.
2015-05-23 15:22:25 -04:00
Andres Freund 631d749007 Remove the new UPSERT command tag and use INSERT instead.
Previously, INSERT with ON CONFLICT DO UPDATE specified used a new
command tag -- UPSERT.  It was introduced out of concern that INSERT as
a command tag would be a misrepresentation for ON CONFLICT DO UPDATE, as
some affected rows may actually have been updated.

Alvaro Herrera noticed that the implementation of that new command tag
was incomplete; in subsequent discussion we concluded that having it
doesn't provide benefits that are in line with the compatibility breaks
it requires.

Catversion bump due to the removal of PlannedStmt->isUpsert.

Author: Peter Geoghegan
Discussion: 20150520215816.GI5885@postgresql.org
2015-05-23 00:58:45 +02:00
Fujii Masao 6d1733fa90 Minor enhancement of readability of ALTER TABLE syntax in the doc.
Fabrízio Mello
2015-05-22 21:42:15 +09:00
Robert Haas 160a9aaabf Correct two mistakes in the ALTER FOREIGN TABLE reference page.
Etsuro Fujita
2015-05-21 11:16:33 -04:00
Fujii Masao cad3708960 Correct the names of pgstattuple_approx output columns in the doc. 2015-05-21 20:51:52 +09:00
Heikki Linnakangas 4fc72cc7bb Collection of typo fixes.
Use "a" and "an" correctly, mostly in comments. Two error messages were
also fixed (they were just elogs, so no translation work required). Two
function comments in pg_proc.h were also fixed. Etsuro Fujita reported one
of these, but I found a lot more with grep.

Also fix a few other typos spotted while grepping for the a/an typos.
For example, "consists out of ..." -> "consists of ...". Plus a "though"/
"through" mixup reported by Euler Taveira.

Many of these typos were in old code, which would be nice to backpatch to
make future backpatching easier. But much of the code was new, and I didn't
feel like crafting separate patches for each branch. So no backpatching.
2015-05-20 16:56:22 +03:00
Tom Lane 5cb8519ceb Last-minute updates for release notes.
Revise description of CVE-2015-3166, in line with scaled-back patch.
Change release date.

Security: CVE-2015-3166
2015-05-19 18:33:58 -04:00
Tom Lane afee04352b Revert "Change pg_seclabel.provider and pg_shseclabel.provider to type "name"."
This reverts commit b82a7be603.  There
is a better (less invasive) way to fix it, which I will commit next.
2015-05-19 10:40:04 -04:00
Tom Lane b82a7be603 Change pg_seclabel.provider and pg_shseclabel.provider to type "name".
These were "text", but that's a bad idea because it has collation-dependent
ordering.  No index in template0 should have collation-dependent ordering,
especially not indexes on shared catalogs.  There was general agreement
that provider names don't need to be longer than other identifiers, so we
can fix this at a small waste of table space by changing from text to name.

There's no way to fix the problem in the back branches, but we can hope
that security labels don't yet have widespread-enough usage to make it
urgent to fix.

There needs to be a regression sanity test to prevent us from making this
same mistake again; but before putting that in, we'll need to get rid of
similar brain fade in the recently-added pg_replication_origin catalog.

Note: for lack of a suitable testing environment, I've not really exercised
this change.  I trust the buildfarm will show up any mistakes.
2015-05-18 20:07:53 -04:00
Tom Lane 19d47ed2da Last-minute updates for release notes.
Add entries for security issues.

Security: CVE-2015-3165 through CVE-2015-3167
2015-05-18 12:09:02 -04:00
Noah Misch 85270ac7a2 pgcrypto: Report errant decryption as "Wrong key or corrupt data".
This has been the predominant outcome.  When the output of decrypting
with a wrong key coincidentally resembled an OpenPGP packet header,
pgcrypto could instead report "Corrupt data", "Not text data" or
"Unsupported compression algorithm".  The distinct "Corrupt data"
message added no value.  The latter two error messages misled when the
decrypted payload also exhibited fundamental integrity problems.  Worse,
error message variance in other systems has enabled cryptologic attacks;
see RFC 4880 section "14. Security Considerations".  Whether these
pgcrypto behaviors are likewise exploitable is unknown.

In passing, document that pgcrypto does not resist side-channel attacks.
Back-patch to 9.0 (all supported versions).

Security: CVE-2015-3167
2015-05-18 10:02:31 -04:00
Heikki Linnakangas 4df1328950 Put back stats-collector restarting code, removed accidentally.
Removed that code snippet accidentally in the archive_mode='always' patch.

Also, use varname-tags for archive_command in the docs.

Fujii Masao
2015-05-18 10:20:30 +03:00
Fujii Masao d773b55713 Don't classify REINDEX command as DDL in the pg_audit doc.
The commit a936743 changed the class of REINDEX but forgot to update the doc.
2015-05-18 14:55:07 +09:00
Tom Lane a0891d2d01 Release notes for 9.4.2, 9.3.7, 9.2.11, 9.1.16, 9.0.20. 2015-05-17 15:54:20 -04:00
Magnus Hagander de6109b8cc Fix wording error caused by recent typo fixes
It wasn't just a typo, but bad wording. This should make it
more clear. Pointed out by Tom Lane.
2015-05-17 19:07:36 +02:00
Magnus Hagander 3b075e9d7b Fix typos in comments
Dmitriy Olshevskiy
2015-05-17 14:58:04 +02:00
Magnus Hagander 6b665454e3 Minor docs fixes for pg_audit
Peter Geoghegan
2015-05-17 11:07:19 +02:00
Tom Lane 0563b4c0c3 First-draft release notes for 9.4.2 et al.
As usual, the release notes for older branches will be made by cutting
these down, but put them up for community review first.
2015-05-16 18:09:39 -04:00
Tom Lane c65aa7a87e Fix docs typo
I don't think "respectfully" is what was meant here ...
2015-05-16 13:28:26 -04:00
Simon Riggs f941d03329 Add docs for tablesample system_time() 2015-05-15 21:54:18 -04:00
Andres Freund f3d3118532 Support GROUPING SETS, CUBE and ROLLUP.
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.

This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.

The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.

The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.

Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage.  The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting.  The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.

A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.

Needs a catversion bump because stored rules may change.

Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
    Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:46:31 +02:00
Simon Riggs 6e4415c6aa Add docs for tablesample system_rows() 2015-05-15 21:44:53 -04:00
Alvaro Herrera b0b7be6133 Add BRIN infrastructure for "inclusion" opclasses
This lets BRIN be used with R-Tree-like indexing strategies.

Also provided are operator classes for range types, box and inet/cidr.
The infrastructure provided here should be sufficient to create operator
classes for similar datatypes; for instance, opclasses for PostGIS
geometries should be doable, though we didn't try to implement one.

(A box/point opclass was also submitted, but we ripped it out before
commit because the handling of floating point comparisons in existing
code is inconsistent and would generate corrupt indexes.)

Author: Emre Hasegeli.  Cosmetic changes by me
Review: Andreas Karlsson
2015-05-15 18:05:22 -03:00
Simon Riggs 910baf0a96 Tablesample method API docs
Petr Jelinek
2015-05-15 15:40:52 -04:00
Simon Riggs 149f6f1576 TABLESAMPLE system_time(limit)
Contrib module implementing a tablesample method
that allows you to limit the sample by a hard time
limit.

Petr Jelinek

Reviewed by Michael Paquier, Amit Kapila and
Simon Riggs
2015-05-15 15:18:57 -04:00
Simon Riggs 9689290ff0 TABLESAMPLE system_rows(limit)
Contrib module implementing a tablesample method
that allows you to limit the sample by a hard row
limit.

Petr Jelinek

Reviewed by Michael Paquier, Amit Kapila and
Simon Riggs
2015-05-15 15:14:22 -04:00
Robert Haas 92edba2665 doc: CREATE FOREIGN TABLE now allows CHECK ( ... ) NO INHERIT
Etsuro Fujita
2015-05-15 14:42:15 -04:00
Simon Riggs f6d208d6e5 TABLESAMPLE, SQL Standard and extensible
Add a TABLESAMPLE clause to SELECT statements that allows
user to specify random BERNOULLI sampling or block level
SYSTEM sampling. Implementation allows for extensible
sampling functions to be written, using a standard API.
Basic version follows SQLStandard exactly. Usable
concrete use cases for the sampling API follow in later
commits.

Petr Jelinek

Reviewed by Michael Paquier and Simon Riggs
2015-05-15 14:37:10 -04:00
Heikki Linnakangas 8b0f105d2d Fix docs build. Oops. 2015-05-15 19:58:56 +03:00
Heikki Linnakangas ffd37740ee Add archive_mode='always' option.
In 'always' mode, the standby independently archives all files it receives
from the primary.

Original patch by Fujii Masao, docs and review by me.
2015-05-15 18:55:24 +03:00
Bruce Momjian f6d65f0c70 docs: consistently uppercase index method and add spacing
Consistently uppercase index method names, e.g. GIN, and add space after
the index method name and the parentheses enclosing the column names.
2015-05-15 11:42:34 -04:00
Heikki Linnakangas 98edd617f3 Fix datatype confusion with the new lossy GiST distance functions.
We can only support a lossy distance function when the distance function's
datatype is comparable with the original ordering operator's datatype.
The distance function always returns a float8, so we are limited to float8,
and float4 (by a hard-coded cast of the float8 to float4).

In light of this limitation, it seems like a good idea to have a separate
'recheck' flag for the ORDER BY expressions, so that if you have a non-lossy
distance function, it still works with lossy quals. There are cases like
that with the build-in or contrib opclasses, but it's plausible.

There was a hidden assumption that the ORDER BY values returned by GiST
match the original ordering operator's return type, but there are plenty
of examples where that's not true, e.g. in btree_gist and pg_trgm. As long
as the distance function is not lossy, we can tolerate that and just not
return the distance to the executor (or rather, always return NULL). The
executor doesn't need the distances if there are no lossy results.

There was another little bug: the recheck variable was not initialized
before calling the distance function. That revealed the bigger issue,
as the executor tried to reorder tuples that didn't need reordering, and
that failed because of the datatype mismatch.
2015-05-15 18:09:31 +03:00
Fujii Masao 458a07701e Support --verbose option in reindexdb.
Sawada Masahiko, reviewed by Fabrízio Mello
2015-05-15 21:45:55 +09:00
Heikki Linnakangas 35fcb1b3d0 Allow GiST distance function to return merely a lower-bound.
The distance function can now set *recheck = false, like index quals. The
executor will then re-check the ORDER BY expressions, and use a queue to
reorder the results on the fly.

This makes it possible to do kNN-searches on polygons and circles, which
don't store the exact value in the index, but just a bounding box.

Alexander Korotkov and me
2015-05-15 14:26:51 +03:00
Fujii Masao ecd222e770 Support VERBOSE option in REINDEX command.
When this option is specified, a progress report is printed as each index
is reindexed.

Per discussion, we agreed on the following syntax for the extensibility of
the options.

    REINDEX (flexible options) { INDEX | ... } name

Sawada Masahiko.
Reviewed by Robert Haas, Fabrízio Mello, Alvaro Herrera, Kyotaro Horiguchi,
Jim Nasby and me.

Discussion: CAD21AoA0pK3YcOZAFzMae+2fcc3oGp5zoRggDyMNg5zoaWDhdQ@mail.gmail.com
2015-05-15 20:09:57 +09:00
Tom Lane 4b8f797f67 Honor traditional SGML NAMELEN limit.
We've conformed to this limit in the past, so might as well continue to.

Aaron Swenson
2015-05-14 22:34:28 -04:00
Peter Eisentraut a486e35706 Add pg_settings.pending_restart column
with input from David G. Johnston, Robert Haas, Michael Paquier
2015-05-14 20:08:51 -04:00
Bruce Momjian 333a870f94 doc: list bigint as mapping to int8 and int64
Report by Paul Jungwirth
2015-05-14 17:37:59 -04:00
Tom Lane 333d077962 Docs: fix erroneous claim about max byte length of GB18030.
This encoding has characters up to 4 bytes long, not 2.
2015-05-14 14:59:00 -04:00
Tom Lane 1dc5ebc907 Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation.  Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype.  There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.

The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.

In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables.  We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work.  (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)

Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad.  In any case most applications should see a net win.

Tom Lane, reviewed by Andres Freund
2015-05-14 12:08:49 -04:00
Stephen Frost ac52bb0442 Add pg_audit, an auditing extension
This extension provides detailed logging classes, ability to control
logging at a per-object level, and includes fully-qualified object
names for logged statements (DML and DDL) in independent fields of the
log output.

Authors: Ian Barwick, Abhijit Menon-Sen, David Steele
Reviews by: Robert Haas, Tatsuo Ishii, Sawada Masahiko, Fujii Masao,
Simon Riggs

Discussion with: Josh Berkus, Jaime Casanova, Peter Eisentraut,
David Fetter, Yeb Havinga, Alvaro Herrera, Petr Jelinek, Tom Lane,
MauMau, Bruce Momjian, Jim Nasby, Michael Paquier,
Fabrízio de Royes Mello, Neil Tiffin
2015-05-14 10:36:16 -04:00
Andres Freund 5850b20f58 Add pgstattuple_approx() to the pgstattuple extension.
The new function allows to estimate bloat and other table level statics
in a faster, but approximate, way. It does so by using information from
the free space map for pages marked as all visible in the visibility
map. The rest of the table is actually read and free space/bloat is
measured accurately.  In many cases that allows to get bloat information
much quicker, causing less IO.

Author: Abhijit Menon-Sen
Reviewed-By: Andres Freund, Amit Kapila and Tomas Vondra
Discussion: 20140402214144.GA28681@kea.toroid.org
2015-05-13 07:35:06 +02:00
Andrew Dunstan c6947010ce Additional functions and operators for jsonb
jsonb_pretty(jsonb) produces nicely indented json output.
jsonb || jsonb concatenates two jsonb values.
jsonb - text removes a key and its associated value from the json
jsonb - int removes the designated array element
jsonb - text[] removes a key and associated value or array element at
the designated path
jsonb_replace(jsonb,text[],jsonb) replaces the array element designated
by the path or the value associated with the key designated by the path
with the given value.

Original work by Dmitry Dolgov, adapted and reworked for PostgreSQL core
by Andrew Dunstan, reviewed and tidied up by Petr Jelinek.
2015-05-12 15:52:45 -04:00
Tom Lane afb9249d06 Add support for doing late row locking in FDWs.
Previously, FDWs could only do "early row locking", that is lock a row as
soon as it's fetched, even though local restriction/join conditions might
discard the row later.  This patch adds callbacks that allow FDWs to do
late locking in the same way that it's done for regular tables.

To make use of this feature, an FDW must support the "ctid" column as a
unique row identifier.  Currently, since ctid has to be of type TID,
the feature is of limited use, though in principle it could be used by
postgres_fdw.  We may eventually allow FDWs to specify another data type
for ctid, which would make it possible for more FDWs to use this feature.

This commit does not modify postgres_fdw to use late locking.  We've
tested some prototype code for that, but it's not in committable shape,
and besides it's quite unclear whether it actually makes sense to do late
locking against a remote server.  The extra round trips required are likely
to outweigh any benefit from improved concurrency.

Etsuro Fujita, reviewed by Ashutosh Bapat, and hacked up a lot by me
2015-05-12 14:10:17 -04:00
Bruce Momjian ea12b3ca8c doc build: use unique Makefile variable to control temp install 2015-05-12 12:30:50 -04:00
Andrew Dunstan 72d422a522 Map basebackup tablespaces using a tablespace_map file
Windows can't reliably restore symbolic links from a tar format, so
instead during backup start we create a tablespace_map file, which is
used by the restoring postgres to create the correct links in pg_tblspc.
The backup protocol also now has an option to request this file to be
included in the backup stream, and this is used by pg_basebackup when
operating in tar mode.

This is done on all platforms, not just Windows.

This means that pg_basebackup will not not work in tar mode against 9.4
and older servers, as this protocol option isn't implemented there.

Amit Kapila, reviewed by Dilip Kumar, with a little editing from me.
2015-05-12 09:29:10 -04:00
Alvaro Herrera b488c580ae Allow on-the-fly capture of DDL event details
This feature lets user code inspect and take action on DDL events.
Whenever a ddl_command_end event trigger is installed, DDL actions
executed are saved to a list which can be inspected during execution of
a function attached to ddl_command_end.

The set-returning function pg_event_trigger_ddl_commands can be used to
list actions so captured; it returns data about the type of command
executed, as well as the affected object.  This is sufficient for many
uses of this feature.  For the cases where it is not, we also provide a
"command" column of a new pseudo-type pg_ddl_command, which is a
pointer to a C structure that can be accessed by C code.  The struct
contains all the info necessary to completely inspect and even
reconstruct the executed command.

There is no actual deparse code here; that's expected to come later.
What we have is enough infrastructure that the deparsing can be done in
an external extension.  The intention is that we will add some deparsing
code in a later release, as an in-core extension.

A new test module is included.  It's probably insufficient as is, but it
should be sufficient as a starting point for a more complete and
future-proof approach.

Authors: Álvaro Herrera, with some help from Andres Freund, Ian Barwick,
Abhijit Menon-Sen.

Reviews by Andres Freund, Robert Haas, Amit Kapila, Michael Paquier,
Craig Ringer, David Steele.
Additional input from Chris Browne, Dimitri Fontaine, Stephen Frost,
Petr Jelínek, Tom Lane, Jim Nasby, Steven Singer, Pavel Stěhule.

Based on original work by Dimitri Fontaine, though I didn't use his
code.

Discussion:
  https://www.postgresql.org/message-id/m2txrsdzxa.fsf@2ndQuadrant.fr
  https://www.postgresql.org/message-id/20131108153322.GU5809@eldon.alvh.no-ip.org
  https://www.postgresql.org/message-id/20150215044814.GL3391@alvh.no-ip.org
2015-05-11 19:14:31 -03:00
Stephen Frost fa2642438f Allow LOCK TABLE .. ROW EXCLUSIVE MODE with INSERT
INSERT acquires RowExclusiveLock during normal operation and therefore
it makes sense to allow LOCK TABLE .. ROW EXCLUSIVE MODE to be executed
by users who have INSERT rights on a table (even if they don't have
UPDATE or DELETE).

Not back-patching this as it's a behavior change which, strictly
speaking, loosens security restrictions.

Per discussion with Tom and Robert (circa 2013).
2015-05-11 15:44:12 -04:00
Robert Haas b4d4ce1d50 Increase threshold for multixact member emergency autovac to 50%.
Analysis by Noah Misch shows that the 25% threshold set by commit
53bb309d2d is lower than any other,
similar autovac threshold.  While we don't know exactly what value
will be optimal for all users, it is better to err a little on the
high side than on the low side.  A higher value increases the risk
that users might exhaust the available space and start seeing errors
before autovacuum can clean things up sufficiently, but a user who
hits that problem can compensate for it by reducing
autovacuum_multixact_freeze_max_age to a value dependent on their
average multixact size.  On the flip side, if the emergency cap
imposed by that patch kicks in too early, the user will experience
excessive wraparound scanning and will be unable to mitigate that
problem by configuration.  The new value will hopefully reduce the
risk of such bad experiences while still providing enough headroom
to avoid multixact member exhaustion for most users.

Along the way, adjust the documentation to reflect the effects of
commit 04e6d3b877, which taught
autovacuum to run for multixact wraparound even when autovacuum
is configured off.
2015-05-11 12:15:50 -04:00
Bruce Momjian 23c33198b9 docs: add "serialization anomaly" to transaction isolation table
Also distinguish between SQL-standard and Postgres behavior.

Report by David G. Johnston
2015-05-11 12:02:10 -04:00
Tom Lane 1a8a4e5cde Code review for foreign/custom join pushdown patch.
Commit e7cb7ee145 included some design
decisions that seem pretty questionable to me, and there was quite a lot
of stuff not to like about the documentation and comments.  Clean up
as follows:

* Consider foreign joins only between foreign tables on the same server,
rather than between any two foreign tables with the same underlying FDW
handler function.  In most if not all cases, the FDW would simply have had
to apply the same-server restriction itself (far more expensively, both for
lack of caching and because it would be repeated for each combination of
input sub-joins), or else risk nasty bugs.  Anyone who's really intent on
doing something outside this restriction can always use the
set_join_pathlist_hook.

* Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist
to better reflect what they're for, and allow these custom scan tlists
to be used even for base relations.

* Change make_foreignscan() API to include passing the fdw_scan_tlist
value, since the FDW is required to set that.  Backwards compatibility
doesn't seem like an adequate reason to expect FDWs to set it in some
ad-hoc extra step, and anyway existing FDWs can just pass NIL.

* Change the API of path-generating subroutines of add_paths_to_joinrel,
and in particular that of GetForeignJoinPaths and set_join_pathlist_hook,
so that various less-used parameters are passed in a struct rather than
as separate parameter-list entries.  The objective here is to reduce the
probability that future additions to those parameter lists will result in
source-level API breaks for users of these hooks.  It's possible that this
is even a small win for the core code, since most CPU architectures can't
pass more than half a dozen parameters efficiently anyway.  I kept root,
joinrel, outerrel, innerrel, and jointype as separate parameters to reduce
code churn in joinpath.c --- in particular, putting jointype into the
struct would have been problematic because of the subroutines' habit of
changing their local copies of that variable.

* Avoid ad-hocery in ExecAssignScanProjectionInfo.  It was probably all
right for it to know about IndexOnlyScan, but if the list is to grow
we should refactor the knowledge out to the callers.

* Restore nodeForeignscan.c's previous use of the relcache to avoid
extra GetFdwRoutine lookups for base-relation scans.

* Lots of cleanup of documentation and missed comments.  Re-order some
code additions into more logical places.
2015-05-10 14:36:36 -04:00
Stephen Frost f0a4b20bb9 Correct reindexdb documentation
--schema takes a schema, not a table.

Author: Sawada Masahiko
2015-05-09 14:45:54 -04:00
Bruce Momjian da31c5ed79 doc: adjust ordering of pg_stat_statement paragraphs
Clarify installation instructions

Patch by Ian Barwick
2015-05-09 14:11:31 -04:00
Andrew Dunstan cb9fa802b3 Add new OID alias type regnamespace
Catalog version bumped

Kyotaro HORIGUCHI
2015-05-09 13:36:52 -04:00
Andrew Dunstan 0c90f6769d Add new OID alias type regrole
The new type has the scope of whole the database cluster so it doesn't
behave the same as the existing OID alias types which have database
scope,
concerning object dependency. To avoid confusion constants of the new
type are prohibited from appearing where dependencies are made involving
it.

Also, add a note to the docs about possible MVCC violation and
optimization issues, which are general over the all reg* types.

Kyotaro Horiguchi
2015-05-09 13:06:49 -04:00
Stephen Frost 9a0884176f Change default for include_realm to 1
The default behavior for GSS and SSPI authentication methods has long
been to strip the realm off of the principal, however, this is not a
secure approach in multi-realm environments and the use-case for the
parameter at all has been superseded by the regex-based mapping support
available in pg_ident.conf.

Change the default for include_realm to be '1', meaning that we do
NOT remove the realm from the principal by default.  Any installations
which depend on the existing behavior will need to update their
configurations (ideally by leaving include_realm set to 1 and adding a
mapping in pg_ident.conf, but alternatively by explicitly setting
include_realm=0 prior to upgrading).  Note that the mapping capability
exists in all currently supported versions of PostgreSQL and so this
change can be done today.  Barring that, existing users can update their
configurations today to explicitly set include_realm=0 to ensure that
the prior behavior is maintained when they upgrade.

This needs to be noted in the release notes.

Per discussion with Magnus and Peter.
2015-05-08 19:39:42 -04:00
Stephen Frost a97e0c3354 Add pg_file_settings view and function
The function and view added here provide a way to look at all settings
in postgresql.conf, any #include'd files, and postgresql.auto.conf
(which is what backs the ALTER SYSTEM command).

The information returned includes the configuration file name, line
number in that file, sequence number indicating when the parameter is
loaded (useful to see if it is later masked by another definition of the
same parameter), parameter name, and what it is set to at that point.
This information is updated on reload of the server.

This is unfiltered, privileged, information and therefore access is
restricted to superusers through the GRANT system.

Author: Sawada Masahiko, various improvements by me.
Reviewers: David Steele
2015-05-08 19:09:26 -04:00
Andres Freund e8898e9169 Minor ON CONFLICT related comments and doc fixes.
Geoff Winkless, Stephen Frost, Peter Geoghegan and me.
2015-05-08 19:24:14 +02:00
Robert Haas 53bb309d2d Teach autovacuum about multixact member wraparound.
The logic introduced in commit b69bf30b9b
and repaired in commits 669c7d20e6 and
7be47c56af helps to ensure that we don't
overwrite old multixact member information while it is still needed,
but a user who creates many large multixacts can still exhaust the
member space (and thus start getting errors) while autovacuum stands
idly by.

To fix this, progressively ramp down the effective value (but not the
actual contents) of autovacuum_multixact_freeze_max_age as member space
utilization increases.  This makes autovacuum more aggressive and also
reduces the threshold for a manual VACUUM to perform a full-table scan.

This patch leaves unsolved the problem of ensuring that emergency
autovacuums are triggered even when autovacuum=off.  We'll need to fix
that via a separate patch.

Thomas Munro and Robert Haas
2015-05-08 12:53:00 -04:00
Andres Freund 168d5805e4 Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint.  DO NOTHING avoids the
constraint violation, without touching the pre-existing row.  DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed.  The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.

This feature is often referred to as upsert.

This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert.  If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made.  If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.

To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.

Bumps catversion as stored rules change.

Author: Peter Geoghegan, with significant contributions from Heikki
    Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
    Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:43:10 +02:00
Alvaro Herrera db5f98ab4f Improve BRIN infra, minmax opclass and regression test
The minmax opclass was using the wrong support functions when
cross-datatypes queries were run.  Instead of trying to fix the
pg_amproc definitions (which apparently is not possible), use the
already correct pg_amop entries instead.  This requires jumping through
more hoops (read: extra syscache lookups) to obtain the underlying
functions to execute, but it is necessary for correctness.

Author: Emre Hasegeli, tweaked by Álvaro
Review: Andreas Karlsson

Also change BrinOpcInfo to record each stored type's typecache entry
instead of just the OID.  Turns out that the full type cache is
necessary in brin_deform_tuple: the original code used the indexed
type's byval and typlen properties to extract the stored tuple, which is
correct in Minmax; but in other implementations that want to store
something different, that's wrong.  The realization that this is a bug
comes from Emre also, but I did not use his patch.

I also adopted Emre's regression test code (with smallish changes),
which is more complete.
2015-05-07 13:02:22 -03:00
Bruce Momjian 82ec7c95b7 Makefile: Add comment that doc uninstall clears man directories
Report by Mario Valdez
2015-05-07 10:26:08 -04:00
Tom Lane 929ca96584 citext's regexp_matches() functions weren't documented, either. 2015-05-05 16:11:01 -04:00
Peter Eisentraut 53f0965767 doc: Update installation instructions for new shared libperl/libpython handling 2015-05-05 14:41:39 -04:00
Alvaro Herrera 3b6db1f445 Add geometry/range functions to support BRIN inclusion
This commit adds the following functions:
    box(point) -> box
    bound_box(box, box) -> box
    inet_same_family(inet, inet) -> bool
    inet_merge(inet, inet) -> cidr
    range_merge(anyrange, anyrange) -> anyrange

The first of these is also used to implement a new assignment cast from
point to box.

These functions are the first part of a base to implement an "inclusion"
operator class for BRIN, for multidimensional data types.

Author: Emre Hasegeli
Reviewed by: Andreas Karlsson
2015-05-05 15:22:24 -03:00
Peter Eisentraut ad8d6d064c Fix typos
Author: Erik Rijkers <er@xs4all.nl>
2015-05-04 20:40:19 -04:00
Robert Haas e7cb7ee145 Allow FDWs and custom scan providers to replace joins with scans.
Foreign data wrappers can use this capability for so-called "join
pushdown"; that is, instead of executing two separate foreign scans
and then joining the results locally, they can generate a path which
performs the join on the remote server and then is scanned locally.
This commit does not extend postgres_fdw to take advantage of this
capability; it just provides the infrastructure.

Custom scan providers can use this in a similar way.  Previously,
it was only possible for a custom scan provider to scan a single
relation.  Now, it can scan an entire join tree, provided of course
that it knows how to produce the same results that the join would
have produced if executed normally.

KaiGai Kohei, reviewed by Shigeru Hanada, Ashutosh Bapat, and me.
2015-05-01 08:50:35 -04:00
Andres Freund 2b22795b32 Copy editing of the replication origins patch.
Michael Paquier and myself.
2015-05-01 12:22:13 +02:00
Alvaro Herrera 9d396af463 Fix up some loose ends for CURRENT_USER as RoleSpec
In commit 31eae6028e, some documents were not updated to show the new
capability; fix that.  Also, the error message you get when CURRENT_USER
and SESSION_USER are used in a context that doesn't accept them could be
clearer about it being a problem only in those contexts; so add the
word "here".

Author: Kyotaro HORIGUCHI

His patch submission also included changes to GRANT/REVOKE, but those
seemed more controversial, so I left them out.  We can reconsider these
changes later.
2015-04-30 16:57:05 -03:00
Magnus Hagander da114099af Fix typo
Amit Langote
2015-04-30 09:52:34 +02:00
Robert Haas 49601ab1a6 Add <literal> markup, for consistency.
This file isn't entirely consistent about whether "on" and "off"
should be marked up with <literal>, but it doesn't make much sense
to be inconsistent within a single sentence.

Etsuro Fujita
2015-04-29 18:03:18 -04:00
Andres Freund 5aa2350426 Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
  e.g. to avoid loops in bi-directional replication setups

The solution to these problems, as implemented here, consist out of
three parts:

1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
   replication origin, how far replay has progressed in a efficient and
   crash safe manner.
3) The ability to filter out changes performed on the behest of a
   replication origin during logical decoding; this allows complex
   replication topologies. E.g. by filtering all replayed changes out.

Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated.  We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.

This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL.  Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.

For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.

Bumps both catversion and wal page magic.

Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
    20140923182422.GA15776@alap3.anarazel.de,
    20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
Bruce Momjian 5086dfceba doc: recommend use of GUC server_version_num for version checks
Patch by Craig Ringer
2015-04-28 20:31:08 -04:00
Stephen Frost dcbf5948e1 Improve qual pushdown for RLS and SB views
The original security barrier view implementation, on which RLS is
built, prevented all non-leakproof functions from being pushed down to
below the view, even when the function was not receiving any data from
the view.  This optimization improves on that situation by, instead of
checking strictly for non-leakproof functions, it checks for Vars being
passed to non-leakproof functions and allows functions which do not
accept arguments or whose arguments are not from the current query level
(eg: constants can be particularly useful) to be pushed down.

As discussed, this does mean that a function which is pushed down might
gain some idea that there are rows meeting a certain criteria based on
the number of times the function is called, but this isn't a
particularly new issue and the documentation in rules.sgml already
addressed similar covert-channel risks.  That documentation is updated
to reflect that non-leakproof functions may be pushed down now, if
they meet the above-described criteria.

Author: Dean Rasheed, with a bit of rework to make things clearer,
along with comment and documentation updates from me.
2015-04-27 12:29:42 -04:00
Peter Eisentraut cac7658205 Add transforms feature
This provides a mechanism for specifying conversions between SQL data
types and procedural languages.  As examples, there are transforms
for hstore and ltree for PL/Perl and PL/Python.

reviews by Pavel Stěhule and Andres Freund
2015-04-26 10:33:14 -04:00
Stephen Frost e89bd02f58 Perform RLS WITH CHECK before constraints, etc
The RLS capability is built on top of the WITH CHECK OPTION
system which was added for auto-updatable views, however, unlike
WCOs on views (which are mandated by the SQL spec to not fire until
after all other constraints and checks are done), it makes much more
sense for RLS checks to happen earlier than constraint and uniqueness
checks.

This patch reworks the structure which holds the WCOs a bit to be
explicitly either VIEW or RLS checks and the RLS-related checks are
done prior to the constraint and uniqueness checks.  This also allows
better error reporting as we are now reporting when a violation is due
to a WITH CHECK OPTION and when it's due to an RLS policy violation,
which was independently noted by Craig Ringer as being confusing.

The documentation is also updated to include a paragraph about when RLS
WITH CHECK handling is performed, as there have been a number of
questions regarding that and the documentation was previously silent on
the matter.

Author: Dean Rasheed, with some kabitzing and comment changes by me.
2015-04-24 20:34:26 -04:00
Peter Eisentraut d64a9c8c83 doc: Move ALTER TABLE IF EXISTS description to better place
It was previously mixed in with the description of ALTER TABLE
subcommands.  Move it to the Parameters section, which is where it is on
other reference pages.

pointed out by Amit Langote
2015-04-24 13:22:18 -04:00
Peter Eisentraut 9ba978c8cc Fix misspellings
Amit Langote and Thom Brown
2015-04-24 12:00:49 -04:00
Andres Freund cef939c347 Rename pg_replication_slot's new active_in to active_pid.
In d811c037ce active_in was added but discussion since showed that
active_pid is preferred as a name.

Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com
2015-04-22 09:43:40 +02:00
Peter Eisentraut b0a738f428 Move pg_xlogdump from contrib/ to src/bin/
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2015-04-21 19:03:49 -04:00
Robert Haas 0275ecf31c Update FDW docs to reflect existence of CHECK constraints.
Generalize the remarks previously made about NOT NULL constraints to
CHECK constraints.

Etsuro Fujita
2015-04-21 17:46:47 -04:00
Andres Freund d811c037ce Add 'active_in' column to pg_replication_slots.
Right now it is visible whether a replication slot is active in any
session, but not in which.  Adding the active_in column, containing the
pid of the backend having acquired the slot, makes it much easier to
associate pg_replication_slots entries with the corresponding
pg_stat_replication/pg_stat_activity row.

This should have been done from the start, but I (Andres) dropped the
ball there somehow.

Author: Craig Ringer, revised by me Discussion:
CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com
2015-04-21 11:51:06 +02:00
Peter Eisentraut 528c2e44ab Move pg_test_timing from contrib/ to src/bin/
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2015-04-20 21:30:12 -04:00
Peter Eisentraut 00882d9e5c Move pg_test_fsync from contrib/ to src/bin/
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2015-04-19 22:20:49 -04:00
Alvaro Herrera 4cb7d671fd Add new target modulescheck in vcregress.pl
This allows an MSVC build to run regression tests related to modules in
src/test/modules.

Author: Michael Paquier
Reviewed by: Andrew Dunstan
2015-04-16 23:39:52 -03:00
Bruce Momjian 2e5d52a644 pg_upgrade: document need for text search files to be copied
Report by CJ Estel

Backpatch through 9.4
2015-04-16 19:51:24 -04:00
Peter Eisentraut 9fa8b0ee90 Move pg_upgrade from contrib/ to src/bin/
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2015-04-14 19:26:38 -04:00
Peter Eisentraut 30982be4e5 Integrate pg_upgrade_support module into backend
Previously, these functions were created in a schema "binary_upgrade",
which was deleted after pg_upgrade was finished.  Because we don't want
to keep that schema around permanently, move them to pg_catalog but
rename them with a binary_upgrade_... prefix.

The provided functions are only small wrappers around global variables
that were added specifically for pg_upgrade use, so keeping the module
separate does not create any modularity.

The functions still check that they are only called in binary upgrade
mode, so it is not possible to call these during normal operation.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2015-04-14 19:26:37 -04:00
Peter Eisentraut 81134af3ec Move pgbench from contrib/ to src/bin/
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2015-04-13 13:07:16 -04:00
Magnus Hagander 9029f4b374 Add system view pg_stat_ssl
This view shows information about all connections, such as if the
connection is using SSL, which cipher is used, and which client
certificate (if any) is used.

Reviews by Alex Shulgin, Heikki Linnakangas, Andres Freund & Michael Paquier
2015-04-12 19:07:46 +02:00
Peter Eisentraut 83aca89f7c Move pg_archivecleanup from contrib/ to src/bin/
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2015-04-11 23:29:18 -04:00
Magnus Hagander 8ae4600cd9 Fix incorrect punctuation
Amit Langote
2015-04-09 13:35:30 +02:00
Fujii Masao 17d436d2e8 Remove obsolete FORCE option from REINDEX.
FORCE option has been marked "obsolete" since very old version 7.4
but existed for backwards compatibility. Per discussion on pgsql-hackers,
we concluded that it's no longer worth keeping supporting the option.
2015-04-09 11:31:42 +09:00
Simon Riggs 1cdf4d0b6a Fix spelling of author's name 2015-04-07 14:04:29 -04:00