Commit Graph

84 Commits

Author SHA1 Message Date
Tomas Vondra d06215d03b Allow setting statistics target for extended statistics
When building statistics, we need to decide how many rows to sample and
how accurate the resulting statistics should be. Until now, it was not
possible to explicitly define statistics target for extended statistics
objects, the value was always computed from the per-attribute targets
with a fallback to the system-wide default statistics target.

That's a bit inconvenient, as it ties together the statistics target set
for per-column and extended statistics. In some cases it may be useful
to require larger sample / higher accuracy for extended statics (or the
other way around), but with this approach that's not possible.

So this commit introduces a new command, allowing to specify statistics
target for individual extended statistics objects, overriding the value
derived from per-attribute targets (and the system default).

  ALTER STATISTICS stat_name SET STATISTICS target_value;

When determining statistics target for an extended statistics object we
first look at this explicitly set value. When this value is -1, we fall
back to the old formula, looking at the per-attribute targets first and
then the system default. This means the behavior is backwards compatible
with older PostgreSQL releases.

Author: Tomas Vondra
Discussion: https://postgr.es/m/20190618213357.vli3i23vpkset2xd@development
Reviewed-by: Kirk Jamison, Dean Rasheed
2019-09-11 00:25:51 +02:00
Tom Lane 07b39083c2 Fix pg_dump's handling of dependencies for custom opclasses.
Since pg_dump doesn't treat the member operators and functions of operator
classes/families (that is, the pg_amop and pg_amproc entries, not the
underlying operators/functions) as separate dumpable objects, it missed
their dependency information.  I think this was safe when the code was
designed, because the default object sorting rule emits operators and
functions before opclasses, and there were no dependency types that could
mess that up.  However, the introduction of range types in 9.2 broke it:
now a type can have a dependency on an opclass, allowing dependency rules
to push the opclass before the type and hence before custom operators.
Lacking any information showing that it shouldn't do so, pg_dump emitted
the objects in the wrong order.

Fix by teaching getDependencies() to translate pg_depend entries for
pg_amop/amproc rows to look like dependencies for their parent opfamily.

I added a regression test for this in HEAD/v12, but not further back;
life is too short to fight with 002_pg_dump.pl.

Per bug #15934 from Tom Gottfried.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/15934-58b8c8ab7a09ea15@postgresql.org
2019-07-31 15:42:49 -04:00
Peter Eisentraut 06140c201b Add CREATE DATABASE LOCALE option
This sets both LC_COLLATE and LC_CTYPE with one option.  Similar
behavior is already supported in initdb, CREATE COLLATION, and
createdb.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://www.postgresql.org/message-id/flat/d9d5043a-dc70-da8a-0166-1e218e6e34d4%402ndquadrant.com
2019-07-23 14:47:24 +02:00
Alvaro Herrera 33a53130a8 Make pg_dump emit ATTACH PARTITION instead of PARTITION OF (reprise)
Using PARTITION OF can result in column ordering being changed from the
database being dumped, if the partition uses a column layout different
from the parent's.  It's not pg_dump's job to editorialize on table
definitions, so this is not acceptable; back-patch all the way back to
pg10, where partitioned tables where introduced.

This change also ensures that partitions end up in the correct
tablespace, if different from the parent's; this is an oversight in
ca4103025d (in pg12 only).  Partitioned indexes (in pg11) don't have
this problem, because they're already created as independent indexes and
attached to their parents afterwards.

This change also has the advantage that the partition is restorable from
the dump (as a standalone table) even if its parent table isn't
restored.

The original commits (3b23552ad8 in branch master) failed to cover
subsidiary column elements correctly, such as NOT NULL constraint and
CHECK constraints, as reported by Rushabh Lathia (initially as a failure
to restore serial columns).  They were reverted.  This recapitulation
commit fixes those problems.

Add some pg_dump tests to verify these things more exhaustively,
including constraints with legacy-inheritance tables, which were not
tested originally.  In branches 10 and 11, add a local constraint to the
pg_dump test partition that was added by commit 2d7eeb1b14 to master.

Author: Álvaro Herrera, David Rowley
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/CAKJS1f_1c260nOt_vBJ067AZ3JXptXVRohDVMLEBmudX1YEx-A@mail.gmail.com
Discussion: https://postgr.es/m/20190423185007.GA27954@alvherre.pgsql
Discussion: https://postgr.es/m/CAGPqQf0iQV=PPOv2Btog9J9AwOQp6HmuVd6SbGTR_v3Zp2XT1w@mail.gmail.com
2019-06-10 18:56:23 -04:00
Noah Misch f2c71cb71f Stop using spelling "nonexistant".
The documentation used "nonexistent" exclusively, and the source tree
used it three times as often as "nonexistant".
2019-06-08 10:12:26 -07:00
Amit Kapila 9679345f3c Fix typos.
Reported-by: Alexander Lakhin
Author: Alexander Lakhin
Reviewed-by: Amit Kapila and Tom Lane
Discussion: https://postgr.es/m/7208de98-add8-8537-91c0-f8b089e2928c@gmail.com
2019-05-26 18:28:18 +05:30
Tom Lane db6e2b4c52 Initial pgperltidy run for v12.
Make all the perl code look nice, too (for some value of "nice").
2019-05-22 13:36:19 -04:00
Alvaro Herrera a1ec7402e9 Revert "Make pg_dump emit ATTACH PARTITION instead of PARTITION OF"
... and fallout (from branches 10, 11 and master).  The change was
ill-considered, and it broke a few normal use cases; since we don't have
time to fix it, we'll try again after this week's minor releases.

Reported-by: Rushabh Lathia
Discussion: https://postgr.es/m/CAGPqQf0iQV=PPOv2Btog9J9AwOQp6HmuVd6SbGTR_v3Zp2XT1w@mail.gmail.com
2019-05-06 12:23:49 -04:00
Alvaro Herrera 3b23552ad8 Make pg_dump emit ATTACH PARTITION instead of PARTITION OF
Using PARTITION OF can result in column ordering being changed from the
database being dumped, if the partition uses a column layout different
from the parent's.  It's not pg_dump's job to editorialize on table
definitions, so this is not acceptable; back-patch all the way back to
pg10, where partitioned tables where introduced.

This change also ensures that partitions end up in the correct
tablespace, if different from the parent's; this is an oversight in
ca4103025d (in pg12 only).  Partitioned indexes (in pg11) don't have
this problem, because they're already created as independent indexes and
attached to their parents afterwards.

This change also has the advantage that the partition is restorable from
the dump (as a standalone table) even if its parent table isn't
restored.

Author: David Rowley
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/CAKJS1f_1c260nOt_vBJ067AZ3JXptXVRohDVMLEBmudX1YEx-A@mail.gmail.com
Discussion: https://postgr.es/m/20190423185007.GA27954@alvherre.pgsql
2019-04-24 15:30:37 -04:00
Peter Eisentraut cc8d415117 Unified logging system for command-line programs
This unifies the various ad hoc logging (message printing, error
printing) systems used throughout the command-line programs.

Features:

- Program name is automatically prefixed.

- Message string does not end with newline.  This removes a common
  source of inconsistencies and omissions.

- Additionally, a final newline is automatically stripped, simplifying
  use of PQerrorMessage() etc., another common source of mistakes.

- I converted error message strings to use %m where possible.

- As a result of the above several points, more translatable message
  strings can be shared between different components and between
  frontends and backend, without gratuitous punctuation or whitespace
  differences.

- There is support for setting a "log level".  This is not meant to be
  user-facing, but can be used internally to implement debug or
  verbose modes.

- Lazy argument evaluation, so no significant overhead if logging at
  some level is disabled.

- Some color in the messages, similar to gcc and clang.  Set
  PG_COLOR=auto to try it out.  Some colors are predefined, but can be
  customized by setting PG_COLORS.

- Common files (common/, fe_utils/, etc.) can handle logging much more
  simply by just using one API without worrying too much about the
  context of the calling program, requiring callbacks, or having to
  pass "progname" around everywhere.

- Some programs called setvbuf() to make sure that stderr is
  unbuffered, even on Windows.  But not all programs did that.  This
  is now done centrally.

Soft goals:

- Reduces vertical space use and visual complexity of error reporting
  in the source code.

- Encourages more deliberate classification of messages.  For example,
  in some cases it wasn't clear without analyzing the surrounding code
  whether a message was meant as an error or just an info.

- Concepts and terms are vaguely aligned with popular logging
  frameworks such as log4j and Python logging.

This is all just about printing stuff out.  Nothing affects program
flow (e.g., fatal exits).  The uses are just too varied to do that.
Some existing code had wrappers that do some kind of print-and-exit,
and I adapted those.

I tried to keep the output mostly the same, but there is a lot of
historical baggage to unwind and special cases to consider, and I
might not always have succeeded.  One significant change is that
pg_rewind used to write all error messages to stdout.  That is now
changed to stderr.

Reviewed-by: Donald Dong <xdong@csumb.edu>
Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 20:01:35 +02:00
Peter Eisentraut fc22b6623b Generated columns
This is an SQL-standard feature that allows creating columns that are
computed from expressions rather than assigned, similar to a view or
materialized view but on a column basis.

This implements one kind of generated column: stored (computed on
write).  Another kind, virtual (computed on read), is planned for the
future, and some room is left for it.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b151f851-4019-bdb1-699e-ebab07d2f40a@2ndquadrant.com
2019-03-30 08:15:57 +01:00
Alvaro Herrera 7e413a0f82 pg_dump: allow multiple rows per insert
This is useful to speed up loading data in a different database engine.

Authors: Surafel Temesgen and David Rowley.  Lightly edited by Álvaro.
Reviewed-by: Fabien Coelho
Discussion: https://postgr.es/m/CALAY4q9kumSdnRBzvRJvSRf2+BH20YmSvzqOkvwpEmodD-xv6g@mail.gmail.com
2019-03-07 09:34:17 -03:00
Andres Freund 3b925e905d tableam: Add pg_dump support.
This adds pg_dump support for table AMs in a similar manner to how
tablespaces are handled. That is, instead of specifying the AM for
every CREATE TABLE etc, emit SET default_table_access_method
statements. That makes it easier to change the AM for all/most tables
in a dump, and allows restore to succeed even if some AM is not
available.

This increases the dump archive version, as a tables/matview's AM
needs to be tracked therein.

Author: Dimitri Dolgov, Andres Freund
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 09:54:38 -08:00
Andrew Dunstan 78b408a20a Avoid accidental wildcard expansion in msys shell
Commit f092de05 added a test for pg_dumpall --exclude-database including
the wildcard pattern '*dump*' which matches some files in the source
directory. The test library on msys uses the shell which expands this
and thus the program gets incorrect arguments. This doesn't happen if
the pattern doesn't match any files, so here the pattern is set to
'*dump_test*' which is such a pattern.

Per buildfarm animal jacana.
2019-03-03 11:48:12 -05:00
Andrew Dunstan f092de0503 Add --exclude-database option to pg_dumpall
This option functions similarly to pg_dump's --exclude-table option, but
for database names. The option can be given once, and the argument can
be a pattern including wildcard characters.

Author: Andrew Dunstan.
Reviewd-by: Fabien Coelho and Michael Paquier
Discussion: https://postgr.es/m/43a54a47-4aa7-c70e-9ca6-648f436dd6e6@2ndQuadrant.com
2019-03-01 10:47:44 -05:00
Tom Lane 068503c765 Solve cross-version-upgrade testing problem induced by 1fb57af92.
Renaming varchar_transform to varchar_support had a side effect
I hadn't foreseen: the core regression tests leave around a
transform object that relies on that function, so the name
change breaks cross-version upgrade tests, because the name
used in the older branches doesn't match.

Since the dependency on varchar_transform was chosen with the
aid of a dartboard anyway (it would surely not work as a
language transform support function), fix by just choosing
a different random builtin function with the right signature.
Also add some comments explaining why this isn't horribly unsafe.

I chose to make the same substitution in a couple of other
copied-and-pasted test cases, for consistency, though those
aren't directly contributing to the testing problem.

Per buildfarm.  Back-patch, else it doesn't fix the problem.
2019-02-09 21:02:06 -05:00
Tom Lane 1fb57af920 Create the infrastructure for planner support functions.
Rename/repurpose pg_proc.protransform as "prosupport".  The idea is
still that it names an internal function that provides knowledge to
the planner about the behavior of the function it's attached to;
but redesign the API specification so that it's not limited to doing
just one thing, but can support an extensible set of requests.

The original purpose of simplifying a function call is handled by
the first request type to be invented, SupportRequestSimplify.
Adjust all the existing transform functions to handle this API,
and rename them fron "xxx_transform" to "xxx_support" to reflect
the potential generalization of what they do.  (Since we never
previously provided any way for extensions to add transform functions,
this change doesn't create an API break for them.)

Also add DDL and pg_dump support for attaching a support function to a
user-defined function.  Unfortunately, DDL access has to be restricted
to superusers, at least for now; but seeing that support functions
will pretty much have to be written in C, that limitation is just
theoretical.  (This support is untested in this patch, but a follow-on
patch will add cases that exercise it.)

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-09 18:08:48 -05:00
Peter Eisentraut 0c1f8f166c Use EXECUTE FUNCTION syntax for triggers more
Change pg_dump and ruleutils.c to use the FUNCTION keyword instead of
PROCEDURE in trigger and event trigger definitions.

This completes the pieces of the transition started in
0a63f996e0 that were kept out of
PostgreSQL 11 because of the required catversion change.

Discussion: https://www.postgresql.org/message-id/381bef53-f7be-29c8-d977-948e389161d6@2ndquadrant.com
2019-02-07 09:21:34 +01:00
Michael Paquier f339a998ff Align better test output regex with grammar in pg_dump TAP tests
This enforces one-or-more character matches in the regular expressions
for pg_dump testing on SQL syntax output where zero-or-more matches
implies a syntax error.

Author: Daniel Gustafsson
Reviewed-by: David G. Johnston, Michael Paquier
Discussion: https://postgr.es/m/B313C32C-0E24-4AFB-95FF-6DA0C4E18A89@yesql.se
2019-02-07 10:04:55 +09:00
Michael Paquier d07fb6810e Tighten some regexes with proper character escaping in pg_dump TAP tests
Some tests have been using regular expressions which have been lax in
escaping dots, which may cause tests to pass when they should not.  This
make the whole set of tests more robust where needed.

Author: David Rowley
Reviewed-by: Daniel Gustafsson, Michael Paquier
Discussion: https://postgr.es/m/CAKJS1f9jD8aVo1BTH+Vgwd=f-ynbuRVrS90XbWMT6UigaOQJTA@mail.gmail.com
2019-02-06 17:33:55 +09:00
Michael Paquier e4fca461ab Include ALTER INDEX SET STATISTICS in pg_dump
The new grammar pattern of ALTER INDEX SET STATISTICS able to use column
numbers on top of the existing column names introduced by commit 5b6d13e
forgot to add support for the feature in pg_dump, so defining statistics
on index columns was missing from the dumps, potentially causing silent
planning problems with a subsequent restore.

pg_dump ought to not use column names in what it generates as these are
automatically generated by the server and could conflict with real
relation attributes with matching patterns.  "expr" and "exprN", N
incremented automatically after the creation of the first one, are used
as default attribute names for index expressions, and that could easily
match what is defined in other relations, causing the dumps to fail if
some of those attributes are renamed at some point.  So to avoid any
problems, the new grammar with column numbers gets used.

Reported-by: Ronan Dunklau
Author: Michael Paquier
Reviewed-by: Tom Lane, Adrien Nayrat, Amul Sul
Discussion: https://postgr.es/m/CAARsnT3UQ4V=yDNW468w8RqHfYiY9mpn2r_c5UkBJ97NAApUEw@mail.gmail.com
Backpatch-through: 11, where the new syntax has been introduced.
2018-12-18 09:28:16 +09:00
Stephen Frost eeeb1dfc87 Add pg_dump test for empty OP class
This adds a pg_dump test for an empty operator class.

Author: Michael Paquier
Discussion: https://postgr.es/m/20181208011142.GU3415@tamriel.snowman.net
2018-12-10 10:13:52 -05:00
Stephen Frost 2d7eeb1b14 Add additional partition tests to pg_dump
This adds a few tests for non-inherited constraints.

Author: Amit Langote
Discussion: https://postgr.es/m/20181208001735.GT3415%40tamriel.snowman.net
2018-12-10 09:46:36 -05:00
Andres Freund 12a53c732c Fix pg_upgrade for oid removal.
pg_upgrade previously copied pg_largeobject_metadata over from the old
cluster. That doesn't work, because the table has oids before
578b229718. I missed that.

As most pieces of metadata for large objects already were dumped as
DDL (except for comments overwritten by pg_upgrade, due to the copy of
pg_largeobject_metadata) it seems reasonable to just also dump grants
for large objects.  If we ever consider this a relevant performance
problem, we'd need to fix the rest of the already emitted DDL
too.

There's still an open discussion about whether we'll want to force a
specific ordering for the dumped objects, as currently
pg_largeobjects_metadata potentially has a different ordering
before/after pg_upgrade, which can make automated testing a bit
harder.

Reported-By: Andrew Dunstan
Author: Andres Freund
Discussion: https://postgr.es/m/91a8a980-41bc-412b-fba2-2ba71a141c2b@2ndQuadrant.com
2018-11-26 14:37:08 -08:00
Andres Freund 578b229718 Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.

This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row.  Neither pg_dump nor COPY included the contents of the
oid column by default.

The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.

WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.

Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
  WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
  issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
  restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
  OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
  plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.

The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.

The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such.  This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.

The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.

Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).

The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.

While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.

Catversion bump, for obvious reasons.

Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-20 16:00:17 -08:00
Alvaro Herrera 93ad00c968 Dump foreign keys on partitioned tables
The patch that ended up as commit 3de241dba8 ("Foreign keys on
partitioned tables") lacked pg_dump tests, so the pg_dump code that was
there to support it inadvertently stopped working when in later
development I modified the backend code not to emit pg_trigger rows for
the partitioned table itself.

Bug analysis and code fix is by Michaël.  I (Álvaro) added the test.

Reported-by: amul sul <sulamul@gmail.com>
Co-authored-by: Michaël Paquier <michael@paquier.xyz>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAAJ_b94n=UsNVhgs97vCaWEZAMe-tGDRVuZ73oePQH=eaJKGSA@mail.gmail.com
2018-07-13 13:13:26 -04:00
Tom Lane f755a152d4 Improve spelling of new FINALFUNC_MODIFY aggregate attribute.
I'd used SHARABLE as a value originally, but Peter Eisentraut points out
that dictionaries agree that SHAREABLE is the preferred spelling.
Run around and change that before it's too late.

Discussion: https://postgr.es/m/d2e1afd4-659c-50d6-1b20-7cfd3675e909@2ndquadrant.com
2018-05-21 11:41:42 -04:00
Andrew Dunstan 35361ee788 Restrict vertical tightness to parentheses in Perl code
The vertical tightness settings collapse vertical whitespace between
opening and closing brackets (parentheses, square brakets and braces).
This can make data structures in particular harder to read, and is not
very consistent with our style in non-Perl code. This patch restricts
that setting to parentheses only, and reformats all the perl code
accordingly. Not applying this to parentheses has some unfortunate
effects, so the consensus is to keep the setting for parentheses and not
for the others.

The diff for this patch does highlight some places where structures
should have trailing commas. They can be added manually, as there is no
automatic tool to do so.

Discussion: https://postgr.es/m/a2f2b87c-56be-c070-bfc0-36288b4b41c1@2ndQuadrant.com
2018-05-09 10:14:46 -04:00
Peter Eisentraut 76ece16974 perltidy: Add option --nooutdent-long-comments 2018-04-27 11:37:43 -04:00
Peter Eisentraut d4f16d5071 perltidy: Add option --nooutdent-long-quotes 2018-04-27 11:37:43 -04:00
Tom Lane f04d4ac919 Reindent Perl files with perltidy version 20170521.
Discussion: https://postgr.es/m/CABUevEzK3cNiHZQ18f5tK0guoT+cN_jWeVzhYYxY=r+1Q3SmoA@mail.gmail.com
2018-04-25 14:00:19 -04:00
Peter Eisentraut 039eb6e92f Logical replication support for TRUNCATE
Update the built-in logical replication system to make use of the
previously added logical decoding for TRUNCATE support.  Add the
required truncate callback to pgoutput and a new logical replication
protocol message.

Publications get a new attribute to determine whether to replicate
truncate actions.  When updating a publication via pg_dump from an older
version, this is not set, thus preserving the previous behavior.

Author: Simon Riggs <simon@2ndquadrant.com>
Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it>
Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
2018-04-07 11:34:11 -04:00
Stephen Frost 446f7f5d78 Rewrite pg_dump TAP tests
This reworks how the tests to run are defined.  Instead of having to
define all runs for all tests, we define those tests which should pass
(generally using one of the defined broad hashes), add in any which
should be specific for this test, and exclude any specific runs that
shouldn't pass for this test.  This ends up removing some 4k+ lines
(more than half the file) but, more importantly, greatly simplifies the
way runs-to-be-tested are defined.

As discussed in the updated comments, for example, take the test which
does CREATE TABLE test_table.  That CREATE TABLE should show up in all
'full' runs of pg_dump, except those cases where 'test_table' is
excluded, of course, and that's exactly how the test gets defined now
(modulo a few other related cases, like where we dump only that table,
or we dump the schema it's in, or we exclude the schema it's in):

like => {
    %full_runs,
    %dump_test_schema_runs,
    only_dump_test_table    => 1,
    section_pre_data        => 1, },
unlike => {
    exclude_dump_test_schema => 1,
    exclude_test_table => 1, }, },

Next, we no longer expect every run to be listed for every test.  If a
run is listed in 'like' (directly or through a hash) then it's a 'like',
unless it's listed in 'unlike' in which case it's an 'unlike'.  If it
isn't listed in either, then it's considered an 'unlike' automatically.

Lastly, this changes the code to no longer use like/unlike but rather to
use 'ok()' with 'diag()' which allows much more control over what gets
spit out to the screen.  Gone are the days of the entire dump being sent
to the console, now you'll just get a couple of lines for each failing
test which say the test that failed and the run that it failed on.

This covers both the pg_dump TAP tests in src/bin/pg_dump and those in
src/test/modules/test_pg_dump.
2018-04-04 15:26:51 -04:00
Peter Eisentraut fd1a421fe6 Add prokind column, replacing proisagg and proiswindow
The new column distinguishes normal functions, procedures, aggregates,
and window functions.  This replaces the existing columns proisagg and
proiswindow, and replaces the convention that procedures are indicated
by prorettype == 0.  Also change prorettype to be VOIDOID for procedures.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
2018-03-02 13:48:33 -05:00
Tom Lane 3d2aed664e Avoid using unsafe search_path settings during dump and restore.
Historically, pg_dump has "set search_path = foo, pg_catalog" when
dumping an object in schema "foo", and has also caused that setting
to be used while restoring the object.  This is problematic because
functions and operators in schema "foo" could capture references meant
to refer to pg_catalog entries, both in the queries issued by pg_dump
and those issued during the subsequent restore run.  That could
result in dump/restore misbehavior, or in privilege escalation if a
nefarious user installs trojan-horse functions or operators.

This patch changes pg_dump so that it does not change the search_path
dynamically.  The emitted restore script sets the search_path to what
was used at dump time, and then leaves it alone thereafter.  Created
objects are placed in the correct schema, regardless of the active
search_path, by dint of schema-qualifying their names in the CREATE
commands, as well as in subsequent ALTER and ALTER-like commands.

Since this change requires a change in the behavior of pg_restore
when processing an archive file made according to this new convention,
bump the archive file version number; old versions of pg_restore will
therefore refuse to process files made with new versions of pg_dump.

Security: CVE-2018-1058
2018-02-26 10:18:21 -05:00
Alvaro Herrera eb7ed3f306 Allow UNIQUE indexes on partitioned tables
If we restrict unique constraints on partitioned tables so that they
must always include the partition key, then our standard approach to
unique indexes already works --- each unique key is forced to exist
within a single partition, so enforcing the unique restriction in each
index individually is enough to have it enforced globally.  Therefore we
can implement unique indexes on partitions by simply removing a few
restrictions (and adding others.)

Discussion: https://postgr.es/m/20171222212921.hi6hg6pem2w2t36z@alvherre.pgsql
Discussion: https://postgr.es/m/20171229230607.3iib6b62fn3uaf47@alvherre.pgsql
Reviewed-by: Simon Riggs, Jesper Pedersen, Peter Eisentraut, Jaime
	Casanova, Amit Langote
2018-02-19 17:40:00 -03:00
Tom Lane 5c9f2564fa Fix assorted errors in pg_dump's handling of extended statistics objects.
pg_dump supposed that a stats object necessarily shares the same schema
as its underlying table, and that it doesn't have a separate owner.
These things may have been true during early development of the feature,
but they are not true as of v10 release.

Failure to track the object's schema separately turns out to have only
limited consequences, because pg_get_statisticsobjdef() always schema-
qualifies the target object name in the generated CREATE STATISTICS command
(a decision out of step with the rest of ruleutils.c, but I digress).
Therefore the restored object would be in the right schema, so that the
only problem is that the TOC entry would be mislabeled as to schema.  That
could lead to wrong decisions for schema-selective restores, for example.

The ownership issue is a bit more serious: not only was the TOC entry
potentially mislabeled as to owner, but pg_dump didn't bother to issue an
ALTER OWNER command at all, so that after restore the stats object would
continue to be owned by the restoring superuser.

A final point is that decisions as to whether to dump a stats object or
not were driven by whether the underlying table was dumped or not.  While
that's not wrong on its face, it won't scale nicely to the planned future
extension to cross-table statistics.  Moreover, that design decision comes
out of the view of stats objects as being auxiliary to a particular table,
like a rule or trigger, which is exactly where the above problems came
from.  Since we're now treating stats objects more like independent objects
in their own right, they ought to behave like standalone objects for this
purpose too.  So change to using the generic selectDumpableObject() logic
for them (which presently amounts to "dump if containing schema is to be
dumped").

Along the way to fixing this, restructure so that getExtendedStatistics
collects the identity info (only) for all extended stats objects in one
query, and then for each object actually being dumped, we retrieve the
definition in dumpStatisticsExt.  This is necessary to ensure that
schema-qualification in the generated CREATE STATISTICS command happens
with respect to the search path that pg_dump will now be using at restore
time (ie, the schema the stats object is in, not that of the underlying
table).  It's probably also significantly faster in the typical scenario
where only a minority of tables have extended stats.

Back-patch to v10 where extended stats were introduced.

Discussion: https://postgr.es/m/18272.1518328606@sss.pgh.pa.us
2018-02-11 13:24:15 -05:00
Tom Lane 5955d93419 Improve pg_dump's handling of "special" built-in objects.
We had some pretty ad-hoc handling of the public schema and the plpgsql
extension, which are both presumed to exist in template0 but might be
modified or deleted in the database being dumped.

Up to now, by default pg_dump would emit a CREATE EXTENSION IF NOT EXISTS
command as well as a COMMENT command for plpgsql.  The usefulness of the
former is questionable, and the latter caused annoying errors in
non-superuser dump/restore scenarios.  Let's instead install a rule that
built-in extensions (identified by having low-numbered OIDs) are not to be
dumped.  We were doing it that way already in binary-upgrade mode, so this
just makes regular mode behave the same.  It remains true that if someone
has installed a non-default ACL on the plpgsql language, that will get
dumped thanks to the pg_init_privs mechanism.  This is more consistent with
the handling of built-in objects of other kinds.

Also, change the very ad-hoc mechanism that was used to avoid dumping
creation and comment commands for the public schema.  Instead of hardwiring
a test in _printTocEntry(), make use of the DUMP_COMPONENT_ infrastructure
to mark that schema up-front about what we want to do with it.  This has
the visible effect that the public schema won't be mentioned in the output
at all, except for updating its ACL if it has a non-default ACL.
Previously, while it was normally not mentioned, --clean mode would drop
and recreate it, again causing headaches for non-superuser usage.  This
change likewise makes the public schema less special and more like other
built-in objects.

If plpgsql, or the public schema, has been removed entirely in the source
DB, that situation won't be reproduced in the destination ... but that
was true before.

Discussion: https://postgr.es/m/29048.1516812451@sss.pgh.pa.us
2018-01-25 13:54:42 -05:00
Tom Lane b3f8401205 Move handling of database properties from pg_dumpall into pg_dump.
This patch rearranges the division of labor between pg_dump and pg_dumpall
so that pg_dump itself handles all properties attached to a single
database.  Notably, a database's ACL (GRANT/REVOKE status) and local GUC
settings established by ALTER DATABASE SET and ALTER ROLE IN DATABASE SET
can be dumped and restored by pg_dump.  This is a long-requested
improvement.

"pg_dumpall -g" will now produce only role- and tablespace-related output,
nothing about individual databases.  The total output of a regular
pg_dumpall run remains the same.

pg_dump (or pg_restore) will restore database-level properties only when
creating the target database with --create.  This applies not only to
ACLs and GUCs but to the other database properties it already handled,
that is database comments and security labels.  This is more consistent
and useful, but does represent an incompatibility in the behavior seen
without --create.

(This change makes the proposed patch to have pg_dump use "COMMENT ON
DATABASE CURRENT_DATABASE" unnecessary, since there is no case where
the command is issued that we won't know the true name of the database.
We might still want that patch as a feature in its own right, but pg_dump
no longer needs it.)

pg_dumpall with --clean will now drop and recreate the "postgres" and
"template1" databases in the target cluster, allowing their locale and
encoding settings to be changed if necessary, and providing a cleaner
way to set nondefault tablespaces for them than we had before.  This
means that such a script must now always be started in the "postgres"
database; the order of drops and reconnects will not work otherwise.
Without --clean, the script will not adjust any database-level properties
of those two databases (including their comments, ACLs, and security
labels, which it formerly would try to set).

Another minor incompatibility is that the CREATE DATABASE commands in a
pg_dumpall script will now always specify locale and encoding settings.
Formerly those would be omitted if they matched the cluster's default.
While that behavior had some usefulness in some migration scenarios,
it also posed a significant hazard of unwanted locale/encoding changes.
To migrate to another locale/encoding, it's now necessary to use pg_dump
without --create to restore into a database with the desired settings.

Commit 4bd371f6f's hack to emit "SET default_transaction_read_only = off"
is gone: we now dodge that problem by the expedient of not issuing ALTER
DATABASE SET commands until after reconnecting to the target database.
Therefore, such settings won't apply during the restore session.

In passing, improve some shaky grammar in the docs, and add a note pointing
out that pg_dumpall's output can't be expected to load without any errors.
(Someday we might want to fix that, but this is not that patch.)

Haribabu Kommi, reviewed at various times by Andreas Karlsson,
Vaishnavi Prabakaran, and Robert Haas; further hacking by me.

Discussion: https://postgr.es/m/CAJrrPGcUurV0eWTeXODwsOYFN=Ekq36t1s0YnFYUNzsmRfdAyA@mail.gmail.com
2018-01-22 14:09:09 -05:00
Alvaro Herrera 8b08f7d482 Local partitioned indexes
When CREATE INDEX is run on a partitioned table, create catalog entries
for an index on the partitioned table (which is just a placeholder since
the table proper has no data of its own), and recurse to create actual
indexes on the existing partitions; create them in future partitions
also.

As a convenience gadget, if the new index definition matches some
existing index in partitions, these are picked up and used instead of
creating new ones.  Whichever way these indexes come about, they become
attached to the index on the parent table and are dropped alongside it,
and cannot be dropped on isolation unless they are detached first.

To support pg_dump'ing these indexes, add commands
    CREATE INDEX ON ONLY <table>
(which creates the index on the parent partitioned table, without
recursing) and
    ALTER INDEX ATTACH PARTITION
(which is used after the indexes have been created individually on each
partition, to attach them to the parent index).  These reconstruct prior
database state exactly.

Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit
	Langote, Jesper Pedersen, Simon Riggs, David Rowley
Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql
2018-01-19 11:49:22 -03:00
Peter Eisentraut e4128ee767 SQL procedures
This adds a new object type "procedure" that is similar to a function
but does not have a return type and is invoked by the new CALL statement
instead of SELECT or similar.  This implementation is aligned with the
SQL standard and compatible with or similar to other SQL implementations.

This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well
as ALTER/DROP ROUTINE that can refer to either a function or a
procedure (or an aggregate function, as an extension to SQL).  There is
also support for procedures in various utility commands such as COMMENT
and GRANT, as well as support in pg_dump and psql.  Support for defining
procedures is available in all the languages supplied by the core
distribution.

While this commit is mainly syntax sugar around existing functionality,
future features will rely on having procedures as a separate object
type.

Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2017-11-30 11:03:20 -05:00
Tom Lane 4de2d4fba3 Explicitly track whether aggregate final functions modify transition state.
Up to now, there's been hard-wired assumptions that normal aggregates'
final functions never modify their transition states, while ordered-set
aggregates' final functions always do.  This has always been a bit
limiting, and in particular it's getting in the way of improving the
built-in ordered-set aggregates to allow merging of transition states.
Therefore, let's introduce catalog and CREATE AGGREGATE infrastructure
that lets the finalfn's behavior be declared explicitly.

There are now three possibilities for the finalfn behavior: it's purely
read-only, it trashes the transition state irrecoverably, or it changes
the state in such a way that no more transfn calls are possible but the
state can still be passed to other, compatible finalfns.  There are no
examples of this third case today, but we'll shortly make the built-in
OSAs act like that.

This change allows user-defined aggregates to explicitly disclaim support
for use as window functions, and/or to prevent transition state merging,
if their implementations cannot handle that.  While it was previously
possible to handle the window case with a run-time error check, there was
not any way to prevent transition state merging, which in retrospect is
something commit 804163bc2 should have provided for.  But better late
than never.

In passing, split out pg_aggregate.c's extern function declarations into
a new header file pg_aggregate_fn.h, similarly to what we've done for
some other catalog headers, so that pg_aggregate.h itself can be safe
for frontend files to include.  This lets pg_dump use the symbolic
names for relevant constants.

Discussion: https://postgr.es/m/4834.1507849699@sss.pgh.pa.us
2017-10-14 15:21:39 -04:00
Tom Lane 21d304dfed Final pgindent + perltidy run for v10. 2017-08-14 17:29:33 -04:00
Tom Lane 93f039b494 Fix pg_dump's handling of event triggers.
pg_dump with the --clean option failed to emit DROP EVENT TRIGGER
commands for event triggers.  In a closely related oversight,
it also did not emit ALTER OWNER commands for event triggers.
Since only superusers can create event triggers, the latter oversight
is of little practical consequence ... but if we're going to record
an owner for event triggers, then surely pg_dump should preserve it.

Per complaint from Greg Atkins.  Back-patch to 9.3 where event triggers
were introduced.

Discussion: https://postgr.es/m/20170722191142.yi4e7tzcg3iacclg@gmail.com
2017-07-22 20:20:09 -04:00
Stephen Frost 4500edc7e9 Do not require 'public' to exist for pg_dump -c
Commit 330b84d8c4 didn't contemplate the case where the public schema
has been dropped and introduced a query which fails when there is no
public schema into pg_dump (when used with -c).

Adjust the query used by pg_dump to handle the case where the public
schema doesn't exist and add tests to check that such a case no longer
fails.

Back-patch the specific fix to 9.6, as the prior commit was.

Adding tests for this case involved adding support to the pg_dump
TAP tests to work with multiple databases, which, while not a large
change, is a bit much to back-patch, so that's only done in master.

Addresses bug #14650
Discussion: https://www.postgresql.org/message-id/20170512181801.1795.47483%40wrigleys.postgresql.org
2017-06-28 10:33:57 -04:00
Bruce Momjian ce55481032 Post-PG 10 beta1 pgperltidy run 2017-05-17 19:01:23 -04:00
Alvaro Herrera bc085205c8 Change CREATE STATISTICS syntax
Previously, we had the WITH clause in the middle of the command, where
you'd specify both generic options as well as statistic types.  Few
people liked this, so this commit changes it to remove the WITH keyword
from that clause and makes it accept statistic types only.  (We
currently don't have any generic options, but if we invent in the
future, we will gain a new WITH clause, probably at the end of the
command).

Also, the column list is now specified without parens, which makes the
whole command look more similar to a SELECT command.  This change will
let us expand the command to supporting expressions (not just columns
names) as well as multiple tables and their join conditions.

Tom added lots of code comments and fixed some parts of the CREATE
STATISTICS reference page, too; more changes in this area are
forthcoming.  He also fixed a potential problem in the alter_generic
regression test, reducing verbosity on a cascaded drop to avoid
dependency on message ordering, as we do in other tests.

Tom also closed a security bug: we documented that table ownership was
required in order to create a statistics object on it, but didn't
actually implement it.

Implement tab-completion for statistics objects.  This can stand some
more improvement.

Authors: Alvaro Herrera, with lots of cleanup by Tom Lane
Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql
2017-05-12 14:59:35 -03:00
Peter Eisentraut b807f59828 Rework the options syntax for logical replication commands
For CREATE/ALTER PUBLICATION/SUBSCRIPTION, use similar option style as
other statements that use a WITH clause for options.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-05-12 08:57:49 -04:00
Stephen Frost 44c528810a Change the way pg_dump retrieves partitioning info
This gets rid of the code that issued separate queries to retrieve the
partitioning parent-child relationship, parent partition key, and child
partition bound information.  With this patch, the information is
retrieved instead using the queries issued from getTables() and
getInherits(), which is both more efficient than the previous approach
and doesn't require any new code.

Since the partitioning parent-child relationship is now retrieved with
the same old code that handles inheritance, partition attributes receive
a proper flagInhAttrs() treatment (that it didn't receive before), which
is needed so that the inherited NOT NULL constraints are not emitted if
we already emitted it for the parent.

Also, fix a bug in pg_dump's --binary-upgrade code, which caused pg_dump
to emit invalid command to attach a partition to its parent.

Author: Amit Langote, with some additional changes by me.
2017-05-04 22:17:52 -04:00
Alvaro Herrera a93077ef46 Add pg_dump tests for CREATE STATISTICS
CREATE STATISTICS pg_dump support code was not covered at all by
previous tests.

Discussion: https://postgr.es/m/20170503172746.rwftidszir67sgk7@alvherre.pgsql
2017-05-03 15:52:00 -03:00