Commit Graph

249 Commits

Author SHA1 Message Date
Tom Lane 66f1630680 Add string_to_table() function.
This splits a string at occurrences of a delimiter.  It is exactly like
string_to_array() except for producing a set of values instead of an
array of values.  Thus, the relationship of these two functions is
the same as between regexp_split_to_table() and regexp_split_to_array().

Although the same results could be had from unnest(string_to_array()),
this is somewhat faster than that, and anyway it seems reasonable to
have it for symmetry with the regexp functions.

Pavel Stehule, reviewed by Peter Smith

Discussion: https://postgr.es/m/CAFj8pRD8HOpjq2TqeTBhSo_QkzjLOhXzGCpKJ4nCs7Y9SQkuPw@mail.gmail.com
2020-09-02 18:23:56 -04:00
Tom Lane 6ca547cf75 Mark factorial operator, and postfix operators in general, as deprecated.
Per discussion, we're planning to remove parser support for postfix
operators in order to simplify the grammar.  So it behooves us to
put out a deprecation notice at least one release before that.

There is only one built-in postfix operator, ! for factorial.
Label it deprecated in the docs and in pg_description, and adjust
some examples that formerly relied on it.  (The sister prefix
operator !! is also deprecated.  We don't really have to remove
that one, but since we're suggesting that people use factorial()
instead, it seems better to remove both operators.)

Also state in the CREATE OPERATOR ref page that postfix operators
in general are going away.

Although this changes the initial contents of pg_description,
I did not force a catversion bump; it doesn't seem essential.

In v13, also back-patch 4c5cf5431, so that there's someplace for
the <link>s to point to.

Mark Dilger and John Naylor, with some adjustments by me

Discussion: https://postgr.es/m/BE2DF53D-251A-4E26-972F-930E523580E9@enterprisedb.com
2020-08-30 14:37:24 -04:00
Fujii Masao 3e98c0bafb Add pg_backend_memory_contexts system view.
This view displays the usages of all the memory contexts of the server
process attached to the current session. This information is useful to
investigate the cause of backend-local memory bloat.

This information can be also collected by calling
MemoryContextStats(TopMemoryContext) via a debugger. But this technique
cannot be uesd in some environments because no debugger is available there.
And it outputs lots of text messages and it's not easy to analyze them.
So, pg_backend_memory_contexts view allows us to access to backend-local
memory contexts information more easily.

Bump catalog version.

Author: Atsushi Torikoshi, Fujii Masao
Reviewed-by: Tatsuhito Kasahara, Andres Freund, Daniel Gustafsson, Robert Haas, Michael Paquier
Discussion: https://postgr.es/m/72a656e0f71d0860161e0b3f67e4d771@oss.nttdata.com
2020-08-19 15:34:43 +09:00
Tom Lane 8a37951eeb Mark built-in coercion functions as leakproof where possible.
Making these leakproof seems helpful since (for example) if you have a
function f(int8) that is leakproof, you don't want it to effectively
become non-leakproof when you apply it to an int4 or int2 column.
But that's what happens today, since the implicit up-coercion will
not be leakproof.

Most of the coercion functions that visibly can't throw errors are
functions that convert numeric datatypes to other, wider ones.
Notable is that float4_numeric and float8_numeric can be marked
leakproof; before commit a57d312a7 they could not have been.
I also marked the functions that coerce strings to "name" as leakproof;
that's okay today because they truncate silently, but if we ever
reconsidered that behavior then they could no longer be leakproof.

I desisted from marking rtrim1() as leakproof; it appears so right now,
but the code seems a little too complex and perhaps subject to change,
since it's shared with other SQL functions.

Discussion: https://postgr.es/m/459322.1595607431@sss.pgh.pa.us
2020-07-25 12:54:58 -04:00
Fujii Masao d05b172a76 Add generic_plans and custom_plans fields into pg_prepared_statements.
There was no easy way to find how many times generic and custom plans
have been executed for a prepared statement. This commit exposes those
numbers of times in pg_prepared_statements view.

Author: Atsushi Torikoshi, Kyotaro Horiguchi
Reviewed-by: Tatsuro Yamada, Masahiro Ikeda, Fujii Masao
Discussion: https://postgr.es/m/CACZ0uYHZ4M=NZpofH6JuPHeX=__5xcDELF8hT8_2T+R55w4RQw@mail.gmail.com
2020-07-20 11:55:50 +09:00
Amit Kapila d973747281 Revert "Track statistics for spilling of changes from ReorderBuffer".
The stats with this commit was available only for WALSenders, however,
users might want to see for backends doing logical decoding via SQL API.
Then, users might want to reset and access these stats across server
restart which was not possible with the current patch.

List of commits reverted:

caa3c4242c   Don't call elog() while holding spinlock.
e641b2a995   Doc: Update the documentation for spilled transaction
statistics.
5883f5fe27   Fix unportable printf format introduced in commit 9290ad198.
9290ad198b   Track statistics for spilling of changes from ReorderBuffer.

Additionaly, remove the release notes entry for this feature.

Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com
2020-07-13 08:53:23 +05:30
Michael Paquier b1e48bbe64 Include replication origins in SQL functions for commit timestamp
This includes two changes:
- Addition of a new function pg_xact_commit_timestamp_origin() able, for
a given transaction ID, to return the commit timestamp and replication
origin of this transaction.  An equivalent function existed in
pglogical.
- Addition of the replication origin to pg_last_committed_xact().

The commit timestamp manager includes already APIs able to return the
replication origin of a transaction on top of its commit timestamp, but
the code paths for replication origins were never stressed as those
functions have never looked for a replication origin, and the SQL
functions available have never included this information since their
introduction in 73c986a.

While on it, refactor a test of modules/commit_ts/ to use tstzrange() to
check that a transaction timestamp is within the wanted range, making
the test a bit easier to read.

Bump catalog version.

Author: Movead Li
Reviewed-by: Madan Kumar, Michael Paquier
Discussion: https://postgr.es/m/2020051116430836450630@highgo.ca
2020-07-12 20:47:15 +09:00
Tom Lane f3faf35f37 Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites.  But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).

So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them.  This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.

Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value.  Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.

Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.

Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 15:43:22 -04:00
Alvaro Herrera a8aaa0c786
Morph pg_replication_slots.min_safe_lsn to safe_wal_size
The previous definition of the column was almost universally disliked,
so provide this updated definition which is more useful for monitoring
purposes: a large positive value is good, while zero or a negative value
means danger.  This should be operationally more convenient.

Backpatch to 13, where the new column to pg_replication_slots (and the
feature it represents) were added.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/9ddfbf8c-2f67-904d-44ed-cf8bc5916228@oss.nttdata.com
2020-07-07 13:08:00 -04:00
Fujii Masao 9bae7e4cde Add +(pg_lsn,numeric) and -(pg_lsn,numeric) operators.
By using these operators, the number of bytes can be added into and
subtracted from LSN.

Bump catalog version.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Asif Rehman
Discussion: https://postgr.es/m/ed9f7f74-e996-67f8-554a-52ebd3779b3b@oss.nttdata.com
2020-06-30 23:55:07 +09:00
Michael Paquier 2c8dd05d6c Make pg_stat_wal_receiver consistent with the WAL receiver's shmem info
d140f2f3 has renamed receivedUpto to flushedUpto, and has added
writtenUpto to the WAL receiver's shared memory information, but
pg_stat_wal_receiver was not consistent with that.  This commit renames
received_lsn to flushed_lsn, and adds a new column called written_lsn.

Bump catalog version.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/20200515090817.GA212736@paquier.xyz
2020-05-17 09:22:07 +09:00
Tom Lane 7b48f1b490 Do pre-release housekeeping on catalog data.
Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta
tasks specified by RELEASE_CHANGES.  For reference, the command was
./renumber_oids.pl --first-mapped-oid=8000 --target-oid=5032

Also run reformat_dat_file.pl while I'm here.

Renumbering recently-added types changed some results in the opr_sanity
test.  To make those a bit easier to eyeball-verify, change the queries
to show regtype not just bare type OIDs.  (I think we didn't have
regtype when these queries were written.)
2020-05-12 13:03:43 -04:00
Alvaro Herrera c655077639
Allow users to limit storage reserved by replication slots
Replication slots are useful to retain data that may be needed by a
replication system.  But experience has shown that allowing them to
retain excessive data can lead to the primary failing because of running
out of space.  This new feature allows the user to configure a maximum
amount of space to be reserved using the new option
max_slot_wal_keep_size.  Slots that overrun that space are invalidated
at checkpoint time, enabling the storage to be released.

Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20170228.122736.123383594.horiguchi.kyotaro@lab.ntt.co.jp
2020-04-07 18:35:00 -04:00
Tom Lane 26a944cf29 Adjust bytea get_bit/set_bit to use int8 not int4 for bit numbering.
Since the existing bit number argument can't exceed INT32_MAX, it's
not possible for these functions to manipulate bits beyond the first
256MB of a bytea value.  Lift that restriction by redeclaring the
bit number arguments as int8 (which requires a catversion bump,
hence is not back-patchable).

The similarly-named functions for bit/varbit don't really have a
problem because we restrict those types to at most VARBITMAXLEN bits;
hence leave them alone.

While here, extend the encode/decode functions in utils/adt/encode.c
to allow dealing with values wider than 1GB.  This is not a live bug
or restriction in current usage, because no input could be more than
1GB, and since none of the encoders can expand a string more than 4X,
the result size couldn't overflow uint32.  But it might be desirable
to support more in future, so make the input length values size_t
and the potential-output-length values uint64.

Also add some test cases to improve the miserable code coverage
of these functions.

Movead Li, editorialized some by me; also reviewed by Ashutosh Bapat

Discussion: https://postgr.es/m/20200312115135445367128@highgo.ca
2020-04-07 15:57:58 -04:00
Thomas Munro 4c04be9b05 Introduce xid8-based functions to replace txid_XXX.
The txid_XXX family of fmgr functions exposes 64 bit transaction IDs to
users as int8.  Now that we have an SQL type xid8 for FullTransactionId,
define a new set of functions including pg_current_xact_id() and
pg_current_snapshot() based on that.  Keep the old functions around too,
for now.

It's a bit sneaky to use the same C functions for both, but since the
binary representation is identical except for the signedness of the
type, and since older functions are the ones using the wrong signedness,
and since we'll presumably drop the older ones after a reasonable period
of time, it seems reasonable to switch to FullTransactionId internally
and share the code for both.

Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Takao Fujii <btfujiitkp@oss.nttdata.com>
Reviewed-by: Yoshikazu Imai <imai.yoshikazu@fujitsu.com>
Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/20190725000636.666m5mad25wfbrri%40alap3.anarazel.de
2020-04-07 12:04:32 +12:00
Thomas Munro aeec457de8 Add SQL type xid8 to expose FullTransactionId to users.
Similar to xid, but 64 bits wide.  This new type is suitable for use in
various system views and administration functions.

Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Takao Fujii <btfujiitkp@oss.nttdata.com>
Reviewed-by: Yoshikazu Imai <imai.yoshikazu@fujitsu.com>
Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/20190725000636.666m5mad25wfbrri%40alap3.anarazel.de
2020-04-07 12:03:59 +12:00
Peter Eisentraut 2991ac5fc9 Add SQL functions for Unicode normalization
This adds SQL expressions NORMALIZE() and IS NORMALIZED to convert and
check Unicode normal forms, per SQL standard.

To support fast IS NORMALIZED tests, we pull in a new data file
DerivedNormalizationProps.txt from Unicode and build a lookup table
from that, using techniques similar to ones already used for other
Unicode data.  make update-unicode will keep it up to date.  We only
build and use these tables for the NFC and NFKC forms, because they
are too big for NFD and NFKD and the improvement is not significant
enough there.

Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/c1909f27-c269-2ed9-12f8-3ab72c8caf7a@2ndquadrant.com
2020-04-02 08:56:27 +02:00
Tomas Vondra 28cac71bd3 Collect statistics about SLRU caches
There's a number of SLRU caches used to access important data like clog,
commit timestamps, multixact, asynchronous notifications, etc. Until now
we had no easy way to monitor these shared caches, compute hit ratios,
number of reads/writes etc.

This commit extends the statistics collector to track this information
for a predefined list of SLRUs, and also introduces a new system view
pg_stat_slru displaying the data.

The list of built-in SLRUs is fixed, but additional SLRUs may be defined
in extensions. Unfortunately, there's no suitable registry of SLRUs, so
this patch simply defines a fixed list of SLRUs with entries for the
built-in ones and one entry for all additional SLRUs. Extensions adding
their own SLRU are fairly rare, so this seems acceptable.

This patch only allows monitoring of SLRUs, not tuning. The SLRU sizes
are still fixed (hard-coded in the code) and it's not entirely clear
which of the SLRUs might need a GUC to tune size. In a way, allowing us
to determine that is one of the goals of this patch.

Bump catversion as the patch introduces new functions and system view.

Author: Tomas Vondra
Reviewed-by: Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/flat/20200119143707.gyinppnigokesjok@development
2020-04-02 02:34:21 +02:00
Tom Lane a80818605e Improve selectivity estimation for assorted match-style operators.
Quite a few matching operators such as JSONB's @> used "contsel" and
"contjoinsel" as their selectivity estimators.  That was a bad idea,
because (a) contsel is only a stub, yielding a fixed default estimate,
and (b) that default is 0.001, meaning we estimate these operators as
five times more selective than equality, which is surely pretty silly.

There's a good model for improving this in ltree's ltreeparentsel():
for any "var OP constant" query, we can try applying the operator
to all of the column's MCV and histogram values, taking the latter
as being a random sample of the non-MCV values.  That code is
actually 100% generic, except for the question of exactly what
default selectivity ought to be plugged in when we don't have stats.

Hence, migrate the guts of ltreeparentsel() into the core code, provide
wrappers "matchingsel" and "matchingjoinsel" with a more-appropriate
default estimate, and use those for the non-geometric operators that
formerly used contsel (mostly JSONB containment operators and tsquery
matching).

Also apply this code to some match-like operators in hstore, ltree, and
pg_trgm, including the former users of ltreeparentsel as well as ones
that improperly used contsel.  Since commit 911e70207 just created new
versions of those extensions that we haven't released yet, we can sneak
this change into those new versions instead of having to create an
additional generation of update scripts.

Patch by me, reviewed by Alexey Bashtanov

Discussion: https://postgr.es/m/12237.1582833074@sss.pgh.pa.us
2020-04-01 10:32:33 -04:00
Alexander Korotkov 911e702077 Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing.  These index AMs are GiST, GIN,
SP-GiST and BRIN.  There opclasses define representation of keys, operations on
them and supported search strategies.  So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision.  This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.

This commit doesn't introduce new storage in system catalog.  Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.

In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions.  Options
are set to fn_expr as the constant bytea expression.  It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.

This commit comes with some examples of opclass options usage.  We parametrize
signature length in GiST.  That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops.  Also we parametrize maximum number of integer ranges for
gist__int_ops.  However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.

Catversion is bumped.

Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 19:17:23 +03:00
David Rowley b07642dbcd Trigger autovacuum based on number of INSERTs
Traditionally autovacuum has only ever invoked a worker based on the
estimated number of dead tuples in a table and for anti-wraparound
purposes. For the latter, with certain classes of tables such as
insert-only tables, anti-wraparound vacuums could be the first vacuum that
the table ever receives. This could often lead to autovacuum workers being
busy for extended periods of time due to having to potentially freeze
every page in the table. This could be particularly bad for very large
tables. New clusters, or recently pg_restored clusters could suffer even
more as many large tables may have the same relfrozenxid, which could
result in large numbers of tables requiring an anti-wraparound vacuum all
at once.

Here we aim to reduce the work required by anti-wraparound and aggressive
vacuums in general, by triggering autovacuum when the table has received
enough INSERTs. This is controlled by adding two new GUCs and reloptions;
autovacuum_vacuum_insert_threshold and
autovacuum_vacuum_insert_scale_factor. These work exactly the same as the
existing scale factor and threshold controls, only base themselves off the
number of inserts since the last vacuum, rather than the number of dead
tuples. New controls were added rather than reusing the existing
controls, to allow these new vacuums to be tuned independently and perhaps
even completely disabled altogether, which can be done by setting
autovacuum_vacuum_insert_threshold to -1.

We make no attempt to skip index cleanup operations on these vacuums as
they may trigger for an insert-mostly table which continually doesn't have
enough dead tuples to trigger an autovacuum for the purpose of removing
those dead tuples. If we were to skip cleaning the indexes in this case,
then it is possible for the index(es) to become bloated over time.

There are additional benefits to triggering autovacuums based on inserts,
as tables which never contain enough dead tuples to trigger an autovacuum
are now more likely to receive a vacuum, which can mark more of the table
as "allvisible" and encourage the query planner to make use of Index Only
Scans.

Currently, we still obey vacuum_freeze_min_age when triggering these new
autovacuums based on INSERTs. For large insert-only tables, it may be
beneficial to lower the table's autovacuum_freeze_min_age so that tuples
are eligible to be frozen sooner. Here we've opted not to zero that for
these types of vacuums, since the table may just be insert-mostly and we
may otherwise freeze tuples that are still destined to be updated or
removed in the near future.

There was some debate to what exactly the new scale factor and threshold
should default to. For now, these are set to 0.2 and 1000, respectively.
There may be some motivation to adjust these before the release.

Author: Laurenz Albe, Darafei Praliaskouski
Reviewed-by: Alvaro Herrera, Masahiko Sawada, Chris Travers, Andres Freund, Justin Pryzby
Discussion: https://postgr.es/m/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com
2020-03-28 19:20:12 +13:00
Tom Lane 24e2885ee3 Introduce "anycompatible" family of polymorphic types.
This patch adds the pseudo-types anycompatible, anycompatiblearray,
anycompatiblenonarray, and anycompatiblerange.  They work much like
anyelement, anyarray, anynonarray, and anyrange respectively, except
that the actual input values need not match precisely in type.
Instead, if we can find a common supertype (using the same rules
as for UNION/CASE type resolution), then the parser automatically
promotes the input values to that type.  For example,
"myfunc(anycompatible, anycompatible)" can match a call with one
integer and one bigint argument, with the integer automatically
promoted to bigint.  With anyelement in the definition, the user
would have had to cast the integer explicitly.

The new types also provide a second, independent set of type variables
for function matching; thus with "myfunc(anyelement, anyelement,
anycompatible) returns anycompatible" the first two arguments are
constrained to be the same type, but the third can be some other
type, and the result has the type of the third argument.  The need
for more than one set of type variables was foreseen back when we
first invented the polymorphic types, but we never did anything
about it.

Pavel Stehule, revised a bit by me

Discussion: https://postgr.es/m/CAFj8pRDna7VqNi8gR+Tt2Ktmz0cq5G93guc3Sbn_NVPLdXAkqA@mail.gmail.com
2020-03-19 11:43:11 -04:00
Peter Eisentraut a2b1faa0f2 Implement type regcollation
This will be helpful for a following commit and it's also just
generally useful, like the other reg* types.

Author: Julien Rouhaud
Reviewed-by: Thomas Munro and Michael Paquier
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
2020-03-18 21:21:00 +01:00
Tom Lane bb03010b9f Remove the "opaque" pseudo-type and associated compatibility hacks.
A long time ago, it was necessary to declare datatype I/O functions,
triggers, and language handler support functions in a very type-unsafe
way involving a single pseudo-type "opaque".  We got rid of those
conventions in 7.3, but there was still support in various places to
automatically convert such functions to the modern declaration style,
to be able to transparently re-load dumps from pre-7.3 servers.
It seems unnecessary to continue to support that anymore, so take out
the hacks; whereupon the "opaque" pseudo-type itself is no longer
needed and can be dropped.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Discussion: https://postgr.es/m/4110.1583255415@sss.pgh.pa.us
2020-03-05 15:48:56 -05:00
Peter Geoghegan 612a1ab767 Add equalimage B-Tree support functions.
Invent the concept of a B-Tree equalimage ("equality implies image
equality") support function, registered as support function 4.  This
indicates whether it is safe (or not safe) to apply optimizations that
assume that any two datums considered equal by an operator class's order
method must be interchangeable without any loss of semantic information.
This is static information about an operator class and a collation.

Register an equalimage routine for almost all of the existing B-Tree
opclasses.  We only need two trivial routines for all of the opclasses
that are included with the core distribution.  There is one routine for
opclasses that index non-collatable types (which returns 'true'
unconditionally), plus another routine for collatable types (which
returns 'true' when the collation is a deterministic collation).

This patch is infrastructure for an upcoming patch that adds B-Tree
deduplication.

Author: Peter Geoghegan, Anastasia Lubennikova
Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
2020-02-26 11:28:25 -08:00
Peter Eisentraut 2ed19a488e Set gen_random_uuid() to volatile
It was set to immutable.  This was a mistake in the initial
commit (5925e55498).

Reported-by: hubert depesz lubaczewski <depesz@depesz.com>
Discussion: https://www.postgresql.org/message-id/flat/20200218185452.GA8710%40depesz.com
2020-02-19 20:09:32 +01:00
Tom Lane b78542b9e9 Run "make reformat-dat-files".
Mostly to make sure the previous commit didn't break this.

Discussion: https://postgr.es/m/20200212182337.GZ1412@telsasoft.com
2020-02-15 14:58:30 -05:00
Michael Paquier b025f32e0b Add leader_pid to pg_stat_activity
This new field tracks the PID of the group leader used with parallel
query.  For parallel workers and the leader, the value is set to the
PID of the group leader.  So, for the group leader, the value is the
same as its own PID.  Note that this reflects what PGPROC stores in
shared memory, so as leader_pid is NULL if a backend has never been
involved in parallel query.  If the backend is using parallel query or
has used it at least once, the value is set until the backend exits.

Author: Julien Rouhaud
Reviewed-by: Sergei Kornilov, Guillaume Lelarge, Michael Paquier, Tomas
Vondra
Discussion: https://postgr.es/m/CAOBaU_Yy5bt0vTPZ2_LUM6cUcGeqmYNoJ8-Rgto+c2+w3defYA@mail.gmail.com
2020-02-06 09:18:06 +09:00
Tom Lane 50fc694e43 Invent "trusted" extensions, and remove the pg_pltemplate catalog.
This patch creates a new extension property, "trusted".  An extension
that's marked that way in its control file can be installed by a
non-superuser who has the CREATE privilege on the current database,
even if the extension contains objects that normally would have to be
created by a superuser.  The objects within the extension will (by
default) be owned by the bootstrap superuser, but the extension itself
will be owned by the calling user.  This allows replicating the old
behavior around trusted procedural languages, without all the
special-case logic in CREATE LANGUAGE.  We have, however, chosen to
loosen the rules slightly: formerly, only a database owner could take
advantage of the special case that allowed installation of a trusted
language, but now anyone who has CREATE privilege can do so.

Having done that, we can delete the pg_pltemplate catalog, moving the
knowledge it contained into the extension script files for the various
PLs.  This ends up being no change at all for the in-core PLs, but it is
a large step forward for external PLs: they can now have the same ease
of installation as core PLs do.  The old "trusted PL" behavior was only
available to PLs that had entries in pg_pltemplate, but now any
extension can be marked trusted if appropriate.

This also removes one of the stumbling blocks for our Python 2 -> 3
migration, since the association of "plpythonu" with Python 2 is no
longer hard-wired into pg_pltemplate's initial contents.  Exactly where
we go from here on that front remains to be settled, but one problem
is fixed.

Patch by me, reviewed by Peter Eisentraut, Stephen Frost, and others.

Discussion: https://postgr.es/m/5889.1566415762@sss.pgh.pa.us
2020-01-29 18:42:43 -05:00
Dean Rasheed 13661ddd7e Add functions gcd() and lcm() for integer and numeric types.
These compute the greatest common divisor and least common multiple of
a pair of numbers using the Euclidean algorithm.

Vik Fearing, reviewed by Fabien Coelho.

Discussion: https://postgr.es/m/adbd3e0b-e3f1-5bbc-21db-03caf1cef0f7@2ndquadrant.com
2020-01-25 14:00:59 +00:00
Andrew Dunstan a83586b554 Add a non-strict version of jsonb_set
jsonb_set_lax() is the same as jsonb_set, except that it takes and extra
argument that specifies what to do if the value argument is NULL. The
default is 'use_json_null'. Other possibilities are 'raise_exception',
'return_target' and 'delete_key', all these behaviours having been
suggested as reasonable by various users.

Discussion: https://postgr.es/m/375873e2-c957-3a8d-64f9-26c43c2b16e7@2ndQuadrant.com

Reviewed by: Pavel Stehule
2020-01-17 11:52:39 +10:30
Robert Haas ed10f32e37 Add pg_shmem_allocations view.
This tells you about allocations that have been made from the main
shared memory segment. The original patch also tried to show information
about dynamic shared memory allocation as well, but I decided to
leave that problem for another time.

Andres Freund and Robert Haas, reviewed by Michael Paquier, Marti
Raudsepp, Tom Lane, Álvaro Herrera, and Kyotaro Horiguchi.

Discussion: http://postgr.es/m/20140504114417.GM12715@awork2.anarazel.de
2020-01-09 10:59:07 -05:00
Tom Lane 20d6225d16 Add functions min_scale(numeric) and trim_scale(numeric).
These allow better control of trailing zeroes in numeric values.

Pavel Stehule, based on an old proposal of Marko Tiikkaja's;
review by Karl Pinc

Discussion: https://postgr.es/m/CAFj8pRDjs-navGASeF0Wk74N36YGFJ+v=Ok9_knRa7vDc-qugg@mail.gmail.com
2020-01-06 12:13:53 -05:00
Bruce Momjian 7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
Tom Lane 8b7ae5a82d Stabilize the results of pg_notification_queue_usage().
This function wasn't touched in commit 51004c717, but that turns out
to be a bad idea, because its results now include any dead space
that exists in the NOTIFY queue on account of our being lazy about
advancing the queue tail.  Notably, the isolation tests now fail
if run twice without a server restart between, because async-notify's
first test of the function will already show a positive value.
It seems likely that end users would be equally unhappy about the
result's instability.  To fix, just make the function call
asyncQueueAdvanceTail before computing its result.  That should end
in producing the same value as before, and it's hard to believe that
there's any practical use-case where pg_notification_queue_usage()
is called so often as to create a performance degradation, especially
compared to what we did before.

Out of paranoia, also mark this function parallel-restricted (it
was volatile, but parallel-safe by default, before).  Although the
code seems to work fine when run in a parallel worker, that's outside
the design scope of async.c, and it's a bit scary to have intentional
side-effects happening in a parallel worker.  There seems no plausible
use-case where it'd be important to try to parallelize this, so let's
not take any risk of introducing new bugs.

In passing, re-pgindent async.c and run reformat-dat-files on
pg_proc.dat, just because I'm a neatnik.

Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us
2019-11-24 14:09:33 -05:00
Peter Eisentraut 2e4db241bf Remove configure --disable-float4-byval
This build option was only useful to maintain compatibility for
version-0 functions, but those are no longer supported, so this option
can be removed.

float4 is now always pass-by-value; the pass-by-reference code path is
completely removed.

Discussion: https://www.postgresql.org/message-id/flat/f3e1e576-2749-bbd7-2d57-3f9dcf75255a@2ndquadrant.com
2019-11-21 18:29:21 +01:00
Amit Kapila 9290ad198b Track statistics for spilling of changes from ReorderBuffer.
This adds the statistics about transactions spilled to disk from
ReorderBuffer.  Users can query the pg_stat_replication view to check
these stats.

Author: Tomas Vondra, with bug-fixes and minor changes by Dilip Kumar
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2019-11-21 08:06:51 +05:30
Alexander Korotkov bffe1bd684 Implement jsonpath .datetime() method
This commit implements jsonpath .datetime() method as it's specified in
SQL/JSON standard.  There are no-argument and single-argument versions of
this method.  No-argument version selects first of ISO datetime formats
matching input string.  Single-argument version accepts template string as
its argument.

Additionally to .datetime() method itself this commit also implements
comparison ability of resulting date and time values.  There is some difficulty
because exising jsonb_path_*() functions are immutable, while comparison of
timezoned and non-timezoned types involves current timezone.  At first, current
timezone could be changes in session.  Moreover, timezones themselves are not
immutable and could be updated.  This is why we let existing immutable functions
throw errors on such non-immutable comparison.  In the same time this commit
provides jsonb_path_*_tz() functions which are stable and support operations
involving timezones.  As new functions are added to the system catalog,
catversion is bumped.

Support of .datetime() method was the only blocker prevents T832 from being
marked as supported.  sql_features.txt is updated correspondingly.

Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov.
Heavily revised by me.  Comments were adjusted by Liudmila Mantrova.

Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com
Author: Alexander Korotkov, Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Anastasia Lubennikova, Peter Eisentraut
2019-09-25 22:51:51 +03:00
Tom Lane c160b8928c Straighten out leakproofness markings on text comparison functions.
Since we introduced the idea of leakproof functions, texteq and textne
were marked leakproof but their sibling text comparison functions were
not.  This inconsistency seemed justified because texteq/textne just
relied on memcmp() and so could easily be seen to be leakproof, while
the other comparison functions are far more complex and indeed can
throw input-dependent errors.

However, that argument crashed and burned with the addition of
nondeterministic collations, because now texteq/textne may invoke
the exact same varstr_cmp() infrastructure as the rest.  It makes no
sense whatever to give them different leakproofness markings.

After a certain amount of angst we've concluded that it's all right
to consider varstr_cmp() to be leakproof, mostly because the other
choice would be disastrous for performance of many queries where
leakproofness matters.  The input-dependent errors should only be
reachable for corrupt input data, or so we hope anyway; certainly,
if they are reachable in practice, we've got problems with requirements
as basic as maintaining a btree index on a text column.

Hence, run around to all the SQL functions that derive from varstr_cmp()
and mark them leakproof.  This should result in a useful gain in
flexibility/performance for queries in which non-leakproofness degrades
the efficiency of the query plan.

Back-patch to v12 where nondeterministic collations were added.
While this isn't an essential bug fix given the determination
that varstr_cmp() is leakproof, we might as well apply it now that
we've been forced into a post-beta4 catversion bump.

Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
2019-09-21 16:56:30 -04:00
Tom Lane ca70bdaefe Fix issues around strictness of SIMILAR TO.
As a result of some long-ago quick hacks, the SIMILAR TO operator
and the corresponding flavor of substring() interpreted "ESCAPE NULL"
as selecting the default escape character '\'.  This is both
surprising and not per spec: the standard is clear that these
functions should return NULL for NULL input.

Additionally, because of inconsistency of the strictness markings
of 3-argument substring() and similar_escape(), the planner could not
inline the SQL definition of substring(), resulting in a substantial
performance penalty compared to the underlying POSIX substring()
function.

The simplest fix for this would be to change the strictness marking
of similar_escape(), but if we do that we risk breaking existing views
that depend on that function.  Hence, leave similar_escape() as-is
as a compatibility function, and instead invent a new function
similar_to_escape() that comes in two strict variants.

There are a couple of other behaviors in this area that are also
not per spec, but they are documented and seem generally at least
as sane as the spec's definition, so leave them alone.  But improve
the documentation to describe them fully.

Patch by me; thanks to Álvaro Herrera and Andrew Gierth for review
and discussion.

Discussion: https://postgr.es/m/14047.1557708214@sss.pgh.pa.us
2019-09-07 14:21:59 -04:00
Alvaro Herrera 8f75e8e446 Fix typo
In early development patches, "replication origins" were called "identifiers";
almost everything was renamed, but these references to the old terminology
went unnoticed.

Reported-by: Craig Ringer
2019-08-21 11:12:44 -04:00
Peter Geoghegan 71dcd74386 Add sort support routine for the inet data type.
Add sort support for inet, including support for abbreviated keys.
Testing has shown that this reduces the time taken to sort medium to
large inet/cidr inputs by ~50-60% in realistic cases.

Author: Brandur Leach
Reviewed-By: Peter Geoghegan, Edmund Horner
Discussion: https://postgr.es/m/CABR_9B-PQ8o2MZNJ88wo6r-NxW2EFG70M96Wmcgf99G6HUQ3sw@mail.gmail.com
2019-08-01 09:34:14 -07:00
Tom Lane 4886da8327 Mark advisory-lock functions as parallel restricted, not parallel unsafe.
There seems no good reason not to allow a parallel leader to execute
these functions.  (The workers still can't, though.  Although the code
would work, any such lock would go away at worker exit, which is not
the documented behavior of advisory locks.)

Discussion: https://postgr.es/m/11847.1564496844@sss.pgh.pa.us
2019-08-01 11:36:21 -04:00
Peter Eisentraut 5925e55498 Add gen_random_uuid function
This adds a built-in function to generate UUIDs.

PostgreSQL hasn't had a built-in function to generate a UUID yet,
relying on external modules such as uuid-ossp and pgcrypto to provide
one.  Now that we have a strong random number generator built-in, we
can easily provide a version 4 (random) UUID generation function.

This patch takes the existing function gen_random_uuid() from pgcrypto
and makes it a built-in function.  The pgcrypto implementation now
internally redirects to the built-in one.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://www.postgresql.org/message-id/6a65610c-46fc-2323-6b78-e8086340a325@2ndquadrant.com
2019-07-14 14:30:27 +02:00
Alexander Korotkov c085e1c1cb Add support for <-> (box, point) operator to GiST box_ops
Index-based calculation of this operator is exact.  So, signature of
gist_bbox_distance() function is changes so that caller is responsible for
setting *recheck flag.

Discussion: https://postgr.es/m/f71ba19d-d989-63b6-f04a-abf02ad9345d%40postgrespro.ru
Author: Nikita Glukhov
Reviewed-by: Tom Lane, Alexander Korotkov
2019-07-14 15:09:15 +03:00
Alexander Korotkov 6254c55f81 Add missing commutators for distance operators
Some of <-> operators between geometric types have their commutators missed.
This commit adds them.  The motivation is upcoming kNN support for some of those
operators.

Discussion: https://postgr.es/m/f71ba19d-d989-63b6-f04a-abf02ad9345d%40postgrespro.ru
Author: Nikita Glukhov
Reviewed-by: Tom Lane, Alexander Korotkov
2019-07-14 14:55:01 +03:00
Tom Lane 0ab1a2e39b Remove dead encoding-conversion functions.
The code for conversions SQL_ASCII <-> MULE_INTERNAL and
SQL_ASCII <-> UTF8 was unreachable, because we long ago changed
the wrapper functions pg_do_encoding_conversion() et al so that
they have hard-wired behaviors for conversions involving SQL_ASCII.
(At least some of those fast paths date back to 2002, though it
looks like we may not have been totally consistent about this until
later.)  Given the lack of complaints, nobody is dissatisfied with
this state of affairs.  Hence, let's just remove the unreachable code.

Also, change CREATE CONVERSION so that it rejects attempts to
define such conversions.  Since we consider that SQL_ASCII represents
lack of knowledge about the encoding in use, such a conversion would
be semantically dubious even if it were reachable.

Adjust a couple of regression test cases that had randomly decided
to rely on these conversion functions rather than any other ones.

Discussion: https://postgr.es/m/41163.1559156593@sss.pgh.pa.us
2019-07-05 14:17:27 -04:00
Michael Paquier 313f87a171 Add min() and max() aggregates for pg_lsn
This is useful for monitoring, when it comes for example to calculations
of WAL retention with replication slots and delays with a set of
standbys.

Bump catalog version.

Author: Fabrízio de Royes Mello
Reviewed-by: Surafel Temesgen
Discussion: https://postgr.es/m/CAFcNs+oc8ZoHhowA4rR1GGCgG8QNgK_TOwPRVYQo5rYy8_PXzA@mail.gmail.com
2019-07-05 12:21:11 +09:00
Tomas Vondra 4d66285adc Fix pg_mcv_list_items() to produce text[]
The function pg_mcv_list_items() returns values stored in MCV items. The
items may contain columns with different data types, so the function was
generating text array-like representation, but in an ad-hoc way without
properly escaping various characters etc.

Fixed by simply building a text[] array, which also makes it easier to
use from queries etc.

Requires changes to pg_proc entry, so bump catversion.

Backpatch to 12, where multi-column MCV lists were introduced.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed
Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development
2019-07-05 01:32:46 +02:00
Tom Lane c3f67ed6e4 Do pre-release housekeeping on catalog data, and fix jsonpath send/recv.
Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta
tasks specified by RELEASE_CHANGES.  (The only change is 8394 -> 3428.)

Also run reformat_dat_file.pl while I'm here.

While looking at the reformat diffs, I chanced to notice that type
jsonpath had typsend and typreceive = '-', which surely is not the
intention given that jsonpath_send and jsonpath_recv exist.
Fix that.  It's safe to assume that these functions have never been
tested :-(.  I didn't try, but somebody should.
2019-04-28 17:16:50 -04:00
Magnus Hagander 77bd49adba Show shared object statistics in pg_stat_database
This adds a row to the pg_stat_database view with datoid 0 and datname
NULL for those objects that are not in a database. This was added
particularly for checksums, but we were already tracking more satistics
for these objects, just not returning it.

Also add a checksum_last_failure column that holds the timestamptz of
the last checksum failure that occurred in a database (or in a
non-dataabase file), if any.

Author: Julien Rouhaud <rjuju123@gmail.com>
2019-04-12 14:04:50 +02:00
Alvaro Herrera 9f06d79ef8 Add facility to copy replication slots
This allows the user to create duplicates of existing replication slots,
either logical or physical, and even changing properties such as whether
they are temporary or the output plugin used.

There are multiple uses for this, such as initializing multiple replicas
using the slot for one base backup; when doing investigation of logical
replication issues; and to select a different output plugins.

Author: Masahiko Sawada
Reviewed-by: Michael Paquier, Andres Freund, Petr Jelinek
Discussion: https://postgr.es/m/CAD21AoAm7XX8y_tOPP6j4Nzzch12FvA1wPqiO690RCk+uYVstg@mail.gmail.com
2019-04-05 18:05:18 -03:00
Stephen Frost b0b39f72b9 GSSAPI encryption support
On both the frontend and backend, prepare for GSSAPI encryption
support by moving common code for error handling into a separate file.
Fix a TODO for handling multiple status messages in the process.
Eliminate the OIDs, which have not been needed for some time.

Add frontend and backend encryption support functions.  Keep the
context initiation for authentication-only separate on both the
frontend and backend in order to avoid concerns about changing the
requested flags to include encryption support.

In postmaster, pull GSSAPI authorization checking into a shared
function.  Also share the initiator name between the encryption and
non-encryption codepaths.

For HBA, add "hostgssenc" and "hostnogssenc" entries that behave
similarly to their SSL counterparts.  "hostgssenc" requires either
"gss", "trust", or "reject" for its authentication.

Similarly, add a "gssencmode" parameter to libpq.  Supported values are
"disable", "require", and "prefer".  Notably, negotiation will only be
attempted if credentials can be acquired.  Move credential acquisition
into its own function to support this behavior.

Add a simple pg_stat_gssapi view similar to pg_stat_ssl, for monitoring
if GSSAPI authentication was used, what principal was used, and if
encryption is being used on the connection.

Finally, add documentation for everything new, and update existing
documentation on connection security.

Thanks to Michael Paquier for the Windows fixes.

Author: Robbie Harwood, with changes to the read/write functions by me.
Reviewed in various forms and at different times by: Michael Paquier,
   Andres Freund, David Steele.
Discussion: https://www.postgresql.org/message-id/flat/jlg1tgq1ktm.fsf@thriss.redhat.com
2019-04-03 15:02:33 -04:00
Alvaro Herrera ab0dfc961b Report progress of CREATE INDEX operations
This uses the progress reporting infrastructure added by c16dc1aca5,
adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY.

There are two pieces to this: one is index-AM-agnostic, and the other is
AM-specific.  The latter is fairly elaborate for btrees, including
reportage for parallel index builds and the separate phases that btree
index creation uses; other index AMs, which are much simpler in their
building procedures, have simplistic reporting only, but that seems
sufficient, at least for non-concurrent builds.

The index-AM-agnostic part is fairly complete, providing insight into
the CONCURRENTLY wait phases as well as block-based progress during the
index validation table scan.  (The index validation index scan requires
patching each AM, which has not been included here.)

Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada
Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql
2019-04-02 15:18:08 -03:00
Tomas Vondra 7300a69950 Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.

Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
Michael Paquier 5bde1651bb Switch function current_schema[s]() to be parallel-unsafe
When invoked for the first time in a session, current_schema() and
current_schemas() can finish by creating a temporary schema.  Currently
those functions are parallel-safe, however if for a reason or another
they get launched across multiple parallel workers, they would fail when
attempting to create a temporary schema as temporary contexts are not
supported in this case.

The original issue has been spotted by buildfarm members crake and
lapwing, after commit c5660e0 has introduced the first regression tests
based on current_schema() in the tree.  After that, 396676b has
introduced a workaround to avoid parallel plans but that was not
completely right either.

Catversion is bumped.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20190118024618.GF1883@paquier.xyz
2019-03-27 11:35:12 +09:00
Alexander Korotkov 641fde2523 Remove ambiguity for jsonb_path_match() and jsonb_path_exists()
There are 2-arguments and 4-arguments versions of jsonb_path_match() and
jsonb_path_exists().  But 4-arguments versions have optional 3rd and 4th
arguments, that leads to ambiguity.  In the same time 2-arguments versions are
needed only for @@ and @? operators.  So, rename 2-arguments versions to
remove the ambiguity.

Catversion is bumped.
2019-03-20 10:30:56 +03:00
Alexander Korotkov 72b6460336 Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database.  The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them.  This commit implements partial support JSON path language as
separate datatype called "jsonpath".  The implementation is partial because
it's lacking datetime support and suppression of numeric errors.  Missing
features will be added later by separate commits.

Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches.  This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:

 * jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
 * jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
 * jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
 * jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
 * jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).

This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly.  These operators will have an index support
(implemented in subsequent patches).

Catversion bumped, to add new functions and operators.

Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova.  The work
was inspired by Oleg Bartunov.

Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 12:16:48 +03:00
Tom Lane f1d85aa98e Add support for hyperbolic functions, as well as log10().
The SQL:2016 standard adds support for the hyperbolic functions
sinh(), cosh(), and tanh().  POSIX has long required libm to
provide those functions as well as their inverses asinh(),
acosh(), atanh().  Hence, let's just expose the libm functions
to the SQL level.  As with the trig functions, we only implement
versions for float8, not numeric.

For the moment, we'll assume that all platforms actually do have
these functions; if experience teaches otherwise, some autoconf
effort may be needed.

SQL:2016 also adds support for base-10 logarithm, but with the
function name log10(), whereas the name we've long used is log().
Add aliases named log10() for the float8 and numeric versions.

Lætitia Avrot

Discussion: https://postgr.es/m/CAB_COdguG22LO=rnxDQ2DW1uzv8aQoUzyDQNJjrR4k00XSgm5w@mail.gmail.com
2019-03-12 15:55:09 -04:00
Tom Lane 3aa0395d4e Remove remaining hard-wired OID references in the initial catalog data.
In the v11-era commits that taught genbki.pl to resolve symbolic
OID references in the initial catalog data, we didn't bother to
make every last reference symbolic; some of the catalogs have so
few initial rows that it didn't seem worthwhile.

However, the new project policy that OIDs assigned by new patches
should be automatically renumberable changes this calculus.
A patch that wants to add a row in one of these catalogs would have
a problem when the OID it assigns gets renumbered.  Hence, do the
mop-up work needed to make all OID references in initial data be
symbolic, and establish an associated project policy that we'll
never again write a hard-wired OID reference there.

No catversion bump since the contents of postgres.bki aren't
actually changed by this commit.

Discussion: https://postgr.es/m/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-=eaochT0dd2BN9CQ@mail.gmail.com
2019-03-12 12:30:35 -04:00
Magnus Hagander 6b9e875f72 Track block level checksum failures in pg_stat_database
This adds a column that counts how many checksum failures have occurred
on files belonging to a specific database. Both checksum failures
during normal backend processing and those created when a base backup
detects a checksum failure are counted.

Author: Magnus Hagander
Reviewed by: Julien Rouhaud
2019-03-09 10:47:30 -08:00
Tom Lane 1b76168da7 Reformat catalog .dat files.
Test run for my previous commit; cleans up formatting issues in some
other recent commits.
2019-03-08 12:01:27 -05:00
Thomas Munro 91595f9d49 Drop the vestigial "smgr" type.
Before commit 3fa2bb31 this type appeared in the catalogs to
select which of several block storage mechanisms each relation
used.

New features under development propose to revive the concept of
different block storage managers for new kinds of data accessed
via bufmgr.c, but don't need to put references to them in the
catalogs.  So, avoid useless maintenance work on this type by
dropping it.  Update some regression tests that were referencing
it where any type would do.

Discussion: https://postgr.es/m/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com
2019-03-07 15:44:04 +13:00
Andres Freund 8586bf7ed8 tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
  ACCESS METHOD ... TYPE TABLE and
  CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.

Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.

Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.

Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.

Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs.  It's quite possible that this behaviour
still needs to be fine tuned.

For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.

Catversion bumped, to add the heap table AM and references to it.

Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
    https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
    https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 09:54:38 -08:00
Alvaro Herrera b96f6b1948 pg_partition_ancestors
Adds another introspection feature for partitioning, necessary for
further psql patches.

Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/20190226222757.GA31622@alvherre.pgsql
2019-03-04 16:14:29 -03:00
Joe Conway 290e3b77fd Mark pg_config() stable rather than immutable
pg_config() has been marked immutable since its inception. As part of a
larger discussion around the definition of immutable versus stable and
related implications for marking functions parallel safe raised by
Andres, the consensus was clearly that pg_config() is stable, since
it could possibly change output even for the same minor version with
a recompile or installation of a new binary. So mark it stable.

Theoretically this could/should be backpatched, but it was deemed to be not
worth the effort since in practice this is very unlikely to cause problems
in the real world.

Discussion: https://postgr.es/m/20181126234521.rh3grz7aavx2ubjv@alap3.anarazel.de
2019-02-17 09:21:13 -05:00
Tom Lane 74dfe58a59 Allow extensions to generate lossy index conditions.
For a long time, indxpath.c has had the ability to extract derived (lossy)
index conditions from certain operators such as LIKE.  For just as long,
it's been obvious that we really ought to make that capability available
to extensions.  This commit finally accomplishes that, by adding another
API for planner support functions that lets them create derived index
conditions for their functions.  As proof of concept, the hardwired
"special index operator" code formerly present in indxpath.c is pushed
out to planner support functions attached to LIKE and other relevant
operators.

A weak spot in this design is that an extension needs to know OIDs for
the operators, datatypes, and opfamilies involved in the transformation
it wants to make.  The core-code prototypes use hard-wired OID references
but extensions don't have that option for their own operators etc.  It's
usually possible to look up the required info, but that may be slow and
inconvenient.  However, improving that situation is a separate task.

I want to do some additional refactorization around selfuncs.c, but
that also seems like a separate task.

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-11 21:26:14 -05:00
Tom Lane a391ff3c3d Build out the planner support function infrastructure.
Add support function requests for estimating the selectivity, cost,
and number of result rows (if a SRF) of the target function.

The lack of a way to estimate selectivity of a boolean-returning
function in WHERE has been a recognized deficiency of the planner
since Berkeley days.  This commit finally fixes it.

In addition, non-constant estimates of cost and number of output
rows are now possible.  We still fall back to looking at procost
and prorows if the support function doesn't service the request,
of course.

To make concrete use of the possibility of estimating output rowcount
for SRFs, this commit adds support functions for array_unnest(anyarray)
and the integer variants of generate_series; the lack of plausible
rowcount estimates for those, even when it's obvious to a human,
has been a repeated subject of complaints.  Obviously, much more
could now be done in this line, but I'm mostly just trying to get
the infrastructure in place.

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-09 18:32:23 -05:00
Tom Lane 1fb57af920 Create the infrastructure for planner support functions.
Rename/repurpose pg_proc.protransform as "prosupport".  The idea is
still that it names an internal function that provides knowledge to
the planner about the behavior of the function it's attached to;
but redesign the API specification so that it's not limited to doing
just one thing, but can support an extensible set of requests.

The original purpose of simplifying a function call is handled by
the first request type to be invented, SupportRequestSimplify.
Adjust all the existing transform functions to handle this API,
and rename them fron "xxx_transform" to "xxx_support" to reflect
the potential generalization of what they do.  (Since we never
previously provided any way for extensions to add transform functions,
this change doesn't create an API break for them.)

Also add DDL and pg_dump support for attaching a support function to a
user-defined function.  Unfortunately, DDL access has to be restricted
to superusers, at least for now; but seeing that support functions
will pretty much have to be written in C, that limitation is just
theoretical.  (This support is untested in this patch, but a follow-on
patch will add cases that exercise it.)

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-09 18:08:48 -05:00
Michael Paquier 3677a0b26b Add pg_partition_root to display top-most parent of a partition tree
This is useful when looking at partition trees with multiple layers, and
combined with pg_partition_tree, it provides the possibility to show up
an entire tree by just knowing one member at any level.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Amit Langote
Discussion: https://postgr.es/m/20181207014015.GP2407@paquier.xyz
2019-02-08 08:56:14 +09:00
Peter Eisentraut f60a0e9677 Add more columns to pg_stat_ssl
Add columns client_serial and issuer_dn to pg_stat_ssl.  These allow
uniquely identifying the client certificate.

Rename the existing column clientdn to client_dn, to make the naming
more consistent and easier to read.

Discussion: https://www.postgresql.org/message-id/flat/398754d8-6bb5-c5cf-e7b8-22e5f0983caf@2ndquadrant.com/
2019-02-01 00:33:47 +01:00
Tom Lane d33faa285b Move the built-in conversions into the initial catalog data.
Instead of running a SQL script to create the standard conversion
functions and pg_conversion entries, put those entries into the
initial data in postgres.bki.

This shaves a few percent off the runtime of initdb, and also allows
accurate comments to be attached to the conversion functions; the
previous script labeled them with machine-generated comments that
were not quite right for multi-purpose conversion functions.
Also, we can get rid of the duplicative Makefile and MSVC perl
implementations of the generation code for that SQL script.

A functional change is that these pg_proc and pg_conversion entries
are now "pinned" by initdb.  Leaving them unpinned was perhaps a
good thing back while the conversions feature was under development,
but there seems no valid reason for it now.

Also, the conversion functions are now marked as immutable, where
before they were volatile by virtue of lacking any explicit
specification.  That seems like it was just an oversight.

To avoid using magic constants in pg_conversion.dat, extend
genbki.pl to allow encoding names to be converted, much as it
does for language, access method, etc names.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
2019-01-03 19:47:53 -05:00
Tom Lane 814c9019aa Use symbolic references for pg_language OIDs in the bootstrap data.
This patch teaches genbki.pl to replace pg_language names by OIDs
in much the same way as it already does for pg_am names etc, and
converts pg_proc.dat to use such symbolic references in the prolang
column.

Aside from getting rid of a few more magic numbers in the initial
catalog data, this means that Gen_fmgrtab.pl no longer needs to read
pg_language.dat, since it doesn't have to know the OID of the "internal"
language; now it's just looking for the string "internal".

No need for a catversion bump, since the contents of postgres.bki
don't actually change at all.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
2019-01-03 18:38:49 -05:00
Bruce Momjian 97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00
Tom Lane d01e75d68e Update leakproofness markings on some btree comparison functions.
Mark pg_lsn and oidvector comparison functions as leakproof.  Per
discussion, these clearly are leakproof so we might as well mark them so.

On the other hand, remove leakproof markings from name comparison
functions other than equal/not-equal.  Now that these depend on
varstr_cmp, they can't be considered leakproof if text comparison isn't.
(This was my error in commit 586b98fdf.)

While at it, add some opr_sanity queries to catch cases where related
functions do not have the same volatility and leakproof markings.
This would clearly be bogus for commutator or negator pairs.  In the
domain of btree comparison functions, we do have some exceptions,
because text equality is leakproof but inequality comparisons are not.
That's odd on first glance but is reasonable (for now anyway) given
the much greater complexity of the inequality code paths.

Discussion: https://postgr.es/m/20181231172551.GA206480@gust.leadboat.com
2018-12-31 16:38:11 -05:00
Tom Lane 0a6ea4001a Add a hash opclass for type "tid".
Up to now we've not worried much about joins where the join key is a
relation's CTID column, reasoning that storing a table's CTIDs in some
other table would be pretty useless.  However, there are use-cases for
this sort of query involving self-joins, so that argument doesn't really
hold water.

With larger relations, a merge or hash join is desirable.  We had a btree
opclass for type "tid", allowing merge joins on CTID, but no hash opclass
so that hash joins weren't possible.  Add the missing infrastructure.

This also potentially enables hash aggregation on "tid", though the
use-cases for that aren't too clear.

Discussion: https://postgr.es/m/1853.1545453106@sss.pgh.pa.us
2018-12-30 15:40:04 -05:00
Tom Lane 5bbee34d9f Avoid producing over-length specific_name outputs in information_schema.
information_schema output columns that are declared as being type
sql_identifier are supposed to conform to the implementation's rules
for valid identifiers, in particular the identifier length limit.
Several places potentially violated this limit by concatenating a
function's name and OID.  (The OID is added to ensure name uniqueness
within a schema, since the spec doesn't expect function name overloading.)

Simply truncating the concatenation result to fit in "name" won't do,
since losing part of the OID might wind up giving non-unique results.
Instead, let's truncate the function name as necessary.

The most practical way to do that is to do it in a C function; the
information_schema.sql script doesn't have easy access to the value
of NAMEDATALEN, nor does it have an easy way to truncate on the basis
of resulting byte-length rather than number of characters.

(There are still a couple of places that cast concatenation results to
sql_identifier, but as far as I can see they are guaranteed not to produce
over-length strings, at least with the normal value of NAMEDATALEN.)

Discussion: https://postgr.es/m/23817.1545283477@sss.pgh.pa.us
2018-12-20 16:21:59 -05:00
Tom Lane 2ece7c07dc Add text-vs-name cross-type operators, and unify name_ops with text_ops.
Now that name comparison has effectively the same behavior as text
comparison, we might as well merge the name_ops opfamily into text_ops,
allowing cross-type comparisons to be processed without forcing a
datatype coercion first.  We need do little more than add cross-type
operators to make the opfamily complete, and fix one or two places
in the planner that assumed text_ops was a single-datatype opfamily.

I chose to unify hash name_ops into hash text_ops as well, since the
types have compatible hashing semantics.  This allows marking the
new cross-type equality operators as oprcanhash.

(Note: this doesn't remove the name_ops opclasses, so there's no
breakage of index definitions.  Those opclasses are just reparented
into the text_ops opfamily.)

Discussion: https://postgr.es/m/15938.1544377821@sss.pgh.pa.us
2018-12-19 17:46:25 -05:00
Michael Paquier 7fee252f6f Add timestamp of last received message from standby to pg_stat_replication
The timestamp generated by the standby at message transmission has been
included in the protocol since its introduction for both the status
update message and hot standby feedback message, but it has never
appeared in pg_stat_replication.  Seeing this timestamp does not matter
much with a cluster which has a lot of activity, but on a mostly-idle
cluster, this makes monitoring able to react faster than the configured
timeouts.

Author: MyungKyu LIM
Reviewed-by: Michael Paquier, Masahiko Sawada
Discussion: https://postgr.es/m/1657809367.407321.1533027417725.JavaMail.jboss@ep2ml404
2018-12-09 16:35:06 +09:00
Andres Freund 578b229718 Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.

This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row.  Neither pg_dump nor COPY included the contents of the
oid column by default.

The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.

WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.

Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
  WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
  issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
  restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
  OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
  plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.

The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.

The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such.  This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.

The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.

Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).

The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.

While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.

Catversion bump, for obvious reasons.

Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-20 16:00:17 -08:00
Tom Lane 600b04d6b5 Add a timezone-specific variant of date_trunc().
date_trunc(field, timestamptz, zone_name) performs truncation using
the named time zone as reference, rather than working in the session
time zone as is the default behavior.  It's equivalent to

date_trunc(field, timestamptz at time zone zone_name) at time zone zone_name

but it's faster, easier to type, and arguably easier to understand.

Vik Fearing and Tom Lane

Discussion: https://postgr.es/m/6249ffc4-2b22-4c1b-4e7d-7af84fedd7c6@2ndquadrant.com
2018-11-14 15:41:07 -05:00
Michael Paquier d5eec4eefd Add pg_partition_tree to display information about partitions
This new function is useful to display a full tree of partitions with a
partitioned table given in output, and avoids the need of any complex
WITH RECURSIVE query when looking at partition trees which are
deep multiple levels.

It returns a set of records, one for each partition, containing the
partition's name, its immediate parent's name, a boolean value telling
if the relation is a leaf in the tree and an integer telling its level
in the partition tree with given table considered as root, beginning at
zero for the root, and incrementing by one each time the scan goes one
level down.

Author: Amit Langote
Reviewed-by: Jesper Pedersen, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/8d00e51a-9a51-ad02-d53e-ba6bf50b2e52@lab.ntt.co.jp
2018-10-30 10:25:06 +09:00
Michael Paquier 10074651e3 Add pg_promote function
This function is able to promote a standby with this new SQL-callable
function.  Execution access can be granted to non-superusers so that
failover tools can observe the principle of least privilege.

Catalog version is bumped.

Author: Laurenz Albe
Reviewed-by: Michael Paquier, Masahiko Sawada
Discussion: https://postgr.es/m/6e7c79b3ec916cf49742fb8849ed17cd87aed620.camel@cybertec.at
2018-10-25 09:46:00 +09:00
Andres Freund cda6a8d01d Remove deprecated abstime, reltime, tinterval datatypes.
These types have been deprecated for a *long* time.

Catversion bump, for obvious reasons.

Author: Andres Freund
Discussion:
    https://postgr.es/m/20181009192237.34wjp3nmw7oynmmr@alap3.anarazel.de
    https://postgr.es/m/20171213080506.cwjkpcz3bkk6yz2u@alap3.anarazel.de
    https://postgr.es/m/25615.1513115237@sss.pgh.pa.us
2018-10-11 11:59:15 -07:00
Michael Paquier c481016201 Add pg_ls_archive_statusdir function
This function lists the contents of the WAL archive status directory,
and is intended to be used by monitoring tools.  Unlike pg_ls_dir(),
access to it can be granted to non-superusers so that those monitoring
tools can observe the principle of least privilege.  Access is also
given by default to members of pg_monitor.

Author:  Christoph Moench-Tegeder
Reviewed-by: Aya Iwata
Discussion: https://postgr.es/m/20180930205920.GA64534@elch.exwg.net
2018-10-09 22:29:09 +09:00
Tom Lane 07ee62ce9e Propagate xactStartTimestamp and stmtStartTimestamp to parallel workers.
Previously, a worker process would establish values for these based on
its own start time.  In v10 and up, this can trivially be shown to cause
misbehavior of transaction_timestamp(), timestamp_in(), and related
functions which are (perhaps unwisely?) marked parallel-safe.  It seems
likely that other behaviors might diverge from what happens in the parent
as well.

It's not as trivial to demonstrate problems in 9.6 or 9.5, but I'm sure
it's still possible, so back-patch to all branches containing parallel
worker infrastructure.

In HEAD only, mark now() and statement_timestamp() as parallel-safe
(other affected functions already were).  While in theory we could
still squeeze that change into v11, it doesn't seem important enough
to force a last-minute catversion bump.

Konstantin Knizhnik, whacked around a bit by me

Discussion: https://postgr.es/m/6406dbd2-5d37-4cb6-6eb2-9c44172c7e7c@postgrespro.ru
2018-10-06 12:00:09 -04:00
Michael Paquier 9cd92d1a33 Add pg_ls_tmpdir function
This lists the contents of a temporary directory associated to a given
tablespace, useful to get information about on-disk consumption caused
by temporary files used by a session query.  By default, pg_default is
scanned, and a tablespace can be specified as argument.

This function is intended to be used by monitoring tools, and, unlike
pg_ls_dir(), access to them can be granted to non-superusers so that
those monitoring tools can observe the principle of least privilege.
Access is also given by default to members of pg_monitor.

Author: Nathan Bossart
Reviewed-by: Laurenz Albe
Discussion: https://postgr.es/m/92F458A2-6459-44B8-A7F2-2ADD3225046A@amazon.com
2018-10-05 09:21:48 +09:00
Joe Conway c62dd80cdf Document aclitem functions and operators
aclitem functions and operators have been heretofore undocumented.
Fix that. While at it, ensure the non-operator aclitem functions have
pg_description strings.

Does not seem worthwhile to back-patch.

Author: Fabien Coelho, with pg_description from John Naylor, and significant
refactoring and editorialization by me.
Reviewed by: Tom Lane
Discussion: https://postgr.es/m/flat/alpine.DEB.2.21.1808010825490.18204%40lancre
2018-09-24 10:14:57 -04:00
Tom Lane ae5205c8a8 Make argument names of pg_get_object_address consistent, and fix docs.
pg_get_object_address and pg_identify_object_as_address are supposed
to be inverses, but they disagreed as to the names of the arguments
representing the textual form of an object address.  Moreover, the
documented argument names didn't agree with reality at all, either
for these functions or pg_identify_object.

In HEAD and v11, I think we can get away with renaming the input
arguments of pg_get_object_address to match the outputs of
pg_identify_object_as_address.  In theory that might break queries
using named-argument notation to call pg_get_object_address, but
it seems really unlikely that anybody is doing that, or that they'd
have much trouble adjusting if they were.  In older branches, we'll
just live with the lack of consistency.

Aside from fixing the documentation of these functions to match reality,
I couldn't resist the temptation to do some copy-editing.

Per complaint from Jean-Pierre Pelletier.  Back-patch to 9.5 where these
functions were introduced.  (Before v11, this is a documentation change
only.)

Discussion: https://postgr.es/m/CANGqjDnWH8wsTY_GzDUxbt4i=y-85SJreZin4Hm8uOqv1vzRQA@mail.gmail.com
2018-09-05 13:47:28 -04:00
Michael Paquier ce89ad0fa0 Fix argument of pg_create_logical_replication_slot for slot name
All attributes and arguments using a slot name map to the data type
"name", but this function has been using "text".  This is cosmetic, as
even if text is used then the slot name would be truncated to 64
characters anyway and stored as such.  The documentation already said
so and the function already assumed that the argument was of this type
when fetching its value.

Bump catalog version.

Author: Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoADYz_-eAqH5AVFaCaojcRgwpo9PW=u8kgTMys63oB8Cw@mail.gmail.com
2018-07-13 09:32:12 +09:00
Tom Lane 39a96512b3 Mark built-in btree comparison functions as leakproof where it's safe.
Generally, if the comparison operators for a datatype or pair of datatypes
are leakproof, the corresponding btree comparison support function can be
considered so as well.  But we had not originally worried about marking
support functions as leakproof, reasoning that they'd not likely be used in
queries so the marking wouldn't matter.  It turns out there's at least one
place where it does matter: calc_arraycontsel() finds the target datatype's
default btree comparison function and tries to use that to estimate
selectivity, but it will be blocked in some cases if the function isn't
leakproof.  This leads to unnecessarily poor selectivity estimates and bad
plans, as seen in bug #15251.

Hence, run around and apply proleakproof markings where the corresponding
btree comparison operators are leakproof.  (I did eyeball each function
to verify that it wasn't doing anything surprising, too.)

This isn't a full solution to bug #15251, and it's not back-patchable
because of the need for a catversion bump.  A more useful response probably
is to consider whether we can check permissions on the parent table instead
of the child.  However, this change will help in some cases where that
won't, and it's easy enough to do in HEAD, so let's do so.

Discussion: https://postgr.es/m/3876.1531261875@sss.pgh.pa.us
2018-07-11 18:47:31 -04:00
Andrew Dunstan 123efbccea Mark binary_upgrade_set_missing_value as parallel_unsafe
per buildfarm.

Bump catalog version again although in practice nobody is going to use
this in a parallel query.
2018-06-23 08:43:05 -04:00
Andrew Dunstan 2448adf29c Allow for pg_upgrade of attributes with missing values
Commit 16828d5c02 neglected to do this, so upgraded databases would
silently get null instead of the specified default in rows without the
attribute defined.

A new binary upgrade function is provided to perform this and pg_dump is
adjusted to output a call to the function if required in binary upgrade
mode.

Also included is code to drop missing attribute values for dropped
columns. That way if the type is later dropped the missing value won't
have a dangling reference to the type.

Finally the regression tests are adjusted to ensure that there is a row
with a missing value so that this code is exercised in upgrade testing.

Catalog version unfortunately bumped.

Regression test changes from Tom Lane.
Remainder from me, reviewed by Tom Lane, Andres Freund, Alvaro Herrera

Discussion: https://postgr.es/m/19987.1529420110@sss.pgh.pa.us
2018-06-22 08:42:36 -04:00
Tom Lane 45c6d75f8c Clarify handling of special-case values in bootstrap catalog data.
I (tgl) originally coded the special case for pg_proc.pronargs as
though it were a kind of default value.  It seems better though to
treat computable columns as an independent concern: this makes the
code clearer, and probably a bit faster too since we needn't do
work inside the per-column loop.

Improve related comments, as well, in the expectation that there
might be more cases like this in future.

John Naylor, some additional comment-hacking by me

Discussion: https://postgr.es/m/CAJVSVGW-D7OobzU=dybVT2JqZAx-4X1yvBJdavBmqQL05Q6CLw@mail.gmail.com
2018-04-28 15:27:16 -04:00
Tom Lane 68c23cba34 Improve consistency of comments in system catalog headers.
Use the term "system catalog" rather than "system relation" in assorted
places where it's clearly referring to a table rather than, say, an
index.  Use more natural word order in the header boilerplate, improve
some of the one-liner catalog descriptions, and fix assorted random
deviations from the normal boilerplate.  All purely neatnik-ism, but
why not.

John Naylor, some additional cleanup by me

Discussion: https://postgr.es/m/CAJVSVGUeJmFB3h-NJ18P32NPa+kzC165nm7GSoGHfPaN80Wxcw@mail.gmail.com
2018-04-19 17:14:09 -04:00
Tom Lane 55d26ff638 Rationalize handling of single and double quotes in bootstrap data.
Change things around so that proper quoting of values interpolated into
the BKI data by initdb is the responsibility of initdb, not something
we half-heartedly handle by putting double quotes into the raw BKI data.
(Note: experimentation shows that it still doesn't work to put a double
quote into the initial superuser username, but that's the fault of
inadequate quoting while interpolating the name into SQL scripts;
the BKI aspect of it works fine now.)

Having done that, we can remove the special-case handling of values
that look like "something" from genbki.pl, and instead teach it to
escape double --- and single --- quotes properly.  This removes the
nowhere-documented need to treat those specially in the BKI source
data; whatever you write will be passed through unchanged into the
inserted data value, modulo Perl's rules about single-quoted strings.

Add documentation explaining the (pre-existing) handling of backslashes
in the BKI data.

Per an earlier discussion with John Naylor.

Discussion: https://postgr.es/m/CAJVSVGUNao=-Q2-vAN3PYcdF5tnL5JAHwGwzZGuYHtq+Mk_9ng@mail.gmail.com
2018-04-17 19:53:50 -04:00
Magnus Hagander a228cc13ae Revert "Allow on-line enabling and disabling of data checksums"
This reverts the backend sides of commit 1fde38beaa.
I have, at least for now, left the pg_verify_checksums tool in place, as
this tool can be very valuable without the rest of the patch as well,
and since it's a read-only tool that only runs when the cluster is down
it should be a lot safer.
2018-04-09 19:03:42 +02:00
Tom Lane 4f85f66469 Cosmetic cleanups in initial catalog data.
Write ',' and ';' for typdelim values instead of the obscurantist
ASCII octal equivalents.  Not sure why anybody ever thought the
latter were better; maybe it had something to do with lack of
a better quoting convention, twenty-plus years ago?

Reassign a couple of high-numbered OIDs that were left in during
yesterday's mad rush to commit stuff of uncertain internal
temperature.

The latter requires a catversion bump, though the former wouldn't
since the end-result catalog data is unchanged.
2018-04-08 15:55:49 -04:00
Tom Lane 372728b0d4 Replace our traditional initial-catalog-data format with a better design.
Historically, the initial catalog data to be installed during bootstrap
has been written in DATA() lines in the catalog header files.  This had
lots of disadvantages: the format was badly underdocumented, it was
very difficult to edit the data in any mechanized way, and due to the
lack of any abstraction the data was verbose, hard to read/understand,
and easy to get wrong.

Hence, move this data into separate ".dat" files and represent it in a way
that can easily be read and rewritten by Perl scripts.  The new format is
essentially "key => value" for each column; while it's a bit repetitive,
explicit labeling of each value makes the data far more readable and less
error-prone.  Provide a way to abbreviate entries by omitting field values
that match a specified default value for their column.  This allows removal
of a large amount of repetitive boilerplate and also lowers the barrier to
adding new columns.

Also teach genbki.pl how to translate symbolic OID references into
numeric OIDs for more cases than just "regproc"-like pg_proc references.
It can now do that for regprocedure-like references (thus solving the
problem that regproc is ambiguous for overloaded functions), operators,
types, opfamilies, opclasses, and access methods.  Use this to turn
nearly all OID cross-references in the initial data into symbolic form.
This represents a very large step forward in readability and error
resistance of the initial catalog data.  It should also reduce the
difficulty of renumbering OID assignments in uncommitted patches.

Also, solve the longstanding problem that frontend code that would like to
use OID macros and other information from the catalog headers often had
difficulty with backend-only code in the headers.  To do this, arrange for
all generated macros, plus such other declarations as we deem fit, to be
placed in "derived" header files that are safe for frontend inclusion.
(Once clients migrate to using these pg_*_d.h headers, it will be possible
to get rid of the pg_*_fn.h headers, which only exist to quarantine code
away from clients.  That is left for follow-on patches, however.)

The now-automatically-generated macros include the Anum_xxx and Natts_xxx
constants that we used to have to update by hand when adding or removing
catalog columns.

Replace the former manual method of generating OID macros for pg_type
entries with an automatic method, ensuring that all built-in types have
OID macros.  (But note that this patch does not change the way that
OID macros for pg_proc entries are built and used.  It's not clear that
making that match the other catalogs would be worth extra code churn.)

Add SGML documentation explaining what the new data format is and how to
work with it.

Despite being a very large change in the catalog headers, there is no
catversion bump here, because postgres.bki and related output files
haven't changed at all.

John Naylor, based on ideas from various people; review and minor
additional coding by me; previous review by Alvaro Herrera

Discussion: https://postgr.es/m/CAJVSVGWO48JbbwXkJz_yBFyGYW-M9YWxnPdxJBUosDC9ou_F0Q@mail.gmail.com
2018-04-08 13:17:27 -04:00