Commit Graph

69 Commits

Author SHA1 Message Date
Andrew Dunstan a83586b554 Add a non-strict version of jsonb_set
jsonb_set_lax() is the same as jsonb_set, except that it takes and extra
argument that specifies what to do if the value argument is NULL. The
default is 'use_json_null'. Other possibilities are 'raise_exception',
'return_target' and 'delete_key', all these behaviours having been
suggested as reasonable by various users.

Discussion: https://postgr.es/m/375873e2-c957-3a8d-64f9-26c43c2b16e7@2ndQuadrant.com

Reviewed by: Pavel Stehule
2020-01-17 11:52:39 +10:30
Robert Haas ed10f32e37 Add pg_shmem_allocations view.
This tells you about allocations that have been made from the main
shared memory segment. The original patch also tried to show information
about dynamic shared memory allocation as well, but I decided to
leave that problem for another time.

Andres Freund and Robert Haas, reviewed by Michael Paquier, Marti
Raudsepp, Tom Lane, Álvaro Herrera, and Kyotaro Horiguchi.

Discussion: http://postgr.es/m/20140504114417.GM12715@awork2.anarazel.de
2020-01-09 10:59:07 -05:00
Tom Lane 20d6225d16 Add functions min_scale(numeric) and trim_scale(numeric).
These allow better control of trailing zeroes in numeric values.

Pavel Stehule, based on an old proposal of Marko Tiikkaja's;
review by Karl Pinc

Discussion: https://postgr.es/m/CAFj8pRDjs-navGASeF0Wk74N36YGFJ+v=Ok9_knRa7vDc-qugg@mail.gmail.com
2020-01-06 12:13:53 -05:00
Bruce Momjian 7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
Tom Lane 8b7ae5a82d Stabilize the results of pg_notification_queue_usage().
This function wasn't touched in commit 51004c717, but that turns out
to be a bad idea, because its results now include any dead space
that exists in the NOTIFY queue on account of our being lazy about
advancing the queue tail.  Notably, the isolation tests now fail
if run twice without a server restart between, because async-notify's
first test of the function will already show a positive value.
It seems likely that end users would be equally unhappy about the
result's instability.  To fix, just make the function call
asyncQueueAdvanceTail before computing its result.  That should end
in producing the same value as before, and it's hard to believe that
there's any practical use-case where pg_notification_queue_usage()
is called so often as to create a performance degradation, especially
compared to what we did before.

Out of paranoia, also mark this function parallel-restricted (it
was volatile, but parallel-safe by default, before).  Although the
code seems to work fine when run in a parallel worker, that's outside
the design scope of async.c, and it's a bit scary to have intentional
side-effects happening in a parallel worker.  There seems no plausible
use-case where it'd be important to try to parallelize this, so let's
not take any risk of introducing new bugs.

In passing, re-pgindent async.c and run reformat-dat-files on
pg_proc.dat, just because I'm a neatnik.

Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us
2019-11-24 14:09:33 -05:00
Peter Eisentraut 2e4db241bf Remove configure --disable-float4-byval
This build option was only useful to maintain compatibility for
version-0 functions, but those are no longer supported, so this option
can be removed.

float4 is now always pass-by-value; the pass-by-reference code path is
completely removed.

Discussion: https://www.postgresql.org/message-id/flat/f3e1e576-2749-bbd7-2d57-3f9dcf75255a@2ndquadrant.com
2019-11-21 18:29:21 +01:00
Amit Kapila 9290ad198b Track statistics for spilling of changes from ReorderBuffer.
This adds the statistics about transactions spilled to disk from
ReorderBuffer.  Users can query the pg_stat_replication view to check
these stats.

Author: Tomas Vondra, with bug-fixes and minor changes by Dilip Kumar
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2019-11-21 08:06:51 +05:30
Alexander Korotkov bffe1bd684 Implement jsonpath .datetime() method
This commit implements jsonpath .datetime() method as it's specified in
SQL/JSON standard.  There are no-argument and single-argument versions of
this method.  No-argument version selects first of ISO datetime formats
matching input string.  Single-argument version accepts template string as
its argument.

Additionally to .datetime() method itself this commit also implements
comparison ability of resulting date and time values.  There is some difficulty
because exising jsonb_path_*() functions are immutable, while comparison of
timezoned and non-timezoned types involves current timezone.  At first, current
timezone could be changes in session.  Moreover, timezones themselves are not
immutable and could be updated.  This is why we let existing immutable functions
throw errors on such non-immutable comparison.  In the same time this commit
provides jsonb_path_*_tz() functions which are stable and support operations
involving timezones.  As new functions are added to the system catalog,
catversion is bumped.

Support of .datetime() method was the only blocker prevents T832 from being
marked as supported.  sql_features.txt is updated correspondingly.

Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov.
Heavily revised by me.  Comments were adjusted by Liudmila Mantrova.

Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com
Author: Alexander Korotkov, Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Anastasia Lubennikova, Peter Eisentraut
2019-09-25 22:51:51 +03:00
Tom Lane c160b8928c Straighten out leakproofness markings on text comparison functions.
Since we introduced the idea of leakproof functions, texteq and textne
were marked leakproof but their sibling text comparison functions were
not.  This inconsistency seemed justified because texteq/textne just
relied on memcmp() and so could easily be seen to be leakproof, while
the other comparison functions are far more complex and indeed can
throw input-dependent errors.

However, that argument crashed and burned with the addition of
nondeterministic collations, because now texteq/textne may invoke
the exact same varstr_cmp() infrastructure as the rest.  It makes no
sense whatever to give them different leakproofness markings.

After a certain amount of angst we've concluded that it's all right
to consider varstr_cmp() to be leakproof, mostly because the other
choice would be disastrous for performance of many queries where
leakproofness matters.  The input-dependent errors should only be
reachable for corrupt input data, or so we hope anyway; certainly,
if they are reachable in practice, we've got problems with requirements
as basic as maintaining a btree index on a text column.

Hence, run around to all the SQL functions that derive from varstr_cmp()
and mark them leakproof.  This should result in a useful gain in
flexibility/performance for queries in which non-leakproofness degrades
the efficiency of the query plan.

Back-patch to v12 where nondeterministic collations were added.
While this isn't an essential bug fix given the determination
that varstr_cmp() is leakproof, we might as well apply it now that
we've been forced into a post-beta4 catversion bump.

Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
2019-09-21 16:56:30 -04:00
Tom Lane ca70bdaefe Fix issues around strictness of SIMILAR TO.
As a result of some long-ago quick hacks, the SIMILAR TO operator
and the corresponding flavor of substring() interpreted "ESCAPE NULL"
as selecting the default escape character '\'.  This is both
surprising and not per spec: the standard is clear that these
functions should return NULL for NULL input.

Additionally, because of inconsistency of the strictness markings
of 3-argument substring() and similar_escape(), the planner could not
inline the SQL definition of substring(), resulting in a substantial
performance penalty compared to the underlying POSIX substring()
function.

The simplest fix for this would be to change the strictness marking
of similar_escape(), but if we do that we risk breaking existing views
that depend on that function.  Hence, leave similar_escape() as-is
as a compatibility function, and instead invent a new function
similar_to_escape() that comes in two strict variants.

There are a couple of other behaviors in this area that are also
not per spec, but they are documented and seem generally at least
as sane as the spec's definition, so leave them alone.  But improve
the documentation to describe them fully.

Patch by me; thanks to Álvaro Herrera and Andrew Gierth for review
and discussion.

Discussion: https://postgr.es/m/14047.1557708214@sss.pgh.pa.us
2019-09-07 14:21:59 -04:00
Alvaro Herrera 8f75e8e446 Fix typo
In early development patches, "replication origins" were called "identifiers";
almost everything was renamed, but these references to the old terminology
went unnoticed.

Reported-by: Craig Ringer
2019-08-21 11:12:44 -04:00
Peter Geoghegan 71dcd74386 Add sort support routine for the inet data type.
Add sort support for inet, including support for abbreviated keys.
Testing has shown that this reduces the time taken to sort medium to
large inet/cidr inputs by ~50-60% in realistic cases.

Author: Brandur Leach
Reviewed-By: Peter Geoghegan, Edmund Horner
Discussion: https://postgr.es/m/CABR_9B-PQ8o2MZNJ88wo6r-NxW2EFG70M96Wmcgf99G6HUQ3sw@mail.gmail.com
2019-08-01 09:34:14 -07:00
Tom Lane 4886da8327 Mark advisory-lock functions as parallel restricted, not parallel unsafe.
There seems no good reason not to allow a parallel leader to execute
these functions.  (The workers still can't, though.  Although the code
would work, any such lock would go away at worker exit, which is not
the documented behavior of advisory locks.)

Discussion: https://postgr.es/m/11847.1564496844@sss.pgh.pa.us
2019-08-01 11:36:21 -04:00
Peter Eisentraut 5925e55498 Add gen_random_uuid function
This adds a built-in function to generate UUIDs.

PostgreSQL hasn't had a built-in function to generate a UUID yet,
relying on external modules such as uuid-ossp and pgcrypto to provide
one.  Now that we have a strong random number generator built-in, we
can easily provide a version 4 (random) UUID generation function.

This patch takes the existing function gen_random_uuid() from pgcrypto
and makes it a built-in function.  The pgcrypto implementation now
internally redirects to the built-in one.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://www.postgresql.org/message-id/6a65610c-46fc-2323-6b78-e8086340a325@2ndquadrant.com
2019-07-14 14:30:27 +02:00
Alexander Korotkov c085e1c1cb Add support for <-> (box, point) operator to GiST box_ops
Index-based calculation of this operator is exact.  So, signature of
gist_bbox_distance() function is changes so that caller is responsible for
setting *recheck flag.

Discussion: https://postgr.es/m/f71ba19d-d989-63b6-f04a-abf02ad9345d%40postgrespro.ru
Author: Nikita Glukhov
Reviewed-by: Tom Lane, Alexander Korotkov
2019-07-14 15:09:15 +03:00
Alexander Korotkov 6254c55f81 Add missing commutators for distance operators
Some of <-> operators between geometric types have their commutators missed.
This commit adds them.  The motivation is upcoming kNN support for some of those
operators.

Discussion: https://postgr.es/m/f71ba19d-d989-63b6-f04a-abf02ad9345d%40postgrespro.ru
Author: Nikita Glukhov
Reviewed-by: Tom Lane, Alexander Korotkov
2019-07-14 14:55:01 +03:00
Tom Lane 0ab1a2e39b Remove dead encoding-conversion functions.
The code for conversions SQL_ASCII <-> MULE_INTERNAL and
SQL_ASCII <-> UTF8 was unreachable, because we long ago changed
the wrapper functions pg_do_encoding_conversion() et al so that
they have hard-wired behaviors for conversions involving SQL_ASCII.
(At least some of those fast paths date back to 2002, though it
looks like we may not have been totally consistent about this until
later.)  Given the lack of complaints, nobody is dissatisfied with
this state of affairs.  Hence, let's just remove the unreachable code.

Also, change CREATE CONVERSION so that it rejects attempts to
define such conversions.  Since we consider that SQL_ASCII represents
lack of knowledge about the encoding in use, such a conversion would
be semantically dubious even if it were reachable.

Adjust a couple of regression test cases that had randomly decided
to rely on these conversion functions rather than any other ones.

Discussion: https://postgr.es/m/41163.1559156593@sss.pgh.pa.us
2019-07-05 14:17:27 -04:00
Michael Paquier 313f87a171 Add min() and max() aggregates for pg_lsn
This is useful for monitoring, when it comes for example to calculations
of WAL retention with replication slots and delays with a set of
standbys.

Bump catalog version.

Author: Fabrízio de Royes Mello
Reviewed-by: Surafel Temesgen
Discussion: https://postgr.es/m/CAFcNs+oc8ZoHhowA4rR1GGCgG8QNgK_TOwPRVYQo5rYy8_PXzA@mail.gmail.com
2019-07-05 12:21:11 +09:00
Tomas Vondra 4d66285adc Fix pg_mcv_list_items() to produce text[]
The function pg_mcv_list_items() returns values stored in MCV items. The
items may contain columns with different data types, so the function was
generating text array-like representation, but in an ad-hoc way without
properly escaping various characters etc.

Fixed by simply building a text[] array, which also makes it easier to
use from queries etc.

Requires changes to pg_proc entry, so bump catversion.

Backpatch to 12, where multi-column MCV lists were introduced.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed
Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development
2019-07-05 01:32:46 +02:00
Tom Lane c3f67ed6e4 Do pre-release housekeeping on catalog data, and fix jsonpath send/recv.
Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta
tasks specified by RELEASE_CHANGES.  (The only change is 8394 -> 3428.)

Also run reformat_dat_file.pl while I'm here.

While looking at the reformat diffs, I chanced to notice that type
jsonpath had typsend and typreceive = '-', which surely is not the
intention given that jsonpath_send and jsonpath_recv exist.
Fix that.  It's safe to assume that these functions have never been
tested :-(.  I didn't try, but somebody should.
2019-04-28 17:16:50 -04:00
Magnus Hagander 77bd49adba Show shared object statistics in pg_stat_database
This adds a row to the pg_stat_database view with datoid 0 and datname
NULL for those objects that are not in a database. This was added
particularly for checksums, but we were already tracking more satistics
for these objects, just not returning it.

Also add a checksum_last_failure column that holds the timestamptz of
the last checksum failure that occurred in a database (or in a
non-dataabase file), if any.

Author: Julien Rouhaud <rjuju123@gmail.com>
2019-04-12 14:04:50 +02:00
Alvaro Herrera 9f06d79ef8 Add facility to copy replication slots
This allows the user to create duplicates of existing replication slots,
either logical or physical, and even changing properties such as whether
they are temporary or the output plugin used.

There are multiple uses for this, such as initializing multiple replicas
using the slot for one base backup; when doing investigation of logical
replication issues; and to select a different output plugins.

Author: Masahiko Sawada
Reviewed-by: Michael Paquier, Andres Freund, Petr Jelinek
Discussion: https://postgr.es/m/CAD21AoAm7XX8y_tOPP6j4Nzzch12FvA1wPqiO690RCk+uYVstg@mail.gmail.com
2019-04-05 18:05:18 -03:00
Stephen Frost b0b39f72b9 GSSAPI encryption support
On both the frontend and backend, prepare for GSSAPI encryption
support by moving common code for error handling into a separate file.
Fix a TODO for handling multiple status messages in the process.
Eliminate the OIDs, which have not been needed for some time.

Add frontend and backend encryption support functions.  Keep the
context initiation for authentication-only separate on both the
frontend and backend in order to avoid concerns about changing the
requested flags to include encryption support.

In postmaster, pull GSSAPI authorization checking into a shared
function.  Also share the initiator name between the encryption and
non-encryption codepaths.

For HBA, add "hostgssenc" and "hostnogssenc" entries that behave
similarly to their SSL counterparts.  "hostgssenc" requires either
"gss", "trust", or "reject" for its authentication.

Similarly, add a "gssencmode" parameter to libpq.  Supported values are
"disable", "require", and "prefer".  Notably, negotiation will only be
attempted if credentials can be acquired.  Move credential acquisition
into its own function to support this behavior.

Add a simple pg_stat_gssapi view similar to pg_stat_ssl, for monitoring
if GSSAPI authentication was used, what principal was used, and if
encryption is being used on the connection.

Finally, add documentation for everything new, and update existing
documentation on connection security.

Thanks to Michael Paquier for the Windows fixes.

Author: Robbie Harwood, with changes to the read/write functions by me.
Reviewed in various forms and at different times by: Michael Paquier,
   Andres Freund, David Steele.
Discussion: https://www.postgresql.org/message-id/flat/jlg1tgq1ktm.fsf@thriss.redhat.com
2019-04-03 15:02:33 -04:00
Alvaro Herrera ab0dfc961b Report progress of CREATE INDEX operations
This uses the progress reporting infrastructure added by c16dc1aca5,
adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY.

There are two pieces to this: one is index-AM-agnostic, and the other is
AM-specific.  The latter is fairly elaborate for btrees, including
reportage for parallel index builds and the separate phases that btree
index creation uses; other index AMs, which are much simpler in their
building procedures, have simplistic reporting only, but that seems
sufficient, at least for non-concurrent builds.

The index-AM-agnostic part is fairly complete, providing insight into
the CONCURRENTLY wait phases as well as block-based progress during the
index validation table scan.  (The index validation index scan requires
patching each AM, which has not been included here.)

Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada
Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql
2019-04-02 15:18:08 -03:00
Tomas Vondra 7300a69950 Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.

Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
Michael Paquier 5bde1651bb Switch function current_schema[s]() to be parallel-unsafe
When invoked for the first time in a session, current_schema() and
current_schemas() can finish by creating a temporary schema.  Currently
those functions are parallel-safe, however if for a reason or another
they get launched across multiple parallel workers, they would fail when
attempting to create a temporary schema as temporary contexts are not
supported in this case.

The original issue has been spotted by buildfarm members crake and
lapwing, after commit c5660e0 has introduced the first regression tests
based on current_schema() in the tree.  After that, 396676b has
introduced a workaround to avoid parallel plans but that was not
completely right either.

Catversion is bumped.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20190118024618.GF1883@paquier.xyz
2019-03-27 11:35:12 +09:00
Alexander Korotkov 641fde2523 Remove ambiguity for jsonb_path_match() and jsonb_path_exists()
There are 2-arguments and 4-arguments versions of jsonb_path_match() and
jsonb_path_exists().  But 4-arguments versions have optional 3rd and 4th
arguments, that leads to ambiguity.  In the same time 2-arguments versions are
needed only for @@ and @? operators.  So, rename 2-arguments versions to
remove the ambiguity.

Catversion is bumped.
2019-03-20 10:30:56 +03:00
Alexander Korotkov 72b6460336 Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database.  The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them.  This commit implements partial support JSON path language as
separate datatype called "jsonpath".  The implementation is partial because
it's lacking datetime support and suppression of numeric errors.  Missing
features will be added later by separate commits.

Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches.  This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:

 * jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
 * jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
 * jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
 * jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
 * jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).

This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly.  These operators will have an index support
(implemented in subsequent patches).

Catversion bumped, to add new functions and operators.

Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova.  The work
was inspired by Oleg Bartunov.

Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 12:16:48 +03:00
Tom Lane f1d85aa98e Add support for hyperbolic functions, as well as log10().
The SQL:2016 standard adds support for the hyperbolic functions
sinh(), cosh(), and tanh().  POSIX has long required libm to
provide those functions as well as their inverses asinh(),
acosh(), atanh().  Hence, let's just expose the libm functions
to the SQL level.  As with the trig functions, we only implement
versions for float8, not numeric.

For the moment, we'll assume that all platforms actually do have
these functions; if experience teaches otherwise, some autoconf
effort may be needed.

SQL:2016 also adds support for base-10 logarithm, but with the
function name log10(), whereas the name we've long used is log().
Add aliases named log10() for the float8 and numeric versions.

Lætitia Avrot

Discussion: https://postgr.es/m/CAB_COdguG22LO=rnxDQ2DW1uzv8aQoUzyDQNJjrR4k00XSgm5w@mail.gmail.com
2019-03-12 15:55:09 -04:00
Tom Lane 3aa0395d4e Remove remaining hard-wired OID references in the initial catalog data.
In the v11-era commits that taught genbki.pl to resolve symbolic
OID references in the initial catalog data, we didn't bother to
make every last reference symbolic; some of the catalogs have so
few initial rows that it didn't seem worthwhile.

However, the new project policy that OIDs assigned by new patches
should be automatically renumberable changes this calculus.
A patch that wants to add a row in one of these catalogs would have
a problem when the OID it assigns gets renumbered.  Hence, do the
mop-up work needed to make all OID references in initial data be
symbolic, and establish an associated project policy that we'll
never again write a hard-wired OID reference there.

No catversion bump since the contents of postgres.bki aren't
actually changed by this commit.

Discussion: https://postgr.es/m/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-=eaochT0dd2BN9CQ@mail.gmail.com
2019-03-12 12:30:35 -04:00
Magnus Hagander 6b9e875f72 Track block level checksum failures in pg_stat_database
This adds a column that counts how many checksum failures have occurred
on files belonging to a specific database. Both checksum failures
during normal backend processing and those created when a base backup
detects a checksum failure are counted.

Author: Magnus Hagander
Reviewed by: Julien Rouhaud
2019-03-09 10:47:30 -08:00
Tom Lane 1b76168da7 Reformat catalog .dat files.
Test run for my previous commit; cleans up formatting issues in some
other recent commits.
2019-03-08 12:01:27 -05:00
Thomas Munro 91595f9d49 Drop the vestigial "smgr" type.
Before commit 3fa2bb31 this type appeared in the catalogs to
select which of several block storage mechanisms each relation
used.

New features under development propose to revive the concept of
different block storage managers for new kinds of data accessed
via bufmgr.c, but don't need to put references to them in the
catalogs.  So, avoid useless maintenance work on this type by
dropping it.  Update some regression tests that were referencing
it where any type would do.

Discussion: https://postgr.es/m/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com
2019-03-07 15:44:04 +13:00
Andres Freund 8586bf7ed8 tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
  ACCESS METHOD ... TYPE TABLE and
  CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.

Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.

Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.

Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.

Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs.  It's quite possible that this behaviour
still needs to be fine tuned.

For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.

Catversion bumped, to add the heap table AM and references to it.

Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
    https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
    https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 09:54:38 -08:00
Alvaro Herrera b96f6b1948 pg_partition_ancestors
Adds another introspection feature for partitioning, necessary for
further psql patches.

Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/20190226222757.GA31622@alvherre.pgsql
2019-03-04 16:14:29 -03:00
Joe Conway 290e3b77fd Mark pg_config() stable rather than immutable
pg_config() has been marked immutable since its inception. As part of a
larger discussion around the definition of immutable versus stable and
related implications for marking functions parallel safe raised by
Andres, the consensus was clearly that pg_config() is stable, since
it could possibly change output even for the same minor version with
a recompile or installation of a new binary. So mark it stable.

Theoretically this could/should be backpatched, but it was deemed to be not
worth the effort since in practice this is very unlikely to cause problems
in the real world.

Discussion: https://postgr.es/m/20181126234521.rh3grz7aavx2ubjv@alap3.anarazel.de
2019-02-17 09:21:13 -05:00
Tom Lane 74dfe58a59 Allow extensions to generate lossy index conditions.
For a long time, indxpath.c has had the ability to extract derived (lossy)
index conditions from certain operators such as LIKE.  For just as long,
it's been obvious that we really ought to make that capability available
to extensions.  This commit finally accomplishes that, by adding another
API for planner support functions that lets them create derived index
conditions for their functions.  As proof of concept, the hardwired
"special index operator" code formerly present in indxpath.c is pushed
out to planner support functions attached to LIKE and other relevant
operators.

A weak spot in this design is that an extension needs to know OIDs for
the operators, datatypes, and opfamilies involved in the transformation
it wants to make.  The core-code prototypes use hard-wired OID references
but extensions don't have that option for their own operators etc.  It's
usually possible to look up the required info, but that may be slow and
inconvenient.  However, improving that situation is a separate task.

I want to do some additional refactorization around selfuncs.c, but
that also seems like a separate task.

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-11 21:26:14 -05:00
Tom Lane a391ff3c3d Build out the planner support function infrastructure.
Add support function requests for estimating the selectivity, cost,
and number of result rows (if a SRF) of the target function.

The lack of a way to estimate selectivity of a boolean-returning
function in WHERE has been a recognized deficiency of the planner
since Berkeley days.  This commit finally fixes it.

In addition, non-constant estimates of cost and number of output
rows are now possible.  We still fall back to looking at procost
and prorows if the support function doesn't service the request,
of course.

To make concrete use of the possibility of estimating output rowcount
for SRFs, this commit adds support functions for array_unnest(anyarray)
and the integer variants of generate_series; the lack of plausible
rowcount estimates for those, even when it's obvious to a human,
has been a repeated subject of complaints.  Obviously, much more
could now be done in this line, but I'm mostly just trying to get
the infrastructure in place.

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-09 18:32:23 -05:00
Tom Lane 1fb57af920 Create the infrastructure for planner support functions.
Rename/repurpose pg_proc.protransform as "prosupport".  The idea is
still that it names an internal function that provides knowledge to
the planner about the behavior of the function it's attached to;
but redesign the API specification so that it's not limited to doing
just one thing, but can support an extensible set of requests.

The original purpose of simplifying a function call is handled by
the first request type to be invented, SupportRequestSimplify.
Adjust all the existing transform functions to handle this API,
and rename them fron "xxx_transform" to "xxx_support" to reflect
the potential generalization of what they do.  (Since we never
previously provided any way for extensions to add transform functions,
this change doesn't create an API break for them.)

Also add DDL and pg_dump support for attaching a support function to a
user-defined function.  Unfortunately, DDL access has to be restricted
to superusers, at least for now; but seeing that support functions
will pretty much have to be written in C, that limitation is just
theoretical.  (This support is untested in this patch, but a follow-on
patch will add cases that exercise it.)

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-09 18:08:48 -05:00
Michael Paquier 3677a0b26b Add pg_partition_root to display top-most parent of a partition tree
This is useful when looking at partition trees with multiple layers, and
combined with pg_partition_tree, it provides the possibility to show up
an entire tree by just knowing one member at any level.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Amit Langote
Discussion: https://postgr.es/m/20181207014015.GP2407@paquier.xyz
2019-02-08 08:56:14 +09:00
Peter Eisentraut f60a0e9677 Add more columns to pg_stat_ssl
Add columns client_serial and issuer_dn to pg_stat_ssl.  These allow
uniquely identifying the client certificate.

Rename the existing column clientdn to client_dn, to make the naming
more consistent and easier to read.

Discussion: https://www.postgresql.org/message-id/flat/398754d8-6bb5-c5cf-e7b8-22e5f0983caf@2ndquadrant.com/
2019-02-01 00:33:47 +01:00
Tom Lane d33faa285b Move the built-in conversions into the initial catalog data.
Instead of running a SQL script to create the standard conversion
functions and pg_conversion entries, put those entries into the
initial data in postgres.bki.

This shaves a few percent off the runtime of initdb, and also allows
accurate comments to be attached to the conversion functions; the
previous script labeled them with machine-generated comments that
were not quite right for multi-purpose conversion functions.
Also, we can get rid of the duplicative Makefile and MSVC perl
implementations of the generation code for that SQL script.

A functional change is that these pg_proc and pg_conversion entries
are now "pinned" by initdb.  Leaving them unpinned was perhaps a
good thing back while the conversions feature was under development,
but there seems no valid reason for it now.

Also, the conversion functions are now marked as immutable, where
before they were volatile by virtue of lacking any explicit
specification.  That seems like it was just an oversight.

To avoid using magic constants in pg_conversion.dat, extend
genbki.pl to allow encoding names to be converted, much as it
does for language, access method, etc names.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
2019-01-03 19:47:53 -05:00
Tom Lane 814c9019aa Use symbolic references for pg_language OIDs in the bootstrap data.
This patch teaches genbki.pl to replace pg_language names by OIDs
in much the same way as it already does for pg_am names etc, and
converts pg_proc.dat to use such symbolic references in the prolang
column.

Aside from getting rid of a few more magic numbers in the initial
catalog data, this means that Gen_fmgrtab.pl no longer needs to read
pg_language.dat, since it doesn't have to know the OID of the "internal"
language; now it's just looking for the string "internal".

No need for a catversion bump, since the contents of postgres.bki
don't actually change at all.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
2019-01-03 18:38:49 -05:00
Bruce Momjian 97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00
Tom Lane d01e75d68e Update leakproofness markings on some btree comparison functions.
Mark pg_lsn and oidvector comparison functions as leakproof.  Per
discussion, these clearly are leakproof so we might as well mark them so.

On the other hand, remove leakproof markings from name comparison
functions other than equal/not-equal.  Now that these depend on
varstr_cmp, they can't be considered leakproof if text comparison isn't.
(This was my error in commit 586b98fdf.)

While at it, add some opr_sanity queries to catch cases where related
functions do not have the same volatility and leakproof markings.
This would clearly be bogus for commutator or negator pairs.  In the
domain of btree comparison functions, we do have some exceptions,
because text equality is leakproof but inequality comparisons are not.
That's odd on first glance but is reasonable (for now anyway) given
the much greater complexity of the inequality code paths.

Discussion: https://postgr.es/m/20181231172551.GA206480@gust.leadboat.com
2018-12-31 16:38:11 -05:00
Tom Lane 0a6ea4001a Add a hash opclass for type "tid".
Up to now we've not worried much about joins where the join key is a
relation's CTID column, reasoning that storing a table's CTIDs in some
other table would be pretty useless.  However, there are use-cases for
this sort of query involving self-joins, so that argument doesn't really
hold water.

With larger relations, a merge or hash join is desirable.  We had a btree
opclass for type "tid", allowing merge joins on CTID, but no hash opclass
so that hash joins weren't possible.  Add the missing infrastructure.

This also potentially enables hash aggregation on "tid", though the
use-cases for that aren't too clear.

Discussion: https://postgr.es/m/1853.1545453106@sss.pgh.pa.us
2018-12-30 15:40:04 -05:00
Tom Lane 5bbee34d9f Avoid producing over-length specific_name outputs in information_schema.
information_schema output columns that are declared as being type
sql_identifier are supposed to conform to the implementation's rules
for valid identifiers, in particular the identifier length limit.
Several places potentially violated this limit by concatenating a
function's name and OID.  (The OID is added to ensure name uniqueness
within a schema, since the spec doesn't expect function name overloading.)

Simply truncating the concatenation result to fit in "name" won't do,
since losing part of the OID might wind up giving non-unique results.
Instead, let's truncate the function name as necessary.

The most practical way to do that is to do it in a C function; the
information_schema.sql script doesn't have easy access to the value
of NAMEDATALEN, nor does it have an easy way to truncate on the basis
of resulting byte-length rather than number of characters.

(There are still a couple of places that cast concatenation results to
sql_identifier, but as far as I can see they are guaranteed not to produce
over-length strings, at least with the normal value of NAMEDATALEN.)

Discussion: https://postgr.es/m/23817.1545283477@sss.pgh.pa.us
2018-12-20 16:21:59 -05:00
Tom Lane 2ece7c07dc Add text-vs-name cross-type operators, and unify name_ops with text_ops.
Now that name comparison has effectively the same behavior as text
comparison, we might as well merge the name_ops opfamily into text_ops,
allowing cross-type comparisons to be processed without forcing a
datatype coercion first.  We need do little more than add cross-type
operators to make the opfamily complete, and fix one or two places
in the planner that assumed text_ops was a single-datatype opfamily.

I chose to unify hash name_ops into hash text_ops as well, since the
types have compatible hashing semantics.  This allows marking the
new cross-type equality operators as oprcanhash.

(Note: this doesn't remove the name_ops opclasses, so there's no
breakage of index definitions.  Those opclasses are just reparented
into the text_ops opfamily.)

Discussion: https://postgr.es/m/15938.1544377821@sss.pgh.pa.us
2018-12-19 17:46:25 -05:00
Michael Paquier 7fee252f6f Add timestamp of last received message from standby to pg_stat_replication
The timestamp generated by the standby at message transmission has been
included in the protocol since its introduction for both the status
update message and hot standby feedback message, but it has never
appeared in pg_stat_replication.  Seeing this timestamp does not matter
much with a cluster which has a lot of activity, but on a mostly-idle
cluster, this makes monitoring able to react faster than the configured
timeouts.

Author: MyungKyu LIM
Reviewed-by: Michael Paquier, Masahiko Sawada
Discussion: https://postgr.es/m/1657809367.407321.1533027417725.JavaMail.jboss@ep2ml404
2018-12-09 16:35:06 +09:00
Andres Freund 578b229718 Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.

This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row.  Neither pg_dump nor COPY included the contents of the
oid column by default.

The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.

WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.

Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
  WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
  issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
  restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
  OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
  plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.

The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.

The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such.  This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.

The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.

Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).

The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.

While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.

Catversion bump, for obvious reasons.

Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-20 16:00:17 -08:00