Commit Graph

4650 Commits

Author SHA1 Message Date
Michael Paquier 8961cb9a03 Fix typos in comments
The changes done in this commit impact comments with no direct
user-visible changes, with fixes for incorrect function, variable or
structure names.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/e8c38840-596a-83d6-bd8d-cebc51111572@gmail.com
2023-05-02 12:23:08 +09:00
Daniel Gustafsson 69537f5d17 Remove duplicate lines of code
Commit 6df7a9698b accidentally included two identical prototypes for
default_multirange_selectivi() and commit 086cf1458c added a break;
statement where one was already present, thus duplicating it.  While
there is no bug caused by this, fix by removing the duplicated lines
as they provide no value.

Backpatch the fix for duplicate prototypes to v14 and the duplicate
break statement fix to all supported branches to avoid backpatching
hazards due to the removal.

Reported-by: Anton Voloshin <a.voloshin@postgrespro.ru>
Discussion: https://postgr.es/m/0e69cb60-0176-f6d0-7e15-6478b7d85724@postgrespro.ru
2023-04-24 11:16:17 +02:00
David Rowley d91d163529 Fix incorrect function name reference
This function was renamed in 0c9d84427 but this comment wasn't updated.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/699beab4-a6ca-92c9-f152-f559caf6dc25@gmail.com
2023-04-21 10:46:08 +12:00
Michael Paquier 0ecb87e1fa Remove io prefix from pg_stat_io columns
a9c70b46 added the statistics view pg_stat_io which contained columns
"io_context" and "io_object".  Given that the columns are in the
pg_stat_io view, the "io" prefix is somewhat redundant, so remove it.

The code variables referring to these fields are kept unchanged so as
they can keep their context about I/O.

Bump catalog version.

Author: Melanie Plageman
Reviewed-by: Kyotaro Horiguchi, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CAAKRu_aAQoJWrvT2BYYQvJChFKra_O-5ra3jhzKJZqWsTR1CPQ@mail.gmail.com
2023-04-21 07:21:50 +09:00
Thomas Munro 7d3d72b55e Remove obsolete defense against strxfrm() bugs.
Old versions of Solaris and illumos had buffer overrun bugs in their
strxfrm() implementations.  The bugs were fixed more than a decade ago
and the relevant releases are long out of vendor support.  It's time to
remove the defense added by commit be8b06c3.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+hUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ=meA@mail.gmail.com
2023-04-20 13:20:14 +12:00
David Rowley eef231e816 Fix some typos and some incorrectly duplicated words
Author: Justin Pryzby
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/ZD3D1QxoccnN8A1V@telsasoft.com
2023-04-18 14:03:49 +12:00
David Rowley b4dbf3e924 Fix various typos
This fixes many spelling mistakes in comments, but a few references to
invalid parameter names, function names and option names too in comments
and also some in string constants

Also, fix an #undef that was undefining the incorrect definition

Author: Alexander Lakhin
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/d5f68d19-c0fc-91a9-118d-7c6a5a3f5fad@gmail.com
2023-04-18 13:23:23 +12:00
Peter Geoghegan d6f0f95a6b Harmonize some more function parameter names.
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in a few places.  These
inconsistencies were all introduced relatively recently, after the code
base had parameter name mismatches fixed in bulk (see commits starting
with commits 4274dc22 and 035ce1fe).

pg_bsd_indent still has a couple of similar inconsistencies, which I
(pgeoghegan) have left untouched for now.

Like all earlier commits that cleaned up function parameter names, this
commit was written with help from clang-tidy.
2023-04-13 10:15:20 -07:00
Stephen Frost 6633cfb216 De-Revert "Add support for Kerberos credential delegation"
This reverts commit 3d03b24c3 (Revert Add support for Kerberos
credential delegation) which was committed on the grounds of concern
about portability, but on further review and discussion, it's clear that
we are better off explicitly requiring MIT Kerberos as that appears to
be the only GSSAPI library currently that's under proper maintenance
and ongoing development.  The API used for storing credentials was added
to MIT Kerberos over a decade ago while for the other libraries which
appear to be mainly based on Heimdal, which exists explicitly to be a
re-implementation of MIT Kerberos, the API never made it to a released
version (even though it was added to the Heimdal git repo over 5 years
ago..).

This post-feature-freeze change was approved by the RMT.

Discussion: https://postgr.es/m/ZDDO6jaESKaBgej0%40tamriel.snowman.net
2023-04-13 08:55:07 -04:00
Alvaro Herrera 9ce04b50e1
Revert "Catalog NOT NULL constraints" and fallout
This reverts commit e056c557ae and minor later fixes thereof.

There's a few problems in this new feature -- most notably regarding
pg_upgrade behavior, but others as well.  This new feature is not in any
way critical on its own, so instead of scrambling to fix it we revert it
and try again in early 17 with these issues in mind.

Discussion: https://postgr.es/m/3801207.1681057430@sss.pgh.pa.us
2023-04-12 19:29:21 +02:00
Stephen Frost 3d03b24c35 Revert "Add support for Kerberos credential delegation"
This reverts commit 3d4fa227bc.

Per discussion and buildfarm, this depends on APIs that seem to not
be available on at least one platform (NetBSD).  Should be certainly
possible to rework to be optional on that platform if necessary but bit
late for that at this point.

Discussion: https://postgr.es/m/3286097.1680922218@sss.pgh.pa.us
2023-04-08 07:21:35 -04:00
Thomas Munro db4f21e4a3 Redesign interrupt/cancel API for regex engine.
Previously, a PostgreSQL-specific callback checked by the regex engine
had a way to trigger a special error code REG_CANCEL if it detected that
the next call to CHECK_FOR_INTERRUPTS() would certainly throw via
ereport().

A later proposed bugfix aims to move some complex logic out of signal
handlers, so that it won't run until the next CHECK_FOR_INTERRUPTS(),
which makes the above design impossible unless we split
CHECK_FOR_INTERRUPTS() into two phases, one to run logic and another to
ereport().  We may develop such a system in the future, but for the
regex code it is no longer necessary.

An earlier commit moved regex memory management over to our
MemoryContext system.  Given that the purpose of the two-phase interrupt
checking was to free memory before throwing, something we don't need to
worry about anymore, it seems simpler to inject CHECK_FOR_INTERRUPTS()
directly into cancelation points, and just let it throw.

Since the plan is to keep PostgreSQL-specific concerns separate from the
main regex engine code (with a view to bein able to stay in sync with
other projects), do this with a new macro INTERRUPT(), customizable in
regcustom.h and defaulting to nothing.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
2023-04-08 22:10:39 +12:00
Thomas Munro bea3d7e383 Use MemoryContext API for regex memory management.
Previously, regex_t objects' memory was managed with malloc() and free()
directly.  Switch to palloc()-based memory management instead.
Advantages:

 * memory used by cached regexes is now visible with MemoryContext
   observability tools

 * cleanup can be done automatically in certain failure modes
   (something that later commits will take advantage of)

 * cleanup can be done in bulk

On the downside, there may be more fragmentation (wasted memory) due to
per-regex MemoryContext objects.  This is a problem shared with other
cached objects in PostgreSQL and can probably be improved with later
tuning.

Thanks to Noah Misch for suggesting this general approach, which
unblocks later work on interrupts.

Suggested-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
2023-04-08 22:08:41 +12:00
Andres Freund 26669757b6 Handle logical slot conflicts on standby
During WAL replay on the standby, when a conflict with a logical slot is
identified, invalidate such slots. There are two sources of conflicts:
1) Using the information added in 6af1793954, logical slots are invalidated if
   required rows are removed
2) wal_level on the primary server is reduced to below logical

Uses the infrastructure introduced in the prior commit. FIXME: add commit
reference.

Change InvalidatePossiblyObsoleteSlot() to use a recovery conflict to
interrupt use of a slot, if called in the startup process. The new recovery
conflict is added to pg_stat_database_conflicts, as confl_active_logicalslot.

See 6af1793954 for an overall design of logical decoding on a standby.

Bumps catversion for the addition of the pg_stat_database_conflicts column.
Bumps PGSTAT_FILE_FORMAT_ID for the same reason.

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
2023-04-08 00:05:44 -07:00
Stephen Frost 3d4fa227bc Add support for Kerberos credential delegation
Support GSSAPI/Kerberos credentials being delegated to the server by a
client.  With this, a user authenticating to PostgreSQL using Kerberos
(GSSAPI) credentials can choose to delegate their credentials to the
PostgreSQL server (which can choose to accept them, or not), allowing
the server to then use those delegated credentials to connect to
another service, such as with postgres_fdw or dblink or theoretically
any other service which is able to be authenticated using Kerberos.

Both postgres_fdw and dblink are changed to allow non-superuser
password-less connections but only when GSSAPI credentials have been
delegated to the server by the client and GSSAPI is used to
authenticate to the remote system.

Authors: Stephen Frost, Peifeng Qiu
Reviewed-By: David Christensen
Discussion: https://postgr.es/m/CO1PR05MB8023CC2CB575E0FAAD7DF4F8A8E29@CO1PR05MB8023.namprd05.prod.outlook.com
2023-04-07 21:58:04 -04:00
Andres Freund ac8d53dae5 Track IO times in pg_stat_io
a9c70b46db and 8aaa04b32S added counting of IO operations to a new view,
pg_stat_io. Now, add IO timing for reads, writes, extends, and fsyncs to
pg_stat_io as well.

This combines the tracking for pgBufferUsage with the tracking for pg_stat_io
into a new function pgstat_count_io_op_time(). This should make it a bit
easier to avoid the somewhat costly instr_time conversion done for
pgBufferUsage.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_ay5iKmnbXZ3DsauViF3eMxu4m1oNnJXqV_HyqYeg55Ww%40mail.gmail.com
2023-04-07 17:04:56 -07:00
Alvaro Herrera e056c557ae
Catalog NOT NULL constraints
We now create pg_constaint rows for NOT NULL constraints with
contype='n'.

We propagate these constraints during operations such as adding
inheritance relationships, creating and attaching partitions, creating
tables LIKE other tables.  We mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations; for example, as opposed to CHECK constraints, we don't
match NOT NULL ones by name when descending a hierarchy to alter it;
instead we match by column number.  This means we don't require the
constraint names to be identical across a hierarchy.

For now, we omit them from system catalogs.  Maybe this is worth
reconsidering.  We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)

This has been very long in the making.  The first patch was written by
Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'),
which I (Álvaro) then hijacked in 2011 and 2012, until that one was
killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints.  However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again.

In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring additional pg_attribute columns to
track the OID of the NOT NULL constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>

Discussion: https://postgr.es/m/CACA0E642A0267EDA387AF2B%40%5B172.26.14.62%5D
Discussion: https://postgr.es/m/AANLkTinLXMOEMz+0J29tf1POokKi4XDkWJ6-DDR9BKgU@mail.gmail.com
Discussion: https://postgr.es/m/20110707213401.GA27098@alvh.no-ip.org
Discussion: https://postgr.es/m/1343682669-sup-2532@alvh.no-ip.org
Discussion: https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Discussion: https://postgr.es/m/20220817181249.q7qvj3okywctra3c@alvherre.pgsql
2023-04-07 19:59:57 +02:00
Tom Lane 888f2ea0a8 Add array_sample() and array_shuffle() functions.
These are useful in Monte Carlo applications.

Martin Kalcher, reviewed/adjusted by Daniel Gustafsson and myself

Discussion: https://postgr.es/m/9d160a44-7675-51e8-60cf-6d64b76db831@aboutsource.net
2023-04-07 11:47:07 -04:00
Jeff Davis 36320cbc16 Fix MSVC warning introduced in ea1db8ae70.
Discussion: https://postgr.es/m/CA+hUKGJR1BhCORa5WdvwxztD3arhENcwaN1zEQ1Upg20BwjKWA@mail.gmail.com
Reported-by: Thomas Munro
2023-04-04 15:43:18 -07:00
Jeff Davis ea1db8ae70 Canonicalize ICU locale names to language tags.
Convert to BCP47 language tags before storing in the catalog, except
during binary upgrade or when the locale comes from an existing
collation or template database.

The resulting language tags can vary slightly between ICU
versions. For instance, "@colBackwards=yes" is converted to
"und-u-kb-true" in older versions of ICU, and to the simpler (but
equivalent) "und-u-kb" in newer versions.

The process of canonicalizing to a language tag also understands more
input locale string formats than ucol_open(). For instance,
"fr_CA.UTF-8" is misinterpreted by ucol_open() and the region is
ignored; effectively treating it the same as the locale "fr" and
opening the wrong collator. Canonicalization properly interprets the
language and region, resulting in the language tag "fr-CA", which can
then be understood by ucol_open().

This commit fixes a problem in prior versions due to ucol_open()
misinterpreting locale strings as described above. For instance,
creating an ICU collation with locale "fr_CA.UTF-8" would store that
string directly in the catalog, which would later be passed to (and
misinterpreted by) ucol_open(). After this commit, the locale string
will be canonicalized to language tag "fr-CA" in the catalog, which
will be properly understood by ucol_open(). Because this fix affects
the resulting collator, we cannot change the locale string stored in
the catalog for existing databases or collations; otherwise we'd risk
corrupting indexes. Therefore, only canonicalize locales for
newly-created (not upgraded) collations/databases. For similar
reasons, do not backport.

Discussion: https://postgr.es/m/8c7af6820aed94dc7bc259d2aa7f9663518e6137.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-04-04 10:38:58 -07:00
Alvaro Herrera 6ee30209a6
SQL/JSON: support the IS JSON predicate
This patch introduces the SQL standard IS JSON predicate. It operates
on text and bytea values representing JSON, as well as on the json and
jsonb types. Each test has IS and IS NOT variants and supports a WITH
UNIQUE KEYS flag. The tests are:

IS JSON [VALUE]
IS JSON ARRAY
IS JSON OBJECT
IS JSON SCALAR

These should be self-explanatory.

The WITH UNIQUE KEYS flag makes these return false when duplicate keys
exist in any object within the value, not necessarily directly contained
in the outermost object.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
2023-03-31 22:34:04 +02:00
Tom Lane a2a0c7c29e Further tweaking of width_bucket() edge cases.
I realized that the third overflow case I posited in commit b0e9e4d76
actually should be handled in a different way: rather than tolerating
the idea that the quotient could round to 1, we should clamp so that
the output cannot be more than "count" when we know that the operand is
less than bound2.  That being the case, we don't need an overflow-aware
increment in that code path, which leads me to revert the movement of
the pg_add_s32_overflow() call.  (The diff in width_bucket_float8
might be easier to read by comparing against b0e9e4d76^.)

What's more, width_bucket_numeric also has this problem of the quotient
potentially rounding to 1, so add a clamp there too.

As before, I'm not quite convinced that a back-patch is warranted.

Discussion: https://postgr.es/m/391415.1680268470@sss.pgh.pa.us
2023-03-31 16:29:55 -04:00
Andres Freund 8aaa04b32d Track shared buffer hits in pg_stat_io
Among other things, this should make it easier to calculate a useful cache hit
ratio by excluding buffer reads via buffer access strategies. As buffer access
strategies reuse buffers (and thus evict the prior buffer contents), it is
normal to see reads on repeated scans of the same data.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAAKRu_beMa9Hzih40%3DXPYqhDVz6tsgUGTrhZXRo%3Dunp%2Bszb%3DUA%40mail.gmail.com
2023-03-30 19:24:21 -07:00
Alvaro Herrera 60966f56c3
Fix inconsistencies and style issues in new SQL/JSON code
Reported by Alexander Lakhin.

Discussion: https://postgr.es/m/60483139-5c34-851d-baee-6c0d014e1710@gmail.com
2023-03-30 21:06:31 +02:00
Tom Lane b0e9e4d76c Avoid overflow in width_bucket_float8().
The original coding of this function paid little attention to the
possibility of overflow.  There were actually three different hazards:

1. The range from bound1 to bound2 could exceed DBL_MAX, which on
IEEE-compliant machines produces +Infinity in the subtraction.
At best we'd lose all precision in the result, and at worst
produce NaN due to dividing Inf/Inf.  The range can't exceed
twice DBL_MAX though, so we can fix this case by scaling all the
inputs by 0.5.

2. We computed count * (operand - bound1), which is also at risk of
float overflow, before dividing.  Safer is to do the division first,
producing a quotient that should be in [0,1), and even after allowing
for roundoff error can't be outside [0,1]; then multiplying by count
can't produce a result overflowing an int.  (width_bucket_numeric does
the multiplication first on the grounds that that improves accuracy of
its result, but I don't think that a similar argument can be made in
float arithmetic.)

3. If the division result does round to 1, and count is INT_MAX,
the final addition of 1 would overflow an int.  We took care
of that in the operand >= bound2 case but did not consider that
it could be possible in the main path.  Fix that by moving the
overflow-aware addition of 1 so it is done that way in all cases.

The fix for point 2 creates a possibility that values very close to
a bucket boundary will be rounded differently than they were before.
I'm not troubled by that for HEAD, but it is an argument against
putting this into the stable branches.  Given that the cases being
fixed here are fairly extreme and unlikely to be hit in normal use,
it seems best not to back-patch.

Mats Kindahl and Tom Lane

Discussion: https://postgr.es/m/17876-61f280d1601f978d@postgresql.org
2023-03-30 11:27:36 -04:00
Alvaro Herrera 7081ac46ac
SQL/JSON: add standard JSON constructor functions
This commit introduces the SQL/JSON standard-conforming constructors for
JSON types:

JSON_ARRAY()
JSON_ARRAYAGG()
JSON_OBJECT()
JSON_OBJECTAGG()

Most of the functionality was already present in PostgreSQL-specific
functions, but these include some new functionality such as the ability
to skip or include NULL values, and to allow duplicate keys or throw
error when they are found, as well as the standard specified syntax to
specify output type and format.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
2023-03-29 12:11:36 +02:00
Jeff Davis 1671f990dd Validate ICU locales.
For ICU collations, ensure that the locale's language exists in ICU,
and that the locale can be opened.

Basic validation helps avoid minor mistakes and misspellings, which
often fall back to the root locale instead of the intended
locale. It's even more important to avoid such mistakes in ICU
versions 54 and earlier, where the same (misspelled) locale string
could fall back to different locales depending on the environment.

Discussion: https://postgr.es/m/11b1eeb7e7667fdd4178497aeb796c48d26e69b9.camel@j-davis.com
Discussion: https://postgr.es/m/df2efad0cae7c65180df8e5ebb709e5eb4f2a82b.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-28 16:34:29 -07:00
Jeff Davis 8b3eb0c584 Fix error inconsistency in older ICU versions.
To support older ICU versions, we rely on
icu_set_collation_attributes() to do error checking that is handled
directly by ucol_open() in newer ICU versions. Commit 3b50275b12
introduced a slight inconsistency, where the error report includes the
fixed-up locale string, rather than the locale string passed to
pg_ucol_open().

Refactor slightly so that pg_ucol_open() handles the errors from both
ucol_open() and icu_set_collation_attributes(), making it easier to
see any differences between the error reports. It also makes
pg_ucol_open() responsible for closing the UCollator on error, which
seems like the right place.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-28 08:24:18 -07:00
Michael Paquier 4efd0bf7ea Generate a few more functions of pgstatfuncs.c with macros
Two new macros are added with their respective functions switched to
use them.  These are for functions with millisecond stats, with and
without "xact" in their names (for the stats that can be tracked within
a transaction).

While on it, prefix the macro for float8 on database entries with "_MS",
as it does a us->ms conversion, based on a suggestion from Andres
Freund.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/6e2efb4f-6fd0-807e-f6bf-94207db8183a@gmail.com
2023-03-28 07:35:33 +09:00
Michael Paquier 850f4b4c8c Generate pg_stat_get_xact*() functions for relations using macros
This change replaces seven functions definitions by macros.

This is the same idea as 8018ffb or 83a1a1b, taking advantage of the
variable rename done in 8089517 for relation entries.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/631e3084-c5d9-8463-7540-fcff4674caa5@gmail.com
2023-03-27 09:57:41 +09:00
Tom Lane 554841699f Fix oversights in array manipulation.
The nested-arrays code path in ExecEvalArrayExpr() used palloc to
allocate the result array, whereas every other array-creating function
has used palloc0 since 18c0b4ecc.  This mostly works, but unused bits
past the end of the nulls bitmap may end up undefined.  That causes
valgrind complaints with -DWRITE_READ_PARSE_PLAN_TREES, and could
cause planner misbehavior as cited in 18c0b4ecc.  There seems no very
good reason why we should strive to avoid palloc0 in just this one case,
so fix it the easy way with s/palloc/palloc0/.

While looking at that I noted that we also failed to check for overflow
of "nbytes" and "nitems" while summing the sizes of the sub-arrays,
potentially allowing a crash due to undersized output allocation.
For "nbytes", follow the policy used by other array-munging code of
checking for overflow after each addition.  (As elsewhere, the last
addition of the array's overhead space doesn't need an extra check,
since palloc itself will catch a value between 1Gb and 2Gb.)
For "nitems", there's no very good reason to sum the inputs at all,
since we can perfectly well use ArrayGetNItems' result instead of
ignoring it.

Per discussion of this bug, also remove redundant zeroing of the
nulls bitmap in array_set_element and array_set_slice.

Patch by Alexander Lakhin and myself, per bug #17858 from Alexander
Lakhin; thanks also to Richard Guo.  These bugs are a dozen years old,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17858-8fd287fd3663d051@postgresql.org
2023-03-26 13:41:06 -04:00
Daniel Gustafsson d435f15fff Add SysCacheGetAttrNotNull for guaranteed not-null attrs
When extracting an attr from a cached tuple in the syscache with
SysCacheGetAttr the isnull parameter must be checked in case the
attr cannot be NULL.  For cases when this is known beforehand, a
wrapper is introduced which perform the errorhandling internally
on behalf of the caller, invoking an elog in case of a NULL attr.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/AD76405E-DB45-46B6-941F-17B1EB3A9076@yesql.se
2023-03-25 22:49:33 +01:00
Jeff Davis 81a6d57e33 Fix abbreviated keys bug introduced in d87d548cd0.
Discussion: http://postgr.es/m/CAMkU=1z17XJatF-rMCY3Cjqcxer-Kyn57x6h3OSCpJ0LpAp0ig@mail.gmail.com
Reported-by: Jeff Janes
2023-03-25 11:08:32 -07:00
Jeff Davis a03b3b6b4a Avoid potential UCollator leak for older ICU versions.
ICU versions 53 and earlier rely on icu_set_collation_attributes() to
process the attributes in the locale string. Avoid leaking the
already-opened UCollator object if an error is encountered.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-24 08:48:03 -07:00
Jeff Davis 9a24289915 pg_locale.c: change ereport() to elog().
Discussion: https://postgr.es/m/73553013-3926-0f34-0fb8-f37909fe4902@enterprisedb.com
Reported-by: Peter Eisentraut
2023-03-24 08:47:42 -07:00
Daniel Gustafsson a04761ac77 Fix typo in header comment
Commit 4c04be9b0 accidentally left off the _id portion of the function
name in the header comment.

Author: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAEG8a3LP+ytnAXSzR=yiEaQrde+iCybMHsuPn9n=UN3puV_1tw@mail.gmail.com
2023-03-24 09:03:31 +01:00
Peter Eisentraut a9bc04b211 Fix incorrect format placeholders
The fields of NLSVERSIONINFOEX are of type DWORD, which is unsigned
long, so the results of the computations being printed are also of
type unsigned long.
2023-03-24 07:21:40 +01:00
Michael Paquier 8089517ab8 Rename fields in pgstat structures for functions and relations
This commit renames the members of a few pgstat structures related to
functions and relations, by respectively removing their prefix "f_" and
"t_".  The statistics for functions and relations and handled in their
own file, and pgstatfuncs.c associates each field in a structure
variable named based on the object type handled, so no information is
lost with this rename.

This will help with some of the refactoring aimed for pgstatfuncs.c, as
this makes more consistent the field names with the SQL functions
retrieving them.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Melanie Plageman
Discussion: https://postgr.es/m/9142f62a-a422-145c-bde0-b5bc498a4ada@gmail.com
2023-03-24 08:46:29 +09:00
Peter Geoghegan ae4fdde135 Count updates that move row to a new page.
Add pgstat counter to track row updates that result in the successor
version going to a new heap page, leaving behind an original version
whose t_ctid points to the new version.  The current count is shown by
the n_tup_newpage_upd column of each of the pg_stat_*_tables views.

The new n_tup_newpage_upd column complements the existing n_tup_hot_upd
and n_tup_upd columns.  Tables that have high n_tup_newpage_upd values
(relative to n_tup_upd) are good candidates for tuning heap fillfactor.

Corey Huinker, with small tweaks by me.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CADkLM=ded21M9iZ36hHm-vj2rE2d=zcKpUQMds__Xm2pxLfHKA@mail.gmail.com
2023-03-23 11:16:17 -07:00
Jeff Davis 3b50275b12 Handle the "und" locale in ICU versions 54 and older.
The "und" locale is an alternative spelling of the root locale, but it
was not recognized until ICU 55. To maintain common behavior across
all supported ICU versions, check for "und" and replace with "root"
before opening.

Previously, the lack of support for "und" was dangerous, because
versions 54 and older fall back to the environment when a locale is
not found. If the user specified "und" for the language (which is
expected and documented), it could not only resolve to the wrong
collator, but it could unexpectedly change (which could lead to
corrupt indexes).

This effectively reverts commit d72900bded, which worked around the
problem for the built-in "unicode" collation, and is no longer
necessary.

Discussion: https://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73.camel@j-davis.com
Discussion: https://postgr.es/m/0c6fa66f2753217d2a40480a96bd2ccf023536a1.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-23 10:08:27 -07:00
Jeff Davis a326aac8f1 Wrap ICU ucol_open().
Hide details of supporting older ICU versions in a wrapper
function. The current code only needs to handle
icu_set_collation_attributes(), but a subsequent commit will add
additional version-specific code.

Discussion: https://postgr.es/m/7ee414ad-deb5-1144-8a0e-b34ae3b71cd5@enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-23 09:15:25 -07:00
Jeff Davis 869650fa86 Support language tags in older ICU versions (53 and earlier).
By calling uloc_canonicalize() before parsing the attributes, the
existing locale attribute parsing logic works on language tags as
well.

Fix a small memory leak, too.

Discussion: http://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-21 16:12:37 -07:00
Tom Lane 75bd846b68 Add functions to do timestamptz arithmetic in a non-default timezone.
Add versions of timestamptz + interval, timestamptz - interval, and
generate_series(timestamptz, ...) in which a timezone can be specified
explicitly instead of defaulting to the TimeZone GUC setting.

The new functions for the first two are named date_add and
date_subtract.  This might seem too generic, but we could use
overloading to add additional variants if that seems useful.

Along the way, improve the docs' pretty inadequate explanation
of how timestamptz +- interval works.

Przemysław Sztoch and Gurjeet Singh; cosmetic changes and most of
the docs work by me

Discussion: https://postgr.es/m/01a84551-48dd-1359-bf7e-f6b0203a6bd0@sztoch.pl
2023-03-18 14:12:16 -04:00
Tom Lane 3e59e5048d Refactor datetime functions' timezone lookup code to reduce duplication.
We already had five copies of essentially the same logic, and an
upcoming patch introduces yet another use-case.  That's past my
threshold of pain, so introduce a common subroutine.  There's not
that much net code savings, but the chance of typos should go down.

Inspired by a patch from Przemysław Sztoch, but different in detail.

Discussion: https://postgr.es/m/01a84551-48dd-1359-bf7e-f6b0203a6bd0@sztoch.pl
2023-03-17 17:47:19 -04:00
Jeff Davis f413941f41 Fix t_isspace(), etc., when datlocprovider=i and datctype=C.
Check whether the datctype is C to determine whether t_isspace() and
related functions use isspace() or iswspace().

Previously, t_isspace() checked whether the database default collation
was C; which is incorrect when the default collation uses the ICU
provider.

Discussion: https://postgr.es/m/79e4354d9eccfdb00483146a6b9f6295202e7890.camel@j-davis.com
Reviewed-by: Peter Eisentraut
Backpatch-through: 15
2023-03-17 12:08:46 -07:00
Andres Freund 0dc40196f2 Work around spurious compiler warning in inet operators
gcc 12+ has complaints like the following:

../../../../../pgsql/src/backend/utils/adt/network.c: In function 'inetnot':
../../../../../pgsql/src/backend/utils/adt/network.c:1893:34: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1893 |                         pdst[nb] = ~pip[nb];
      |                         ~~~~~~~~~^~~~~~~~~~
../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16
   27 |         unsigned char ipaddr[16];       /* up to 128 bits of address */
      |                       ^~~~~~
../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16

This is due to a compiler bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104986

It has been a year since the bug has been reported without getting fixed. As
the warnings are verbose and use of gcc 12 is becoming more common, it seems
worth working around the bug. Particularly because a simple reformulation of
the loop condition fixes the issue and isn't any less readable.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/144536.1648326206@sss.pgh.pa.us
Backpatch: 11-
2023-03-16 14:48:45 -07:00
Tom Lane 5b3c595355 Tighten error checks in datetime input, and remove bogus "ISO" format.
DecodeDateTime and DecodeTimeOnly had support for date input in the
style "Y2023M03D16", which the comments claimed to be an "ISO" format.
However, so far as I can find there is no such format in ISO 8601;
they write units before numbers in intervals, but not in datetimes.
Furthermore, the lesser-known ISO 8601-2 spec actually defines an
incompatible format "2023Y03M16D".  None of our documentation mentions
such a format either.  So let's just drop it.

That leaves us with only two cases for a prefix unit specifier in
datetimes: Julian dates written as Jnnnn, and the "T" separator
defined by ISO 8601.  Add checks to catch misuse of these specifiers,
that is consecutive specifiers or a dangling specifier at the end of
the string.  We do not however disallow a specifier that is separated
from the field that it disambiguates (by noise words or unrelated
fields).  That being the case, remove some overly-aggressive error
checks from the ISOTIME cases.

Joseph Koshakow, editorialized a bit by me; thanks also to
Peter Eisentraut for some standards-reading.

Discussion: https://postgr.es/m/CAAvxfHf2Q1gKLiHGnuPOiyf0ASvKUM4BnMfsXuwgtYEb_Gx0Zw@mail.gmail.com
2023-03-16 14:18:33 -04:00
Tom Lane 2333803d84 Use "data directory" not "current directory" in error messages.
The user receiving the message might not understand where the
server's "current directory" is.  "Data directory" seems clearer.
(This would not be good for frontend code, but both of these
messages are only issued in the backend.)

Kyotaro Horiguchi

Discussion: https://postgr.es/m/20230316.111646.1564684434328830712.horikyota.ntt@gmail.com
2023-03-16 12:04:08 -04:00
Michael Paquier e731aeac89 Remove PgStat_BackendFunctionEntry
This structure included only PgStat_FunctionCounts, and removing it
facilitates some upcoming refactoring for pgstatfuncs.c to use more
macros rather that mostly-duplicated functions.

Author: Bertrand Drouvot
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/11d531fe-52fc-c6ea-7e8e-62f1b6ec626e@gmail.com
2023-03-16 14:22:34 +09:00
Tom Lane 483bdb2afe Support [NO] INDENT option in XMLSERIALIZE().
This adds the ability to pretty-print XML documents ... according to
libxml's somewhat idiosyncratic notions of what's pretty, anyway.
One notable divergence from a strict reading of the spec is that
libxml is willing to collapse empty nodes "<node></node>" to just
"<node/>", whereas SQL and the underlying XML spec say that this
option should only result in whitespace tweaks.  Nonetheless,
it seems close enough to justify using the SQL-standard syntax.

Jim Jones, reviewed by Peter Smith and myself

Discussion: https://postgr.es/m/2f5df461-dad8-6d7d-4568-08e10608a69b@uni-muenster.de
2023-03-15 16:59:09 -04:00