Commit Graph

5595 Commits

Author SHA1 Message Date
Robert Haas 482675987b Add a run_as_owner option to subscriptions.
This option is normally false, but can be set to true to obtain
the legacy behavior where the subscription runs with the permissions
of the subscription owner rather than the permissions of the
table owner. The advantages of this mode are (1) it doesn't require
that the subscription owner have permission to SET ROLE to each
table owner and (2) since no role switching occurs, the
SECURITY_RESTRICTED_OPERATION restrictions do not apply.

On the downside, it allows any table owner to easily usurp
the privileges of the subscription owner - basically, to take
over their account. Because that's generally quite undesirable,
we don't make this mode the default, but we do make it available,
just in case the new behavior causes too many problems for someone.

Discussion: http://postgr.es/m/CA+TgmoZ-WEeG6Z14AfH7KhmpX2eFh+tZ0z+vf0=eMDdbda269g@mail.gmail.com
2023-04-04 12:03:03 -04:00
Alvaro Herrera 6ee30209a6
SQL/JSON: support the IS JSON predicate
This patch introduces the SQL standard IS JSON predicate. It operates
on text and bytea values representing JSON, as well as on the json and
jsonb types. Each test has IS and IS NOT variants and supports a WITH
UNIQUE KEYS flag. The tests are:

IS JSON [VALUE]
IS JSON ARRAY
IS JSON OBJECT
IS JSON SCALAR

These should be self-explanatory.

The WITH UNIQUE KEYS flag makes these return false when duplicate keys
exist in any object within the value, not necessarily directly contained
in the outermost object.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
2023-03-31 22:34:04 +02:00
Tom Lane a2a0c7c29e Further tweaking of width_bucket() edge cases.
I realized that the third overflow case I posited in commit b0e9e4d76
actually should be handled in a different way: rather than tolerating
the idea that the quotient could round to 1, we should clamp so that
the output cannot be more than "count" when we know that the operand is
less than bound2.  That being the case, we don't need an overflow-aware
increment in that code path, which leads me to revert the movement of
the pg_add_s32_overflow() call.  (The diff in width_bucket_float8
might be easier to read by comparing against b0e9e4d76^.)

What's more, width_bucket_numeric also has this problem of the quotient
potentially rounding to 1, so add a clamp there too.

As before, I'm not quite convinced that a back-patch is warranted.

Discussion: https://postgr.es/m/391415.1680268470@sss.pgh.pa.us
2023-03-31 16:29:55 -04:00
Tom Lane f0d65c0eaf Reject system columns as elements of foreign keys.
Up through v11 it was sensible to use the "oid" system column as
a foreign key column, but since that was removed there's no visible
usefulness in making any of the remaining system columns a foreign
key.  Moreover, since the TupleTableSlot rewrites in v12, such cases
actively fail because of implicit assumptions that only user columns
appear in foreign keys.  The lack of complaints about that seems
like good evidence that no one is trying to do it.  Hence, rather
than trying to repair those assumptions (of which there are at least
two, maybe more), let's just forbid the case up front.

Per this patch, a system column in either the referenced or
referencing side of a foreign key will draw this error; however,
putting one in the referenced side would have failed later anyway,
since we don't allow unique indexes to be made on system columns.

Per bug #17877 from Alexander Lakhin.  Back-patch to v12; the
case still appears to work in v11, so we shouldn't break it there.

Discussion: https://postgr.es/m/17877-4bcc658e33df6de1@postgresql.org
2023-03-31 11:18:49 -04:00
Daniel Gustafsson 558fff0adf pg_regress: Emit TAP compliant output
This converts pg_regress output format to emit TAP compliant output
while keeping it as human readable as possible for use without TAP
test harnesses. As verbose harness related information isn't really
supported by TAP this also reduces the verbosity of pg_regress runs
which makes scrolling through log output in buildfarm/CI runs a bit
easier as well.

As the meson TAP parser conumes whitespace, the leading indentation
for differentiating parallel tests from sequential tests has been
changed to a single character prefix.

This patch has been around for an extended period of time, reviewers
listed below may have been involved in reviewing a version quite
different from the version in this commit.  The original idea for
this patch was a hacking session with Jinbao Chen.

TAP format testing is also enabled in meson as of this.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nikolay Shaplov <dhyan@nataraj.su>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/BD4B107D-7E53-4794-ACBA-275BEB4327C9@yesql.se
Discussion: https://postgr.es/m/20220221164736.rq3ornzjdkmwk2wo@alap3.anarazel.de
2023-03-31 13:00:02 +02:00
Andres Freund 8aaa04b32d Track shared buffer hits in pg_stat_io
Among other things, this should make it easier to calculate a useful cache hit
ratio by excluding buffer reads via buffer access strategies. As buffer access
strategies reuse buffers (and thus evict the prior buffer contents), it is
normal to see reads on repeated scans of the same data.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAAKRu_beMa9Hzih40%3DXPYqhDVz6tsgUGTrhZXRo%3Dunp%2Bszb%3DUA%40mail.gmail.com
2023-03-30 19:24:21 -07:00
Thomas Munro 11c2d6fdf5 Parallel Hash Full Join.
Full and right outer joins were not supported in the initial
implementation of Parallel Hash Join because of deadlock hazards (see
discussion).  Therefore FULL JOIN inhibited parallelism, as the other
join strategies can't do that in parallel either.

Add a new PHJ phase PHJ_BATCH_SCAN that scans for unmatched tuples on
the inner side of one batch's hash table.  For now, sidestep the
deadlock problem by terminating parallelism there.  The last process to
arrive at that phase emits the unmatched tuples, while others detach and
are free to go and work on other batches, if there are any, but
otherwise they finish the join early.

That unfairness is considered acceptable for now, because it's better
than no parallelism at all.  The build and probe phases are run in
parallel, and the new scan-for-unmatched phase, while serial, is usually
applied to the smaller of the two relations and is either limited by
some multiple of work_mem, or it's too big and is partitioned into
batches and then the situation is improved by batch-level parallelism.

Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BA6ftXPz4oe92%2Bx8Er%2BxpGZqto70-Q_ERwRaSyA%3DafNg%40mail.gmail.com
2023-03-31 11:34:03 +13:00
Tom Lane e9d202a149 Clean up role created in new subscription test.
This oversight broke repeated runs of "make installcheck".
2023-03-30 13:07:04 -04:00
Robert Haas c3afe8cf5a Add new predefined role pg_create_subscription.
This role can be granted to non-superusers to allow them to issue
CREATE SUBSCRIPTION. The non-superuser must additionally have CREATE
permissions on the database in which the subscription is to be
created.

Most forms of ALTER SUBSCRIPTION, including ALTER SUBSCRIPTION .. SKIP,
now require only that the role performing the operation own the
subscription, or inherit the privileges of the owner. However, to
use ALTER SUBSCRIPTION ... RENAME or ALTER SUBSCRIPTION ... OWNER TO,
you also need CREATE permission on the database. This is similar to
what we do for schemas. To change the owner of a schema, you must also
have permission to SET ROLE to the new owner, similar to what we do
for other object types.

Non-superusers are required to specify a password for authentication
and the remote side must use the password, similar to what is required
for postgres_fdw and dblink.  A superuser who wants a non-superuser to
own a subscription that does not rely on password authentication may
set the new password_required=false property on that subscription. A
non-superuser may not set password_required=false and may not modify a
subscription that already has password_required=false.

This new password_required subscription property works much like the
eponymous postgres_fdw property.  In both cases, the actual semantics
are that a password is not required if either (1) the property is set
to false or (2) the relevant user is the superuser.

Patch by me, reviewed by Andres Freund, Jeff Davis, Mark Dilger,
and Stephen Frost (but some of those people did not fully endorse
all of the decisions that the patch makes).

Discussion: http://postgr.es/m/CA+TgmoaDH=0Xj7OBiQnsHTKcF2c4L+=gzPBUKSJLh8zed2_+Dg@mail.gmail.com
2023-03-30 11:37:19 -04:00
Tom Lane b0e9e4d76c Avoid overflow in width_bucket_float8().
The original coding of this function paid little attention to the
possibility of overflow.  There were actually three different hazards:

1. The range from bound1 to bound2 could exceed DBL_MAX, which on
IEEE-compliant machines produces +Infinity in the subtraction.
At best we'd lose all precision in the result, and at worst
produce NaN due to dividing Inf/Inf.  The range can't exceed
twice DBL_MAX though, so we can fix this case by scaling all the
inputs by 0.5.

2. We computed count * (operand - bound1), which is also at risk of
float overflow, before dividing.  Safer is to do the division first,
producing a quotient that should be in [0,1), and even after allowing
for roundoff error can't be outside [0,1]; then multiplying by count
can't produce a result overflowing an int.  (width_bucket_numeric does
the multiplication first on the grounds that that improves accuracy of
its result, but I don't think that a similar argument can be made in
float arithmetic.)

3. If the division result does round to 1, and count is INT_MAX,
the final addition of 1 would overflow an int.  We took care
of that in the operand >= bound2 case but did not consider that
it could be possible in the main path.  Fix that by moving the
overflow-aware addition of 1 so it is done that way in all cases.

The fix for point 2 creates a possibility that values very close to
a bucket boundary will be rounded differently than they were before.
I'm not troubled by that for HEAD, but it is an argument against
putting this into the stable branches.  Given that the cases being
fixed here are fairly extreme and unlikely to be hit in normal use,
it seems best not to back-patch.

Mats Kindahl and Tom Lane

Discussion: https://postgr.es/m/17876-61f280d1601f978d@postgresql.org
2023-03-30 11:27:36 -04:00
Alvaro Herrera 7081ac46ac
SQL/JSON: add standard JSON constructor functions
This commit introduces the SQL/JSON standard-conforming constructors for
JSON types:

JSON_ARRAY()
JSON_ARRAYAGG()
JSON_OBJECT()
JSON_OBJECTAGG()

Most of the functionality was already present in PostgreSQL-specific
functions, but these include some new functionality such as the ability
to skip or include NULL values, and to allow duplicate keys or throw
error when they are found, as well as the standard specified syntax to
specify output type and format.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
2023-03-29 12:11:36 +02:00
Amit Kapila 062a844424 Avoid syncing data twice for the 'publish_via_partition_root' option.
When there are multiple publications for a subscription and one of those
publishes via the parent table by using publish_via_partition_root and the
other one directly publishes the child table, we end up copying the same
data twice during initial synchronization. The reason for this was that we
get both the parent and child tables from the publisher and try to copy
the data for both of them.

This patch extends the function pg_get_publication_tables() to take a
publication list as its input parameter. This allows us to exclude a
partition table whose ancestor is published by the same publication list.

This problem does exist in back-branches but we decide to fix it there in
a separate commit if required. The fix for back-branches requires quite
complicated changes to fetch the required table information from the
publisher as we can't update the function pg_get_publication_tables() in
back-branches. We are not sure whether we want to deviate and complicate
the code in back-branches for this problem as there are no field reports
yet.

Author: Wang wei
Reviewed-by: Peter Smith, Jacob Champion, Kuroda Hayato, Vignesh C, Osumi Takamichi, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB57167F45D481F78CDC5986F794B99@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-03-29 10:46:58 +05:30
Jeff Davis 1671f990dd Validate ICU locales.
For ICU collations, ensure that the locale's language exists in ICU,
and that the locale can be opened.

Basic validation helps avoid minor mistakes and misspellings, which
often fall back to the root locale instead of the intended
locale. It's even more important to avoid such mistakes in ICU
versions 54 and earlier, where the same (misspelled) locale string
could fall back to different locales depending on the environment.

Discussion: https://postgr.es/m/11b1eeb7e7667fdd4178497aeb796c48d26e69b9.camel@j-davis.com
Discussion: https://postgr.es/m/df2efad0cae7c65180df8e5ebb709e5eb4f2a82b.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-28 16:34:29 -07:00
Tom Lane 326a33a289 Fix corner-case planner failure for MERGE.
MERGE planning could fail with "variable not found in subplan target
list" if the target table is partitioned and all its partitions are
excluded at plan time, or in the case where it has no partitions but
used to have some.  This happened because distribute_row_identity_vars
thought it didn't need to make the target table's reltarget list
fully valid; but if we generate a join plan then that is required
because the dummy Result node's tlist will be made from the reltarget.

The same logic appears in distribute_row_identity_vars in v14,
but AFAICS the problem is unreachable in that branch for lack of
MERGE.  In other updating statements, the target table is always
inner-joined to any other tables, so if the target is known dummy
then the whole plan reduces to dummy, so no join nodes are created.
So I'll refrain from back-patching this code change to v14 for now.

Per report from Alvaro Herrera.

Discussion: https://postgr.es/m/20230328112248.6as34mlx5sr4kltg@alvherre.pgsql
2023-03-28 11:39:24 -04:00
Tom Lane a3c9d35ae1 Reject attempts to alter composite types used in indexes.
find_composite_type_dependencies() ignored indexes, which is a poor
decision because an expression index could have a stored column of
a composite (or other container) type even when the underlying table
does not.  Teach it to detect such cases and error out.  We have to
work a bit harder than for other relations because the pg_depend entry
won't identify the specific index column of concern, but it's not much
new code.

This does not address bug #17872's original complaint that dropping
a column in such a type might lead to violations of the uniqueness
property that a unique index is supposed to ensure.  That seems of
much less concern to me because it won't lead to crashes.

Per bug #17872 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17872-d0fbb799dc3fd85d@postgresql.org
2023-03-27 15:04:15 -04:00
Daniel Gustafsson b577743000 Make SCRAM iteration count configurable
Replace the hardcoded value with a GUC such that the iteration
count can be raised in order to increase protection against
brute-force attacks.  The hardcoded value for SCRAM iteration
count was defined to be 4096, which is taken from RFC 7677, so
set the default for the GUC to 4096 to match.  In RFC 7677 the
recommendation is at least 15000 iterations but 4096 is listed
as a SHOULD requirement given that it's estimated to yield a
0.5s processing time on a mobile handset of the time of RFC
writing (late 2015).

Raising the iteration count of SCRAM will make stored passwords
more resilient to brute-force attacks at a higher computational
cost during connection establishment.  Lowering the count will
reduce computational overhead during connections at the tradeoff
of reducing strength against brute-force attacks.

There are however platforms where even a modest iteration count
yields a too high computational overhead, with weaker password
encryption schemes chosen as a result.  In these situations,
SCRAM with a very low iteration count still gives benefits over
weaker schemes like md5, so we allow the iteration count to be
set to one at the low end.

The new GUC is intentionally generically named such that it can
be made to support future SCRAM standards should they emerge.
At that point the value can be made into key:value pairs with
an undefined key as a default which will be backwards compatible
with this.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org>
Discussion: https://postgr.es/m/F72E7BC7-189F-4B17-BF47-9735EB72C364@yesql.se
2023-03-27 09:46:29 +02:00
Tom Lane 3c05284d83 Invent GENERIC_PLAN option for EXPLAIN.
This provides a very simple way to see the generic plan for a
parameterized query.  Without this, it's necessary to define
a prepared statement and temporarily change plan_cache_mode,
which is a bit tedious.

One thing that's a bit of a hack perhaps is that we disable
execution-time partition pruning when the GENERIC_PLAN option
is given.  That's because the pruning code may attempt to
fetch the value of one of the parameters, which would fail.

Laurenz Albe, reviewed by Julien Rouhaud, Christoph Berg,
Michel Pelletier, Jim Jones, and myself

Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel@cybertec.at
2023-03-24 17:07:22 -04:00
Peter Geoghegan ae4fdde135 Count updates that move row to a new page.
Add pgstat counter to track row updates that result in the successor
version going to a new heap page, leaving behind an original version
whose t_ctid points to the new version.  The current count is shown by
the n_tup_newpage_upd column of each of the pg_stat_*_tables views.

The new n_tup_newpage_upd column complements the existing n_tup_hot_upd
and n_tup_upd columns.  Tables that have high n_tup_newpage_upd values
(relative to n_tup_upd) are good candidates for tuning heap fillfactor.

Corey Huinker, with small tweaks by me.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CADkLM=ded21M9iZ36hHm-vj2rE2d=zcKpUQMds__Xm2pxLfHKA@mail.gmail.com
2023-03-23 11:16:17 -07:00
Jeff Davis 3b50275b12 Handle the "und" locale in ICU versions 54 and older.
The "und" locale is an alternative spelling of the root locale, but it
was not recognized until ICU 55. To maintain common behavior across
all supported ICU versions, check for "und" and replace with "root"
before opening.

Previously, the lack of support for "und" was dangerous, because
versions 54 and older fall back to the environment when a locale is
not found. If the user specified "und" for the language (which is
expected and documented), it could not only resolve to the wrong
collator, but it could unexpectedly change (which could lead to
corrupt indexes).

This effectively reverts commit d72900bded, which worked around the
problem for the built-in "unicode" collation, and is no longer
necessary.

Discussion: https://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73.camel@j-davis.com
Discussion: https://postgr.es/m/0c6fa66f2753217d2a40480a96bd2ccf023536a1.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-23 10:08:27 -07:00
Jeff Davis 869650fa86 Support language tags in older ICU versions (53 and earlier).
By calling uloc_canonicalize() before parsing the attributes, the
existing locale attribute parsing logic works on language tags as
well.

Fix a small memory leak, too.

Discussion: http://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-21 16:12:37 -07:00
Andres Freund ef719e7b32 Stabilize pg_stat_io writes test
Counting writes only for io_context = 'normal' is unreliable, as backends
using a buffer access strategy could flush all of the dirty buffers out from
under the other backends and checkpointer. Change the test to count writes in
any context. This achieves roughly the same coverage anyway.

Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://www.postgresql.org/message-id/ZAnWU8WbXEDjrfUE%40telsasoft.com
2023-03-20 18:16:06 -07:00
Tomas Vondra 19d8e2308b Ignore BRIN indexes when checking for HOT updates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.

A new type TU_UpdateIndexes provides a signal to the executor to
determine which indexes to update - no indexes, all indexes, or only the
summarizing indexes.

This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.

This was originally committed as 5753d4ee32, but then got reverted by
e3fcca0d0d because of correctness issues.

Original patch by Josef Simanek, various fixes and improvements by Tomas
Vondra and me.

Authors: Matthias van de Meent, Josef Simanek, Tomas Vondra
Reviewed-by: Tomas Vondra, Alvaro Herrera
Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
2023-03-20 11:02:42 +01:00
Tomas Vondra e858312683 Fix netmask handling in inet_minmax_multi_ops
When calculating distance in brin_minmax_multi_distance_inet(), the
netmask was applied incorrectly. This results in (seemingly) incorrect
ordering of values, triggering an assert.

For builds without asserts this is mostly harmless - we may merge other
ranges, possibly resulting in slightly less efficient index. But it's
still correct and the greedy algorithm doesn't guarantee optimality
anyway.

Backpatch to 14, where minmax-multi indexes were introduced.

Reported by Dmitry Dolgov, investigation and fix by me.

Reported-by: Dmitry Dolgov
Backpatch-through: 14
Discussion: https://postgr.es/m/17774-c6f3e36dd4471e67@postgresql.org
2023-03-20 10:24:14 +01:00
Tom Lane 75bd846b68 Add functions to do timestamptz arithmetic in a non-default timezone.
Add versions of timestamptz + interval, timestamptz - interval, and
generate_series(timestamptz, ...) in which a timezone can be specified
explicitly instead of defaulting to the TimeZone GUC setting.

The new functions for the first two are named date_add and
date_subtract.  This might seem too generic, but we could use
overloading to add additional variants if that seems useful.

Along the way, improve the docs' pretty inadequate explanation
of how timestamptz +- interval works.

Przemysław Sztoch and Gurjeet Singh; cosmetic changes and most of
the docs work by me

Discussion: https://postgr.es/m/01a84551-48dd-1359-bf7e-f6b0203a6bd0@sztoch.pl
2023-03-18 14:12:16 -04:00
Peter Eisentraut de4d456b40 Improve several permission-related error messages.
Mainly move some detail from errmsg to errdetail, remove explicit
mention of superuser where appropriate, since that is implied in most
permission checks, and make messages more uniform.

Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230316234701.GA903298@nathanxps13
2023-03-17 10:33:09 +01:00
David Rowley eb7d043c9b Fix incorrect logic for determining safe WindowAgg run conditions
The logic added in 9d9c02ccd to determine when a qual can be used as a
WindowClause run condition failed to correctly check for subqueries in the
qual.  This was being done correctly for normal subquery qual pushdowns,
it's just that 9d9c02ccd failed to follow the lead on that.

This also fixes various other cases where transforming the qual into a
WindowClause run condition in the subquery should have been disallowed.

Bug: #17826
Reported-by: Anban Company
Discussion: https://postgr.es/m/17826-7d8750952f19a5f5@postgresql.org
Backpatch-through: 15, where 9d9c02ccd was introduced.
2023-03-17 15:49:53 +13:00
Tom Lane 9bfd2822b3 Enable use of Memoize atop an Append that came from UNION ALL.
create_append_path() would only apply get_baserel_parampathinfo
when the path is for a partitioned table, but it's also potentially
useful for paths for UNION ALL appendrels.  Specifically, that
supports building a Memoize path atop this one.

While we're in the vicinity, delete some dead code in
create_merge_append_plan(): there's no need for it to support
parameterized MergeAppend paths, and it doesn't look like that
is going to change anytime soon.  It'll be easy enough to undo
this when/if it becomes useful.

Richard Guo

Discussion: https://postgr.es/m/CAMbWs4_ABSu4PWG2rE1q10tJugEXHWgru3U8dAgkoFvgrb6aEA@mail.gmail.com
2023-03-16 18:13:45 -04:00
Tom Lane 5b3c595355 Tighten error checks in datetime input, and remove bogus "ISO" format.
DecodeDateTime and DecodeTimeOnly had support for date input in the
style "Y2023M03D16", which the comments claimed to be an "ISO" format.
However, so far as I can find there is no such format in ISO 8601;
they write units before numbers in intervals, but not in datetimes.
Furthermore, the lesser-known ISO 8601-2 spec actually defines an
incompatible format "2023Y03M16D".  None of our documentation mentions
such a format either.  So let's just drop it.

That leaves us with only two cases for a prefix unit specifier in
datetimes: Julian dates written as Jnnnn, and the "T" separator
defined by ISO 8601.  Add checks to catch misuse of these specifiers,
that is consecutive specifiers or a dangling specifier at the end of
the string.  We do not however disallow a specifier that is separated
from the field that it disambiguates (by noise words or unrelated
fields).  That being the case, remove some overly-aggressive error
checks from the ISOTIME cases.

Joseph Koshakow, editorialized a bit by me; thanks also to
Peter Eisentraut for some standards-reading.

Discussion: https://postgr.es/m/CAAvxfHf2Q1gKLiHGnuPOiyf0ASvKUM4BnMfsXuwgtYEb_Gx0Zw@mail.gmail.com
2023-03-16 14:18:33 -04:00
Tom Lane 483bdb2afe Support [NO] INDENT option in XMLSERIALIZE().
This adds the ability to pretty-print XML documents ... according to
libxml's somewhat idiosyncratic notions of what's pretty, anyway.
One notable divergence from a strict reading of the spec is that
libxml is willing to collapse empty nodes "<node></node>" to just
"<node/>", whereas SQL and the underlying XML spec say that this
option should only result in whitespace tweaks.  Nonetheless,
it seems close enough to justify using the SQL-standard syntax.

Jim Jones, reviewed by Peter Smith and myself

Discussion: https://postgr.es/m/2f5df461-dad8-6d7d-4568-08e10608a69b@uni-muenster.de
2023-03-15 16:59:09 -04:00
Tom Lane e3ac85014e Support PlaceHolderVars in MERGE actions.
preprocess_targetlist thought PHVs couldn't appear here.
It was mistaken, as per report from Önder Kalacı.

Surveying other pull_var_clause calls, I noted no similar errors,
but I did notice that qual_is_pushdown_safe's assertion about
!contain_window_function was pointless, because the following
pull_var_clause call would complain about them anyway.  In HEAD
only, remove the redundant Assert and improve the commentary.

Discussion: https://postgr.es/m/CACawEhUuum-gC_2S3sXLTcsk7bUSPSHOD+g1ZpfKaDK-KKPPWA@mail.gmail.com
2023-03-15 11:59:18 -04:00
Tom Lane b081fe4199 Fix corner case bug in numeric to_char() some more.
The band-aid applied in commit f0bedf3e4 turns out to still need
some work: it made sure we didn't set Np->last_relevant too small
(to the left of the decimal point), but it didn't prevent setting
it too large (off the end of the partially-converted string).
This could result in fetching data beyond the end of the allocated
space, which with very bad luck could cause a SIGSEGV, though
I don't see any hazard of interesting memory disclosure.

Per bug #17839 from Thiago Nunes.  The bug's pretty ancient,
so back-patch to all supported versions.

Discussion: https://postgr.es/m/17839-aada50db24d7b0da@postgresql.org
2023-03-14 19:17:31 -04:00
Dean Rasheed d5d574146d Add support for the error functions erf() and erfc().
Expose the standard error functions as SQL-callable functions. These
are expected to be useful to people working with normal distributions,
and we use them here to test the distribution from random_normal().

Since these functions are defined in the POSIX and C99 standards, they
should in theory be available on all supported platforms. If that
turns out not to be the case, more work will be needed.

On all platforms tested so far, using extra_float_digits = -1 in the
regression tests is sufficient to allow for variations between
implementations. However, past experience has shown that there are
almost certainly going to be additional unexpected portability issues,
so these tests may well need further adjustments, based on the
buildfarm results.

Dean Rasheed, reviewed by Nathan Bossart and Thomas Munro.

Discussion: https://postgr.es/m/CAEZATCXv5fi7+Vu-POiyai+ucF95+YMcCMafxV+eZuN1B-=MkQ@mail.gmail.com
2023-03-14 09:17:36 +00:00
Tom Lane 25a7812cd0 Fix JSON error reporting for many cases of erroneous string values.
The majority of error exit cases in json_lex_string() failed to
set lex->token_terminator, causing problems for the error context
reporting code: it would see token_terminator less than token_start
and do something more or less nuts.  In v14 and up the end result
could be as bad as a crash in report_json_context().  Older
versions accidentally avoided that fate; but all versions produce
error context lines that are far less useful than intended,
because they'd stop at the end of the prior token instead of
continuing to where the actually-bad input is.

To fix, invent some macros that make it less notationally painful
to do the right thing.  Also add documentation about what the
function is actually required to do; and in >= v14, add an assertion
in report_json_context about token_terminator being sufficiently
far advanced.

Per report from Nikolay Shaplov.  Back-patch to all supported
versions.

Discussion: https://postgr.es/m/7332649.x5DLKWyVIX@thinkpad-pgpro
2023-03-13 15:19:00 -04:00
Tom Lane 30dbdbe753 Fix failure to detect some cases of improperly-nested aggregates.
check_agg_arguments_walker() supposed that it needn't descend into
the arguments of a lower-level aggregate function, but this is
just wrong in the presence of multiple levels of sub-select.  The
oversight would lead to executor failures on queries that should
be rejected.  (Prior to v11, they actually were rejected, thanks
to a "redundant" execution-time check.)

Per bug #17835 from Anban Company.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17835-4f29f3098b2d0ba4@postgresql.org
2023-03-13 12:40:28 -04:00
Andrew Dunstan 9f8377f7a2 Add a DEFAULT option to COPY FROM
This allows for a string which if an input field matches causes the
column's default value to be inserted. The advantage of this is that
the default can be inserted in some rows and not others, for which
non-default data is available.

The file_fdw extension is also modified to take allow use of this
option.

Israel Barth Rubio

Discussion: https://postgr.es/m/CAO_rXXAcqesk6DsvioOZ5zmeEmpUN5ktZf-9=9yu+DTr0Xr8Uw@mail.gmail.com
2023-03-13 10:01:56 -04:00
Dean Rasheed 7b14e20b12 Fix MERGE command tag for actions blocked by BEFORE ROW triggers.
This ensures that the row count in the command tag for a MERGE is
correctly computed in the case where UPDATEs or DELETEs are skipped
due to a BEFORE ROW trigger returning NULL (the INSERT case was
already handled correctly by ExecMergeNotMatched() calling
ExecInsert()).

Back-patch to v15, where MERGE was introduced.

Discussion: https://postgr.es/m/CAEZATCU8XEmR0JWKDtyb7iZ%3DqCffxS9uyJt0iOZ4TV4RT%2Bow1w%40mail.gmail.com
2023-03-13 11:12:20 +00:00
Peter Eisentraut b2bd9a6796 Fix expected test output
For builds without lz4, for 208bf364a9.
2023-03-13 11:15:21 +01:00
Peter Eisentraut 208bf364a9 Remove incidental md5() function uses from main regression tests
Most of these calls were to generate some random data.  These can be
replaced by appropriately adapted sha256() calls.  To keep the diff
smaller, we wrap this into a helper function that produces the same
output format and length as the md5() call.

This will eventually allow these tests to pass in OpenSSL FIPS mode
(which does not allow MD5 use).

Similar work for other test suites will follow later.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/dbbd927f-ef1f-c9a1-4ec6-c759778ac852@enterprisedb.com
2023-03-13 10:53:28 +01:00
Tom Lane 767c598954 Work around implementation restriction in adjust_appendrel_attrs.
adjust_appendrel_attrs can't transfer nullingrel labeling to a non-Var
translation expression (mainly because it's too late to wrap such an
expression in a PlaceHolderVar).  I'd supposed in commit 2489d76c4
that that restriction was unreachable because we'd not attempt to push
problematic clauses down to an appendrel child relation.  I forgot that
set_append_rel_size blindly converts all the parent rel's joininfo
clauses to child clauses, and that list could well contain clauses
from above a nulling outer join.

We might eventually have to devise a direct fix for this implementation
restriction, but for now it seems enough to filter out troublesome
clauses while constructing the child's joininfo list.  Such clauses
are certainly not useful while constructing paths for the child rel;
they'll have to be applied later when we join the completed appendrel
to something else.  So we don't need them here, and omitting them from
the list should save a few cycles while processing the child rel.

Per bug #17832 from Marko Tiikkaja.

Discussion: https://postgr.es/m/17832-d0a8106cdf1b722e@postgresql.org
2023-03-12 14:20:34 -04:00
Tom Lane d66bb048c3 Ensure COPY TO on an RLS-enabled table copies no more than it should.
The COPY documentation is quite clear that "COPY relation TO" copies
rows from only the named table, not any inheritance children it may
have.  However, if you enabled row-level security on the table then
this stopped being true, because the code forgot to apply the ONLY
modifier in the "SELECT ... FROM relation" query that it constructs
in order to allow RLS predicates to be attached.  Fix that.

Report and patch by Antonin Houska (comment adjustments and test case
by me).  Back-patch to all supported branches.

Discussion: https://postgr.es/m/3472.1675251957@antos
2023-03-10 13:52:44 -05:00
Peter Eisentraut 3e623ebc7a Fix tests for non-ICU build
missed in 0d21d4b9bc
2023-03-10 14:27:55 +01:00
Peter Eisentraut 0d21d4b9bc Add standard collation UNICODE
This adds a new predefined collation named UNICODE, which sorts by the
default Unicode collation algorithm specifications, per SQL standard.

This only works if ICU support is built.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/flat/1293e382-2093-a2bf-a397-c04e8f83d3c2@enterprisedb.com
2023-03-10 13:35:43 +01:00
Peter Eisentraut 012ee84259 Add a test for UCS_BASIC collation 2023-03-10 11:18:08 +01:00
Tom Lane bcc704b524 Reject combining "epoch" and "infinity" with other datetime fields.
Datetime input formerly accepted combinations such as
'1995-08-06 infinity', but this seems like a clear error.
Reject any combination of regular y/m/d/h/m/s fields with
these special tokens.

Joseph Koshakow, reviewed by Keisuke Kuroda and myself

Discussion: https://postgr.es/m/CAAvxfHdm8wwXwG_FFRaJ1nTHiMWb7YXS2YKCzCt8Q0a2ZoMcHg@mail.gmail.com
2023-03-09 16:49:03 -05:00
Peter Eisentraut 30a53b7929 Allow tailoring of ICU locales with custom rules
This exposes the ICU facility to add custom collation rules to a
standard collation.

New options are added to CREATE COLLATION, CREATE DATABASE, createdb,
and initdb to set the rules.

Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Discussion: https://www.postgresql.org/message-id/flat/821c71a4-6ef0-d366-9acf-bb8e367f739f@enterprisedb.com
2023-03-08 16:56:37 +01:00
Tom Lane 99be6feec9 Fix more bugs caused by adding columns to the end of a view.
If a view is defined atop another view, and then CREATE OR REPLACE
VIEW is used to add columns to the lower view, then when the upper
view's referencing RTE is expanded by ApplyRetrieveRule we will have
a subquery RTE with fewer eref->colnames than output columns.  This
confuses various code that assumes those lists are always in sync,
as they are in plain parser output.

We have seen such problems before (cf commit d5b760ecb), and now
I think the time has come to do what was speculated about in that
commit: let's make ApplyRetrieveRule synthesize some column names to
preserve the invariant that holds in parser output.  Otherwise we'll
be chasing this class of bugs indefinitely.  Moreover, it appears from
testing that this actually gives us better results in the test case
d5b760ecb added, and likely in other corner cases that we lack
coverage for.

In HEAD, I replaced d5b760ecb's hack to make expandRTE exit early with
an elog(ERROR) call, since the case is now presumably unreachable.
But it seems like changing that in back branches would bring more risk
than benefit, so there I just updated the comment.

Per bug #17811 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17811-d31686b78f0dffc9@postgresql.org
2023-03-07 18:21:53 -05:00
Peter Eisentraut ce1215d9b0 Add support for unit "B" to pg_size_bytes()
This makes it consistent with the units support in GUC.

Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0106914a-9eb5-22be-40d8-652cc88c827d%40enterprisedb.com
2023-03-07 20:31:16 +01:00
Andres Freund 1be0fdb9de Fix flakey pg_stat_io test
Wrap test of pg_stat_io's tracking of shared buffer reads in a transaction to
prevent concurrent accesses (e.g. by autovacuum) causing spurious test
failures.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230306190919.ai6mxdq3sygyyths%40awork3.anarazel.de
2023-03-07 10:08:38 -08:00
Tom Lane 7fee7871b4 Fix some more cases of missed GENERATED-column updates.
If UPDATE is forced to retry after an EvalPlanQual check, it neglected
to repeat GENERATED-column computations, even though those might well
have changed since we're dealing with a different tuple than before.
Fixing this is mostly a matter of looping back a bit further when
we retry.  In v15 and HEAD that's most easily done by altering the API
of ExecUpdateAct so that it includes computing GENERATED expressions.

Also, if an UPDATE in a partitioned table turns into a cross-partition
INSERT operation, we failed to recompute GENERATED columns.  That's a
bug since 8bf6ec3ba allowed partitions to have different generation
expressions; although it seems to have no ill effects before that.
Fixing this is messier because we can now have situations where the same
query needs both the UPDATE-aligned set of GENERATED columns and the
INSERT-aligned set, and it's unclear which set will be generated first
(else we could hack things by forcing the INSERT-aligned set to be
generated, which is indeed how fe9e658f4 made it work for MERGE).
The best fix seems to be to build and store separate sets of expressions
for the INSERT and UPDATE cases.  That would create ABI issues in the
back branches, but so far it seems we can leave this alone in the back
branches.

Per bug #17823 from Hisahiro Kauchi.  The first part of this affects all
branches back to v12 where GENERATED columns were added.

Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
2023-03-06 18:31:27 -05:00
Michael Paquier 4211fbd841 Add PROCESS_MAIN to VACUUM
Disabling this option is useful to run VACUUM (with or without FULL) on
only the toast table of a relation, bypassing the main relation.  This
option is enabled by default.

Running directly VACUUM on a toast table was already possible without
this feature, by using the non-deterministic name of a toast relation
(as of pg_toast.pg_toast_N, where N would be the OID of the parent
relation) in the VACUUM command, and it required a scan of pg_class to
know the name of the toast table.  So this feature is basically a
shortcut to be able to run VACUUM or VACUUM FULL on a toast relation,
using only the name of the parent relation.

A new switch called --no-process-main is added to vacuumdb, to work as
an equivalent of PROCESS_MAIN.

Regression tests are added to cover VACUUM and VACUUM FULL, looking at
pg_stat_all_tables.vacuum_count to see how many vacuums have run on
each table, main or toast.

Author: Nathan Bossart
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/20221230000028.GA435655@nathanxps13
2023-03-06 16:41:05 +09:00
Michael Paquier 46d490ac19 Improve the regression tests of VACUUM (PROCESS_TOAST)
All the regression tests of VACUUM (PROCESS_TOAST) were only checking if
the commands were able to run, without checking if VACUUM was really
running on what it should.  This expands this set of tests so as we now
look at pg_stat_all_tables.vacuum_count to see how many vacuums have
been run on a given table and its toast relation.

Extracted from a larger patch by the same author, as this is useful on
its own.

Special thanks to Álvaro Herrera for the idea of using
pg_stat_all_tables to check the state of the toast relation.

Author: Nathan Bossart
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/20221230000028.GA435655@nathanxps13
2023-03-06 15:40:56 +09:00
Peter Eisentraut 102a5c164a SQL JSON path enhanced numeric literals
Add support for non-decimal integer literals and underscores in
numeric literals to SQL JSON path language.  This follows the rules of
ECMAScript, as referred to by the SQL standard.

Internally, all the numeric literal parsing of jsonpath goes through
numeric_in, which already supports all this, so this patch is just a
bit of lexer work and some tests and documentation.

Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b11b25bb-6ec1-d42f-cedd-311eae59e1fb@enterprisedb.com
2023-03-05 15:19:58 +01:00
Tom Lane 6949b921d5 Avoid failure when altering state of partitioned foreign-key triggers.
Beginning in v15, if you apply ALTER TABLE ENABLE/DISABLE TRIGGER to
a partitioned table, it also affects the partitions' cloned versions
of the affected trigger(s).  The initial implementation of this
located the clones by name, but that fails on foreign-key triggers
which have names incorporating their own OIDs.  We can fix that, and
also make the behavior more bulletproof in the face of user-initiated
trigger renames, by identifying the cloned triggers by tgparentid.

Following the lead of earlier commits in this area, I took care not
to break ABI in the v15 branch, even though I rather doubt there
are any external callers of EnableDisableTrigger.

While here, update the documentation, which was not touched when
the semantics were changed.

Per bug #17817 from Alan Hodgson.  Back-patch to v15; older versions
do not have this behavior.

Discussion: https://postgr.es/m/17817-31dfb7c2100d9f3d@postgresql.org
2023-03-04 13:32:35 -05:00
Peter Eisentraut b6a0d469ca meson: Prevent installation of test files during main install
Previously, meson installed modules under src/test/modules/ as part of
a normal installation, even though these files are only meant for use
by tests.  This is because there is no way to set up up the build
system to install extra things only when told.

This patch fixes that with a workaround: We don't install these
modules as part of meson install, but we create a new "test" that runs
before the real tests whose action it is to install these files.  The
installation is done by manual copies using a small helper script.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/2a039e8e-f31f-31e8-afe7-bab3130ad2de%40enterprisedb.com
2023-03-03 07:45:52 +01:00
Tom Lane 98a88bc2bc Harden new test case against force_parallel_mode = regress.
Per buildfarm: worker processes can't see a role created in
the current transaction.
2023-03-02 17:47:20 -05:00
Tom Lane 3dfae91f7a Show "internal name" not "source code" in psql's \df+ command.
Our previous habit of showing the full function body is really
pretty unfriendly for tabular viewing of functions, and now that
we have \sf and \ef commands there seems no good reason why \df+
has to do it.  It still seems to make sense to show prosrc for
internal and C-language functions, since in those cases prosrc
is just the C function name; but then let's rename the column to
"Internal name" which is a more accurate descriptor.

Isaac Morland

Discussion: https://postgr.es/m/CAMsGm5eqKc6J1=Lwn=ZONG=6ZDYWRQ4cgZQLqMuZGB1aVt_JBg@mail.gmail.com
2023-03-02 17:15:13 -05:00
Tom Lane d7056bc1c7 Avoid fetching one past the end of translate()'s "to" parameter.
This is usually harmless, but if you were very unlucky it could
provoke a segfault due to the "to" string being right up against
the end of memory.  Found via valgrind testing (so we might've
found it earlier, except that our regression tests lacked any
exercise of translate()'s deletion feature).

Fix by switching the order of the test-for-end-of-string and
advance-pointer steps.  While here, compute "to_ptr + tolen"
just once.  (Smarter compilers might figure that out for
themselves, but let's just make sure.)

Report and fix by Daniil Anisimov, in bug #17816.

Discussion: https://postgr.es/m/17816-70f3d2764e88a108@postgresql.org
2023-03-01 11:30:31 -05:00
Andrew Dunstan b5737efea0 Remove unnecessary and problematic collate.windows.win1252 tests
Some windows instances can't handle setting lc_time to a non BCP 47
locale, and the removed tests in any case don't really make lots of
sense here.

Juan José Santamaría Flecha

Discussion: https://postgr.es/m/237b255b-e063-a82e-66e1-c63a12bf9664@dunslane.net
2023-02-28 15:47:07 -05:00
Michael Paquier 737668d9e3 Fix expected output of xml_2.out
Per buildfarm members snakefly, parula and prion, that reflect the
results coming from the latest versions of libxml2.

Oversight in b8da37b in the shape of an incorrect copy-paste.  The CI
was green, but it does not stress this expected output.
2023-02-28 08:37:37 +09:00
Michael Paquier b8da37b3ad Rework pg_input_error_message(), now renamed pg_input_error_info()
pg_input_error_info() is now a SQL function able to return a row with
more than just the error message generated for incorrect data type
inputs when these are able to handle soft failures, returning more
contents of ErrorData, as of:
- The error message (same as before).
- The error detail, if set.
- The error hint, if set.
- SQL error code.

All the regression tests that relied on pg_input_error_message() are
updated to reflect the effects of the rename.

Per discussion with Tom Lane and Andrew Dunstan.

Author: Nathan Bossart
Discussion: https://postgr.es/m/139a68e1-bd1f-a9a7-b5fe-0be9845c6311@dunslane.net
2023-02-28 08:04:13 +09:00
Tom Lane 87f3667ec0 Fix MULTIEXPR_SUBLINK with partitioned target tables, yet again.
We already tried to fix this in commits 3f7323cbb et al (and follow-on
fixes), but now it emerges that there are still unfixed cases;
moreover, these cases affect all branches not only pre-v14.  I thought
we had eliminated all cases of making multiple clones of an UPDATE's
target list when we nuked inheritance_planner.  But it turns out we
still do that in some partitioned-UPDATE cases, notably including
INSERT ... ON CONFLICT UPDATE, because ExecInitPartitionInfo thinks
it's okay to clone and modify the parent's targetlist.

This fix is based on a suggestion from Andres Freund: let's stop
abusing the ParamExecData.execPlan mechanism, which was only ever
meant to handle initplans, and instead solve the execution timing
problem by having the expression compiler move MULTIEXPR_SUBLINK steps
to the front of their expression step lists.  This is feasible because
(a) all branches still in support compile the entire targetlist of
an UPDATE into a single ExprState, and (b) we know that all
MULTIEXPR_SUBLINKs do need to be evaluated --- none could be buried
inside a CASE, for example.  There is a minor semantics change
concerning the order of execution of the MULTIEXPR's subquery versus
other parts of the parent targetlist, but that seems like something
we can get away with.  By doing that, we no longer need to worry
about whether different clones of a MULTIEXPR_SUBLINK share output
Params; their usage of that data structure won't overlap.

Per bug #17800 from Alexander Lakhin.  Back-patch to all supported
branches.  In v13 and earlier, we can revert 3f7323cbb and follow-on
fixes; however, I chose to keep the SubPlan.subLinkId field added
in ccbb54c72.  We don't need that anymore in the core code, but it's
cheap enough to fill, and removing a plan node field in a minor
release seems like it'd be asking for trouble.

Andres Freund and Tom Lane

Discussion: https://postgr.es/m/17800-ff90866b3906c964@postgresql.org
2023-02-25 14:44:14 -05:00
Dean Rasheed a7d71c41db Fix mishandling of OLD/NEW references in subqueries in rule actions.
If a rule action contains a subquery that refers to columns from OLD
or NEW, then those are really lateral references, and the planner will
complain if it sees such things in a subquery that isn't marked as
lateral. However, at rule-definition time, the user isn't required to
mark the subquery with LATERAL, and so it can fail when the rule is
used.

Fix this by marking such subqueries as lateral in the rewriter, at the
point where they're used.

Dean Rasheed and Tom Lane, per report from Alexander Lakhin.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/5e09da43-aaba-7ea7-0a51-a2eb981b058b%40gmail.com
2023-02-25 14:41:12 +00:00
Daniel Gustafsson d959523257 Disallow NULLS NOT DISTINCT indexes for primary keys
A unique index which is created with non-distinct NULLS cannot be
used for backing a primary key constraint.  Make sure to disallow
such table alterations and teach pg_dump to drop the non-distinct
NULLS clause on indexes where this has been set.

Bug: 17720
Reported-by: Reiner Peterke <zedaardv@drizzle.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/17720-dab8ee0fa85d316d@postgresql.org
2023-02-24 11:09:50 +01:00
Tom Lane 739f1d6218 Fix mis-handling of outer join quals generated by EquivalenceClasses.
It's possible, in admittedly-rather-contrived cases, for an eclass
to generate a derived "join" qual that constrains the post-outer-join
value(s) of some RHS variable(s) without mentioning the LHS at all.
While the mechanisms were set up to work for this, we fell foul of
the "get_common_eclass_indexes" filter installed by commit 3373c7155:
it could decide that such an eclass wasn't relevant to the join, so
that the required qual clause wouldn't get emitted there or anywhere
else.

To fix, apply get_common_eclass_indexes only at inner joins, where
its rule is still valid.  At an outer join, fall back to examining all
eclasses that mention either input (or the OJ relid, though it should
be impossible for an eclass to mention that without mentioning either
input).  Perhaps we can improve on that later, but the cost/benefit of
adding more complexity to skip some irrelevant eclasses is dubious.

To allow cheaply distinguishing outer from inner joins, pass the
ojrelid to generate_join_implied_equalities as a separate argument.
This also allows cleaning up some sloppiness that had crept into
the definition of its join_relids argument, and it allows accurate
calculation of nominal_join_relids for a child outer join.  (The
latter oversight seems not to have been a live bug, but it certainly
could have caused problems in future.)

Also fix what might be a live bug in check_index_predicates: it was
being sloppy about what it passed to generate_join_implied_equalities.

Per report from Richard Guo.

Discussion: https://postgr.es/m/CAMbWs4-DsTBfOvXuw64GdFss2=M5cwtEhY=0DCS7t2gT7P6hSA@mail.gmail.com
2023-02-23 11:05:58 -05:00
Dean Rasheed 75c737636b Fix multi-row DEFAULT handling for INSERT ... SELECT rules.
Given an updatable view with a DO ALSO INSERT ... SELECT rule, a
multi-row INSERT ... VALUES query on the view fails if the VALUES list
contains any DEFAULTs that are not replaced by view defaults. This
manifests as an "unrecognized node type" error, or an Assert failure,
in an assert-enabled build.

The reason is that when RewriteQuery() attempts to replace the
remaining DEFAULT items with NULLs in any product queries, using
rewriteValuesRTEToNulls(), it assumes that the VALUES RTE is located
at the same rangetable index in each product query. However, if the
product query is an INSERT ... SELECT, then the VALUES RTE is actually
in the SELECT part of that query (at the same index), rather than the
top-level product query itself.

Fix, by descending to the SELECT in such cases. Note that we can't
simply use getInsertSelectQuery() for this, since that expects to be
given a raw rule action with OLD and NEW placeholder entries, so we
duplicate its logic instead.

While at it, beef up the checks in getInsertSelectQuery() by checking
that the jointree->fromlist node is indeed a RangeTblRef, and that the
RTE it points to has rtekind == RTE_SUBQUERY.

Per bug #17803, from Alexander Lakhin. Back-patch to all supported
branches.

Dean Rasheed, reviewed by Tom Lane.

Discussion: https://postgr.es/m/17803-53c63ed4ecb4eac6%40postgresql.org
2023-02-23 10:53:01 +00:00
Daniel Gustafsson 337903a16f Consider a failed process as a failed test in pg_regress
Commit 55de145d1c added reporting of child process failures, but the
test suite is still allowed to pass even if the process failed. Since
regress tests are higher level tests, a false positive is more likely
in this case so report failed test processes as failed tests.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/82C46B5E-1821-4039-82C2-56BCA5992989@yesql.se
Discussion: https://postgr.es/m/20221122235636.4frx7hjterq6bmls@awork3.anarazel.de
2023-02-23 09:25:47 +01:00
Tom Lane a75ff55c83 Fix some issues with wrong placement of pseudo-constant quals.
initsplan.c figured that it could push Var-free qual clauses to
the top of the current JoinDomain, which is okay in the abstract.
But if the current domain is inside some outer join, and we later
commute an inside-the-domain outer join with one outside it,
we end up placing the pushed-up qual clause incorrectly.

In distribute_qual_to_rels, avoid this by using the syntactic scope
of the qual clause; with the exception that if we're in the top-level
join domain we can still use the full query relid set, ensuring the
resulting gating Result node goes to the top of the plan.  (This is
approximately as smart as the pre-v16 code was.  Perhaps we can do
better later, but it's not clear that such cases are worth a lot of
sweat.)

In process_implied_equality, we don't have a clear notion of syntactic
scope, but we do have the results of SpecialJoinInfo construction.
Thumb through those and remove any lower outer joins that might get
commuted to above the join domain.  Again, we can make an exception
for the top-level join domain.  It'd be possible to work harder here
(for example, by keeping outer joins that aren't shown as potentially
commutable), but I'm going to stop here for the moment.  This issue
has convinced me that the current representation of join domains
probably needs further refinement, so I'm disinclined to write
inessential dependent logic just yet.

In passing, tighten the qualscope passed to process_implied_equality
by generate_base_implied_equalities_no_const; there's no need for
it to be larger than the rel we are currently considering.

Tom Lane and Richard Guo, per report from Tender Wang.

Discussion: https://postgr.es/m/CAHewXNk9eJ35ru5xATWioTV4+xZPHptjy9etdcNPjUfY9RQ+uQ@mail.gmail.com
2023-02-22 12:39:11 -05:00
Dean Rasheed 0d3b49d4af Fix Assert failure for MERGE into a partitioned table with RLS.
In ExecInitPartitionInfo(), the Assert when building the WITH CHECK
OPTION list for the new partition assumed that the command would be an
INSERT or UPDATE, but it can also be a MERGE. This can be triggered by
a MERGE into a partitioned table with RLS checks to enforce.

Fix, and back-patch to v15, where MERGE was introduced.

Discussion: https://postgr.es/m/CAEZATCWWFtQmW67F3XTyMU5Am10Oxa_b8oe0x%2BNu5Mo%2BCdRErg%40mail.gmail.com
2023-02-22 10:51:34 +00:00
Dean Rasheed 80a48e0f21 Fix MERGE command tag for cross-partition updates.
This ensures that the row count in the command tag for a MERGE is
correctly computed. Previously, if MERGE updated a partitioned table,
the row count would be incorrect if any row was moved to a different
partition, since such updates were counted twice.

Back-patch to v15, where MERGE was introduced.

Discussion: https://postgr.es/m/CAEZATCWRMG7XX2QEsVL1LswmNo2d_YG8tKTLkpD3=Lp644S7rg@mail.gmail.com
2023-02-22 09:39:09 +00:00
Peter Eisentraut 2ddab010c2 Implement ANY_VALUE aggregate
SQL:2023 defines an ANY_VALUE aggregate whose purpose is to emit an
implementation-dependent (i.e. non-deterministic) value from the
aggregated rows.

Author: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5cff866c-10a8-d2df-32cb-e9072e6b04a2@postgresfriends.org
2023-02-22 09:33:07 +01:00
Tom Lane 8028e294b4 Detect overflow in timestamp[tz] subtraction.
It's possible to overflow the int64 microseconds field of the
output interval when subtracting two timestamps.  Detect that
instead of silently returning a bogus result.

Nick Babadzhanian

Discussion: https://postgr.es/m/CABw73Uq2oJ3E+kYvvDuY04EkhhkChim2e-PaghBDjOmgUAMWGw@mail.gmail.com
2023-02-20 17:26:25 -05:00
Tom Lane f0d0394e84 Fix parsing of ISO-8601 interval fields with exponential notation.
Historically we've accepted interval input like 'P.1e10D'.  This
is probably an accident of having used strtod() to do the parsing,
rather than something anyone intended, but it's been that way for
a long time.  Commit e39f99046 broke this by trying to parse the
integer and fractional parts separately, without accounting for
the possibility of an exponent.  In principle that coding allowed
for precise conversions of field values wider than 15 decimal
digits, but that does not seem like a goal worth sweating bullets
for.  So, rather than trying to manage an exponent on top of the
existing complexity, let's just revert to the previous coding that
used strtod() by itself.  We can still improve on the old code to
the extent of allowing the value to range up to 1.0e15 rather than
only INT_MAX.  (Allowing more than that risks creating problems
due to precision loss: the converted fractional part might have
absolute value more than 1.  Perhaps that could be dealt with in
some way, but it really does not seem worth additional effort.)

Per bug #17795 from Alexander Lakhin.  Back-patch to v15 where
the faulty code came in.

Discussion: https://postgr.es/m/17795-748d6db3ed95d313@postgresql.org
2023-02-20 16:55:59 -05:00
Tom Lane f6db76c555 Prevent join removal from removing the query's result relation.
This was not something that required consideration before MERGE
was invented; but MERGE builds a join tree that left-joins to the
result relation, meaning that remove_useless_joins will consider
removing it.  That should generally be stopped by the query's use
of output variables from the result relation.  However, if the
result relation is inherited (e.g. a partitioned table) then
we don't add any row identity variables to the query until
expand_inherited_rtentry, which happens after join removal.

This was exposed as of commit 3c569049b, which made it possible
to deduce that a partitioned table could contain at most one row
matching a join key, enabling removal of the not-yet-expanded
result relation.  Ooops.

To fix, let's just teach join_is_removable that the query result
rel is never removable.  It's a cheap enough test in any case,
and it'll save some cycles that we'd otherwise expend in proving
that it's not removable, even in the cases we got right.

Back-patch to v15 where MERGE was added.  Although I think the
case cannot be reached in v15, this seems like cheap insurance.

Per investigation of a report from Alexander Lakhin.

Discussion: https://postgr.es/m/36bee393-b351-16ac-93b2-d46d83637e45@gmail.com
2023-02-20 15:18:32 -05:00
Tom Lane 393430f575 Print the correct aliases for DML target tables in ruleutils.
ruleutils.c blindly printed the user-given alias (or nothing if there
hadn't been one) for the target table of INSERT/UPDATE/DELETE queries.
That works a large percentage of the time, but not always: for queries
appearing in WITH, it's possible that we chose a different alias to
avoid conflict with outer-scope names.  Since the chosen alias would
be used in any Var references to the target table, this'd lead to an
inconsistent printout with consequences such as dump/restore failures.

The correct logic for printing (or not) a relation alias was embedded
in get_from_clause_item.  Factor it out to a separate function so that
we don't need a jointree node to use it.  (Only a limited part of that
function can be reached from these new call sites, but this seems like
the cleanest non-duplicative factorization.)

In passing, I got rid of a redundant "\d+ rules_src" step in rules.sql.

Initial report from Jonathan Katz; thanks to Vignesh C for analysis.
This has been broken for a long time, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/e947fa21-24b2-f922-375a-d4f763ef3e4b@postgresql.org
Discussion: https://postgr.es/m/CALDaNm1MMntjmT_NJGp-Z=xbF02qHGAyuSHfYHias3TqQbPF2w@mail.gmail.com
2023-02-17 16:40:34 -05:00
Tom Lane a0fa18cc0d Fix check for child column generation status matching parent.
In commit 8bf6ec3ba, I mistakenly supposed that MergeAttributes'
loop over saved_schema was reprocessing column definitions that
had already been checked earlier: there is a variant syntax for
creating a child partition in which that's not true.  So we need
to duplicate the full check appearing further up.

(Actually, I believe that the "if (restdef->identity)" part is
not reachable, because we reject identity on partitions earlier.
But it seems wise to keep the check, in case that's ever relaxed,
and to keep this code in sync with the other instance.)

Per report from Alexander Lakhin.

Discussion: https://postgr.es/m/4a8200ca-8378-653e-38ed-b2e1f1611aa6@gmail.com
2023-02-16 18:51:55 -05:00
David Rowley 5352ca22e0 Rename force_parallel_mode to debug_parallel_query
force_parallel_mode is meant to be used to allow us to exercise the
parallel query infrastructure to ensure that it's working as we expect.
It seems some users think this GUC is for forcing the query planner into
picking a parallel plan regardless of the costs.  A quick look at the
documentation would have made them realize that they were wrong, but the
GUC is likely too conveniently named which, evidently, seems to often
result in users expecting that it forces the planner into usefully
parallelizing queries.

Here we rename the GUC to something which casual users are less likely to
mistakenly think is what they need to make their query run more quickly.

For now, the old name can still be used.  We'll revisit if the old name
mapping can be removed once the buildfarm configs are all updated.

Reviewed-by: John Naylor
Discussion: https://postgr.es/m/CAApHDvrsOi92_uA7PEaHZMH-S4Xv+MGhQWA+GrP8b1kjpS1HjQ@mail.gmail.com
2023-02-15 21:21:59 +13:00
Tom Lane c7468c73f7 Fix buggy recursion in flatten_rtes_walker().
Must save-and-restore the context we are modifying.
Oversight in commit a61b1f748.

Tender Wang

Discussion: https://postgr.es/m/CAHewXNnnNySD_YcKNuFpQDV2gxWA7_YLWqHmYVcyoOYxn8kY2A@mail.gmail.com
Discussion: https://postgr.es/m/20230212233711.GA1316@telsasoft.com
2023-02-13 12:19:58 -05:00
Tom Lane f50f029c49 Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.
Late in the development of commit 2489d76c4, I (tgl) incorrectly
concluded that the new function have_unsafe_outer_join_ref couldn't
ever reach its inner loop.  That should be the case if the inner
rel's parameterization is based on just one Var, but it could be
based on Vars from several relations, and then not only is the
inner loop reachable but it's wrongly coded.

Despite those errors, it still appears that the whole thing is
redundant given previous join_is_legal checks, so let's arrange
to only run it in assert-enabled builds.

Diagnosis and patch by Richard Guo, per fuzz testing by Justin Pryzby.

Discussion: https://postgr.es/m/20230212235823.GW1653@telsasoft.com
2023-02-13 11:45:32 -05:00
Alvaro Herrera 9a31256231
Fix object identity string for transforms
In commit ad89a5d115, we added an unhelpful 'ON' that doesn't match
the input syntax.  This was discovered while adding code to support for
DDL in logical replication.

No backpatch because of the change of behavior, however improbable it
may be that somebody is depending on this.

Author: Zheng Li <zhengli10@gmail.com>
Discussion: https://postgr.es/m/CAAD30UKg8rXeGM8Oy_MAmxKBL_K5DiHXdeNF=hUefcu1C_6VfQ@mail.gmail.com
2023-02-13 14:03:09 +01:00
David Rowley 7da51590ed Fix incorrect presorted DISTINCT aggregate if condition
Here we fix a faulty "if" condition which failed to correctly handle two
or more consecutive NULL transition values when checking if the new value
is DISTINCT from the old value for presorted aggregates.  Given a suitably
non-strict aggregate transition function, a byref aggregate could cause a
crash due to calling the type's equality function and passing along a
(Datum) 0 value to test for equality, the equality function would then try
to dereference that 0 Datum and segfault.  For byval types, there'd have
been no crash and the equality function would have seen that the two 0
Datums matched, which (only by chance) meant the calling code would have
worked correctly.

Here we ensure that we only call the equality function when neither of
the input values are NULL.

This code is all new as of 1349d2790, so no backpatch needed.

Reported-by: Fujii Masao
Discussion: https://postgr.es/m/860c6d6f-a3c5-3ae9-9da2-827177bede06@oss.nttdata.com
2023-02-13 20:38:37 +13:00
Tom Lane 5e80d35154 Avoid dereferencing an undefined pointer in DecodeInterval().
Commit e39f99046 moved some code up closer to the start of
DecodeInterval(), without noticing that it had been implicitly
relying on previous checks to reject the case of empty input.
Given empty input, we'd now dereference a pointer that hadn't been
set, possibly leading to a core dump.  (But if we fail to provoke
a SIGSEGV, nothing bad happens, and the expected syntax error is
thrown a bit later.)

Per bug #17788 from Alexander Lakhin.  Back-patch to v15 where
the fault was introduced.

Discussion: https://postgr.es/m/17788-dabac9f98f7eafd5@postgresql.org
2023-02-12 12:50:55 -05:00
Andres Freund 10a082bf72 Add tests for pg_stat_io
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-11 10:05:04 -08:00
Andres Freund 9c83bbcf7e Create regress_tblspc in test_setup
An upcoming test needs to use a tablespace as part of its test. Historically,
we wanted tablespace creation be done in a dedicated file, so it's easy to
disable when testing replication. But that is not necessary anymore, due to
allow_in_place_tablespaces.

Create regress_tblspace tablespace in test_setup. Move the tablespace test to
the end of the parallel schedule, so other tests can use it.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-11 10:03:04 -08:00
Andres Freund a9c70b46db Add pg_stat_io view, providing more detailed IO statistics
Builds on 28e626bde0 and f30d62c2fc. See the former for motivation.

Rows of the view show IO operations for a particular backend type, IO target
object, IO context combination (e.g. a client backend's operations on
permanent relations in shared buffers) and each column in the view is the
total number of IO Operations done (e.g. writes). So a cell in the view would
be, for example, the number of blocks of relation data written from shared
buffers by client backends since the last stats reset.

In anticipation of tracking WAL IO and non-block-oriented IO (such as
temporary file IO), the "op_bytes" column specifies the unit of the "reads",
"writes", and "extends" columns for a given row.

Rows for combinations of IO operation, backend type, target object and context
that never occur, are ommitted entirely. For example, checkpointer will never
operate on temporary relations.

Similarly, if an IO operation never occurs for such a combination, the IO
operation's cell will be null, to distinguish from 0 observed IO
operations. For example, bgwriter should not perform reads.

Note that some of the cells in the view are redundant with fields in
pg_stat_bgwriter (e.g. buffers_backend). For now, these have been kept for
backwards compatibility.

Bumps catversion.

Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Samay Sharma <smilingsamay@gmail.com>
Reviewed-by: Maciek Sakrejda <m.sakrejda@gmail.com>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-11 09:52:15 -08:00
Tom Lane 44e56baa80 Fix join removal logic to clean up sub-RestrictInfos of OR clauses.
analyzejoins.c took care to clean out removed relids from the
clause_relids and required_relids of RestrictInfos associated with
the doomed rel ... but it paid no attention to the fact that if such a
RestrictInfo contains an OR clause, there will be sub-RestrictInfos
containing similar fields.

I'm more than a bit surprised that this oversight hasn't caused
visible problems before.  In any case, it's certainly broken now,
so add logic to clean out the sub-RestrictInfos recursively.
We might need to back-patch this someday.

Per bug #17786 from Robins Tharakan.

Discussion: https://postgr.es/m/17786-f1ea7fbdab97daec@postgresql.org
2023-02-10 14:52:36 -05:00
Tom Lane acc5821e4d Further fixes in qual nullingrel adjustment for outer join commutation.
One of the add_nulling_relids calls in deconstruct_distribute_oj_quals
added an OJ relid to too few Vars, while the other added it to too
many.  We should consider the syntactic structure not
min_left/righthand while deciding which Vars to decorate, and when
considering pushing up a lower outer join pursuant to transforming the
second form of OJ identity 3 to the first form, we only want to
decorate Vars coming from its LHS.

In a related bug, I realized that make_outerjoininfo was failing to
check a very basic property that's needed to apply OJ identity 3:
the syntactically-upper outer join clause can't refer to the lower
join's LHS.  This didn't break the join order restriction logic,
but it led to setting bogus commute_xxx bits, possibly resulting
in bogus nullingrel markings in modified quals.

Richard Guo and Tom Lane

Discussion: https://postgr.es/m/CAMbWs497CmBruMx1SOjepWEz+T5NWa4scqbdE9v7ZzSXqH_gQw@mail.gmail.com
Discussion: https://postgr.es/m/CAEP4nAx9C5gXNBfEA0JBfz7B+5f1Bawt-RWQWyhev-wdps8BZA@mail.gmail.com
2023-02-10 13:31:00 -05:00
Michael Paquier ef7002dbe0 Fix various typos in code and tests
Most of these are recent, and the documentation portions are new as of
v16 so there is no need for a backpatch.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20230208155644.GM1653@telsasoft.com
2023-02-09 14:43:53 +09:00
Tom Lane 798c017634 remove_rel_from_query() must clean up PlaceHolderVar.phrels fields.
While we got away with this sloppiness before, it's not okay now
that fee7b77b9 caused build_joinrel_tlist() to make use of phrels.
Per report from Robins Tharakan.

Richard Guo (some cosmetic tweaks by me)

Discussion: https://postgr.es/m/CAMbWs4_ngw9sKxpTE8hqk=-ooVX_CQP3DarA4HzkRMz_JKpTrA@mail.gmail.com
2023-02-08 14:08:46 -05:00
Michael Paquier b7e84c65d5 Remove SQL regression tests for GUCs related to NO_SHOW_ALL
No GUCs that use NO_SHOW_ALL are reported in pg_show_all_settings(),
hence trying to check combinations of flags related to it is pointless.

These queries have been introduced by d10e41d, so backpatch down to 15
to keep all the branches consistent.  Equivalent checks based on
NO_SHOW_ALL could be added in check_GUC_init() when a GUC is initially
loaded, but this can be done only on HEAD.

Author: Nitin Jadhav
Discussion: https://postgr.es/m/CAMm1aWaYe0muu3ABo7iSAgK+OWDS9yNe8GGRYnCyeEpScYKa+g@mail.gmail.com
Backpatch-through: 15
2023-02-08 16:56:21 +09:00
Tom Lane fee7b77b90 Rethink nullingrel marking rules in build_joinrel_tlist().
The logic for when to add the current outer join's own relid
to the nullingrels sets of output Vars and PHVs was overly
complicated and underly correct.  Not sure why I didn't think
of this before, but since what we want is marking per the
syntactic structure, we can just consult our records about
the syntactic structure, ie syn_righthand/syn_lefthand.

Also, tighten the rule about when to add the commute_above_r
bits, in hopes of eliminating some squishy reasoning.  I do not
know of a reason to think that that's broken as-is, but this way
seems better.

Per bug #17781 from Robins Tharakan.

Discussion: https://postgr.es/m/17781-c0405c8b3cd5e072@postgresql.org
2023-02-07 18:26:16 -05:00
Tom Lane 2cbbffff05 Remove leftover code in deconstruct_distribute_oj_quals().
The initial "put back OJ relids" adjustment of ojscope was
incorrect and unnecessary; it seems to be a leftover from
when I (tgl) was trying to get this function to work at all.

Richard Guo

Discussion: https://postgr.es/m/CAMbWs4-L2C47ZGZPabBAi5oDZsKmsbvhYcGCy5o=gCjsaG_ZQA@mail.gmail.com
2023-02-07 11:56:43 -05:00
Tom Lane cad5692051 Fix up join removal's interaction with PlaceHolderVars.
The portion of join_is_removable() that checks PlaceHolderVars
can be made a little more accurate and intelligible than it was.
The key point is that we can allow join removal even if a PHV
mentions the target rel in ph_eval_at, if that mention was only
added as a consequence of forcing the PHV up to a join level
that's at/above the outer join we're trying to get rid of.
We can check that by testing for the OJ's relid appearing in
ph_eval_at, indicating that it's supposed to be evaluated after
the outer join, plus the existing test that the contained
expression doesn't actually mention the target rel.

While here, add an explicit check that there'll be something left
in ph_eval_at after we remove the target rel and OJ relid.  There
is an Assert later on about that, and I'm not too sure that the
case could happen for a PHV satisfying the other constraints,
but let's just check.  (There was previously a bms_is_subset test
that meant to cover this risk, but it's broken now because it
doesn't account for the fact that we'll also remove the OJ relid.)

The real reason for revisiting this code though is that the
Assert I left behind in 8538519db turns out to be easily
reachable, because if a PHV of this sort appears in an upper-level
qual clause then that clause's clause_relids will include the
PHV's ph_eval_at relids.  This is a mirage though: we have or soon
will remove these relids from the PHV's ph_eval_at, and therefore
they no longer belong in qual clauses' clause_relids either.
Remove that Assert in join_is_removable, and replace the similar
one in remove_rel_from_query with code to remove the deleted relids
from clause_relids.

Per bug #17773 from Robins Tharakan.

Discussion: https://postgr.es/m/17773-a592e6cedbc7bac5@postgresql.org
2023-02-06 15:45:03 -05:00
Tom Lane b2d0e13a0a Fix over-optimistic updating of info about commutable outer joins.
make_outerjoininfo was set up to update SpecialJoinInfo's
commute_below, commute_above_l, commute_above_r fields as soon as
it found a pair of outer joins that look like they can commute.
However, this decision could be negated later in the same loop due
to finding an intermediate outer join that prevents commutation.
That left us with commute_xxx fields that were contradictory to the
join order restrictions expressed in min_lefthand/min_righthand.
The latter fields would keep us from actually choosing a bad join
order; but the inconsistent commute_xxx fields could bollix details
such as the varnullingrels values created for intermediate join
relation targetlists, ending in an assertion failure in setrefs.c.

To fix, wait till the end of make_outerjoininfo where we have
accurate values for min_lefthand/min_righthand, and then insert
only relids not present in those sets into the commute_xxx fields.

Per SQLSmith testing by Robins Tharakan.  Note that while Robins
bisected the failure to commit b448f1c8d, it's really the fault of
2489d76c4.  The outerjoin_delayed logic removed in the later commit
was keeping us from deciding that troublesome join pairs commute,
at least in the specific example seen here.

Discussion: https://postgr.es/m/CAEP4nAyAORgE8K_RHSmvWbE9UaChhjbEL1RrDU3neePwwRUB=A@mail.gmail.com
2023-02-05 14:25:10 -05:00
Tom Lane 9f452feeeb Fix thinko in qual distribution.
deconstruct_distribute tweaks the outer join scope (ojscope)
it passes to distribute_qual_to_rels when considering an outer
join qual that's above potentially-commutable outer joins.
However, if the current join is *not* potentially commutable,
we shouldn't do that.  The argument that distribute_qual_to_rels
will not do something wrong with the bogus ojscope falls flat
if we don't pass it non-null postponed_oj_qual_list.  Moreover,
there's no need to play games in this case since we aren't going
to commute anything.

Per SQLSmith testing by Robins Tharakan.

Discussion: https://postgr.es/m/CAEP4nAw74k4b-=93gmfCNX3MOY3y4uPxqbk_MnCVEpdsqHJVsg@mail.gmail.com
2023-02-04 17:40:35 -05:00
Tom Lane 8538519db1 Fix thinko in outer-join removal.
If we have a RestrictInfo that mentions both the removal-candidate
relation and the outer join's relid, then that is a pushed-down
condition not a join condition, so it should be grounds for deciding
that we can't remove the outer join.  In commit 2489d76c4, I'd blindly
included the OJ's relid into "joinrelids" as per the new standard
convention, but the checks of attr_needed and ph_needed should only
allow the join's input rels to be mentioned.

Having done that, the check for references in pushed-down quals
a few lines further down should be redundant.  I left it in place
as an Assert, though.

While researching this I happened across a couple of comments that
worried about the effects of update_placeholder_eval_levels.
That's gone as of b448f1c8d, so we can remove some worry.

Per bug #17769 from Robins Tharakan.  The submitted test case
triggers this more or less accidentally because we flatten out
a LATERAL sub-select after we've done join strength reduction;
if we did that in the other order, this problem would be masked
because the outer join would get simplified to an inner join.
To ensure that the committed test case will continue to test
what it means to even if we make that happen someday, use a
test clause involving COALESCE(), which will prevent us from
using it to do join strength reduction.

Patch by me, but thanks to Richard Guo for initial investigation.

Discussion: https://postgr.es/m/17769-e4f7a5c9d84a80a7@postgresql.org
2023-02-04 15:19:54 -05:00
Tom Lane 5840c20272 Rethink treatment of "postponed" quals in deconstruct_jointree().
After pulling up LATERAL subqueries, we may have qual clauses that
refer to relations outside their syntactic scope.  Before doing any
such pullup, prepjointree.c checks to make sure that it wouldn't
create a semantically-invalid situation; but we leave it to
deconstruct_jointree() to actually move these quals up the join
tree to a place where they can be evaluated.  In commit 2489d76c4,
I (tgl) refactored deconstruct_jointree() in a way that caused
assertion failures while moving such quals, because the new logic
failed to distinguish "this jointree node is a parent of the source
one" from "this jointree node is processed after the source
one in depth-first order".

Fix this, and at the same time reduce the overhead a bit, by
getting rid of the common PostponedQual list and instead making each
JoinTreeItem contain a list of quals that needed to be postponed to
its level.  We can help distribute_qual_to_rels find the appropriate
JoinTreeItem efficiently by adding parent-item links to the
JoinTreeItem data structure.  This ends up being the same number
of relid subset checks as the original (pre-bug) logic, but less
list manipulation is required during multi-level postponements.

Richard Guo and Tom Lane, per bug #17768 from Robins Tharakan.

Discussion: https://postgr.es/m/17768-5ac8730ece54478f@postgresql.org
2023-02-04 12:45:53 -05:00
Dean Rasheed faff8f8e47 Allow underscores in integer and numeric constants.
This allows underscores to be used in integer and numeric literals,
and their corresponding type input functions, for visual grouping.
For example:

    1_500_000_000
    3.14159_26535_89793
    0xffff_ffff
    0b_1001_0001

A single underscore is allowed between any 2 digits, or immediately
after the base prefix indicator of non-decimal integers, per SQL:202x
draft.

Peter Eisentraut and Dean Rasheed

Discussion: https://postgr.es/m/84aae844-dc55-a4be-86d9-4f0fa405cc97%40enterprisedb.com
2023-02-04 09:48:51 +00:00
Tom Lane eae0e20def Remove over-optimistic Assert.
In commit 2489d76c4, I'd thought it'd be safe to assert that a
PlaceHolderVar appearing in a scan-level expression has empty
nullingrels.  However this is not so, as when we determine that a
join relation is certainly empty we'll put its targetlist into a
Result-with-constant-false-qual node, and nothing is done to adjust
the nullingrels of the Vars or PHVs therein.  (Arguably, a Result
used in this way isn't really a scan-level node, but it certainly
isn't an upper node either ...)

It's not clear this is worth any close analysis, so let's just
take out the faulty Assert.

Per report from Robins Tharakan.  I added a test case based on
his example, just in case somebody tries to tighten this up.

Discussion: https://postgr.es/m/CAEP4nAz7Enq3+DEthGG7j27DpuwSRZnW0Nh6jtNh75yErQ_nbA@mail.gmail.com
2023-01-31 11:57:47 -05:00
Tom Lane b448f1c8d8 Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it.  One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation.  The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.

Remove the outerjoin_delayed mechanism.  We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars.  Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.

Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.

Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529.  If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause.  But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it.  That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end.  This improves the generated plans for
such cases by not having to test a redundant join clause.  We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.

Patch by me; thanks to Richard Guo for review.

Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 13:44:36 -05:00
Tom Lane 2489d76c49 Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees.  This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.

To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle.  PlaceHolderVars
receive similar decoration for the same reason.

In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos.  This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.

This change affects FDWs that want to plan foreign joins.  They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.

Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup.  (The README additions
and comments mention some stuff that will appear in the follow-up.)

Patch by me; thanks to Richard Guo for review.

Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 13:16:20 -05:00