Commit Graph

23985 Commits

Author SHA1 Message Date
Peter Eisentraut 8dd43894b1 Fix XLogPageRead() comment
7fcbf6a and 2ff6555 changed the function signature of XLogPageRead()
but did not update the comment.

XLogReaderRoutine contains up to date information about the API, so no
need to repeat all that at XLogPageRead(), but fix the mentions of the
no longer existing function arguments.
2023-01-23 21:46:30 +01:00
Dean Rasheed 6dfacbf72b Add non-decimal integer support to type numeric.
This enhances the numeric type input function, adding support for
hexadecimal, octal, and binary integers of any size, up to the limits
of the numeric type.

Since 6fcda9aba8, such non-decimal integers have been accepted by the
parser as integer literals and passed through to numeric_in(). This
commit gives numeric_in() the ability to handle them.

While at it, simplify the handling of NaN and infinities, reducing the
number of calls to pg_strncasecmp(), and arrange for pg_strncasecmp()
to not be called at all for regular numbers. This gives a significant
performance improvement for decimal inputs, more than offsetting the
small performance hit of checking for non-decimal input.

Discussion: https://postgr.es/m/CAEZATCV8XShnmT9HZy25C%2Bo78CVOFmUN5EM9FRAZ5xvYTggPMg%40mail.gmail.com
2023-01-23 19:21:22 +00:00
Dean Rasheed 0aa38db56b Optimise numeric division for 3 and 4 base-NBASE digit divisors.
On platforms with 128-bit integer support, introduce a new function
div_var_int64(), along the same lines as div_var_int() added in
d1b307eef2 for divisors with 1 or 2 base-NBASE digits, and use it to
speed up div_var() and div_var_fast() in a similar way when the
divisor has 3 or 4 base-NBASE digits.

This gives significant performance gains for divisors with 9-16
decimal digits.

Joel Jacobson.

Discussion:
  https://postgr.es/m/b7a5893d-af18-4c0b-8918-96de5f1bbf39%40app.fastmail.com
  https://postgr.es/m/CAEZATCXGm%3DDyTq%3DFrcOqC0gPMVveKUYTaD5KRRoajrUTiWxVMw%40mail.gmail.com
2023-01-23 11:58:28 +00:00
David Rowley 009dbdea02 Run pgindent on heapam.c
An upcoming patch by Melanie Plageman does some refactoring work in this
area.  Run pgindent on that file now before making any changes so that
it's easier to maintain/evolve each of the individual patches doing the
refactor work.  Additionally, add a few new required typedefs to the list
to make it easier to do future pgindent runs on this file during the
refactor work.

Discussion: https://postgr.es/m/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ@mail.gmail.com
2023-01-23 23:08:38 +13:00
Heikki Linnakangas 236f1ea84c Fix and clarify function comment on LogicalTapeSetCreate.
Commit c4649cce39 removed the "shared" and "ntapes" arguments, but the
comment still talked about "shared". It also talked about "a shared
file handle", which was technically correct because even before commit
c4649cce39, the "shared file handle" referred to the "fileset"
argument, not "shared". But it was very confusing. Improve the
comment.

Also add a comment on what the "preallocate" argument does.

Backpatch to v15, just to make backpatching other patches easier in
the future.

Discussion: https://www.postgresql.org/message-id/af989685-91d5-aad4-8f60-1d066b5ec309@enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-01-23 11:56:43 +02:00
David Rowley 16fd03e956 Allow parallel aggregate on string_agg and array_agg
This adds combine, serial and deserial functions for the array_agg() and
string_agg() aggregate functions, thus allowing these aggregates to
partake in partial aggregations.  This allows both parallel aggregation to
take place when these aggregates are present and also allows additional
partition-wise aggregation plan shapes to include plans that require
additional aggregation once the partially aggregated results from the
partitions have been combined.

Author: David Rowley
Reviewed-by: Andres Freund, Tomas Vondra, Stephen Frost, Tom Lane
Discussion: https://postgr.es/m/CAKJS1f9sx_6GTcvd6TMuZnNtCh0VhBzhX6FZqw17TgVFH-ga_A@mail.gmail.com
2023-01-23 17:35:01 +13:00
Tom Lane 5a3a95385b Track logrep apply workers' last start times to avoid useless waits.
Enforce wal_retrieve_retry_interval on a per-subscription basis,
rather than globally, and arrange to skip that delay in case of
an intentional worker exit.  This probably makes little difference
in the field, where apply workers wouldn't be restarted often;
but it has a significant impact on the runtime of our logical
replication regression tests (even though those tests use
artificially-small wal_retrieve_retry_interval settings already).

Nathan Bossart, with mostly-cosmetic editorialization by me

Discussion: https://postgr.es/m/20221122004119.GA132961@nathanxps13
2023-01-22 14:08:46 -05:00
Tom Lane c9f7f92648 Allow REPLICA IDENTITY to be set on an index that's not (yet) valid.
The motivation for this change is that when pg_dump dumps a
partitioned index that's marked REPLICA IDENTITY, it generates a
command sequence that applies REPLICA IDENTITY before the partitioned
index has been marked valid, causing restore to fail.  We could
perhaps change pg_dump to not do it like that, but that would be
difficult and would not fix existing dump files with the problem.
There seems to be very little reason for the backend to disallow
this anyway --- the code ignores indisreplident when the index
isn't valid --- so instead let's fix it by allowing the case.

Commit 9511fb37a previously expressed a concern that allowing
indisreplident to be set on invalid indexes might allow us to
wind up in a situation where a table could have indisreplident
set on multiple indexes.  I'm not sure I follow that concern
exactly, but in any case the only way that could happen is because
relation_mark_replica_identity is too trusting about the existing set
of markings being valid.  Let's just rip out its early-exit code path
(which sure looks like premature optimization anyway; what are we
doing expending code to make redundant ALTER TABLE ... REPLICA
IDENTITY commands marginally faster and not-redundant ones marginally
slower?) and fix it to positively guarantee that no more than one
index is marked indisreplident.

The pg_dump failure can be demonstrated in all supported branches,
so back-patch all the way.  I chose to back-patch 9511fb37a as well,
just to keep indisreplident handling the same in all branches.

Per bug #17756 from Sergey Belyashov.

Discussion: https://postgr.es/m/17756-dd50e8e0c8dd4a40@postgresql.org
2023-01-21 13:10:29 -05:00
Noah Misch e52daaabf8 Reject CancelRequestPacket having unexpected length.
When the length was too short, the server read outside the allocation.
That yielded the same log noise as sending the correct length with
(backendPID,cancelAuthCode) matching nothing.  Change to a message about
the unexpected length.  Given the attacker's lack of control over the
memory layout and the general lack of diversity in memory layouts at the
code in question, we doubt a would-be attacker could cause a segfault.
Hence, while the report arrived via security@postgresql.org, this is not
a vulnerability.  Back-patch to v11 (all supported versions).

Andrey Borodin, reviewed by Tom Lane.  Reported by Andrey Borodin.
2023-01-21 06:08:00 -08:00
Andres Freund 25b2aba0c3 Zero initialize uses of instr_time about to trigger compiler warnings
These are all not necessary from a correctness POV. However, in the near
future instr_time will be simplified to an int64, at which point gcc would
otherwise start to warn about the changed places.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20230116023639.rn36vf6ajqmfciua@awork3.anarazel.de
2023-01-20 21:16:47 -08:00
Michael Paquier 8eba3e3f02 Move queryjumble.c code to src/backend/nodes/
This will ease a follow-up move that will generate automatically this
code.  The C file is renamed, for consistency with the node-related
files whose code are generated by gen_node_support.pl:
- queryjumble.c -> queryjumblefuncs.c
- utils/queryjumble.h -> nodes/queryjumble.h

Per a suggestion from Peter Eisentraut.

Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/Y5BHOUhX3zTH/ig6@paquier.xyz
2023-01-21 11:48:37 +09:00
Robert Haas 6e2775e4d4 Add new GUC reserved_connections.
This provides a way to reserve connection slots for non-superusers.
The slots reserved via the new GUC are available only to users who
have the new predefined role pg_use_reserved_connections.
superuser_reserved_connections remains as a final reserve in case
reserved_connections has been exhausted.

Patch by Nathan Bossart. Reviewed by Tushar Ahuja and by me.

Discussion: http://postgr.es/m/20230119194601.GA4105788@nathanxps13
2023-01-20 15:39:13 -05:00
Robert Haas fe00fec1f5 Rename ReservedBackends variable to SuperuserReservedConnections.
This is in preparation for adding a new reserved_connections GUC,
but aligning the GUC name with the variable name is also a good
idea on general principle.

Patch by Nathan Bossart. Reviewed by Tushar Ahuja and by me.

Discussion: http://postgr.es/m/20230119194601.GA4105788@nathanxps13
2023-01-20 15:32:08 -05:00
Robert Haas 6c1d5ba486 Update docs and error message for superuser_reserved_connections.
Commit ea92368cd1 made max_wal_senders
a separate pool of backends from max_connections, but the documentation
and error message for superuser_reserved_connections weren't updated
at the time, and as a result are somewhat misleading. Update.

This is arguably a back-patchable bug fix, but because it seems quite
minor, no back-patch.

Patch by Nathan Bossart. Reviewed by Tushar Ahuja and by me.

Discussion: http://postgr.es/m/20230119194601.GA4105788@nathanxps13
2023-01-20 15:23:04 -05:00
Andres Freund d137cb52cb Remove SHM_QUEUE
Prior patches got rid of all the uses of SHM_QUEUE. ilist.h style lists are
more widely used and have an easier to use interface. As there are no users
left, remove SHM_QUEUE.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com> (in an older version)
Discussion: https://postgr.es/m/20221120055930.t6kl3tyivzhlrzu2@awork3.anarazel.de
Discussion: https://postgr.es/m/20200211042229.msv23badgqljrdg2@alap3.anarazel.de
2023-01-19 18:55:51 -08:00
Andres Freund 9600371764 Use dlists instead of SHM_QUEUE for predicate locking
Part of a series to remove SHM_QUEUE. ilist.h style lists are more widely used
and have an easier to use interface.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com> (in an older version)
Discussion: https://postgr.es/m/20221120055930.t6kl3tyivzhlrzu2@awork3.anarazel.de
Discussion: https://postgr.es/m/20200211042229.msv23badgqljrdg2@alap3.anarazel.de
2023-01-19 18:55:51 -08:00
Michael Paquier efb6f4a4f9 Support the same patterns for pg-user in pg_ident.conf as in pg_hba.conf
While pg_hba.conf has support for non-literal username matches, and
this commit extends the capabilities that are supported for the
PostgreSQL user listed in an ident entry part of pg_ident.conf, with
support for:
1. The "all" keyword, where all the requested users are allowed.
2. Membership checks using the + prefix.
3. Using a regex to match against multiple roles.

1. is a feature that has been requested by Jelte Fennema, 2. something
that has been mentioned independently by Andrew Dunstan, and 3. is
something I came up with while discussing how to extend the first one,
whose implementation is facilitated by 8fea868.

This allows matching certain system users against many different
postgres users with a single line in pg_ident.conf.  Without this, one
would need one line for each of the postgres users that a system user
can log in as, which can be cumbersome to maintain.

Tests are added to the TAP test of peer authentication to provide
coverage for all that.

Note that this introduces a set of backward-incompatible changes to be
able to detect the new patterns, for the following cases:
- A role named "all".
- A role prefixed with '+' characters, which is something that would not
have worked in HBA entries anyway.
- A role prefixed by a slash character, similarly to 8fea868.
Any of these can be still be handled by using quotes in the Postgres
role defined in an ident entry.

A huge advantage of this change is that the code applies the same checks
for the Postgres roles in HBA and ident entries, via the common routine
check_role().

**This compatibility change should be mentioned in the release notes.**

Author: Jelte Fennema
Discussion: https://postgr.es/m/DBBPR83MB0507FEC2E8965012990A80D0F7FC9@DBBPR83MB0507.EURPRD83.prod.outlook.com
2023-01-20 11:21:55 +09:00
David Rowley 9f1ca6ce65 Use appendStringInfoSpaces in more places
This adjusts a few places which were appending a string constant
containing spaces onto a StringInfo.  We have appendStringInfoSpaces for
that job, so let's use that instead.

For the change to jsonb.c's add_indent() function, appendStringInfoString
was being called inside a loop to append 4 spaces on each loop.  This
meant that enlargeStringInfo would get called once per loop.  Here it
should be much more efficient to get rid of the loop and just calculate
the number of spaces with "level * 4" and just append all the spaces in
one go.

Here we additionally adjust the appendStringInfoSpaces function so it
makes use of memset rather than a while loop to apply the required spaces
to the StringInfo.  One of the problems with the while loop was that it
was incrementing one variable and decrementing another variable once per
loop.  That's more work than what's required to get the job done.  We may
as well use memset for this rather than trying to optimize the existing
loop.  Some testing has shown memset is faster even for very small sizes.

Discussion: https://postgr.es/m/CAApHDvp_rKkvwudBKgBHniNRg67bzXVjyvVKfX0G2zS967K43A@mail.gmail.com
2023-01-20 13:07:24 +13:00
Tom Lane 1ca604c201 Improve comment about GetWALAvailability's WALAVAIL_REMOVED code.
Sirisha Chamarthi and Kyotaro Horiguchi

Discussion: https://postgr.es/m/CAKrAKeXt-=bgm=d+EDmcC9kWoikp8kbVb3LH0K3K+AGGsykpHQ@mail.gmail.com
2023-01-19 18:41:08 -05:00
Tom Lane 5a617d75d3 Fix ts_headline() to handle ORs and phrase queries more honestly.
This patch largely reverts what I did in commits c9b0c678d and
78e73e875.  The maximum cover length limit that I added in 78e73e875
(to band-aid over c9b0c678d's performance issues) creates too many
user-visible behavior discrepancies, as complained of for example in
bug #17691.  The real problem with hlCover() is not what I thought
at the time, but more that it seems to have been designed with only
AND tsquery semantics in mind.  It doesn't work quite right for OR,
and even less so for NOT or phrase queries.  However, we can improve
that situation by building a variant of TS_execute() that returns a
list of match locations.  We already get an ExecPhraseData struct
representing match locations for the primitive case of a simple match,
as well as one for a phrase match; we just need to add some logic to
combine these for AND and OR operators.  The result is a list of
ExecPhraseDatas, which hlCover can regard as having simple AND
semantics, so that its old algorithm works correctly.

There's still a lot not to like about ts_headline's behavior, but
I think the remaining issues have to do with the heuristics used
in mark_hl_words and mark_hl_fragments (which, likewise, were not
revisited when phrase search was added).  Improving those is a task
for another day.

Patch by me; thanks to Alvaro Herrera for review.

Discussion: https://postgr.es/m/840.1669405935@sss.pgh.pa.us
2023-01-19 16:21:44 -05:00
Tom Lane 44e9e34266 Log the correct ending timestamp in recovery_target_xid mode.
When ending recovery based on recovery_target_xid matching with
recovery_target_inclusive = off, we printed an incorrect timestamp
(always 2000-01-01) in the "recovery stopping before ... transaction"
log message.  This is a consequence of sloppy refactoring in
c945af80c: the code to fetch recordXtime out of the commit/abort
record used to be executed unconditionally, but it was changed
to get called only in the RECOVERY_TARGET_TIME case.  We need only
flip the order of operations to restore the intended behavior.

Per report from Torsten Förtsch.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/CAKkG4_kUevPqbmyOfLajx7opAQk6Cvwkvx0HRcFjSPfRPTXanA@mail.gmail.com
2023-01-19 12:23:20 -05:00
Alvaro Herrera 438e6b7240
Remove some dead code in selfuncs.c
RelOptInfo.userid is the same for all relations in a given inheritance
tree, so the code in examine_variable() and example_simple_variable()
that repeats the ACL checks on the root parent rel instead of a given
leaf child relations need not recompute userid too.

Author: Amit Langote <amitlangote09@gmail.com>
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20221210201753.GA27893@telsasoft.com
2023-01-19 12:54:15 +01:00
Michael Paquier 7e8a80d1fe Add missing assign hook for GUC checkpoint_completion_target
This is wrong since 88e9823, that has switched the WAL sizing
configuration from checkpoint_segments to min_wal_size and
max_wal_size.  This missed the recalculation of the internal value of
the internal "CheckPointSegments", that works as a mapping of the old
GUC checkpoint_segments, on reload, for example, and it controls the
timing of checkpoints depending on the volume of WAL generated.

Most users tend to leave checkpoint_completion_target at 0.9 to smooth
the I/O workload, which is why I guess this has gone unnoticed for so
long, still it can be useful to tweak and reload the value dynamically
in some cases to control the timing of checkpoints.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACXgPPAm28mruojSBno+F_=9cTOOxHAywu_dfZPeBdybQw@mail.gmail.com
Backpatch-through: 11
2023-01-19 13:13:05 +09:00
Andres Freund 12605414a7 Use dlists instead of SHM_QUEUE for syncrep queue
Part of a series to remove SHM_QUEUE. ilist.h style lists are more widely used
and have an easier to use interface.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com> (in an older version)
Discussion: https://postgr.es/m/20221120055930.t6kl3tyivzhlrzu2@awork3.anarazel.de
Discussion: https://postgr.es/m/20200211042229.msv23badgqljrdg2@alap3.anarazel.de
2023-01-18 12:15:05 -08:00
Andres Freund 5764f611e1 Use dlist/dclist instead of PROC_QUEUE / SHM_QUEUE for heavyweight locks
Part of a series to remove SHM_QUEUE. ilist.h style lists are more widely used
and have an easier to use interface.

As PROC_QUEUE is now unused, remove it.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com> (in an older version)
Discussion: https://postgr.es/m/20221120055930.t6kl3tyivzhlrzu2@awork3.anarazel.de
Discussion: https://postgr.es/m/20200211042229.msv23badgqljrdg2@alap3.anarazel.de
2023-01-18 11:41:14 -08:00
Andres Freund 2b16208753 Fix ILIST_DEBUG build
In c8ad4d8166 dlist_member_check()'s arguments were made const. Unfortunately
the implementation of dlist_member_check() used dlist_foreach(), which
currently doesn't work for const lists.

As a workaround, open-code the list iteration. The other check functions
already do so.

Discussion: https://postgr.es/m/20230118182214.co7dp4oahiunwg57@awork3.anarazel.de
2023-01-18 10:26:15 -08:00
Tom Lane 47bb9db759 Get rid of the "new" and "old" entries in a view's rangetable.
The rule system needs "old" and/or "new" pseudo-RTEs in rule actions
that are ON INSERT/UPDATE/DELETE.  Historically it's put such entries
into the ON SELECT rules of views as well, but those are really quite
vestigial.  The only thing we've used them for is to carry the
view's relid forward to AcquireExecutorLocks (so that we can
re-lock the view to verify it hasn't changed before re-using a plan)
and to carry its relid and permissions data forward to execution-time
permissions checks.  What we can do instead of that is to retain
these fields of the RTE_RELATION RTE for the view even after we
convert it to an RTE_SUBQUERY RTE.  This requires a tiny amount of
extra complication in the planner and AcquireExecutorLocks, but on
the other hand we can get rid of the logic that moves that data from
one place to another.

The principal immediate benefit of doing this, aside from a small
saving in the pg_rewrite data for views, is that these pseudo-RTEs
no longer trigger ruleutils.c's heuristic about qualifying variable
names when the rangetable's length is more than 1.  That results
in quite a number of small simplifications in regression test outputs,
which are all to the good IMO.

Bump catversion because we need to dump a few more fields of
RTE_SUBQUERY RTEs.  While those will always be zeroes anyway in
stored rules (because we'd never populate them until query rewrite)
they are useful for debugging, and it seems like we'd better make
sure to transmit such RTEs accurately in plans sent to parallel
workers.  I don't think the executor actually examines these fields
after startup, but someday it might.

This is a second attempt at committing 1b4d280ea.  The difference
from the first time is that now we can add some filtering rules to
AdjustUpgrade.pm to allow cross-version upgrade testing to pass
despite all the cosmetic changes in CREATE VIEW outputs.

Amit Langote (filtering rules by me)

Discussion: https://postgr.es/m/CA+HiwqEf7gPN4Hn+LoZ4tP2q_Qt7n3vw7-6fJKOf92tSEnX6Gg@mail.gmail.com
Discussion: https://postgr.es/m/891521.1673657296@sss.pgh.pa.us
2023-01-18 13:23:57 -05:00
Tom Lane 8d83a5d0a2 Remove redundant grouping and DISTINCT columns.
Avoid explicitly grouping by columns that we know are redundant
for sorting, for example we need group by only one of x and y in
	SELECT ... WHERE x = y GROUP BY x, y
This comes up more often than you might think, as shown by the
changes in the regression tests.  It's nearly free to detect too,
since we are just piggybacking on the existing logic that detects
redundant pathkeys.  (In some of the existing plans that change,
it's visible that a sort step preceding the grouping step already
didn't bother to sort by the redundant column, making the old plan
a bit silly-looking.)

To do this, build processed_groupClause and processed_distinctClause
lists that omit any provably-redundant sort items, and consult those
not the originals where relevant.  This means that within the
planner, one should usually consult root->processed_groupClause or
root->processed_distinctClause if one wants to know which columns
are to be grouped on; but to check whether grouping or distinct-ing
is happening at all, check non-NIL-ness of parse->groupClause or
parse->distinctClause.  This is comparable to longstanding rules
about handling the HAVING clause, so I don't think it'll be a huge
maintenance problem.

nodeAgg.c also needs minor mods, because it's now possible to generate
AGG_PLAIN and AGG_SORTED Agg nodes with zero grouping columns.

Patch by me; thanks to Richard Guo and David Rowley for review.

Discussion: https://postgr.es/m/185315.1672179489@sss.pgh.pa.us
2023-01-18 12:37:57 -05:00
Amit Kapila d540a02a72 Display the leader apply worker's PID for parallel apply workers.
Add leader_pid to pg_stat_subscription. leader_pid is the process ID of
the leader apply worker if this process is a parallel apply worker. If
this field is NULL, it indicates that the process is a leader apply
worker or a synchronization worker. The new column makes it easier to
distinguish parallel apply workers from other kinds of workers and helps
to identify the leader for the parallel workers corresponding to a
particular subscription.

Additionally, update the leader_pid column in pg_stat_activity as well to
display the PID of the leader apply worker for parallel apply workers.

Author: Hou Zhijie
Reviewed-by: Peter Smith, Sawada Masahiko, Amit Kapila, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-18 09:03:12 +05:30
Michael Paquier 14bdb3f13d Refactor code for restoring files via shell commands
Presently, restore_command uses a different code path than
archive_cleanup_command and recovery_end_command.  These code paths
are similar and can be easily combined, as long as it is possible to
identify if a command should:
- Issue a FATAL on signal.
- Exit immediately on SIGTERM.

While on it, this removes src/common/archive.c and its associated
header.  Since the introduction of c96de2c, BuildRestoreCommand() has
become a simple wrapper of replace_percent_placeholders() able to call
make_native_path().  This simplifies shell_restore.c as long as
RestoreArchivedFile() includes a call to make_native_path().

Author: Nathan Bossart
Reviewed-by: Andres Freund, Michael Paquier
Discussion: https://postgr.es/m/20221227192449.GA3672473@nathanxps13
2023-01-18 11:15:48 +09:00
Michael Paquier 2f31f405e1 Constify the arguments of copydir.h functions
This makes sure that the internal logic of these functions does not
attempt to change the value of the arguments constified, and it removes
one unconstify() in basic_archive.c.

Author: Nathan Bossart
Reviewed-by: Andrew Dunstan, Peter Eisentraut
Discussion: https://postgr.es/m/20230114231126.GA2580330@nathanxps13
2023-01-18 08:55:26 +09:00
Peter Eisentraut 2a1d7071c4 Refactor recordExtObjInitPriv()
Instead of half a dozen of mostly-duplicate conditional branches,
write one common one that can handle most catalogs.  We already have
all the information we need, such as which system catalog corresponds
to which catalog table and which column is the ACL column.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/504bc485-6bd6-dd1b-fe10-e7351aeb310d@enterprisedb.com
2023-01-17 20:06:06 +01:00
Peter Eisentraut 13b345df64 Remove AggregateRelationId from recordExtObjInitPriv()
This was erroneous, because AggregateRelationId has no OID, so it
cannot be part of an extension directly.  (Aggregates are registered
via pg_proc.)  No harm in practice, but better to make it correct.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/504bc485-6bd6-dd1b-fe10-e7351aeb310d@enterprisedb.com
2023-01-17 20:06:06 +01:00
John Naylor e29c565343 Remove dead code in formatting.c
Remove some code guarded by IS_MINUS() or IS_PLUS(), where the entire
stanza is inside an else-block where both of these are false. This
should slightly improve test coverage.

While at it, remove coding that apparently assumes that unsetting a
bit is so expensive that we have to first check if it's already set
in the first place.

Per Coverity report from Ranier Vilela
Analysis and review by Justin Pryzby

Discussion: https://www.postgresql.org/message-id/20221223010818.GP1153%40telsasoft.com
2023-01-17 13:55:49 +07:00
Amit Kapila c981d9145d Improve the code to decide and process the apply action.
The code that decides the apply action missed to handle non-transactional
messages and we didn't catch it in our testing as currently such messages
are simply ignored by the apply worker. This was introduced by changes in
commit 216a784829.

While testing this, I noticed that we forgot to reset stream_xid after
processing the stream stop message which could also result in the wrong
apply action after the fix for non-transactional messages.

In passing, change assert to elog for unexpected apply action in some of
the routines so as to catch the problems in the production environment, if
any.

Reported-by: Tomas Vondra
Author: Amit Kapila
Reviewed-by: Tomas Vondra, Sawada Masahiko, Hou Zhijie
Discussion: https://postgr.es/m/984ff689-adde-9977-affe-cd6029e850be@enterprisedb.com
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-17 11:28:22 +05:30
David Rowley da5800d5fa Don't presort ORDER BY/DISTINCT Aggrefs with volatile functions
In 1349d2790, we gave the planner the ability to provide ORDER BY/DISTINCT
Aggrefs with presorted input so that nodeAgg would not have to perform
sorts during execution.  That commit failed to properly consider the
implications of if the Aggref had a volatile function in its ORDER
BY/DISTINCT clause.  As it happened, this resulted in an ERROR about the
volatile function being missing from the targetlist.

Here, instead of adding the volatile function to the targetlist, we just
never consider an Aggref with a volatile function in its ORDER BY/DISTINCT
clause when choosing which Aggrefs we should sort by.  We do this as if we
were to choose a plan which provided these aggregates with presorted
input, then if there were many such aggregates which could all share the
same sort order, then it may be surprising if they all shared the same
sort sometimes and didn't at other times when some other set of aggregates
were given presorted results.  We can avoid this inconsistency by just
never providing these volatile function aggregates with presorted input.

Reported-by: Dean Rasheed
Discussion: https://postgr.es/m/CAEZATCWETioXs5kY8vT6BVguY41_wD962VDk=u_Nvd7S1UXzuQ@mail.gmail.com
2023-01-17 16:37:06 +13:00
Peter Geoghegan 980ae17310 Tighten up VACUUM's approach to setting VM bits.
Tighten up the way that visibilitymap_set() is called: request that both
the all-visible and all-frozen bits get set whenever the all-frozen bit
is set, regardless of what we think we know about the present state of
the all-visible bit.  Also make sure that the page level PD_ALL_VISIBLE
flag is set in the same code path.

In practice there doesn't seem to be a concrete scenario in which the
previous approach could lead to inconsistencies.  It was almost possible
in scenarios involving concurrent HOT updates from transactions that
abort, but (unlike pruning) freezing can never remove XIDs > VACUUM's
OldestXmin, even those from transactions that are known to have aborted.
That was protective here.

These issues have been around since commit a892234f83, which added the
all-frozen bit to the VM fork.  There is no known live bug here, so no
backpatch.

In passing, add some defensive assertions to catch the issue, and stop
reading the existing state of the VM when setting the VM in VACUUM's
final heap pass.  We already know that affected pages must have had at
least one LP_DEAD item before we set it LP_UNUSED, so there is no point
in reading the VM when it is set like this.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-WznuNGSzF8v6OsgjaC5aYsb3cZ6HW6MLm30X0d65cmSH6A@mail.gmail.com
2023-01-16 09:34:37 -08:00
Peter Eisentraut 20428d344a Add BufFileRead variants with short read and EOF detection
Most callers of BufFileRead() want to check whether they read the full
specified length.  Checking this at every call site is very tedious.
This patch provides additional variants BufFileReadExact() and
BufFileReadMaybeEOF() that include the length checks.

I considered changing BufFileRead() itself, but this function is also
used in extensions, and so changing the behavior like this would
create a lot of problems there.  The new names are analogous to the
existing LogicalTapeReadExact().

Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f3501945-c591-8cc3-5ef0-b72a2e0eaa9c@enterprisedb.com
2023-01-16 11:01:31 +01:00
Peter Eisentraut 1561612e3b Fix some BufFileRead() error reporting
Remove "%m" from error messages where errno would be bogus.  Add short
read byte counts where appropriate.

This is equivalent to what was done in
7897e3bb90, but some code was apparently
developed concurrently to that and not updated accordingly.

Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f3501945-c591-8cc3-5ef0-b72a2e0eaa9c@enterprisedb.com
2023-01-16 09:44:04 +01:00
Michael Paquier 9a740f81eb Refactor code in charge of running shell-based recovery commands
The code specific to the execution of archive_cleanup_command,
recovery_end_command and restore_command is moved to a new file named
shell_restore.c.  The code is split into three functions:
- shell_restore(), that attempts the execution of a shell-based
restore_command.
- shell_archive_cleanup(), for archive_cleanup_command.
- shell_recovery_end(), for recovery_end_command.

This introduces no functional changes, with failure patterns and logs
generated in consequence being the same as before (one case actually
generates one less DEBUG2 message "could not restore" when a restore
command succeeds but the follow-up stat() to check the size fails, but
that only matters with a elevel high enough).

This is preparatory work for allowing recovery modules, a facility
similar to archive modules, with callbacks shaped similarly to the
functions introduced here.

Author: Nathan Bossart
Reviewed-by: Andres Freund, Michael Paquier
Discussion: https://postgr.es/m/20221227192449.GA3672473@nathanxps13
2023-01-16 16:31:43 +09:00
Michael Paquier 02d3448f4f Store IdentLine->pg_user as an AuthToken
While system_user was stored as an AuthToken in IdentLine, pg_user was
stored as a plain string.  This commit changes the code as we start
storing pg_user as an AuthToken too.

This does not have any functional changes, as all the operations on
pg_user only use the string from the AuthToken.  There is no regexp
compiled and no check based on its quoting, yet.  This is in preparation
of more features that intend to extend its capabilities, like support
for regexps and group membership.

Author: Jelte Fennema
Discussion: https://postgr.es/m/CAGECzQRNow4MwkBjgPxywXdJU_K3a9+Pm78JB7De3yQwwkTDew@mail.gmail.com
2023-01-16 13:58:07 +09:00
Tom Lane 647fa50054 Remove arbitrary FUNC_MAX_ARGS limit in int2vectorin and oidvectorin.
int2vectorin limited the number of array elements it'd take to
FUNC_MAX_ARGS, which is probably fine for the traditional use-cases.
But now that pg_publication_rel.prattrs is an int2vector, it's not
fine at all: it's easy to construct cases where that can have up to
about MaxTupleAttributeNumber entries.  Trying to replicate such
tables leads to logical-replication failures.

As long as we have to touch this code anyway, let's just remove
the a-priori limit altogether, and let it accept any size that'll
be allowed by repalloc.  (Note that since int2vector isn't toastable,
we cannot store arrays longer than about BLCKSZ/2; but there is no
good excuse for letting int2vectorin depend on that.  Perhaps we
will lift the no-toast restriction someday.)

While at it, also improve the equivalent logic in oidvectorin.
I don't know of any practical use-case for long oidvectors right
now, but doing it right actually makes the code shorter.

Per report from Erik Rijkers.  Back-patch to v15 where
pg_publication_rel.prattrs was added.

Discussion: https://postgr.es/m/668ba539-33c5-8190-ca11-def2913cb94b@xs4all.nl
2023-01-15 17:32:09 -05:00
Tom Lane 3f244d020f Make new GENERATED-expressions code more bulletproof.
In commit 8bf6ec3ba I assumed that no code path could reach
ExecGetExtraUpdatedCols without having gone through
ExecInitStoredGenerated.  That turns out not to be the case in
logical replication: if there's an ON UPDATE trigger on the target
table, trigger.c will call this code before anybody has set up its
generated columns.  Having seen that, I don't have a lot of faith in
there not being other such paths.  ExecGetExtraUpdatedCols can call
ExecInitStoredGenerated for itself, as long as we are willing to
assume that it is only called in CMD_UPDATE operations, which on
the whole seems like a safer leap of faith.

Per report from Vitaly Davydov.

Discussion: https://postgr.es/m/d259d69652b8c2ff50e14cda3c236c7f@postgrespro.ru
2023-01-15 13:14:52 -05:00
Jeff Davis ff9618e82a Fix MAINTAIN privileges for toast tables and partitions.
Commit 60684dd8 left loose ends when it came to maintaining toast
tables or partitions.

For toast tables, simply skip the privilege check if the toast table
is an indirect target of the maintenance command, because the main
table privileges have already been checked.

For partitions, allow the maintenance command if the user has the
MAINTAIN privilege on the partition or any parent.

Also make CLUSTER emit "skipping" messages when the user doesn't have
privileges, similar to VACUUM.

Author: Nathan Bossart
Reported-by: Pavel Luzanov
Reviewed-by: Pavel Luzanov, Ted Yu
Discussion: https://postgr.es/m/20230113231339.GA2422750@nathanxps13
2023-01-14 00:16:23 -08:00
Andres Freund 250c8ee07e Manual cleanup and pgindent of pgstat and bufmgr related code
This is in preparation for commiting a larger patch series in the area.

Discussion: https://postgr.es/m/CAAKRu_bHwGEbzNxxy+MQDkrsgog6aO6iUvajJ4d6PD98gFU7+w@mail.gmail.com
2023-01-13 15:23:17 -08:00
Jeff Davis d46a9792a8 Clean up useless "skipping" messages for VACUUM/ANALYZE.
When VACUUM/ANALYZE are run on an entire database, it warns of
skipping relations for which the user doesn't have sufficient
privileges. That only makes sense for tables, so skip such messages
for indexes, etc.

Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/c0a85c2e83158560314b576b6241c8ed0aea1745.camel%40j-davis.com
2023-01-13 14:42:03 -08:00
Jeff Davis c44f6334ca Simplify permissions for LOCK TABLE.
The prior behavior was confusing and hard to document. For instance,
if you had UPDATE privileges, you could lock a table in any lock mode
except ACCESS SHARE mode.

Now, if granted a privilege to lock at a given mode, one also has
privileges to lock at a less-conflicting mode. MAINTAIN, UPDATE,
DELETE, and TRUNCATE privileges allow any lock mode. INSERT privileges
allow ROW EXCLUSIVE (or below). SELECT privileges allow ACCESS SHARE.

Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/9550c76535404a83156252b25a11babb4792ea1e.camel%40j-davis.com
2023-01-13 14:33:19 -08:00
Amit Kapila b7ae039536 Ignore dropped and generated columns from the column list.
We don't allow different column lists for the same table in the different
publications of the single subscription. A publication with a column list
except for dropped and generated columns should be considered the same as
a publication with no column list (which implicitly includes all columns
as part of the columns list). However, as we were not excluding the
dropped and generated columns from the column list combining such
publications leads to an error "cannot use different column lists for
table ...".

We decided not to backpatch this fix as there is a risk of users seeing
this as a behavior change and also we didn't see any field report of this
case.

Author: Shi yu
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/OSZPR01MB631091CCBC56F195B1B9ACB0FDFE9@OSZPR01MB6310.jpnprd01.prod.outlook.com
2023-01-13 14:49:23 +05:30
Amit Kapila dca8b01f5f Avoid creating parallel apply state hash table unless required.
This hash table is used to cache the state of streaming transactions being
applied by the parallel apply workers. So, this should be created only
when we are successful in launching at least one worker. This avoids rare
case memory leak when we are never able to launch any worker.

Author: Ted Yu
Discussion: https://postgr.es/m/CALte62wg0rBR3Vj2beV=HiWo2qG9L0hzKcX=yULNER0wmf4aEw@mail.gmail.com
2023-01-13 08:28:05 +05:30
Thomas Munro f1821b58fc Fix WaitEventSetWait() buffer overrun.
The WAIT_USE_EPOLL and WAIT_USE_KQUEUE implementations of
WaitEventSetWaitBlock() confused the size of their internal buffer with
the size of the caller's output buffer, and could ask the kernel for too
many events.  In fact the set of events retrieved from the kernel needs
to be able to fit in both buffers, so take the smaller of the two.

The WAIT_USE_POLL and WAIT_USE WIN32 implementations didn't have this
confusion.

This probably didn't come up before because we always used the same
number in both places, but commit 7389aad6 calculates a dynamic size at
construction time, while using MAXLISTEN for its output event buffer on
the stack.  That seems like a reasonable thing to want to do, so
consider this to be a pre-existing bug worth fixing.

As discovered by valgrind on skink.

Back-patch to all supported releases for epoll, and to release 13 for
the kqueue part, which copied the incorrect epoll code.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/901504.1673504836%40sss.pgh.pa.us
2023-01-13 11:02:12 +13:00
Alexander Korotkov 3161ae86ce Fix jsonpath existense checking of missing variables
The current jsonpath code assumes that the referenced variable always exists.
It could only throw an error at the value valuation time.  At the same time
existence checking assumes variable is present without valuation, and error
suppression doesn't work for missing variables.

This commit makes existense checking trigger an error for missing variables.
This makes the overall behavior consistent.

Backpatch to 12 where jsonpath was introduced.

Reported-by: David G. Johnston
Discussion: https://postgr.es/m/CAKFQuwbeytffJkVnEqDyLZ%3DrQsznoTh1OgDoOF3VmOMkxcTMjA%40mail.gmail.com
Author: Alexander Korotkov, David G. Johnston
Backpatch-through: 12
2023-01-12 18:16:34 +03:00
Peter Eisentraut c8ad4d8166 Constify the arguments of ilist.c/h functions
Const qualifiers ensure that we don't do something stupid in the
function implementation.  Additionally they clarify the interface.  As
an example:

    void
    slist_delete(slist_head *head, const slist_node *node)

Here one can instantly tell that node->next is not going to be set to
NULL.  Finally, const qualifiers potentially allow the compiler to do
more optimizations.  This being said, no benchmarking was done for
this patch.

The functions that return non-const pointers like slist_next_node(),
dclist_next_node() etc. are not affected by the patch intentionally.

Author: Aleksander Alekseev
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CAJ7c6TM2%3D08mNKD9aJg8vEY9hd%2BG4L7%2BNvh30UiNT3kShgRgNg%40mail.gmail.com
2023-01-12 08:00:51 +01:00
Peter Eisentraut 881fa869c6 Code cleanup
for commit c96de2ce17

Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230111185434.GA1912982@nathanxps13
2023-01-12 07:37:39 +01:00
Michael Paquier 8607630d74 Rename some variables related to ident files in hba.{c,h}
The code that handles authentication for user maps was pretty confusing
with its choice of variable names.  It involves two types of users: a
system user and a Postgres user (well, role), and these were not named
consistently throughout the code that processes the user maps loaded
from pg_ident.conf at authentication.

This commit changes the following things to improve the situation:
- Rename "pg_role" to "pg_user" and "token" to "system_user" in
IndetLine.  These choices are more consistent with the pg_ident.conf
example in the docs, as well.  "token" has been introduced recently in
fc579e1, and it is way worse than the choice before that, "ident_user".
- Switch the order of the fields in IdentLine to map with the order of
the items in the ident files, as of map name, system user and PG user.
- In check_ident_usermap(), rename "regexp_pgrole" to "expanded_pg_user"
when processing a regexp for the system user entry in a user map.  This
variable does not store a regular expression at all: it would be either
a string or a substitution to \1 if the Postgres role is specified as
such.

Author: Jelte Fennema
Discussion: https://postgr.es/m/CAGECzQTkwELHUOAKhvdA+m3tWbUQySHHkExJV8GAZ1pwgbEgXg@mail.gmail.com
2023-01-12 14:23:20 +09:00
Michael Paquier 5ad165d2c0 Acquire spinlock when updating 2PC slot data during logical decoding creation
The creation of a logical decoding context in CreateDecodingContext()
updates some data of its slot for two-phase transactions if enabled by
the caller, but the code forgot to acquire a spinlock when updating
these fields like any other code paths.  This could lead to the read of
inconsistent data.

Oversight in a8fd13c.

Author: Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoAD8_fp47191LKuecjDd3DYhoQ4TaucFco1_TEr_jQ-Zw@mail.gmail.com
Backpatch-through: 15
2023-01-12 13:40:33 +09:00
Tom Lane f0e6d6d3c9 Revert "Get rid of the "new" and "old" entries in a view's rangetable."
This reverts commit 1b4d280ea1.
It's broken the buildfarm members that run cross-version-upgrade tests,
because they're not prepared to deal with cosmetic differences between
CREATE VIEW commands emitted by older servers and HEAD.  Even if we had
a solution to that, which we don't, it'd take some time to roll it out
to the affected animals.  This improvement isn't valuable enough to
justify addressing that problem on an emergency basis, so revert it
for now.
2023-01-11 23:01:22 -05:00
Thomas Munro 5a26c7b310 Refactor DetermineSleepTime() to use milliseconds.
Since we're not using select() anymore, we don't need to bother with
struct timeval.  We can work directly in milliseconds, which the latch
API wants.

Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
2023-01-12 16:32:30 +13:00
Thomas Munro 7389aad636 Use WaitEventSet API for postmaster's event loop.
Switch to a design similar to regular backends, instead of the previous
arrangement where signal handlers did non-trivial state management and
called fork().  The main changes are:

* The postmaster now has its own local latch to wait on.  (For now, we
  don't want other backends setting its latch directly, but that could
  probably be made to work with more research on robustness.)

* The existing signal handlers are cut in two: a handle_pm_XXX() part
  that just sets pending_pm_XXX flags and the latch, and a
  process_pm_XXX() part that runs later when the latch is seen.

* Signal handlers are now installed with the regular pqsignal()
  function rather than the special pqsignal_pm() function; historical
  portability concerns about the effect of SA_RESTART on select() are no
  longer relevant, and we don't need to block signals anymore.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
2023-01-12 16:32:20 +13:00
Peter Geoghegan d30b499997 Make lazy_vacuum_heap_rel match lazy_scan_heap.
Make lazy_vacuum_heap_rel variable names match those from lazy_scan_heap
where that makes sense.

Extracted from a larger patch to deal with issues with how vacuumlazy.c
sets pages all-frozen.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznuNGSzF8v6OsgjaC5aYsb3cZ6HW6MLm30X0d65cmSH6A@mail.gmail.com
2023-01-11 18:45:32 -08:00
Peter Geoghegan 8c233cf86b vacuumlazy.c: Tweak local variable name.
Make a local variable name consistent with the name from its WAL record.

Extracted from a larger patch to deal with issues with how vacuumlazy.c
sets pages all-frozen.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznuNGSzF8v6OsgjaC5aYsb3cZ6HW6MLm30X0d65cmSH6A@mail.gmail.com
2023-01-11 17:57:18 -08:00
Peter Geoghegan 50767705ed Rename and relocate freeze plan dedup routines.
Rename the heapam.c freeze plan deduplication routines added by commit
9e540599 to names that follow conventions for functions in heapam.c.
Also relocate the functions so that they're next to their caller, which
runs during original execution, when FREEZE_PAGE WAL records are built.

The routines were initially placed next to (and followed the naming
conventions of) conceptually related REDO routine code, but that scheme
turned out to be kind of jarring when considered in a wider context.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230109214308.icz26oqvt3k2274c@awork3.anarazel.de
2023-01-11 17:30:42 -08:00
Tom Lane 1b4d280ea1 Get rid of the "new" and "old" entries in a view's rangetable.
The rule system needs "old" and/or "new" pseudo-RTEs in rule actions
that are ON INSERT/UPDATE/DELETE.  Historically it's put such entries
into the ON SELECT rules of views as well, but those are really quite
vestigial.  The only thing we've used them for is to carry the
view's relid forward to AcquireExecutorLocks (so that we can
re-lock the view to verify it hasn't changed before re-using a plan)
and to carry its relid and permissions data forward to execution-time
permissions checks.  What we can do instead of that is to retain
these fields of the RTE_RELATION RTE for the view even after we
convert it to an RTE_SUBQUERY RTE.  This requires a tiny amount of
extra complication in the planner and AcquireExecutorLocks, but on
the other hand we can get rid of the logic that moves that data from
one place to another.

The principal immediate benefit of doing this, aside from a small
saving in the pg_rewrite data for views, is that these pseudo-RTEs
no longer trigger ruleutils.c's heuristic about qualifying variable
names when the rangetable's length is more than 1.  That results
in quite a number of small simplifications in regression test outputs,
which are all to the good IMO.

Bump catversion because we need to dump a few more fields of
RTE_SUBQUERY RTEs.  While those will always be zeroes anyway in
stored rules (because we'd never populate them until query rewrite)
they are useful for debugging, and it seems like we'd better make
sure to transmit such RTEs accurately in plans sent to parallel
workers.  I don't think the executor actually examines these fields
after startup, but someday it might.

Amit Langote

Discussion: https://postgr.es/m/CA+HiwqEf7gPN4Hn+LoZ4tP2q_Qt7n3vw7-6fJKOf92tSEnX6Gg@mail.gmail.com
2023-01-11 19:41:09 -05:00
Peter Geoghegan af3855cb77 Improve TransactionIdDidAbort() documentation.
Document that TransactionIdDidAbort() won't indicate that transactions
that were in-progress during a crash have aborted.  Tie this to existing
discussion of the TransactionIdDidCommit() and TransactionIdDidCommit()
protocol that code in heapam_visibility.c (and a few other places) must
observe.

Follow-up to bugfix commit eb5ad4ff.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-Wzn4bEEqgmaUQL3aJ73yM9gAeK-wE4ngi7kjRjLztb+P0w@mail.gmail.com
2023-01-11 15:31:42 -08:00
Tom Lane 8bf6ec3ba3 Improve handling of inherited GENERATED expressions.
In both partitioning and traditional inheritance, require child
columns to be GENERATED if and only if their parent(s) are.
Formerly we allowed the case of an inherited column being
GENERATED when its parent isn't, but that results in inconsistent
behavior: the column can be directly updated through an UPDATE
on the parent table, leading to it containing a user-supplied
value that might not match the generation expression.  This also
fixes an oversight that we enforced partition-key-columns-can't-
be-GENERATED against parent tables, but not against child tables
that were dynamically attached to them.

Also, remove the restriction that the child's generation expression
be equivalent to the parent's.  In the wake of commit 3f7836ff6,
there doesn't seem to be any reason that we need that restriction,
since generation expressions are always computed per-table anyway.
By removing this, we can also allow a child to merge multiple
inheritance parents with inconsistent generation expressions, by
overriding them with its own expression, much as we've long allowed
for DEFAULT expressions.

Since we're rejecting a case that we used to accept, this doesn't
seem like a back-patchable change.  Given the lack of field
complaints about the inconsistent behavior, it's likely that no
one is doing this anyway, but we won't change it in minor releases.

Amit Langote and Tom Lane

Discussion: https://postgr.es/m/2793383.1672944799@sss.pgh.pa.us
2023-01-11 15:55:02 -05:00
Peter Eisentraut c96de2ce17 Common function for percent placeholder replacement
There are a number of places where a shell command is constructed with
percent-placeholders (like %x).  It's cumbersome to have to open-code
this several times.  This factors out this logic into a separate
function.  This also allows us to ensure consistency for and document
some subtle behaviors, such as what to do with unrecognized
placeholders.

The unified handling is now that incorrect and unknown placeholders
are an error, where previously in most cases they were skipped or
ignored.  This affects the following settings:

- archive_cleanup_command
- archive_command
- recovery_end_command
- restore_command
- ssl_passphrase_command

The following settings are part of this refactoring but already had
stricter error handling and should be unchanged in their behavior:

- basebackup_to_shell.command

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5238bbed-0b01-83a6-d4b2-7eb0562a054e%40enterprisedb.com
2023-01-11 10:42:35 +01:00
Michael Paquier 5f6401f81c Fix typos in code and comments
Author: Justin Pryzby
Discussion: https://postgr.es/m/20230110045722.GD9837@telsasoft.com
2023-01-11 15:16:38 +09:00
David Rowley 3c6fc58209 Have the planner consider Incremental Sort for DISTINCT
Prior to this, we only considered a full sort on the cheapest input path
and uniquifying any path which was already sorted in the required sort
order.  Here we adjust create_final_distinct_paths() so that it also
adds an Incremental Sort path on any path which has presorted keys.

Additionally, this adjusts the parallel distinct code so that we now
consider sorting the cheapest partial path and incrementally sorting any
partial paths with presorted keys.  Previously we didn't consider any
sorting for parallel distinct and only added a unique path atop any path
which had the required pathkeys already.

Author: David Rowley
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/CAApHDvo8Lz2H=42urBbfP65LTcEUOh288MT7DsG2_EWtW1AXHQ@mail.gmail.com
2023-01-11 10:25:43 +13:00
Robert Haas e5b8a4c098 Add new GUC createrole_self_grant.
Can be set to the empty string, or to either or both of "set" or
"inherit". If set to a non-empty value, a non-superuser who creates
a role (necessarily by relying up the CREATEROLE privilege) will
grant that role back to themselves with the specified options.

This isn't a security feature, because the grant that this feature
triggers can also be performed explicitly. Instead, it's a user experience
feature. A superuser would necessarily inherit the privileges of any
created role and be able to access all such roles via SET ROLE;
with this patch, you can configure createrole_self_grant = 'set, inherit'
to provide a similar experience for a user who has CREATEROLE but not
SUPERUSER.

Discussion: https://postgr.es/m/CA+TgmobN59ct+Emmz6ig1Nua2Q-_o=r6DSD98KfU53kctq_kQw@mail.gmail.com
2023-01-10 12:44:49 -05:00
Robert Haas cf5eb37c5e Restrict the privileges of CREATEROLE users.
Previously, CREATEROLE users were permitted to make nearly arbitrary
changes to roles that they didn't create, with certain exceptions,
particularly superuser roles.  Instead, allow CREATEROLE users to make such
changes to roles for which they possess ADMIN OPTION, and to
grant membership only in roles for which they possess ADMIN OPTION.

When a CREATEROLE user who is not a superuser creates a role, grant
ADMIN OPTION on the newly-created role to the creator, so that they
can administer roles they create or for which they have been given
privileges.

With these changes, CREATEROLE users still have very significant
powers that unprivileged users do not receive: they can alter, rename,
drop, comment on, change the password for, and change security labels
on roles.  However, they can now do these things only for roles for
which they possess appropriate privileges, rather than all
non-superuser roles; moreover, they cannot grant a role such as
pg_execute_server_program unless they themselves possess it.

Patch by me, reviewed by Mark Dilger.

Discussion: https://postgr.es/m/CA+TgmobN59ct+Emmz6ig1Nua2Q-_o=r6DSD98KfU53kctq_kQw@mail.gmail.com
2023-01-10 12:44:30 -05:00
Dean Rasheed f026c16a2c Fix MERGE's test for unreachable WHEN clauses.
The former code would only detect an unreachable WHEN clause if it had
an AND condition. Fix, so that unreachable unconditional WHEN clauses
are also detected.

Back-patch to v15, where MERGE was added.

Discussion: https://postgr.es/m/CAEZATCVQ=7E2z4cSBB49jjeGGsB6WeoYQY32NDeSvcHiLUZ=ow@mail.gmail.com
2023-01-10 14:17:47 +00:00
Peter Eisentraut d952373a98 New header varatt.h split off from postgres.h
This new header contains all the variable-length data types support
(TOAST support) from postgres.h, which isn't needed by large parts of
the backend code.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/ddcce239-0f29-6e62-4b47-1f8ca742addf%40enterprisedb.com
2023-01-10 05:54:36 +01:00
Amit Kapila f745739697 Fix the display of lock information for specktoken.
A transaction id is now displayed in the transactionid field and
speculative insertion token is displayed in the objid field.

Author: Sawada Masahiko
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAD21AoCEKxZztULP1CDm45aSNNR1QO-Bh1q6LMTspQ78PBuJrw@mail.gmail.com
2023-01-10 08:53:47 +05:30
Tom Lane 38d81760c4 Invent random_normal() to provide normally-distributed random numbers.
There is already a version of this in contrib/tablefunc, but it
seems sufficiently widely useful to justify having it in core.

Paul Ramsey

Discussion: https://postgr.es/m/CACowWR0DqHAvOKUCNxTrASFkWsDLqKMd6WiXvVvaWg4pV1BMnQ@mail.gmail.com
2023-01-09 12:44:00 -05:00
John Naylor 2673ebf49a Remove redundant setting of tuplesort status
Also add an explanatory comment to match other similar coding within
tuplesort_performsort().

Xing Guo

Reviewed by Richard Guo and Cary Huang
Discussion: https://www.postgresql.org/message-id/CACpMh%2BAQ4GXRKKi9ib2ioUH%2BqwNaSAVbetssJ0tMPfxAWuL2yg%40mail.gmail.com
2023-01-09 16:53:21 +07:00
Amit Kapila c06caa0d62 Fix the file mode of worker.c changed by the commit 216a784829.
Reported-by: Japin Li
Discussion: https://postgr.es/m/MEYP282MB166970D1559B7CC74D3E339BB6FE9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2023-01-09 14:04:14 +05:30
David Rowley 3c569049b7 Allow left join removals and unique joins on partitioned tables
This allows left join removals and unique joins to work with partitioned
tables.  The planner just lacked sufficient proofs that a given join
would not cause any row duplication.  Unique indexes currently serve as
that proof, so have get_relation_info() populate the indexlist for
partitioned tables too.

Author: Arne Roland
Reviewed-by: Alvaro Herrera, Zhihong Yu, Amit Langote, David Rowley
Discussion: https://postgr.es/m/c3b2408b7a39433b8230bbcd02e9f302@index.de
2023-01-09 17:15:08 +13:00
Amit Kapila 216a784829 Perform apply of large transactions by parallel workers.
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.

In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode -  in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.

This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).

In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.

Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-09 07:52:45 +05:30
Alexander Korotkov cd9479af2a Improve GIN cost estimation
GIN index scans were not taking any descent CPU-based cost into account.  That
made them look cheaper than other types of indexes when they shouldn't be.

We use the same heuristic as for btree indexes, but multiply it by the number
of searched entries.

Additionally, the CPU cost for the tree was based largely on a
genericcostestimate.  For a GIN index, we should not charge index quals per
tuple, but per entry. On top of this, charge cpu_index_tuple_cost per actual
tuple.

This should fix the cases where a GIN index is preferred over a btree and
the ones where a memoize node is not added on top of the GIN index scan
because it seemed too cheap.

We don't packpatch this to evade unexpected plan changes in stable versions.

Discussion: https://postgr.es/m/CABs3KGQnOkyQ42-zKQqiE7M0Ks9oWDSee%3D%2BJx3-TGq%3D68xqWYw%40mail.gmail.com
Discussion: https://postgr.es/m/3188617.44csPzL39Z%40aivenronan
Author: Ronan Dunklau
Reported-By: Hung Nguyen
Reviewed-by: Tom Lane, Alexander Korotkov
2023-01-08 22:51:43 +03:00
Alexander Korotkov eb5c4e953b Extract the multiplier for CPU process cost of index page into a macro
B-tree, GiST and SP-GiST all charge 50.0 * cpu_operator_cost for processing
an index page.  Extract this to a macro to avoid repeated magic numbers.

Discussion: https://mail.google.com/mail/u/0/?ik=a20b091faa&view=om&permmsgid=msg-f%3A1751459697261369543
Author: Ronan Dunklau
2023-01-08 22:37:33 +03:00
Amit Kapila 2b6df05461 Remove the streaming files for incomplete xacts after restart.
After restart, we try to stream the changes for large transactions that
were not sent before server crash and restart. However, we forget to send
the abort message for such transactions. This leads to spurious streaming
files on the subscriber which won't be cleaned till the apply worker or
the subscriber server restarts.

Reported-by: Dilip Kumar
Author: Hou Zhijie
Reviewed-by: Dilip Kumar and Amit Kapila
Backpatch-through: 14
Discussion: https://postgr.es/m/OS0PR01MB5716A773F46768A1B75BE24394FB9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-01-07 12:17:14 +05:30
David Rowley a14a583292 Add additional regression tests for select_active_windows
During the development of 728202b63, which was aimed at reducing the
number of sorts required to evaluate multiple window functions with
different WindowClause definitions, the code written sorted the
WindowClauses in reverse tleSortGroupRef order.  There appears to be no
discussion in the thread which was opened to discuss the development of
this patch and no comments mentioning the fact that having the
WindowClauses in reverse tleSortGroupRef order makes it more likely that
the final WindowClause to be evaluated will provide presorted input to
the query's DISTINCT or ORDER BY clause.  The reason for this is that the
tleSortGroupRef indexes are assigned for the DISTINCT and ORDER BY clauses
before they are for the WindowClauses PARTITION BY and ORDER BY clauses.
Putting the WindowClause with the lowest tleSortGroupRef last means that
it's more likely that no additional sorting is required for the query's
DISTINCT or ORDER BY clause.

All we're doing here is adding some tests and a comment to help ensure
that remains true and that we don't accidentally forget to consider this
again should we ever rewrite that code.

Author: Ankit Kumar Pandey, David Rowley
Discussion: https://postgr.es/m/CAApHDvq=g2=ny59f1bvwRVvupsgPHK-KjLPBsSL25fVuGZ4idQ@mail.gmail.com
2023-01-07 15:24:35 +13:00
Tom Lane c6e1f62e2c Wake up a subscription's replication worker processes after DDL.
Waken related worker processes immediately at commit of a transaction
that has performed ALTER SUBSCRIPTION (including the RENAME and
OWNER variants).  This reduces the response time for such operations.
In the real world that might not be worth much, but it shaves several
seconds off the runtime for the subscription test suite.

In the case of PREPARE, we just throw away this notification state;
it doesn't seem worth the work to preserve it.  The workers will
still react after the eventual COMMIT PREPARED, but not as quickly.

Nathan Bossart

Discussion: https://postgr.es/m/20221122004119.GA132961@nathanxps13
2023-01-06 17:27:58 -05:00
Tom Lane 4c032dd804 Check for two_phase change at end of process_syncing_tables_for_apply.
Previously this function checked to see if we were ready to switch
to two_phase mode at its start, but that's silly: we should check
at the end, after we've done the work that might make us ready.
This simple change removes one sleep cycle from the time needed to
switch to two_phase mode.  In the real world that might not be
worth much, but it shaves a few seconds off the runtime for the
subscription test suite.

Nathan Bossart

Discussion: https://postgr.es/m/20221122004119.GA132961@nathanxps13
2023-01-06 17:27:58 -05:00
Tom Lane a46a7011b2 Add options to control whether VACUUM runs vac_update_datfrozenxid.
VACUUM normally ends by running vac_update_datfrozenxid(), which
requires a scan of pg_class.  Therefore, if one attempts to vacuum a
database one table at a time --- as vacuumdb has done since v12 ---
we will spend O(N^2) time in vac_update_datfrozenxid().  That causes
serious performance problems in databases with tens of thousands of
tables, and indeed the effect is measurable with only a few hundred.
To add insult to injury, only one process can run
vac_update_datfrozenxid at the same time per DB, so this behavior
largely defeats vacuumdb's -j option.

Hence, invent options SKIP_DATABASE_STATS and ONLY_DATABASE_STATS
to allow applications to postpone vac_update_datfrozenxid() until the
end of a series of VACUUM requests, and teach vacuumdb to use them.

Per bug #17717 from Gunnar L.  Sadly, this answer doesn't seem
like something we'd consider back-patching, so the performance
problem will remain in v12-v15.

Tom Lane and Nathan Bossart

Discussion: https://postgr.es/m/17717-6c50eb1c7d23a886@postgresql.org
2023-01-06 14:17:25 -05:00
Tom Lane cd4b2334db Invalidate pgoutput's replication-decisions cache upon schema rename.
A schema rename should cause reporting the new qualified names of
tables to logical replication subscribers, but that wasn't happening.
Flush the RelationSyncCache to make it happen.

(If you ask me, the new test case shows that the behavior in this area
is still pretty dubious, but apparently it's operating as designed.)

Vignesh C

Discussion: https://postgr.es/m/CALDaNm32vLRv5KdrDFeVC-CU+4Wg1daA55hMqOxDGJBzvd76-w@mail.gmail.com
2023-01-06 11:11:51 -05:00
Peter Eisentraut 4037c5e2fe Fix typo
This doesn't affect the correctness of the code, but it was clearly
inconsistent before this change.
2023-01-06 14:25:19 +01:00
Thomas Munro 72aea955d4 Fix pg_truncate() on Windows.
Commit 57faaf376 added pg_truncate(const char *path, off_t length), but
"length" was ignored under WIN32 and the file was unconditionally
truncated to 0.

There was no live bug, since the only caller passes 0.

Fix, and back-patch to 14 where the function arrived.

Author: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20230106031652.GR3109%40telsasoft.com
2023-01-06 16:42:47 +13:00
Robert Haas 39cffe95f2 Pass down current user ID to AddRoleMems and DelRoleMems.
This is just refactoring; there should be no functonal change. It
might have the effect of slightly reducing the number of calls to
GetUserId(), but the real point is to facilitate future work in
this area.

Patch by me, reviewed by Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmobFzTLkLwOquFrAcdsWBsOWDr-_H-jw+qBvfx-wSzMwDA@mail.gmail.com
2023-01-05 14:33:35 -05:00
Robert Haas 25bb03166b Refactor permissions-checking for role grants.
Instead of having checks in AddRoleMems() and DelRoleMems(), have
the callers perform checks where it's required. In some cases it
isn't, either because the caller has already performed a check for
the same condition, or because the check couldn't possibly fail.

The "Skip permission check if nothing to do" check in each of
AddRoleMems() and DelRoleMems() is pointless. Some call sites
can't pass an empty list. Others can, but in those cases, the role
being modified is one that the current user has just created.
Therefore, they must have permission to modify it, and so no
permission check is required at all.

This patch is intended to have no user-visible consequences. It is
intended to simplify future work in this area.

Patch by me, reviewed by Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmobFzTLkLwOquFrAcdsWBsOWDr-_H-jw+qBvfx-wSzMwDA@mail.gmail.com
2023-01-05 14:30:40 -05:00
Tom Lane 3f7836ff65 Fix calculation of which GENERATED columns need to be updated.
We were identifying the updatable generated columns of inheritance
children by transposing the calculation made for their parent.
However, there's nothing that says a traditional-inheritance child
can't have generated columns that aren't there in its parent, or that
have different dependencies than are in the parent's expression.
(At present it seems that we don't enforce that for partitioning
either, which is likely wrong to some degree or other; but the case
clearly needs to be handled with traditional inheritance.)

Hence, drop the very-klugy-anyway "extraUpdatedCols" RTE field
in favor of identifying which generated columns depend on updated
columns during executor startup.  In HEAD we can remove
extraUpdatedCols altogether; in back branches, it's still there but
always empty.  Another difference between the HEAD and back-branch
versions of this patch is that in HEAD we can add the new bitmap field
to ResultRelInfo, but that would cause an ABI break in back branches.
Like 4b3e37993, add a List field at the end of struct EState instead.

Back-patch to v13.  The bogus calculation is also being made in v12,
but it doesn't have the same visible effect because we don't use it
to decide which generated columns to recalculate; as a consequence of
which the patch doesn't apply easily.  I think that there might still
be a demonstrable bug associated with trigger firing conditions, but
that's such a weird corner-case usage that I'm content to leave it
unfixed in v12.

Amit Langote and Tom Lane

Discussion: https://postgr.es/m/CA+HiwqFshLKNvQUd1DgwJ-7tsTp=dwv7KZqXC4j2wYBV1aCDUA@mail.gmail.com
Discussion: https://postgr.es/m/2793383.1672944799@sss.pgh.pa.us
2023-01-05 14:12:17 -05:00
David Rowley b82557ecc2 Fix some compiler warnings in aset.c and generation.c
This fixes a couple of unused variable warnings that could be seen when
compiling with MEMORY_CONTEXT_CHECKING but not USE_ASSERT_CHECKING.
Defining MEMORY_CONTEXT_CHECKING without asserts is a little unusual,
however, we shouldn't be producing any warnings from such a build.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_D-vgLEh7eO47p=73u1jWO78NWf6Qfv1FndY1kG-Q-jA@mail.gmail.com
2023-01-05 12:56:17 +13:00
Peter Geoghegan eb5ad4ff05 Check that xmax didn't commit in freeze check.
We cannot rely on TransactionIdDidAbort here, since in general it may
report transactions that were in-progress at the time of an earlier hard
crash as not aborted, effectively behaving as if they were still in
progress even after crash recovery completes.  Go back to defensively
verifying that xmax didn't commit instead.

Oversight in commit 79d4bf4e.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230104035636.hy5djyr2as4gbc4q@awork3.anarazel.de
2023-01-03 21:48:27 -08:00
Peter Geoghegan 5212d447fa Update obsolete multixact.c comments.
Commit 4f627f89 switched SLRU truncation for multixacts back to being a
task performed during VACUUM, but missed some comments that continued to
reference truncation happening as part of checkpointing.  Update those
comments now.

Also update comments that became obsolete when commit c3ffa731 changed
the way that vacuum_multixact_freeze_min_age is applied by VACUUM as it
computes its MultiXactCutoff cutoff (which is used by VACUUM to decide
what to freeze).  Explain the same issues by referencing how OldestMxact
is the latest valid value that relminmxid can ever be advanced to at the
end of a VACUUM (following the work in commit 0b018fab).
2023-01-03 16:54:35 -08:00
Peter Geoghegan 54afdcd618 vacuumlazy.c: Save get_database_name() in vacrel.
This brings dbname strings in line with namespace and relation name
strings.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkQ1TKU-DdNvnGeL870di3+CU1UTo-7nw7xFDpVE-XGjA@mail.gmail.com
2023-01-03 11:48:47 -08:00
Peter Geoghegan 79d4bf4eff Delay commit status checks until freezing executes.
pg_xact lookups are relatively expensive.  Move the xmin/xmax commit
status checks from the point that freeze plans are prepared to the point
that they're actually executed.  Otherwise we'll repeat many commit
status checks whenever multiple successive VACUUM operations scan the
same pages and decide against freezing each time, which is a waste of
cycles.

Oversight in commit 1de58df4, which added page-level freezing.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkZpe4K6qMfEt8H4qYJCKc2R7TPvKsBva7jc9w7iGXQSw@mail.gmail.com
2023-01-03 11:22:36 -08:00
Peter Geoghegan b37a083239 Refine the definition of page-level freezing.
Improve comments added by commit 1de58df4 which describe the
lazy_scan_prune "freeze the page" path.  These newly revised comments
are based on suggestions from Jeff Davis.

In passing, remove nearby visibility_cutoff_xid comments left over from
commit 6daeeb1f.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/ebc857107fe3edd422ef8a65191ca4a8da568b9b.camel@j-davis.com
2023-01-03 10:08:55 -08:00
Peter Eisentraut bf03cfd162 Windows support in pg_import_system_collations
Windows can enumerate the locales that are either installed or
supported by calling EnumSystemLocalesEx(), similar to what is already
done in the READ_LOCALE_A_OUTPUT switch.  We can refactor some of the
logic already used in that switch into a new function
create_collation_from_locale().

The enumerated locales have BCP 47 shape, that is with a hyphen
between language and territory, instead of POSIX's underscore.  The
created collations will retain the BCP 47 shape, but we will also
create a POSIX alias, so xx-YY will have an xx_YY alias.

A new test collate.windows.win1252 is added that is like
collate.linux.utf8.

Author: Juan Jose Santamaria Flecha <juanjo.santamaria@gmail.com>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/0050ec23-34d9-2765-9015-98c04f0e18ac@postgrespro.ru
2023-01-03 14:21:56 +01:00
Michael Paquier 33ab0a2a52 Fix typos in comments, code and documentation
While on it, newlines are removed from the end of two elog() strings.
The others are simple grammar mistakes.  One comment in pg_upgrade
referred incorrectly to sequences since a7e5457.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com
Backpatch-through: 11
2023-01-03 16:26:14 +09:00
Tom Lane 92957ed98c Avoid reference to nonexistent array element in ExecInitAgg().
When considering an empty grouping set, we fetched
phasedata->eqfunctions[-1].  Because the eqfunctions array is
palloc'd, that would always be an aset pointer in released versions,
and thus the code accidentally failed to malfunction (since it would
do nothing unless it found a null pointer).  Nonetheless this seems
like trouble waiting to happen, so add a check for length == 0.

It's depressing that our valgrind testing did not catch this.
Maybe we should reconsider the choice to not mark that word NOACCESS?

Richard Guo

Discussion: https://postgr.es/m/CAMbWs4-vZuuPOZsKOYnSAaPYGKhmacxhki+vpOKk0O7rymccXQ@mail.gmail.com
2023-01-02 16:17:00 -05:00
Bruce Momjian c8e1ba736b Update copyright for 2023
Backpatch-through: 11
2023-01-02 15:00:37 -05:00
Peter Geoghegan 325bc54eed Adjust VACUUM hastup LP_REDIRECT comments.
The term "truncation" has been ambiguous since commit 10a8d13823 added
line pointer array truncation during heap pruning.  Clear things up by
specifying that we're talking about rel truncation here, to match nearby
comments that apply to tuples with storage.
2023-01-02 10:18:22 -08:00
Peter Geoghegan 6daeeb1f91 Avoid special XID snapshotConflictHorizon values.
Don't allow VACUUM to WAL-log the value FrozenTransactionId as the
snapshotConflictHorizon of freezing or visibility map related WAL
records.

The only special XID value that's an allowable snapshotConflictHorizon
is InvalidTransactionId, which is interpreted as "record definitely
doesn't require a recovery conflict".

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznuNGSzF8v6OsgjaC5aYsb3cZ6HW6MLm30X0d65cmSH6A@mail.gmail.com
2023-01-02 10:16:51 -08:00
Peter Eisentraut e351f85418 Push lpp variable closer to usage in heapgetpage()
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ@mail.gmail.com
2023-01-02 09:39:31 +01:00
Tom Lane 2ceea5adb0 Accept "+infinity" in date and timestamp[tz] input.
The float and numeric types accept this variant spelling of
"infinity", so it seems like the datetime types should too.

Vik Fearing, some cosmetic mods by me

Discussion: https://postgr.es/m/d0bef637-2dbd-0a5d-e539-48243b6f6c5e@postgresfriends.org
2023-01-01 14:16:07 -05:00
Tomas Vondra 02699bc1fd Fix assert in BRIN build_distances
When brin_minmax_multi_union merges summaries, we may end up with just a
single range after merge_overlapping_ranges. The summaries may contain
just one range each, and they may overlap (or be exactly the same).

With a single range there's no distance to calculate, but we happen to
call build_distances anyway - which is fine, we don't calculate the
distance in this case, except that with asserts this failed due to a
check there are at least two ranges.

The assert is unnecessarily strict, so relax it a bit and bail out if
there's just a single range. The relaxed assert would be enough, but
this way we don't allocate unnecessary memory for distance.

Backpatch to 14, where minmax-multi opclasses were introduced.

Reported-by: Jaime Casanova
Backpatch-through: 14
Discussion: https://postgr.es/m/YzVA55qS0hgz8P3r@ahch-to
2022-12-30 20:49:50 +01:00
Michael Paquier 7aa81c61ec Fix precision handling for some COERCE_SQL_SYNTAX functions
f193883 has been incorrectly setting up the precision used in the
timestamp compilations returned by the following functions:
- LOCALTIME
- LOCALTIMESTAMP
- CURRENT_TIME
- CURRENT_TIMESTAMP

Specifying an out-of-range precision for CURRENT_TIMESTAMP and
LOCALTIMESTAMP was raising a WARNING without adjusting the precision,
leading to a subsequent error.  LOCALTIME and CURRENT_TIME raised a
WARNING without an error, still the precision given to the internal
routines was not correct, so let's be clean.

Ian has reported the problems in timestamp.c, while I have noticed the
ones in date.c.  Regression tests are added for all of them with
precisions high enough to provide coverage for the warnings, something
that went missing up to this commit.

Author: Ian Lawrence Barwick, Michael Paquier
Discussion: https://postgr.es/m/CAB8KJ=jQEnn9sYG+N752spt68wMrhmT-ocHCh4oeNmHF82QMWA@mail.gmail.com
2022-12-30 20:47:57 +09:00
Peter Eisentraut 1f605b82ba Change argument of appendBinaryStringInfo from char * to void *
There is some code that uses this function to assemble some kind of
packed binary layout, which requires a bunch of casts because of this.
Functions taking binary data plus length should take void * instead,
like memcpy() for example.

Discussion: https://www.postgresql.org/message-id/flat/a0086cfc-ff0f-2827-20fe-52b591d2666c%40enterprisedb.com
2022-12-30 11:05:09 +01:00
Peter Eisentraut 33a33f0ba4 Use appendStringInfoString instead of appendBinaryStringInfo where possible
For the jsonpath output, we don't need to squeeze out every bit of
performance, so instead use a more robust coding style.  There are
similar calls in jsonb.c, which we leave alone here since there is
indeed a performance impact for bulk exports.

Discussion: https://www.postgresql.org/message-id/flat/a0086cfc-ff0f-2827-20fe-52b591d2666c%40enterprisedb.com
2022-12-30 11:05:09 +01:00
Peter Eisentraut faf3750657 Add const to BufFileWrite
Make data buffer argument to BufFileWrite a const pointer and bubble
this up to various callers and related APIs.  This makes the APIs
clearer and more consistent.

Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
2022-12-30 10:12:24 +01:00
Peter Eisentraut 5f2f99c9c6 Remove unnecessary casts
Some code carefully cast all data buffer arguments for data write and
read function calls to void *, even though the respective arguments
are already void *.  Remove this unnecessary clutter.

Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
2022-12-30 10:12:24 +01:00
Peter Geoghegan 1de58df4fe Add page-level freezing to VACUUM.
Teach VACUUM to decide on whether or not to trigger freezing at the
level of whole heap pages.  Individual XIDs and MXIDs fields from tuple
headers now trigger freezing of whole pages, rather than independently
triggering freezing of each individual tuple header field.

Managing the cost of freezing over time now significantly influences
when and how VACUUM freezes.  The overall amount of WAL written is the
single most important freezing related cost, in general.  Freezing each
page's tuples together in batch allows VACUUM to take full advantage of
the freeze plan WAL deduplication optimization added by commit 9e540599.

Also teach VACUUM to trigger page-level freezing whenever it detects
that heap pruning generated an FPI.  We'll have already written a large
amount of WAL just to do that much, so it's very likely a good idea to
get freezing out of the way for the page early.  This only happens in
cases where it will directly lead to marking the page all-frozen in the
visibility map.

In most cases "freezing a page" removes all XIDs < OldestXmin, and all
MXIDs < OldestMxact.  It doesn't quite work that way in certain rare
cases involving MultiXacts, though.  It is convenient to define "freeze
the page" in a way that gives FreezeMultiXactId the leeway to put off
the work of processing an individual tuple's xmax whenever it happens to
be a MultiXactId that would require an expensive second pass to process
aggressively (allocating a new multi is especially worth avoiding here).
FreezeMultiXactId is eager when processing is cheap (as it usually is),
and lazy in the event of an individual multi that happens to require
expensive second pass processing.  This avoids regressions related to
processing of multis that page-level freezing might otherwise cause.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
2022-12-28 08:50:47 -08:00
Tom Lane adb5c32eb5 Suppress uninitialized-variable warning from a61b1f748.
Some compilers complain about sub_rteperminfos not being
initialized, evidently because they don't detect that it
is only used and set if isGeneralSelect is true.
Make it follow the long-established pattern for its
sibling variable sub_rtable.

Per reports from Pavel Stehule and the buildfarm.

Discussion: https://postgr.es/m/CAFj8pRDOvGOi-n616kM0Cc7qSbg_nGoS=-haB+D785sUXADqSg@mail.gmail.com
2022-12-27 18:07:48 -05:00
Tom Lane a5434c5258 Remove new locale dependency in regproc regression test.
The modified error message for regcollationin failure includes
the database encoding, which it should've occurred to me is a
portability hazard for the regression tests.  Adjust the test
so the expected output doesn't include that.

In passing, fix a comment typo introduced in b8c0ffbd2.

Per buildfarm.
2022-12-27 13:06:42 -05:00
Tom Lane 3ea7329c9a Simplify the implementations of the to_reg* functions.
Given the soft-input-error feature, we can reduce these functions
to be just thin wrappers around a soft-error call of the
corresponding datatype input function.  This means less code and
more certainty that the to_reg* functions match the normal input
behavior.

Notably, it also means that they will accept numeric OID input,
which they didn't before.  It's not clear to me if that omission
had more than laziness behind it, but it doesn't seem like
something we need to work hard to preserve.

Discussion: https://postgr.es/m/3910031.1672095600@sss.pgh.pa.us
2022-12-27 12:33:04 -05:00
Tom Lane 858e776c84 Convert the reg* input functions to report (most) errors softly.
This is not really complete, but it catches most cases of practical
interest.  The main omissions are:

* regtype, regprocedure, and regoperator parse type names by
calling the main grammar, so any grammar-detected syntax error
will still be a hard error.  Also, if one includes a type
modifier in such a type specification, errors detected by the
typmodin function will be hard errors.

* Lookup errors are handled just by passing missing_ok = true
to the relevant catalog lookup function.  Because we've used
quite a restrictive definition of "missing_ok", this means that
edge cases such as "the named schema exists, but you lack
USAGE permission on it" are still hard errors.

It would make sense to me to replace most/all missing_ok
parameters with an escontext parameter and then allow these
additional lookup failure cases to be trapped too.  But that's
a job for some other day.

Discussion: https://postgr.es/m/3342239.1671988406@sss.pgh.pa.us
2022-12-27 12:26:01 -05:00
Tom Lane 78212f2101 Convert tsqueryin and tsvectorin to report errors softly.
This is slightly tedious because the adjustments cascade through
a couple of levels of subroutines, but it's not very hard.
I chose to avoid changing function signatures more than absolutely
necessary, by passing the escontext pointer in existing structs
where possible.

tsquery's nuisance NOTICEs about empty queries are suppressed in
soft-error mode, since they're not errors and we surely don't want
them to be shown to the user anyway.  Maybe that whole behavior
should be reconsidered.

Discussion: https://postgr.es/m/3824377.1672076822@sss.pgh.pa.us
2022-12-27 12:00:31 -05:00
Tom Lane eb8312a22a Detect bad input for types xid, xid8, and cid.
Historically these input functions just called strtoul or strtoull
and returned the result, with no error detection whatever.  Upgrade
them to reject garbage input and out-of-range values, similarly to
our other numeric input routines.

To share the code for this with type oid, adjust the existing
"oidin_subr" to be agnostic about the SQL name of the type it is
handling, and move it to numutils.c; then clone it for 64-bit types.

Because the xid types previously accepted hex and octal input by
reason of calling strtoul[l] with third argument zero, I made the
common subroutine do that too, with the consequence that type oid
now also accepts hex and octal input.  In view of 6fcda9aba, that
seems like a good thing.

While at it, simplify the existing over-complicated handling of
syntax errors from strtoul: we only need one ereturn not three.

Discussion: https://postgr.es/m/3526121.1672000729@sss.pgh.pa.us
2022-12-27 11:40:01 -05:00
Peter Geoghegan 63c844a0a5 Remove overzealous MultiXact freeze assertion.
When VACUUM determines that an existing MultiXact should use a freeze
plan that sets xmax to InvalidTransactionId, the original Multi may or
may not be before OldestMxact.  Remove an incorrect assertion that
expected it to always be from before OldestMxact.

Oversight in commit 4ce3af.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/TYAPR01MB5866B24104FD80B5D7E65C3EF5ED9@TYAPR01MB5866.jpnprd01.prod.outlook.com
2022-12-26 23:36:02 -08:00
Amit Kapila 5de94a041e Add 'logical_decoding_mode' GUC.
This enables streaming or serializing changes immediately in logical
decoding. This parameter is intended to be used to test logical decoding
and replication of large transactions for which otherwise we need to
generate the changes till logical_decoding_work_mem is reached.

This helps in reducing the timing of existing tests related to logical
replication of in-progress transactions and will help in writing tests for
for the upcoming feature for parallelly applying large in-progress
transactions.

Author: Shi yu
Reviewed-by: Sawada Masahiko, Shveta Mallik, Amit Kapila, Dilip Kumar, Kuroda Hayato, Kyotaro Horiguchi
Discussion: https://postgr.es/m/OSZPR01MB63104E7449DBE41932DB19F1FD1B9@OSZPR01MB6310.jpnprd01.prod.outlook.com
2022-12-26 08:58:16 +05:30
Tom Lane 442e25d248 Convert enum_in() to report errors softly.
I missed this in my initial survey, probably because I examined
the contents of pg_type in the postgres database, which lacks
any enumerated types.

Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
2022-12-25 14:32:30 -05:00
Andrew Dunstan e37fe1db6e Convert jsonpath's input function to report errors softly
Reviewed by Tom Lane

Discussion: https://postgr.es/m/a8dc5700-c341-3ba8-0507-cc09881e6200@dunslane.net
2022-12-24 15:21:20 -05:00
Tom Lane 780ec9f1b2 Make the numeric-OID cases of regprocin and friends be non-throwing.
While at it, use a common subroutine already.

This doesn't move the needle very far in terms of making these
functions non-throwing; the only case we're now able to trap is
numeric-OID-is-out-of-range.  Still, it seems like a pretty
non-controversial step in that direction.
2022-12-24 15:01:21 -05:00
David Rowley bbfdf7180d Fix bug in translate_col_privs_multilevel
Fix incorrect code which was trying to convert a Bitmapset of columns at
the attnums according to a parent table and transform them into the
equivalent Bitmapset with same attnums according to the given child table.
This code is new as of a61b1f748 and was failing to do the correct
translation when there was an intermediate parent table between 'rel' and
'top_parent_rel'.

Reported-by: Ranier Vilela
Author: Richard Guo, Amit Langote
Discussion: https://postgr.es/m/CAEudQArohfB_Gy%2BhcH2-bANUkxgjJiP%3DABq01_LgTNTbcNijag%40mail.gmail.com
2022-12-24 00:58:34 +13:00
Thomas Munro b5d0f8ec01 Allow parent's WaitEventSets to be freed after fork().
An epoll fd belonging to the parent should be closed in the child.  A
kqueue fd is automatically closed by fork(), but we should still adjust
our counter.  For poll and Windows systems, nothing special is required.
On all systems we free the memory.

No caller yet, but we'll need this if we start using WaitEventSet in the
postmaster as planned.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
2022-12-23 20:34:03 +13:00
Thomas Munro 1f0019de2f Don't leak a signalfd when using latches in the postmaster.
At the time of commit 6a2a70a02 we didn't use latch infrastructure in
the postmaster.  We're planning to start doing that, so we'd better make
sure that the signalfd inherited from a postmaster is not duplicated and
then leaked in the child.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
2022-12-23 20:24:41 +13:00
Thomas Munro 30829e52ff Add WL_SOCKET_ACCEPT event to WaitEventSet API.
To be able to handle incoming connections on a server socket with
the WaitEventSet API, we'll need a new kind of event to indicate that
the the socket is ready to accept a connection.

On Unix, it's just the same as WL_SOCKET_READABLE, but on Windows there
is a different underlying kernel event that we need to map our
abstraction to.

No user yet, but a proposed patch would use this.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
2022-12-23 20:21:47 +13:00
Michael Paquier 2fcf685f6d Fix come incorrect elog() messages in aclchk.c
Three error strings used with cache lookup failures were referring to
incorrect object types for ACL checks:
- Schemas
- Types
- Foreign Servers
There errors should never be triggered, but if they do incorrect
information would be reported.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20221222153041.GN1153@telsasoft.com
Backpatch-through: 11
2022-12-23 10:04:18 +09:00
Michael Paquier 13e0d7a603 Rename pg_dissect_walfile_name() to pg_split_walfile_name()
The former name was discussed as being confusing, so use "split", as per
a suggestion from Magnus Hagander.

While on it, one of the output arguments is renamed from "segno" to
"segment_number", as per a suggestion from Kyotaro Horiguchi.

The documentation is updated to reflect all these changes.

Bump catalog version.

Author: Bharath Rupireddy, Michael Paquier
Discussion: https://postgr.es/m/CABUevEytQVaOOhGdoh0D7hGwe3fuKcRF6NthsSW7ww04EmtFgQ@mail.gmail.com
2022-12-23 09:15:01 +09:00
David Rowley ed1a88ddac Allow window functions to adjust their frameOptions
WindowFuncs such as row_number() don't care if it's called with ROWS
UNBOUNDED PRECEDING AND CURRENT ROW or with RANGE UNBOUNDED PRECEDING AND
CURRENT ROW.  The latter is less efficient as the RANGE option requires
that the executor check for peer rows, so using the ROW option instead
would cause less overhead.  Because RANGE is part of the default frame
options for WindowClauses, it means WindowAgg is, by default, working much
harder than it needs to for window functions where the ROWS / RANGE option
has no effect on the window function's result.

On a test query from the discussion thread, a performance improvement of
344% was seen by using ROWS instead of RANGE.

Here we add a new support function node type to allow support functions to
be called for window functions so that the most optimal version of the
frame options can be set.  The planner has been adjusted so that the frame
options are changed only if all window functions sharing the same window
clause agree on what the optimized frame options are.

Here we give the ability for row_number(), rank(), dense_rank(),
percent_rank(), cume_dist() and ntile() to alter their WindowClause's
frameOptions.

Reviewed-by: Vik Fearing, Erwin Brandstetter, Zhihong Yu
Discussion: https://postgr.es/m/CAGHENJ7LBBszxS+SkWWFVnBmOT2oVsBhDMB1DFrgerCeYa_DyA@mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvohAKEtTXxq7Pc-ic2dKT8oZfbRKeEJP64M0B6+S88z+A@mail.gmail.com
2022-12-23 12:43:52 +13:00
Thomas Munro cc15059634 Improve notation of cacheinfo table in syscache.c.
Use C99 designated initializer syntax for the array elements, instead of
writing the enumerator name and position in a comment.  Replace nkeys
and key with a local variadic macro, for a shorter notation.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/CA%2BhUKGKdpDjKL2jgC-GpoL4DGZU1YPqnOFHbDqFkfRQcPaR5DQ%40mail.gmail.com
2022-12-23 10:40:18 +13:00
Peter Geoghegan 07eef53955 Use scanned_pages to decide when to failsafe check.
Perform a failsafe check every time VACUUM's first heap scan scans a
further FAILSAFE_EVERY_PAGES pages, rather than using an approach based
on the number of physical blocks that our current blkno is from the
blkno at the time of the previous failsafe check.  That way VACUUM will
perform a failsafe check every time it has scanned a uniform number of
pages, without it mattering when or how VACUUM skipped pages using the
visibility map.

Sami Imseih, with changes to FAILSAFE_EVERY_PAGES comments added by me.

Author: Sami Imseih <simseih@amazon.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/401CE010-4049-4B94-9961-0B610A5D254D%40amazon.com
2022-12-22 10:41:40 -08:00
Peter Geoghegan 4ce3afb82e Refactor how VACUUM passes around its XID cutoffs.
Use a dedicated struct for the XID/MXID cutoffs used by VACUUM, such as
FreezeLimit and OldestXmin.  This state is initialized in vacuum.c, and
then passed around by code from vacuumlazy.c to heapam.c freezing
related routines.  The new convention is that everybody works off of the
same cutoff state, which is passed around via pointers to const.

Also simplify some of the logic for dealing with frozen xmin in
heap_prepare_freeze_tuple: add dedicated "xmin_already_frozen" state to
clearly distinguish xmin XIDs that we're going to freeze from those that
were already frozen from before.  That way the routine's xmin handling
code is symmetrical with the existing xmax handling code.  This is
preparation for an upcoming commit that will add page level freezing.

Also refactor the control flow within FreezeMultiXactId(), while adding
stricter sanity checks.  We now test OldestXmin directly, instead of
using FreezeLimit as an inexact proxy for OldestXmin.  This is further
preparation for the page level freezing work, which will make the
function's caller cede control of page level freezing to the function
where appropriate (where heap_prepare_freeze_tuple sees a tuple that
happens to contain a MultiXactId in its xmax).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CAH2-WznS9TxXmz2_=SY+SyJyDFbiOftKofM9=aDo68BbXNBUMA@mail.gmail.com
2022-12-22 09:37:59 -08:00
Tom Lane e42e312430 Avoid O(N^2) cost when pulling up lots of UNION ALL subqueries.
perform_pullup_replace_vars() knows how to scan the whole parent
query tree when we are replacing Vars during a subquery flattening
operation.  However, for the specific case of flattening a
UNION ALL leaf query, that's mostly wasted work: the only place
where relevant Vars could exist is in the AppendRelInfo that we
just made for this leaf.  Teaching perform_pullup_replace_vars()
to just deal with that and exit is worthwhile because, if we have
N such subqueries to pull up, we were spending O(N^2) work uselessly
mutating the AppendRelInfos for all the other subqueries.

While we're at it, avoid calling substitute_phv_relids if there are no
PlaceHolderVars, and remove an obsolete check of parse->hasSubLinks.

Andrey Lepikhov and Tom Lane

Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru
2022-12-22 11:02:03 -05:00
Tom Lane 5beb7881fb Add some recursion and looping defenses in prepjointree.c.
Andrey Lepikhov demonstrated a case where we spend an unreasonable
amount of time in pull_up_subqueries().  Not only is that recursing
with no explicit check for stack overrun, but the code seems not
interruptable by control-C.  Let's stick a CHECK_FOR_INTERRUPTS
there, along with sprinkling some stack depth checks.

An actual fix for the excessive time consumption seems a bit
risky to back-patch; but this isn't, so let's do so.

Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru
2022-12-22 10:35:02 -05:00
Peter Eisentraut 73bb72f0ca Remove dead code
The second appearance of NamespaceRelationId in this if-else chain is
in error and can be removed.
2022-12-22 08:12:41 +01:00
Michael Paquier 01be9d498f Fix operator typo in tablecmds.c
A bitwise operator was getting used on two bools in
ATAddCheckConstraint() to track if constraints should be merged or not
with the existing ones of a relation, though obviously this should use
a boolean OR operator.  This led to the same result, but let's be
clean.

Oversight in 074c5cf.

Author: Ranier Vilela
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/CAEudQAp2R2fbbi0OHHhv_n4=Ch0t1VtjObR9YMqtGKHJ+faUFQ@mail.gmail.com
2022-12-22 12:08:45 +09:00
David Rowley 439f61757f Add palloc_aligned() to allow aligned memory allocations
This introduces palloc_aligned() and MemoryContextAllocAligned() which
allow callers to obtain memory which is allocated to the given size and
also aligned to the specified alignment boundary.  The alignment
boundaries may be any power-of-2 value.  Currently, the alignment is
capped at 2^26, however, we don't expect values anything like that large.
The primary expected use case is to align allocations to perhaps CPU
cache line size or to maybe I/O page size.  Certain use cases can benefit
from having aligned memory by either having better performance or more
predictable performance.

The alignment is achieved by requesting 'alignto' additional bytes from
the underlying allocator function and then aligning the address that is
returned to the requested alignment.  This obviously does waste some
memory, so alignments should be kept as small as what is required.

It's also important to note that these alignment bytes eat into the
maximum allocation size.  So something like:

palloc_aligned(MaxAllocSize, 64, 0);

will not work as we cannot request MaxAllocSize + 64 bytes.

Additionally, because we're just requesting the requested size plus the
alignment requirements from the given MemoryContext, if that context is
the Slab allocator, then since slab can only provide chunks of the size
that's specified when the slab context is created, then this is not going
to work.  Slab will generate an error to indicate that the requested size
is not supported.

The alignment that is requested in palloc_aligned() is stored along with
the allocated memory.  This allows the alignment to remain intact through
repalloc() calls.

Author: Andres Freund, David Rowley
Reviewed-by: Maxim Orlov, Andres Freund, John Naylor
Discussion: https://postgr.es/m/CAApHDvpxLPUMV1mhxs6g7GNwCP6Cs6hfnYQL5ffJQTuFAuxt8A%40mail.gmail.com
2022-12-22 13:32:05 +13:00
Andrew Dunstan 33dd895ef3 Introduce float4in_internal
This is the guts of float4in, callable as a routine to input floats,
which will be useful in an upcoming patch for allowing soft errors in
the seg module's input function.

A similar operation was performed some years ago for float8in in
commit 50861cd683.

Reviewed by Tom Lane

Discussion: https://postgr.es/m/cee4e426-d014-c0b7-aa22-a659f2cd9130@dunslane.net
2022-12-21 16:55:52 -05:00
David Rowley eb706fde83 Fix newly introduced bug in slab.c
d21ded75f changed the way slab.c works but introduced a bug that meant we
could end up with the slab's curBlocklistIndex pointing to the wrong list.
The condition which was checking for this was failing to account for two
things:

1. The curBlocklistIndex could be 0 as we've currently got no non-full
blocks to put chunks on.  In this case, the dlist_is_empty() check cannot
be performed as there can be any number of completely full blocks at that
index.

2. The curBlocklistIndex may be greater than the index we just moved the
block onto.  Since we need to ensure we fill up fuller blocks first, we
must reset curBlocklistIndex when changing any blocklist element that's
less than the curBlocklistIndex too.

Reported-by: Takamichi Osumi
Discussion: https://postgr.es/m/TYCPR01MB8373329C6329768D7E093D68EDEB9@TYCPR01MB8373.jpnprd01.prod.outlook.com
2022-12-22 09:57:49 +13:00
Michael Paquier 22e3b55805 Switch some system functions to use get_call_result_type()
This shaves some code by replacing the combinations of
CreateTemplateTupleDesc()/TupleDescInitEntry() hardcoding a mapping of
the attributes listed in pg_proc.dat by get_call_result_type() to build
the TupleDesc needed for the rows generated.

get_call_result_type() is more expensive than the former style, but this
removes some duplication with the lists of OUT parameters (pg_proc.dat
and the attributes hardcoded in these code paths).  This is applied to
functions that are not considered as critical (aka that could be called
repeatedly for monitoring purposes).

Author: Bharath Rupireddy
Reviewed-by: Robert Haas, Álvaro Herrera, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACV23HW5HP5hFjd89FNS-z5X8r2jNXdMXcpN2BgTtKd87w@mail.gmail.com
2022-12-21 10:11:22 +09:00
Andrew Dunstan 8284cf5f74 Add copyright notices to meson files
Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net
2022-12-20 07:54:39 -05:00
Etsuro Fujita 594f8d3776 Allow batching of inserts during cross-partition updates.
Commit 927f453a9 disallowed batching added by commit b663a4136 to be
used for the inserts performed as part of cross-partition updates of
partitioned tables, mainly because the previous code in
nodeModifyTable.c couldn't handle pending inserts into foreign-table
partitions that are also UPDATE target partitions.  But we don't have
such a limitation anymore (cf. commit ffbb7e65a), so let's allow for
this by removing from execPartition.c the restriction added by commit
927f453a9 that batching is only allowed if the query command type is
CMD_INSERT.

In postgres_fdw, since commit 86dc90056 changed it to effectively
disable cross-partition updates in the case where a foreign-table
partition chosen to insert rows into is also an UPDATE target partition,
allow batching in the case where a foreign-table partition chosen to
do so is *not* also an UPDATE target partition.  This is enabled by the
"batch_size" option added by commit b663a4136, which is disabled by
default.

This patch also adjusts the test case added by commit 927f453a9 to
confirm that the inserts performed as part of a cross-partition update
of a partitioned table indeed uses batching.

Amit Langote, reviewed and/or tested by Georgios Kokolatos, Zhihong Yu,
Bharath Rupireddy, Hou Zhijie, Vignesh C, and me.

Discussion: http://postgr.es/m/CA%2BHiwqH1Lz1yJmPs%3DaD-pzd_HLLynLHvq5iYeT9mB0bBV7oJ6w%40mail.gmail.com
2022-12-20 19:05:00 +09:00
David Rowley 3226f47282 Add enable_presorted_aggregate GUC
1349d279 added query planner support to allow more efficient execution of
aggregate functions which have an ORDER BY or a DISTINCT clause.  Prior to
that commit, the planner would only request that the lower planner produce
a plan with the order required for the GROUP BY clause and it would be
left up to nodeAgg.c to perform the final sort of records within each
group so that the aggregate transition functions were called in the
correct order.  Now that the planner requests the lower planner produce a
plan with the GROUP BY and the ORDER BY / DISTINCT aggregates in mind,
there is the possibility that the planner chooses a plan which could be
less efficient than what would have been produced before 1349d279.

While developing 1349d279, I had in mind that Incremental Sort would help
us in cases where an index exists only on the GROUP BY column(s).
Incremental Sort would just replace the implicit tuplesorts which are
being performed in nodeAgg.c.  However, because the planner has the
flexibility to instead choose a plan which just performs a full sort on
both the GROUP BY and ORDER BY / DISTINCT aggregate columns, there is
potential for the planner to make a bad choice.  The costing for
Incremental Sort is not perfect as it assumes an even distribution of rows
to sort within each sort group.

Here we add an escape hatch in the form of the enable_presorted_aggregate
GUC.  This will allow users to get the pre-PG16 behavior in cases where
they have no other means to convince the query planner to produce a plan
which only sorts on the GROUP BY column(s).

Discussion: https://postgr.es/m/CAApHDvr1Sm+g9hbv4REOVuvQKeDWXcKUAhmbK5K+dfun0s9CvA@mail.gmail.com
2022-12-20 22:28:58 +13:00
David Rowley d21ded75fd Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators.  Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory.  Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.

Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.

Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks.  When a block is first
allocated all chunks are unused.  These chunks only make it onto the
free list when they are pfree'd.  When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk.  When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.

Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists.  We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements.  This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.

Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks.  We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again.  Now only once we have 10 of these, we free
the block.  This does raise the high water mark for the total memory that
a slab context can consume.  It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant.  Let's wait and see if there is any evidence to
support that this is required before doing it.

Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 21:48:51 +13:00
John Naylor 995a9fb14f Move variable increment to the end of the loop
This is less error prone and matches the placement of other code
in the file.

Justin Pryzby

Reviewed by Tom Lane
Discussion: https://www.postgresql.org/message-id/20221123172436.GJ11463@telsasoft.com
2022-12-20 14:13:14 +07:00
Michael Paquier cca1863489 Add pg_dissect_walfile_name()
This function takes in input a WAL segment name and returns a tuple made
of the segment sequence number (dependent on the WAL segment size of the
cluster) and its timeline, as of a thin SQL wrapper around the existing
XLogFromFileName().

This function has multiple usages, like being able to compile a LSN from
a file name and an offset, or finding the timeline of a segment without
having to do to some maths based on the first eight characters of the
segment.

Bump catalog version.

Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Kyotaro Horiguchi, Maxim Orlov, Michael
Paquier
Discussion: https://postgr.es/m/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcJK4g@mail.gmail.com
2022-12-20 13:36:27 +09:00
Michael Paquier b3bb7d12af Remove hardcoded dependency to cryptohash type in the internals of SCRAM
SCRAM_KEY_LEN was a variable used in the internal routines of SCRAM to
size a set of fixed-sized arrays used in the SHA and HMAC computations
during the SASL exchange or when building a SCRAM password.  This had a
hard dependency on SHA-256, reducing the flexibility of SCRAM when it
comes to the addition of more hash methods.  A second issue was that
SHA-256 is assumed as the cryptohash method to use all the time.

This commit renames SCRAM_KEY_LEN to a more generic SCRAM_KEY_MAX_LEN,
which is used as the size of the buffers used by the internal routines
of SCRAM.  This is aimed at tracking centrally the maximum size
necessary for all the hash methods supported by SCRAM.  A global
variable has the advantage of keeping the code in its simplest form,
reducing the need of more alloc/free logic for all the buffers used in
the hash calculations.

A second change is that the key length (SHA digest length) and hash
types are now tracked by the state data in the backend and the frontend,
the common portions being extended to handle these as arguments by the
internal routines of SCRAM.  There are a few RFC proposals floating
around to extend the SCRAM protocol, including some to use stronger
cryptohash algorithms, so this lifts some of the existing restrictions
in the code.

The code in charge of parsing and building SCRAM secrets is extended to
rely on the key length and on the cryptohash type used for the exchange,
assuming currently that only SHA-256 is supported for the moment.  Note
that the mock authentication simply enforces SHA-256.

Author: Michael Paquier
Reviewed-by: Peter Eisentraut, Jonathan Katz
Discussion: https://postgr.es/m/Y5k3Qiweo/1g9CG6@paquier.xyz
2022-12-20 08:53:22 +09:00
Robert Haas 10ea0f924a Expose some information about backend subxact status.
A new function pg_stat_get_backend_subxact() can be used to get
information about the number of subtransactions in the cache of
a particular backend and whether that cache has overflowed. This
can be useful for tracking down performance problems that can
result from overflowed snapshots.

Dilip Kumar, reviewed by Zhihong Yu, Nikolay Samokhvalov,
Justin Pryzby, Nathan Bossart, Ashutosh Sharma, Julien
Rouhaud. Additional design comments from Andres Freund,
Tom Lane, Bruce Momjian, and David G. Johnston.

Discussion: http://postgr.es/m/CAFiTN-ut0uwkRJDQJeDPXpVyTWD46m3gt3JDToE02hTfONEN=Q@mail.gmail.com
2022-12-19 14:43:09 -05:00
Tom Lane 935277b241 Fix inability to reference CYCLE column from inside its CTE.
Such references failed with "cache lookup failed for type 0"
because we didn't resolve the type of the CYCLE column until after
analyzing the CTE's query.  We can just move that processing
to before the recursive parse_sub_analyze call, though.

While here, invent a couple of local variables to make this
code less egregiously wider-than-80-columns.

Per bug #17723 from Vik Fearing.  Back-patch to v14 where
the CYCLE feature was added.

Discussion: https://postgr.es/m/17723-2c4985ff111e7bba@postgresql.org
2022-12-16 13:07:42 -05:00
Tom Lane c4939f1215 Clean up dubious error handling in wellformed_xml().
This ancient bit of code was summarily trapping any ereport longjmp
whatsoever and assuming that it must represent an invalid-XML report.
It's not really appropriate to handle OOM-like situations that way:
maybe the input is valid or maybe not, but we couldn't find out.
And it'd be a seriously bad idea to ignore, say, a query cancel
error that way.  (Perhaps that can't happen because there is no
CHECK_FOR_INTERRUPTS anywhere within xml_parse, but even if that's
true today it's obviously a very fragile assumption.)

But in the wake of the previous commit, we can drop the PG_TRY
here altogether, and use the soft error mechanism to catch only
the kinds of errors that are legitimate to treat as invalid-XML.

(This is our first use of the soft error mechanism for something
not directly related to a datatype input function.  It won't be
the last.)

xml_is_document can be converted in the same way.  That one is
not actively broken, because it was checking specifically for
ERRCODE_INVALID_XML_DOCUMENT rather than trapping everything;
but the code is still shorter and probably faster this way.

Discussion: https://postgr.es/m/3564577.1671142683@sss.pgh.pa.us
2022-12-16 11:10:40 -05:00
Tom Lane 37bef842f5 Convert xml_in to report errors softly.
The key idea here is that xml_parse must distinguish hard errors
from soft errors.  We want to throw a hard error for libxml
initialization failures: those might be out-of-memory, or something
else, but in any case they are not the fault of the input string.
If we get to the point of parsing the input, and something goes
wrong, we can fairly consider that to mean bad input.

One thing that arguably does mean bad input, but I didn't trouble
to handle softly, is encoding conversion failure while converting
the server encoding to UTF8.  This might be something to improve
later, but it seems like a pretty low-probability scenario.

Discussion: https://postgr.es/m/3564577.1671142683@sss.pgh.pa.us
2022-12-16 11:10:40 -05:00
David Rowley 4a29eabd1d Remove pessimistic cost penalization from Incremental Sort
When incremental sorts were added in v13 a 1.5x pessimism factor was added
to the cost modal.  Seemingly this was done because the cost modal only
has an estimate of the total number of input rows and the number of
presorted groups.  It assumes that the input rows will be evenly
distributed throughout the presorted groups.  The 1.5x pessimism factor
was added to slightly reduce the likelihood of incremental sorts being
used in the hope to avoid performance regressions where an incremental
sort plan was picked and turned out slower due to a large skew in the
number of rows in the presorted groups.

An additional quirk with the path generation code meant that we could
consider both a sort and an incremental sort on paths with presorted keys.
This meant that with the pessimism factor, it was possible that we opted
to perform a sort rather than an incremental sort when the given path had
presorted keys.

Here we remove the 1.5x pessimism factor to allow incremental sorts to
have a fairer chance at being chosen against a full sort.

Previously we would generally create a sort path on the cheapest input
path (if that wasn't sorted already) and incremental sort paths on any
path which had presorted keys.  This meant that if the cheapest input path
wasn't completely sorted but happened to have presorted keys, we would
create a full sort path *and* an incremental sort path on that input path.
Here we change this logic so that if there are presorted keys, we only
create an incremental sort path, and create sort paths only when a full
sort is required.

Both the removal of the cost pessimism factor and the changes made to the
path generation make it more likely that incremental sorts will now be
chosen.  That, of course, as with teaching the planner any new tricks,
means an increased likelihood that the planner will perform an incremental
sort when it's not the best method.  Our standard escape hatch for these
cases is an enable_* GUC.  enable_incremental_sort already exists for
this.

This came out of a report by Pavel Luzanov where he mentioned that the
master branch was choosing to perform a Seq Scan -> Sort -> Group
Aggregate for his query with an ORDER BY aggregate function.  The v15 plan
for his query performed an Index Scan -> Group Aggregate, of course, the
aggregate performed the final sort internally in nodeAgg.c for the
aggregate's ORDER BY.  The ideal plan would have been to use the index,
which provided partially sorted input then use an incremental sort to
provide the aggregate with the sorted input.  This was not being chosen
due to the pessimism in the incremental sort cost modal, so here we remove
that and rationalize the path generation so that sort and incremental sort
plans don't have to needlessly compete.  We assume that it's senseless
to ever use a full sort on a given input path where an incremental sort
can be performed.

Reported-by: Pavel Luzanov
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/9f61ddbf-2989-1536-b31e-6459370a6baa%40postgrespro.ru
2022-12-16 15:22:23 +13:00
David Rowley ac99802080 Speed up creation of command completion tags
The building of command completion tags could often be seen showing up in
profiles when running high tps workloads.

The query completion tags were being built with snprintf, which is slow at
the best of times when compared with more manual ways of formatting
strings.  Here we introduce BuildQueryCompletionString() to do this job
for us.  We also now store the completion tag's strlen in the
CommandTagBehavior struct so that we can quickly memcpy this number of
bytes into the completion tag string.  Appending the rows affected is done
via pg_ulltoa_n.  BuildQueryCompletionString returns the length of the
built string.  This saves us having to call strlen to figure out how many
bytes to pass to pq_putmessage().

Author: David Rowley, Andres Freund
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CAHoyFK-Xwqc-iY52shj0G+8K9FJpse+FuZ36XBKy78wDVnd=Qg@mail.gmail.com
2022-12-16 10:31:25 +13:00
Tom Lane d35a1af468 Convert range_in and multirange_in to report errors softly.
This is mostly straightforward, except that if the range type
has a canonical function, that might throw an error during range
input.  (Such errors probably only occur for edge cases: in the
in-core canonical functions, it happens only if a bound has the
maximum valid value for the underlying type.)  Hence, this patch
extends the soft-error regime to allow canonical functions to
return errors softly as well.  Extensions implementing range
canonical functions will need modification anyway because of the
API change for range_serialize(); while at it, they might want
to do something similar to what's been done here in the in-core
canonical functions.

Discussion: https://postgr.es/m/3284599.1671075185@sss.pgh.pa.us
2022-12-15 12:18:36 -05:00
Peter Eisentraut 75f49221c2 Static assertions cleanup
Because we added StaticAssertStmt() first before StaticAssertDecl(),
some uses as well as the instructions in c.h are now a bit backwards
from the "native" way static assertions are meant to be used in C.
This updates the guidance and moves some static assertions to better
places.

Specifically, since the addition of StaticAssertDecl(), we can put
static assertions at the file level.  This moves a number of static
assertions out of function bodies, where they might have been stuck
out of necessity, to perhaps better places at the file level or in
header files.

Also, when the static assertion appears in a position where a
declaration is allowed, then using StaticAssertDecl() is more native
than StaticAssertStmt().

Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/941a04e7-dd6f-c0e4-8cdf-a33b3338cbda%40enterprisedb.com
2022-12-15 10:10:32 +01:00
Tom Lane 3b9d2deb67 Convert a few more datatype input functions to report errors softly.
Convert the remaining string-category input functions
(bpcharin, varcharin, byteain) to the new style.

Discussion: https://postgr.es/m/3038346.1671060258@sss.pgh.pa.us
2022-12-14 19:42:05 -05:00
Tom Lane 90161dad4d Convert a few more datatype input functions to report errors softly.
Convert cash_in and uuid_in to the new style.

Amul Sul, minor mods by me

Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
2022-12-14 18:03:11 -05:00
Tom Lane 47f3f97fcd Convert a few more datatype input functions to report errors softly.
Convert assorted internal-ish datatypes, namely aclitemin,
int2vectorin, oidin, oidvectorin, pg_lsn_in, pg_snapshot_in,
and tidin to the new style.

(Some others you might expect to find in this group, such as
cidin and xidin, need no changes because they never throw
errors at all.  That seems a little cheesy ... but it is not in
the charter of this patch series to add new error conditions.)

Amul Sul, minor mods by me

Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
2022-12-14 17:50:24 -05:00
Tom Lane 332741e739 Convert the geometric input functions to report errors softly.
Convert box_in, circle_in, line_in, lseg_in, path_in, point_in,
and poly_in to the new style.

line_in still throws hard errors for overflows/underflows that can occur
when the input is specified as two points rather than in the canonical
"Ax + By + C = 0" style.  I'm not too concerned about that: it won't be
reached in normal dump/restore cases, and it's fairly debatable that
such conversion should ever have been made part of a type input function
in the first place.  But in any case, I don't want to extend the soft
error conventions into float.h without more discussion than this patch
has had.

Amul Sul, minor mods by me

Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
2022-12-14 16:10:20 -05:00
Tom Lane 17407a8eaa Convert a few more datatype input functions to report errors softly.
Convert bit_in, varbit_in, inet_in, cidr_in, macaddr_in, and
macaddr8_in to the new style.

Amul Sul, minor mods by me

Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
2022-12-14 13:22:08 -05:00
Peter Eisentraut b18c2decd7 Rearrange some static assertions for consistency
Put lengthof first.

Reported-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAHut+PsUDMySVRuRc=h+P5N3+=TGvj4W_mi32XXg9dt4o-BXbA@mail.gmail.com
2022-12-14 16:08:13 +01:00
Peter Eisentraut 6fcda9aba8 Non-decimal integer literals
Add support for hexadecimal, octal, and binary integer literals:

    0x42F
    0o273
    0b100101

per SQL:202x draft.

This adds support in the lexer as well as in the integer type input
functions.

Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
2022-12-14 06:17:07 +01:00
Jeff Davis 60684dd834 Add grantable MAINTAIN privilege and pg_maintain role.
Allows VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZED VIEW, CLUSTER,
and LOCK TABLE.

Effectively reverts 4441fc704d. Instead of creating separate
privileges for VACUUM, ANALYZE, and other maintenance commands, group
them together under a single MAINTAIN privilege.

Author: Nathan Bossart
Discussion: https://postgr.es/m/20221212210136.GA449764@nathanxps13
Discussion: https://postgr.es/m/45224.1670476523@sss.pgh.pa.us
2022-12-13 17:33:28 -08:00
Tom Lane 20432f8731 Rethink handling of [Prevent|Is]InTransactionBlock in pipeline mode.
Commits f92944137 et al. made IsInTransactionBlock() set the
XACT_FLAGS_NEEDIMMEDIATECOMMIT flag before returning "false",
on the grounds that that kept its API promises equivalent to those of
PreventInTransactionBlock().  This turns out to be a bad idea though,
because it allows an ANALYZE in a pipelined series of commands to
cause an immediate commit, which is unexpected.

Furthermore, if we return "false" then we have another issue,
which is that ANALYZE will decide it's allowed to do internal
commit-and-start-transaction sequences, thus possibly unexpectedly
committing the effects of previous commands in the pipeline.

To fix the latter situation, invent another transaction state flag
XACT_FLAGS_PIPELINING, which explicitly records the fact that we
have executed some extended-protocol command and not yet seen a
commit for it.  Then, require that flag to not be set before allowing
InTransactionBlock() to return "false".

Having done that, we can remove its setting of NEEDIMMEDIATECOMMIT
without fear of causing problems.  This means that the API guarantees
of IsInTransactionBlock now diverge from PreventInTransactionBlock,
which is mildly annoying, but it seems OK given the very limited usage
of IsInTransactionBlock.  (In any case, a caller preferring the old
behavior could always set NEEDIMMEDIATECOMMIT for itself.)

For consistency also require XACT_FLAGS_PIPELINING to not be set
in PreventInTransactionBlock.  This too is meant to prevent commands
such as CREATE DATABASE from silently committing previous commands
in a pipeline.

Per report from Peter Eisentraut.  As before, back-patch to all
supported branches (which sadly no longer includes v10).

Discussion: https://postgr.es/m/65a899dd-aebc-f667-1d0a-abb89ff3abf8@enterprisedb.com
2022-12-13 14:23:58 -05:00
Peter Eisentraut 369f09e420 Refactor ExecGrant_*() functions
Instead of half a dozen of mostly-duplicate ExecGrant_Foo() functions,
write one common function ExecGrant_generic() that can handle most of
them.  We already have all the information we need, such as which
system catalog corresponds to which catalog table and which column is
the ACL column.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://www.postgresql.org/message-id/flat/22c7e802-4e7d-8d87-8b71-cba95e6f4bcf%40enterprisedb.com
2022-12-13 07:52:04 +01:00
Tom Lane b0feda79fd Fix jsonb subscripting to cope with toasted subscript values.
jsonb_get_element() was incautious enough to use VARDATA() and
VARSIZE() directly on an arbitrary text Datum.  That of course
fails if the Datum is short-header, compressed, or out-of-line.
The typical result would be failing to match any element of a
jsonb object, though matching the wrong one seems possible as well.

setPathObject() was slightly brighter, in that it used VARDATA_ANY
and VARSIZE_ANY_EXHDR, but that only kept it out of trouble for
short-header Datums.  push_path() had the same issue.  This could
result in faulty subscripted insertions, though keys long enough to
cause a problem are likely rare in the wild.

Having seen these, I looked around for unsafe usages in the rest
of the adt/json* files.  There are a couple of places where it's not
immediately obvious that the Datum can't be compressed or out-of-line,
so I added pg_detoast_datum_packed() to cope if it is.  Also, remove
some other usages of VARDATA/VARSIZE on Datums we just extracted from
a text array.  Those aren't actively broken, but they will become so
if we ever start allowing short-header array elements, which does not
seem like a terribly unreasonable thing to do.  In any case they are
not great coding examples, and they could also do with comments
pointing out that we're assuming we don't need pg_detoast_datum_packed.

Per report from exe-dealer@yandex.ru.  Patch by me, but thanks to
David Johnston for initial investigation.  Back-patch to v14 where
jsonb subscripting was introduced.

Discussion: https://postgr.es/m/205321670615953@mail.yandex.ru
2022-12-12 16:17:54 -05:00
Robert Haas 45f5c81ad2 Fix failure to advance content pointer in sendFileWithContent.
If sendFileWithContent were used to send a file larger than the
bbsink buffer size, this would result in corruption. The only
files that are sent via sendFileWithContent are the backup label
file, the tablespace map file, and .done files for WAL segments
included in the backup. Of these, it seems that only the
tablespace_map file can become large enough to cause a problem,
and then only if you have a lot of tablespaces. If you do have
that situation, you might end up with a corrupted
tablespace_map file, which would be bad.

My commit bef47ff85d introduced
this problem.

Report and patch by Antonin Houska.

Discussion: http://postgr.es/m/15764.1670528645@antos
2022-12-12 10:26:48 -05:00
Peter Eisentraut df8b8968d4 Order getopt arguments
Order the letters in the arguments of getopt() and getopt_long(), as
well as in the subsequent switch statements.  In most cases, I used
alphabetical with lower case first.  In a few cases, existing
different orders (e.g., upper case first) was kept to reduce the diff
size.

Discussion: https://www.postgresql.org/message-id/flat/3efd0fe8-351b-f836-9122-886002602357%40enterprisedb.com
2022-12-12 15:20:00 +01:00
Alvaro Herrera 840ff5f451
Get rid of recursion-marker values in enum AlterTableType
During ALTER TABLE execution, when prep-time handling of subcommands of
certain types determine that execution-time handling requires recursion,
they signal this by changing the subcommand type to a special value.
This can be done in a simpler way by using a separate flag introduced by
commit ec0925c22a, so do that.

Catversion bumped.  It's not clear to me that ALTER TABLE subcommands
are stored anywhere in catalogs (CREATE FUNCTION rejects it in BEGIN
ATOMIC function bodies), but we do have both write and read support for
them, so be safe.

Discussion: https://postgr.es/m/20220929090033.zxuaezcdwh2fgfjb@alvherre.pgsql
2022-12-12 11:13:26 +01:00
Michael Paquier eae7fe4859 Remove direct call to GetNewObjectId() for pg_auth_members.oid
This routine should not be called directly as mentioned at its top, so
replace it by GetNewOidWithIndex().  Issue introduced by 6566133 when
pg_auth_members.oid got added, so no backpatch is needed.

Author: Maciek Sakrejda
Discussion: https://postgr.es/m/CAOtHd0Ckbih7Ur7XeVyLAJ26VZOfTNcq9qV403bNF4uTGtAN+Q@mail.gmail.com
2022-12-12 09:01:39 +09:00
Tom Lane b8c0ffbd2c Convert domain_in to report errors softly.
This is straightforward as far as it goes.  However, it does not
attempt to trap errors occurring during the execution of domain
CHECK constraints.  Since those are general user-defined
expressions, the only way to do that would involve starting up a
subtransaction for each check.  Of course the entire point of
the soft-errors feature is to not need subtransactions, so that
would be self-defeating.  For now, we'll rely on the assumption
that domain checks are written to avoid throwing errors.

Discussion: https://postgr.es/m/1181028.1670635727@sss.pgh.pa.us
2022-12-11 12:56:54 -05:00
Tom Lane c60c9badba Convert json_in and jsonb_in to report errors softly.
This requires a bit of further infrastructure-extension to allow
trapping errors reported by numeric_in and pg_unicode_to_server,
but otherwise it's pretty straightforward.

In the case of jsonb_in, we are only capturing errors reported
during the initial "parse" phase.  The value-construction phase
(JsonbValueToJsonb) can also throw errors if assorted implementation
limits are exceeded.  We should improve that, but it seems like a
separable project.

Andrew Dunstan and Tom Lane

Discussion: https://postgr.es/m/3bac9841-fe07-713d-fa42-606c225567d6@dunslane.net
2022-12-11 11:28:15 -05:00
Tom Lane 50428a301d Change JsonSemAction to allow non-throw error reporting.
Formerly, semantic action functions for the JSON parser returned void,
so that there was no way for them to affect the parser's behavior.
That means in particular that they can't force an error exit except by
longjmp'ing.  That won't do in the context of our project to make input
functions return errors softly.  Hence, change them to return the same
JsonParseErrorType enum value as the parser itself uses.  If an action
function returns anything besides JSON_SUCCESS, the parse is abandoned
and that error code is returned.

Action functions can thus easily return the same error conditions that
the parser already knows about.  As an escape hatch for expansion, also
invent a code JSON_SEM_ACTION_FAILED that the core parser does not know
the exact meaning of.  When returning this code, an action function
must use some out-of-band mechanism for reporting the error details.

This commit simply makes the API change and causes all the existing
action functions to return JSON_SUCCESS, so that there is no actual
change in behavior here.  This is long enough and boring enough that
it seemed best to commit it separately from the changes that make
real use of the new mechanism.

In passing, remove a duplicate assignment of
transform_string_values_scalar.

Discussion: https://postgr.es/m/1436686.1670701118@sss.pgh.pa.us
2022-12-11 10:39:05 -05:00
Tom Lane d02ef65bce Standardize error reports in unimplemented I/O functions.
We chose a specific wording of the not-implemented errors for
pseudotype I/O functions and other cases where there's little
value in implementing input and/or output.  gtsvectorin never
got that memo though, nor did most of contrib.  Make these all
fall in line, mostly because I'm a neatnik but also to remove
unnecessary translatable strings.

gbtreekey_in needs a bit of extra love since it supports
multiple SQL types.  Sadly, gbtreekey_out doesn't have the
ability to do that, but I think it's unreachable anyway.

Noted while surveying datatype input functions to see what we
have left to fix.
2022-12-10 18:26:43 -05:00
Tom Lane e730718072 Use the macro, not handwritten code, to construct anymultirange_in().
Apparently anymultirange_in was written before we converted all
these pseudotype input functions to use a common macro, and it didn't
get fixed before committing.  Sloppy merging probably explains its
unintuitive ordering, too, so rearrange.

Noted while surveying datatype input functions to see what we
have left to fix.  I'm inclined to leave the pseudotypes as
throwing hard errors, because it's difficult to see a reason why
anyone would need something else.  But in any case, if we want
to change that, we shouldn't have to change multiple copies of
the code.
2022-12-10 17:22:16 -05:00
David Rowley 94985c2102 Add subquery pullup handling for WindowClause runCondition
9d9c02ccd added code to allow WindowAgg to take some shortcuts when a
monotonic WindowFunc reached some value that it could never come back
from due to the function's monotonic nature.  That commit added a
runCondition field to WindowClause to store the condition which, when it
becomes false we can start taking shortcuts in nodeWindowAgg.c.

Here we fix an issue where subquery pullups didn't properly update the
runCondition to update the Vars to properly reference the new query level.

Here we also add a missing call to preprocess_expression() for the
WindowClause's runCondtion.  The WindowFuncs in the targetlist will have
had this process done, so we must also do it for the WindowFuncs in the
runCondition so that they can be correctly found in the targetlist
during setrefs.c

Bug: #17709
Reported-by: Alexey Makhmutov
Author: Richard Guo, David Rowley
Discussion: https://postgr.es/m/17709-4f557160e3e8ee9a@postgresql.org
Backpatch-through: 15, where 9d9c02ccd was introduced
2022-12-10 19:27:20 +13:00
Michael Paquier 66dcb09246 Fix macro definitions in pgstatfuncs.c
Buildfarm member wrasse has been complaining about empty declarations
as an effect of 8018ffb and 83a1a1b due to extra semicolons.

While on it, remove also the last backslash of the macros definitions,
causing more lines to be eaten in it than necessary, per comment from
Tom Lane.

Reported-by: Tom Lane, and buildfarm member wrasse
Author: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/1188769.1670640236@sss.pgh.pa.us
2022-12-10 13:28:02 +09:00
Tom Lane 4dd687502d Restructure soft-error handling in formatting.c.
Replace the error trapping scheme introduced in 5bc450629 with our
shiny new errsave/ereturn mechanism.  This doesn't have any real
functional impact (although I think that the new coding is able
to report a few more errors softly than v15 did).  And I doubt
there's any measurable performance difference either.  But this
gets rid of an ad-hoc, one-of-a-kind design in favor of a mechanism
that will be widely used going forward, so it should be a net win
for code readability.

Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
2022-12-09 20:15:56 -05:00
Tom Lane c60488b474 Convert datetime input functions to use "soft" error reporting.
This patch converts the input functions for date, time, timetz,
timestamp, timestamptz, and interval to the new soft-error style.
There's some related stuff in formatting.c that remains to be
cleaned up, but that seems like a separable project.

Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
2022-12-09 16:07:49 -05:00
Tom Lane 2661469d86 Allow DateTimeParseError to handle bad-timezone error messages.
Pay down some ancient technical debt (dating to commit 022fd9966):
fix a couple of places in datetime parsing that were throwing
ereport's immediately instead of returning a DTERR code that could be
interpreted by DateTimeParseError.  The reason for that was that there
was no mechanism for passing any auxiliary data (such as a zone name)
to DateTimeParseError, and these errors seemed to really need it.
Up to now it didn't matter that much just where the error got thrown,
but now we'd like to have a hard policy that datetime parse errors
get thrown from just the one place.

Hence, invent a "DateTimeErrorExtra" struct that can be used to
carry any extra values needed for specific DTERR codes.  Perhaps
in the future somebody will be motivated to use this to improve
the specificity of other DateTimeParseError messages, but for now
just deal with the timezone-error cases.

This is on the way to making the datetime input functions report
parse errors softly; but it's really an independent change, so
commit separately.

Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
2022-12-09 13:30:47 -05:00
Tom Lane bad5116957 Const-ify a couple of datetime parsing subroutines.
More could be done in this line, but I just grabbed some low-hanging
fruit.  Principal objective was to remove the need for several ugly
unconstify() usages in formatting.c.
2022-12-09 10:43:45 -05:00
Tom Lane ccff2d20ed Convert a few datatype input functions to use "soft" error reporting.
This patch converts the input functions for bool, int2, int4, int8,
float4, float8, numeric, and contrib/cube to the new soft-error style.
array_in and record_in are also converted.  There's lots more to do,
but this is enough to provide proof-of-concept that the soft-error
API is usable, as well as reference examples for how to convert
input functions.

This patch is mostly by me, but it owes very substantial debt to
earlier work by Nikita Glukhov, Andrew Dunstan, and Amul Sul.
Thanks to Andres Freund for review.

Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
2022-12-09 10:14:53 -05:00
Tom Lane 1939d26282 Add test scaffolding for soft error reporting from input functions.
pg_input_is_valid() returns boolean, while pg_input_error_message()
returns the primary error message if the input is bad, or NULL
if the input is OK.  The main reason for having two functions is
so that we can test both the details-wanted and the no-details-wanted
code paths.

Although these are primarily designed with testing in mind,
it could well be that they'll be useful to end users as well.

This patch is mostly by me, but it owes very substantial debt to
earlier work by Nikita Glukhov, Andrew Dunstan, and Amul Sul.
Thanks to Andres Freund for review.

Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
2022-12-09 10:08:44 -05:00
Tom Lane d9f7f5d32f Create infrastructure for "soft" error reporting.
Postgres' standard mechanism for reporting errors (ereport() or elog())
is used for all sorts of error conditions.  This means that throwing
an exception via ereport(ERROR) requires an expensive transaction or
subtransaction abort and cleanup, since the exception catcher dare not
make many assumptions about what has gone wrong.  There are situations
where we would rather have a lighter-weight mechanism for dealing
with errors that are known to be safe to recover from without a full
transaction cleanup.  This commit creates infrastructure to let us
adapt existing error-reporting code for that purpose.  See the
included documentation changes for details.  Follow-on commits will
provide test code and usage examples.

The near-term plan is to convert most if not all datatype input
functions to report invalid input "softly".  This will enable
implementing some SQL/JSON features cleanly and without the cost
of subtransactions, and it will also allow creating COPY options
to deal with bad input without cancelling the whole COPY.

This patch is mostly by me, but it owes very substantial debt to
earlier work by Nikita Glukhov, Andrew Dunstan, and Amul Sul.
Thanks also to Andres Freund for review.

Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
2022-12-09 09:58:38 -05:00
Alexander Korotkov 096dd80f3c Add USER SET parameter values for pg_db_role_setting
The USER SET flag specifies that the variable should be set on behalf of an
ordinary role.  That lets ordinary roles set placeholder variables, which
permission requirements are not known yet.  Such a value wouldn't be used if
the variable finally appear to require superuser privileges.

The new flags are stored in the pg_db_role_setting.setuser array.  Catversion
is bumped.

This commit is inspired by the previous work by Steve Chavez.

Discussion: https://postgr.es/m/CAPpHfdsLd6E--epnGqXENqLP6dLwuNZrPMcNYb3wJ87WR7UBOQ%40mail.gmail.com
Author: Alexander Korotkov, Steve Chavez
Reviewed-by: Pavel Borisov, Steve Chavez
2022-12-09 13:12:20 +03:00
Peter Eisentraut 07c29ca7fe Remove unnecessary casts
Some code carefully cast all data buffer arguments for BufFileWrite()
and BufFileRead() to void *, even though the arguments are already
void * (and AFAICT were never anything else).  Remove this unnecessary
clutter.

Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
2022-12-08 08:58:15 +01:00
Peter Eisentraut 2d4f1ba6cf Update types in File API
Make the argument types of the File API match stdio better:

- Change the data buffer to void *, from char *.
- Change FileWrite() data buffer to const on top of that.
- Change amounts to size_t, from int.

In passing, change the FilePrefetch() amount argument from int to
off_t, to match the underlying posix_fadvise().

Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
2022-12-08 08:58:15 +01:00
Etsuro Fujita 4b3e379932 Remove new structure member from ResultRelInfo.
In commit ffbb7e65a, I added a ModifyTableState member to ResultRelInfo
to save the owning ModifyTableState for use by nodeModifyTable.c when
performing batch inserts, but as pointed out by Tom Lane, that changed
the array stride of es_result_relations, and that would break any
previously-compiled extension code that accesses that array.  Fix by
removing that member from ResultRelInfo and instead adding a List member
at the end of EState to save such ModifyTableStates.

Per report from Tom Lane.  Back-patch to v14, like the previous commit;
I chose to apply the patch to HEAD as well, to make back-patching easy.

Discussion: http://postgr.es/m/4065383.1669395453%40sss.pgh.pa.us
2022-12-08 16:15:00 +09:00
Amit Kapila bf07ab492c Avoid unnecessary streaming of transactions during logical replication.
After restart, we don't perform streaming of an in-progress transaction if
it was previously decoded and confirmed by the client. To achieve that we
were comparing the END location of the WAL record being decoded with the
WAL location we have already decoded and confirmed by the client. While
decoding the commit record, to decide whether to process and send the
complete transaction, we compare its START location with the WAL location
we have already decoded and confirmed by the client. Now, if we need to
queue some change in the transaction while decoding the commit record
(e.g. snapshot), it is possible that we decide to stream the transaction
but later commit processing decides to skip it. In such a case, we would
needlessly send the changes and later when we decide to skip it, we will
send stream abort.

We also sometimes decide to stream the changes when we actually just need
to process them locally like a change for invalidations. This will lead us
to send empty streams. To avoid this, while queuing each change for
decoding, we remember whether the transaction has any change that actually
needs to be sent downstream and use that information later to decide
whether to stream the transaction or not.

Note, we can't avoid all cases where we have to send empty streams like
the case where the plugin later decides that the change is not
publishable. However, we will no longer need to send stream_abort when we
skip sending a particular transaction.

Author: Dilip Kumar
Reviewed-by: Hou Zhijie, Ashutosh Bapat, Shi yu, Amit Kapila
Discussion: https://postgr.es/m/CAFiTN-tHK=7LzfrPs8fbT2ksrOJGQbzywcgXst2bM9-rJJAAUg@mail.gmail.com
2022-12-08 06:05:09 +05:30
Tom Lane 8305629afe Minor code refactoring in elog.c (no functional change).
Combine some duplicated code stanzas by creating small functions.
Most of these duplications arose at a time when I wouldn't have
trusted C compilers to auto-inline small functions intelligently,
but they're probably poor practice now.  Similarly split out some
bits that aren't actually duplicative as the code stands, but would
become so after an upcoming patch to add another error-handling
code path.

Take the opportunity to add some lengthier comments about what
we're doing here, too.  Re-order one function that seemed not
very well-placed.

Patch by me, per suggestions from Andres Freund.

Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
2022-12-07 14:39:25 -05:00
Peter Eisentraut 5e9b122059 Fix FK comment think-o
from commit d6f96ed94e

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Ian Lawrence Barwick <barwick@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/6a7c7338-1aa2-4689-d171-0b0b294fdd84%40illuminatedcomputing.com
2022-12-07 17:06:50 +01:00
Alvaro Herrera 29861e228a
Update outdated comment in ApplyRetrieveRule
After a61b1f7482.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqGZm7hb2VAy8HGM22-fTDaQzqE6T=5GbAk=GkT9H0hJEg@mail.gmail.com
2022-12-07 12:35:59 +01:00
Andres Freund 9db49fc5bf autoconf: Move export_dynamic determination to configure
Previously export_dynamic was set in src/makefiles/Makefile.$port. For solaris
this required exporting with_gnu_ld. The determination of with_gnu_ld would be
nontrivial to copy for meson PGXS compatibility.  It's also nice to delete
libtool.m4.

This uses -Wl,--export-dynamic on all platforms, previously all platforms but
FreeBSD used -Wl,-E. The likelihood of a name conflict seems lower with the
longer spelling.

Discussion: https://postgr.es/m/20221005200710.luvw5evhwf6clig6@awork3.anarazel.de
2022-12-06 18:55:28 -08:00
Michael Paquier 8018ffbf58 Generate pg_stat_get*() functions for databases using macros
The same code pattern is repeated 21 times for int64 counters (0 for
missing entry) and 5 times for doubles (0 for missing entry) on database
entries.  This code is switched to use macros for the basic code
instead, shaving a few hundred lines of originally-duplicated code
patterns.  The function names remain the same, but some fields of
PgStat_StatDBEntry have to be renamed to cope with the new style.

This is in the same spirit as 83a1a1b.

Author: Michael Paquier
Reviewed-by: Nathan Bossart, Bertrand Drouvot
Discussion: https://postgr.es/m/Y46stlxQ2LQE20Na@paquier.xyz
2022-12-07 09:11:48 +09:00
Alvaro Herrera a61b1f7482
Rework query relation permission checking
Currently, information about the permissions to be checked on relations
mentioned in a query is stored in their range table entries.  So the
executor must scan the entire range table looking for relations that
need to have permissions checked.  This can make the permission checking
part of the executor initialization needlessly expensive when many
inheritance children are present in the range range.  While the
permissions need not be checked on the individual child relations, the
executor still must visit every range table entry to filter them out.

This commit moves the permission checking information out of the range
table entries into a new plan node called RTEPermissionInfo.  Every
top-level (inheritance "root") RTE_RELATION entry in the range table
gets one and a list of those is maintained alongside the range table.
This new list is initialized by the parser when initializing the range
table.  The rewriter can add more entries to it as rules/views are
expanded.  Finally, the planner combines the lists of the individual
subqueries into one flat list that is passed to the executor for
checking.

To make it quick to find the RTEPermissionInfo entry belonging to a
given relation, RangeTblEntry gets a new Index field 'perminfoindex'
that stores the corresponding RTEPermissionInfo's index in the query's
list of the latter.

ExecutorCheckPerms_hook has gained another List * argument; the
signature is now:
typedef bool (*ExecutorCheckPerms_hook_type) (List *rangeTable,
					      List *rtePermInfos,
					      bool ereport_on_violation);
The first argument is no longer used by any in-core uses of the hook,
but we leave it in place because there may be other implementations that
do.  Implementations should likely scan the rtePermInfos list to
determine which operations to allow or deny.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com
2022-12-06 16:09:24 +01:00
David Rowley a858327221 Fix 32-bit build dangling pointer issue in WindowAgg
9d9c02ccd added window "run conditions", which allows the evaluation of
monotonic window functions to be skipped when the run condition is no
longer true.  Prior to this commit, once the run condition was no longer
true and we stopped evaluating the window functions, we simply just left
the ecxt_aggvalues[] and ecxt_aggnulls[] arrays alone to store whatever
value was stored there the last time the window function was evaluated.
Leaving a stale value in there isn't really a problem on 64-bit builds as
all of the window functions which we recognize as monotonic all return
int8, which is passed by value on 64-bit builds.  However, on 32-bit
builds, this was a problem as the value stored in the ecxt_values[]
element would be a by-ref value and it would be pointing to some memory
which would get reset once the tuple context is destroyed.  Since the
WindowAgg node will output these values in the resulting tupleslot, this
could be problematic for the top-level WindowAgg node which must look at
these values to filter out the rows that don't meet its filter condition.

Here we fix this by just zeroing the ecxt_aggvalues[] and setting the
ecxt_aggnulls[] array to true when the run condition first becomes false.
This results in the WindowAgg's output having NULLs for the WindowFunc's
columns rather than the stale or pointer pointing to possibly freed
memory.  These tuples with the NULLs can only make it as far as the
top-level WindowAgg node before they're filtered out.  To ensure that
these tuples *are* always filtered out, we now insist that OpExprs making
up the run condition are strict OpExprs.  Currently, all the window
functions which the planner recognizes as monotonic return INT8 and the
operator which is used for the run condition must be a member of a btree
opclass.  In reality, these restrictions exclude nothing that's built-in
to Postgres and are unlikely to exclude anyone's custom operators due to
the requirement that the operator is part of a btree opclass.  It would be
unusual if those were not strict.

Reported-by: Sergey Shinderuk, using valgrind
Reviewed-by: Richard Guo, Sergey Shinderuk
Discussion: https://postgr.es/m/29184c50-429a-ebd7-f1fb-0589c6723a35@postgrespro.ru
Backpatch-through: 15, where 9d9c02ccd was added
2022-12-07 00:09:36 +13:00
Michael Paquier 83a1a1b566 Generate pg_stat_get*() functions for tables using macros
The same code pattern is repeated 17 times for int64 counters (0 for
missing entry) and 5 times for timestamps (NULL for missing entry) on
table entries.  This code is switched to use a macro for the basic code
instead, shaving a few hundred lines of originally-duplicated code.  The
function names remain the same, but some fields of PgStat_StatTabEntry
have to be renamed to cope with the new style.

Author: Bertrand Drouvot
Reviewed-by: Nathan Bossart
Discussion: https:/postgr.es/m/20221204173207.GA2669116@nathanxps13
2022-12-06 10:46:35 +09:00
Alexander Korotkov 941aa6a626 Check the snapshot argument of index_beginscan and family
Passing a NULL snapshot (InvalidSnapshot) is going to work but only as long
as the index can't find any matching rows.  This can be confusing for
the extension authors, so add an explicit check for this argument.  The check
is implemented with Assert() in order to avoid overhead in release builds.

Reported-by: Sven Klemm
Discussion: https://postgr.es/m/CAJ7c6TPxitD4vbKyP-mpmC1XwyHdPPqvjLzm%2BVpB88h8LGgneQ%40mail.gmail.com
Author: Aleksander Alekseev
Reviewed-by: Pavel Borisov
2022-12-06 03:29:18 +03:00
Tom Lane d69d01ba9d Fix Memoize to work with partitionwise joining.
A couple of places weren't up to speed for this.  By sheer good
luck, we didn't fail but just selected a non-memoized join plan,
at least in the test case we have.  Nonetheless, it's a bug,
and I'm not quite sure that it couldn't have worse consequences
in other examples.  So back-patch to v14 where Memoize came in.

Richard Guo

Discussion: https://postgr.es/m/CAMbWs48GkNom272sfp0-WeD6_0HSR19BJ4H1c9ZKSfbVnJsvRg@mail.gmail.com
2022-12-05 12:36:40 -05:00
Michael Paquier 71cb84ec69 Add LSN location in some error messages related to WAL pages
The error messages reported during any failures while reading or
validating the header of a WAL currently includes only the offset of the
page but not the compiled LSN referring to the page, requiring an extra
step to compile it if looking at the surroundings with pg_waldump or
similar.  Adding this information costs a bit in translation, but also
eases debugging.

Author: Bharath Rupireddy
Reviewed-by:  Álvaro Herrera, Kyotaro Horiguchi, Maxim Orlov, Michael
Paquier
Discussion: https://postgr.es/m/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcJK4g@mail.gmail.com
2022-12-05 09:28:29 +09:00
David Rowley 8692f6644e Fix thinko introduced in 6b423ec67
As pointed out by Dean Rasheed, we really should be using tmp >
-(PG_INTNN_MIN / 10) rather than tmp > (PG_INTNN_MAX / 10) for checking
for overflows in the accumulation in the pg_strtointNN functions.  This
does happen to be the same number when dividing by 10, but there is a
pending patch which adds other bases and this is not the same number if we
were to divide by 2 rather than 10, for example.  If the base 2 parsing
was to follow this example then we could accidentally think a string
containing the value of PG_INT32_MIN was an overflow in pg_strtoint32.
Clearly that shouldn't overflow.

This does not fix any actual live bugs, only some bad examples of overflow
checks for future bases.

Reported-by: Dean Rasheed
Discussion: https://postgr.es/m/CAEZATCVEtwfhdm-K-etZYFB0=qsR0nT6qXta_W+GQx4RYph1dg@mail.gmail.com
2022-12-05 11:55:05 +13:00
Tom Lane 92c4dafe1e Re-pgindent a few files.
Just because I'm a neatnik, and I'm currently working on
code in this area.  It annoys me to not be able to pgindent
my patches without working around unrelated changes.
2022-12-04 14:25:53 -05:00
Tom Lane e76913802c Fix broken MemoizePath support in reparameterize_path().
It neglected to recurse to the subpath, meaning you'd get back
a path identical to the input.  This could produce wrong query
results if the omission meant that the subpath fails to enforce
some join clause it should be enforcing.  We don't have a test
case for this at the moment, but the code is obviously broken
and the fix is equally obvious.  Back-patch to v14 where
Memoize was introduced.

Richard Guo

Discussion: https://postgr.es/m/CAMbWs4_R=ORpz=Lkn2q3ebPC5EuWyfZF+tmfCPVLBVK5W39mHA@mail.gmail.com
2022-12-04 13:48:12 -05:00
Tom Lane 6eb2f0ed4c Add missing MaterialPath support in reparameterize_path[_by_child].
These two functions failed to cover MaterialPath.  That's not a
fatal problem, but we can generate better plans in some cases
if we support it.

Tom Lane and Richard Guo

Discussion: https://postgr.es/m/1854233.1669949723@sss.pgh.pa.us
2022-12-04 13:35:42 -05:00
Tom Lane fe12f2f8fa Fix generate_partitionwise_join_paths() to tolerate failure.
We might fail to generate a partitionwise join, because
reparameterize_path_by_child() does not support all path types.
This should not be a hard failure condition: we should just fall back
to a non-partitioned join.  However, generate_partitionwise_join_paths
did not consider this possibility and would emit the (misleading)
error "could not devise a query plan for the given query" if we'd
failed to make any paths for a child join.  Fix it to give up on
partitionwise joining if so.  (The accepted technique for giving up
appears to be to set rel->nparts = 0, which I find pretty bizarre,
but there you have it.)

I have not added a test case because there'd be little point:
any omissions of this sort that we identify would soon get fixed
by extending reparameterize_path_by_child(), so the test would stop
proving anything.  However, right now there is a known test case based
on failure to cover MaterialPath, and with that I've found that this
is broken in all supported versions.  Hence, patch all the way back.

Original report and patch by me; thanks to Richard Guo for
identifying a test case that works against committed versions.

Discussion: https://postgr.es/m/1854233.1669949723@sss.pgh.pa.us
2022-12-04 13:17:18 -05:00
David Rowley 6b423ec677 Improve performance of pg_strtointNN functions
Experiments have shown that modern versions of both gcc and clang are
unable to fully optimize the multiplication by 10 that we're doing in the
pg_strtointNN functions.  Both compilers seem to be making use of "imul",
which is not the most efficient way to multiply by 10.  This seems to be
due to the overflow checking that we're doing.  Without the overflow
checks, both those compilers switch to a more efficient method of
multiplying by 10.  In absence of overflow concern, integer multiplication
by 10 can be done by bit-shifting left 3 places to multiply by 8 and then
adding the original value twice.

To allow compilers this flexibility, here we adjust the code so that we
accumulate the number as an unsigned version of the type and remove the
use of pg_mul_sNN_overflow() and pg_sub_sNN_overflow().  The overflow
checking can be done simply by checking if the accumulated value has gone
beyond a 10th of the maximum *signed* value for the given type.  If it has
then the accumulation of the next digit will cause an overflow.  After
this is done, we do a final overflow check before converting the unsigned
version of the number back to its signed counterpart.

Testing has shown about an 8% speedup of a COPY into a table containing 2
INT columns.

Author: David Rowley, Dean Rasheed
Discussion: https://postgr.es/m/CAApHDvrL6_+wKgPqRHr7gH_6xy3hXM6a3QCsZ5ForurjDFfenA@mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvrdYByjfj-=WbmVNFgmVZg88-dE7heukw8p55aJ+W=qxQ@mail.gmail.com
2022-12-04 16:18:18 +13:00
Tom Lane 29452de734 Doc: flesh out fmgr/README's description of context-node usage.
I wrote this to provide a home for a planned discussion of error
return conventions for non-error-throwing functions.  But it seems
useful as documentation of existing code no matter what becomes of
that proposal, so commit separately.
2022-12-03 10:50:39 -05:00
Dean Rasheed 2605643a3a Fix DEFAULT handling for multi-row INSERT rules.
When updating a relation with a rule whose action performed an INSERT
from a multi-row VALUES list, the rewriter might skip processing the
VALUES list, and therefore fail to replace any DEFAULTs in it. This
would lead to an "unrecognized node type" error.

The reason was that RewriteQuery() assumed that a query doing an
INSERT from a multi-row VALUES list would necessarily only have one
item in its fromlist, pointing to the VALUES RTE to read from. That
assumption is correct for the original query, but not for product
queries produced for rule actions. In such cases, there may be
multiple items in the fromlist, possibly including multiple VALUES
RTEs.

What is required instead is for RewriteQuery() to skip any RTEs from
the product query's originating query, which might include one or more
already-processed VALUES RTEs. What's left should then include at most
one VALUES RTE (from the rule action) to be processed.

Patch by me. Thanks to Tom Lane for reviewing.

Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAEZATCV39OOW7LAR_Xq4i%2BLc1Byux%3DeK3Q%3DHD_pF1o9LBt%3DphA%40mail.gmail.com
2022-12-03 12:11:33 +00:00
Andres Freund cb2e7ddfe5 Prevent pgstats from getting confused when relkind of a relation changes
When the relkind of a relache entry changes, because a table is converted into
a view, pgstats can get confused in 15+, leading to crashes or assertion
failures.

For HEAD, Tom fixed this in b23cd185fd, by removing support for converting a
table to a view, removing the source of the inconsistency. This commit just
adds an assertion that a relcache entry's relkind does not change, just in
case we end up with another case of that in the future. As there's no cases of
changing relkind anymore, we can't add a test that that's handled correctly.

For 15, fix the problem by not maintaining the association with the old pgstat
entry when the relkind changes during a relcache invalidation processing. In
that case the pgstat entry needs to be unlinked first, to avoid
PgStat_TableStatus->relation getting out of sync. Also add a test reproducing
the issues.

No known problem exists in 11-14, so just add the test there.

Reported-by: vignesh C <vignesh21@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALDaNm2yXz+zOtv7y5zBd5WKT8O0Ld3YxikuU3dcyCvxF7gypA@mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm3oZA-8Wbps2Jd1g5_Gjrr-x3YWrJPek-mF5Asrrvz2Dg@mail.gmail.com
Backpatch: 15-
2022-12-02 18:10:30 -08:00
Jeff Davis 7ac0f8d384 Fix broken hash function hashbpcharextended().
Ignore trailing spaces for non-deterministic collations when
hashing.

The previous behavior could lead to tuples falling into the wrong
partitions when hash partitioning is combined with the BPCHAR type and
a non-deterministic collation. Fortunately, it did not affect hash
indexes, because hash indexes do not use extended hash functions.

Decline to backpatch, per discussion.

Discussion: https://postgr.es/m/eb83d0ac7b299eb08f9b900dd08a5a0c5d90e517.camel@j-davis.com
Reviewed-by: Richard Guo, Tom Lane
2022-12-02 14:06:31 -08:00
Tom Lane 4c689a69ee Remove gen_node_support.pl's special treatment of EquivalenceClasses.
It seems better to deal with this by explicit annotations on the
fields in question, instead of magic knowledge embedded in the
script.  While that creates a risk-of-omission from failing to
annotate fields, the preceding commit should catch any such
oversights.

Discussion: https://postgr.es/m/263413.1669513145@sss.pgh.pa.us
2022-12-02 15:20:30 -05:00
Tom Lane b6bd5def3a Add some error cross-checks to gen_node_support.pl.
Check that if we generate a call to copy, compare, write, or read
a specific node type, that node type does have the appropriate
support function.  (This doesn't protect against trying to invoke
nonexistent code when considering generic field types such as
"Node *", but it seems like a useful check anyway.)

Check that array_size() refers to a field appearing earlier in
the struct.  Aside from catching obvious errors like a misspelled
field name, this protects against a more subtle mistake: if the
size field appears later in the struct than the array field, then
compare and read functions would misbehave.  There is actually
exactly that situation in PlannerInfo, but it's okay since we
do not need compare or read functionality for that (today anyway).

Discussion: https://postgr.es/m/263413.1669513145@sss.pgh.pa.us
2022-12-02 15:09:51 -05:00
Tom Lane cabfb8241d Fix psql's \sf and \ef for new-style SQL functions.
Some options of these commands need to be able to identify the start
of the function body within the output of pg_get_functiondef().
It used to be that that always began with "AS", but since the
introduction of new-style SQL functions, it might also start with
"BEGIN" or "RETURN".  Fix that on the psql side, and add some
regression tests.

Noted by me awhile ago, but I didn't do anything about it.
Thanks to David Johnston for a nag.

Discussion: https://postgr.es/m/AM9PR01MB8268D5CDABDF044EE9F42173FE8C9@AM9PR01MB8268.eurprd01.prod.exchangelabs.com
2022-12-02 14:24:44 -05:00
Tom Lane b23cd185fd Remove logic for converting a table to a view.
Up to now we have allowed manual creation of an ON SELECT rule on
a table to convert it into a view.  That was never anything but a
horrid, error-prone hack though.  pg_dump used to rely on that
behavior to deal with cases involving circular dependencies,
where a dependency loop could be broken by separating the creation
of a view from installation of its ON SELECT rule.  However, we
changed pg_dump to use CREATE OR REPLACE VIEW for that in commit
d8c05aff5 (which was later back-patched as far as 9.4), so there's
not a good argument anymore for continuing to support the behavior.

The proximate reason for axing it now is that we found that the
new statistics code has failure modes associated with the relkind
change caused by this behavior.  We'll patch around that in v15,
but going forward it seems like a better idea to get rid of the
need to support relkind changes.

Discussion: https://postgr.es/m/CALDaNm2yXz+zOtv7y5zBd5WKT8O0Ld3YxikuU3dcyCvxF7gypA@mail.gmail.com
2022-12-02 12:14:32 -05:00
Alvaro Herrera fb958b5da8
Generalize ri_RootToPartitionMap to use for non-partition children
ri_RootToPartitionMap is currently only initialized for tuple routing
target partitions, though a future commit will need the ability to use
it even for the non-partition child tables, so make adjustments to the
decouple it from the partitioning code.

Also, make it lazily initialized via ExecGetRootToChildMap(), making
that function its preferred access path.  Existing third-party code
accessing it directly should no longer do so; consequently, it's been
renamed to ri_RootToChildMap, which also makes it consistent with
ri_ChildToRootMap.

ExecGetRootToChildMap() houses the logic of setting the map appropriately
depending on whether a given child relation is partition or not.

To support this, also add a separate entry point for TupleConversionMap
creation that receives an AttrMap.  No new code here, just split an
existing function in two.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqEYUhDXSK5BTvG_xk=eaAEJCD4GS3C6uH7ybBvv+Z_Tmg@mail.gmail.com
2022-12-02 10:35:55 +01:00
Amit Kapila 40b1491357 Fix incorrect output from pgoutput when using column lists.
For Updates and Deletes, we were not honoring the columns list for old
tuple values while sending tuple data via pgoutput. This results in
pgoutput emitting more columns than expected.

This is not a problem for built-in logical replication as we simply ignore
additional columns based on the relation information sent previously which
didn't have those columns. However, some other users of pgoutput plugin
may expect the columns as per the column list. Also, sending extra columns
unnecessarily consumes network bandwidth defeating the purpose of the
column list feature.

Reported-by: Gunnar Morling
Author: Hou Zhijie
Reviewed-by: Amit Kapila
Backpatch-through: 15
Discussion: https://postgr.es/m/CADGJaX9kiRZ-OH0EpWF5Fkyh1ZZYofoNRCrhapBfdk02tj5EKg@mail.gmail.com
2022-12-02 10:52:58 +05:30
Andres Freund 069de07eae autoconf: Don't AC_SUBST() LD in configure
The only use of $(LD) in Makefiles is for AIX, to generate the export file for
the backend. We only support the system linker on AIX and we already hardcode
the path to a number of other binaries. Removing LD substitution will simplify
the upcoming meson PGXS compatibility.

While at it, add a comment why -r is used.

A subsequent commit will remove the determination of LD from configure as
well.

Discussion: https://postgr.es/m/20221005200710.luvw5evhwf6clig6@awork3.anarazel.de
2022-12-01 19:03:26 -08:00
Jeff Davis edf12e7bbd Fix memory leak for hashing with nondeterministic collations.
Backpatch through 12, where nondeterministic collations were
introduced (5e1963fb76).

Backpatch-through: 12
2022-12-01 11:49:15 -08:00
Tom Lane 1dd6700f44 Fix under-parenthesized display of AT TIME ZONE constructs.
In commit 40c24bfef, I forgot to use get_rule_expr_paren() for the
arguments of AT TIME ZONE, resulting in possibly not printing parens
for expressions that need it.  But get_rule_expr_paren() wouldn't have
gotten it right anyway, because isSimpleNode() hadn't been taught that
COERCE_SQL_SYNTAX parent nodes don't guarantee sufficient parentheses.
Improve all that.  Also use this methodology for F_IS_NORMALIZED, so
that we don't print useless parens for that.

In passing, remove a comment that was obsoleted later.

Per report from Duncan Sands.  Back-patch to v14 where this code
came in.  (Before that, we didn't try to print AT TIME ZONE that way,
so there was no bug just ugliness.)

Discussion: https://postgr.es/m/f41566aa-a057-6628-4b7c-b48770ecb84a@deepbluecap.com
2022-12-01 11:38:14 -05:00
Alvaro Herrera ec38694894
Move PartitioPruneInfo out of plan nodes into PlannedStmt
The planner will now add a given PartitioPruneInfo to
PlannedStmt.partPruneInfos instead of directly to the
Append/MergeAppend plan node.  What gets set instead in the
latter is an index field which points to the list element
of PlannedStmt.partPruneInfos containing the PartitioPruneInfo
belonging to the plan node.

A later commit will make AcquireExecutorLocks() do the initial
partition pruning to determine a minimal set of partitions to be
locked when validating a plan tree and it will need to consult the
PartitioPruneInfos referenced therein to do so.  It would be better
for the PartitioPruneInfos to be accessible directly than requiring
a walk of the plan tree to find them, which is easier when it can be
done by simply iterating over PlannedStmt.partPruneInfos.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
2022-12-01 12:56:21 +01:00
Michael Paquier 43351557d0 Make materialized views participate in predicate locking
Matviews have been discarded from needing predicate locks since 3bf3ab8
and their introduction.  At this point, there was no concurrent flavor
of REFRESH yet, hence there was no meaning in having materialized views
look at read/write conflicts with concurrent transactions using
themselves the serializable isolation level because they could only be
refreshed with an access exclusive lock.  CONCURRENTLY, on the contrary,
allows reads and writes during a refresh as it holds a share update
exclusive lock.

Some isolation tests are added to show the effect of the change, with a
combination of one table and a matview based on it, using a mix of
REFRESH CONCURRENTLY and read/write queries.

This could arguably be considered as a bug, but as it is a subtle
behavior change potentially impacting applications no backpatch is
done.

Author: Yugo Nagata
Reviewed-by: Richard Guo, Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/20220726164434.42d4e33911b4b4fcf751c4e7@sraoss.co.jp
2022-12-01 15:41:13 +09:00
Alvaro Herrera 599b33b949
Stop accessing checkAsUser via RTE in some cases
A future commit will move the checkAsUser field from RangeTblEntry
to a new node that, unlike RTEs, will only be created for tables
mentioned in the query but not for the inheritance child relations
added to the query by the planner.  So, checkAsUser value for a
given child relation will have to be obtained by referring to that
for its ancestor mentioned in the query.

In preparation, it seems better to expand the use of RelOptInfo.userid
during planning in place of rte->checkAsUser so that there will be
fewer places to adjust for the above change.

Given that the child-to-ancestor mapping is not available during the
execution of a given "child" ForeignScan node, add a checkAsUser
field to ForeignScan to carry the child relation's RelOptInfo.userid.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqGFCs2uq7VRKi7g+FFKbP6Ea_2_HkgZb2HPhUfaAKT3ng@mail.gmail.com
2022-11-30 12:07:03 +01:00
Tom Lane 8242752f9c Improve heuristics for compressing the KnownAssignedXids array.
Previously, we'd compress only when the active range of array entries
reached Max(4 * PROCARRAY_MAXPROCS, 2 * pArray->numKnownAssignedXids).
If max_connections is large, the first term could result in not
compressing for a long time, resulting in much wastage of cycles in
hot-standby backends scanning the array to take snapshots.  Get rid
of that term, and just bound it to 2 * pArray->numKnownAssignedXids.

That however creates the opposite risk, that we might spend too much
effort compressing.  Hence, consider compressing only once every 128
commit records.  (This frequency was chosen by benchmarking.  While
we only tried one benchmark scenario, the results seem stable over
a fairly wide range of frequencies.)

Also, force compression when processing RecoveryInfo WAL records
(which should be infrequent); the old code could perform compression
then, but would do so only after the same array-range check as for
the transaction-commit path.

Also, opportunistically run compression if the startup process is about
to wait for WAL, though not oftener than once a second.  This should
prevent cases where we waste lots of time by leaving the array
not-compressed for long intervals due to low WAL traffic.

Lastly, add a simple check to keep us from uselessly compressing
when the array storage is already compact.

Back-patch, as the performance problem is worse in pre-v14 branches
than in HEAD.

Simon Riggs and Michail Nikolaev, with help from Tom Lane and
Andres Freund.

Discussion: https://postgr.es/m/CALdSSPgahNUD_=pB_j=1zSnDBaiOtqVfzo8Ejt5J_k7qZiU1Tw@mail.gmail.com
2022-11-29 15:43:17 -05:00
Tom Lane 8b47ccb624 Prevent clobbering of utility statements in SQL function caches.
This is an oversight in commit 7c337b6b5: I apparently didn't think
about the possibility of a SQL function being executed multiple
times within a query.  In that case, functions.c's primitive caching
mechanism allows the same utility parse tree to be presented for
execution more than once.  We have to tell ProcessUtility to make
a working copy of the parse tree, or bad things happen.

Normally I'd add a regression test, but I think the reported crasher
is dependent on some rather random implementation choices that are
nowhere near functions.c, so its usefulness as a long-lived test
feels questionable.  In any case, this fix is clearly correct given
the design choices of 7c337b6b5.

Per bug #17702 from Xin Wen.  Thanks to Daniel Gustafsson for
analysis.  Back-patch to v14 where the faulty commit came in
(before that, the responsibility for copying scribble-able
utility parse trees lay elsewhere).

Discussion: https://postgr.es/m/17702-ad24fdcdd1e9047a@postgresql.org
2022-11-29 11:46:33 -05:00
Tom Lane 51dfaa0b01 Remove bogus Assert and dead code in remove_useless_results_recurse().
The JOIN_SEMI case Assert'ed that there are no PlaceHolderVars that
need to be evaluated at the semijoin's RHS, which is wrong because
there could be some in the semijoin's qual condition.  However, there
could not be any references further up than that, and within the qual
there is not any way that such a PHV could have gone to null yet, so
we don't really need the PHV and there is no need to avoid making the
RHS-removal optimization.  The upshot is that there's no actual bug
in production code, and we ought to just remove this misguided Assert.

While we're here, also drop the JOIN_RIGHT case, which is dead code
because reduce_outer_joins() already got rid of JOIN_RIGHT.

Per bug #17700 from Xin Wen.  Uselessness of the JOIN_RIGHT case
pointed out by Richard Guo.  Back-patch to v12 where this code
was added.

Discussion: https://postgr.es/m/17700-2b5c10d917c30687@postgresql.org
2022-11-29 10:52:44 -05:00
Alvaro Herrera ad86d159b6
Add 'missing_ok' argument to build_attrmap_by_name
When it's given as true, return a 0 in the position of the missing
column rather than raising an error.

This is currently unused, but it allows us to reimplement column
permission checking in a subsequent commit.  It seems worth breaking
into a separate commit because it affects unrelated code.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqFfiai=qBxPDTjaio_ZcaqUKh+FC=prESrB8ogZgFNNNQ@mail.gmail.com
2022-11-29 09:39:36 +01:00
Michael Paquier 8aa03f3caa Fix comment in snapbuild.c
Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoAmf-PkSnMGAJg2DtGhp7O7vpHoexCxfQLKZg8xrbRwsg@mail.gmail.com
2022-11-29 08:53:01 +09:00
Thomas Munro cd4329d939 Remove promote_trigger_file.
Previously, an idle startup (recovery) process would wake up every 5
seconds to have a chance to poll for promote_trigger_file, even if that
GUC was not configured.  That promotion triggering mechanism was
effectively superseded by pg_ctl promote and pg_promote() a long time
ago.  There probably aren't many users left and it's very easy to change
to the modern mechanisms, so we agreed to remove the feature.

This is part of a campaign to reduce wakeups on idle systems.

Author: Simon Riggs <simon.riggs@enterprisedb.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Ian Lawrence Barwick <barwick@gmail.com>
Discussion: https://postgr.es/m/CANbhV-FsjnzVOQGBpQ589%3DnWuL1Ex0Ykn74Nh1hEjp2usZSR5g%40mail.gmail.com
2022-11-29 12:08:38 +13:00
Andrew Dunstan 4441fc704d Provide non-superuser predefined roles for vacuum and analyze
This provides two new predefined roles: pg_vacuum_all_tables and
pg_analyze_all_tables. Roles which have been granted these roles can
perform vacuum or analyse respectively on any or all tables as if they
were a superuser. This removes the need to grant superuser privilege to
roles just so they can perform vacuum and/or analyze.

Nathan Bossart

Reviewed by: Bharath Rupireddy, Kyotaro Horiguchi, Stephen Frost, Robert
Haas, Mark Dilger, Tom Lane, Corey Huinker, David G. Johnston, Michael
Paquier.

Discussion: https://postgr.es/m/20220722203735.GB3996698@nathanxps13
2022-11-28 12:08:14 -05:00
Andrew Dunstan b5d6382496 Provide per-table permissions for vacuum and analyze.
Currently a table can only be vacuumed or analyzed by its owner or
a superuser. This can now be extended to any user by means of an
appropriate GRANT.

Nathan Bossart

Reviewed by: Bharath Rupireddy, Kyotaro Horiguchi, Stephen Frost, Robert
Haas, Mark Dilger, Tom Lane, Corey Huinker, David G. Johnston, Michael
Paquier.

Discussion: https://postgr.es/m/20220722203735.GB3996698@nathanxps13
2022-11-28 12:08:14 -05:00
Michael Paquier 02ac05b4c0 Fix typo in hba.c
Spotted while reading through the git history.
2022-11-26 10:14:18 +09:00
Peter Eisentraut d94f29cb89 Correct some SQL feature names
Maybe these were initially typos or they have been changed upstream
since we first added them.  This brings it all in line with SQL:2016.
2022-11-25 16:56:05 +01:00
Andrew Dunstan 50617a9aa3 Fix gen_node_support.pl for changed AclMode size
omitted from 7b378237aa, mea culpa.

Complaint and fix from Amit Langote.
2022-11-25 08:55:56 -05:00
Dean Rasheed 7b2ccc5e03 Fix rule-detection code for MERGE.
Use the relation's rd_rules structure to test whether it has rules,
rather than the relhasrules flag, which might be out of date.

Reviewed by Tom Lane.

Backpatch to 15, where MERGE was added.

Discussion: https://postgr.es/m/CAEZATCVkBVZABfw71sYvkcPf6tarcOFST5Bc6AOi-LFT9YdccQ%40mail.gmail.com
2022-11-25 13:31:48 +00:00
Etsuro Fujita ffbb7e65a8 Fix handling of pending inserts in nodeModifyTable.c.
Commit b663a4136, which allowed FDWs to INSERT rows in bulk, added to
nodeModifyTable.c code to flush pending inserts to the foreign-table
result relation(s) before completing processing of the ModifyTable node,
but the code failed to take into account the case where the INSERT query
has modifying CTEs, leading to incorrect results.

Also, that commit failed to flush pending inserts before firing BEFORE
ROW triggers so that rows are visible to such triggers.

In that commit we scanned through EState's
es_tuple_routing_result_relations or es_opened_result_relations list to
find the foreign-table result relations to which pending inserts are
flushed, but that would be inefficient in some cases.  So to fix, 1) add
a List member to EState to record the insert-pending result relations,
and 2) modify nodeModifyTable.c so that it adds the foreign-table result
relation to the list in ExecInsert() if appropriate, and flushes pending
inserts properly using the list where needed.

While here, fix a copy-and-pasteo in a comment in ExecBatchInsert(),
which was added by that commit.

Back-patch to v14 where that commit appeared.

Discussion: https://postgr.es/m/CAPmGK16qutyCmyJJzgQOhfBq%3DNoGDqTB6O0QBZTihrbqre%2BoxA%40mail.gmail.com
2022-11-25 17:45:00 +09:00
Michael Paquier d13b684117 Introduce variables for initial and max nesting depth on configuration files
The code has been assuming already in a few places that the initial
recursion nesting depth is 0, and the recent changes in hba.c (mainly
783e8c6) have relies on this assumption in more places.  The maximum
recursion nesting level is assumed to be 10 for hba.c and GUCs.

Author: Julien Rouhaud
Discussion: https://postgr.es/m/20221124090724.n7amf5kpdhx6vb76@jrouhaud
2022-11-25 07:40:12 +09:00
David Rowley 2d1f3bce97 Fix some 32-bit shift warnings in MSVC
7b378237a widened AclMode to 64 bits which resulted in 3 new additional
warnings on MSVC.  Here we make use of UINT64CONST to reassure the
compiler that we do intend the bit shift expression to yield a 64-bit
result.

Discussion: https://postgr.es/m/CAApHDvo=pn01Y_3zASZZqn+cotF1c4QFCwWgk6MiF0VscaE5ug@mail.gmail.com
2022-11-25 11:05:22 +13:00
David Rowley ec5affdbc2 Improve indenting in _hash_pgaddtup
The Assert added in d09dbeb9b came out rather ugly after having run
pgindent on that code.  Here we adjust things to use some local variables
so that the Assert remains within the 80-character margin.

Author: Ted Yu
Discussion: https://postgr.es/m/CALte62wLSir1=x93Jf0xZvHaO009FEJfhVMFwnaR8q=csPP8kQ@mail.gmail.com
2022-11-25 10:10:44 +13:00
Alvaro Herrera 2cf41cd309
Make multixact error message more explicit
There are recent reports involving a very old error message that we have
no history of hitting -- perhaps a recently introduced bug.  Improve the
error message in an attempt to improve our chances of investigating the
bug.

Per reports from Dimos Stamatakis and Bob Krier.

Backpatch to 11.

Discussion: https://postgr.es/m/CO2PR0801MB2310579F65529380A4E5EDC0E20A9@CO2PR0801MB2310.namprd08.prod.outlook.com
Discussion: https://postgr.es/m/17518-04e368df5ad7f2ee@postgresql.org
2022-11-24 10:45:10 +01:00
Michael Paquier af205152ef Add the database name to the ps display of logical WAL senders
Logical WAL senders display now as follows, gaining a database name:
postgres: walsender USER DATABASE HOST(PORT) STATE

Physical WAL senders show up the same, as of:
postgres: walsender USER HOST(PORT) STATE

This information was missing, hence it was not possible to know from ps
if a WAL sender was a logical or a physical one, and on which database
it is connected when it is logical.

Author: Tatsuhiro Nakamori
Reviewed-by: Fujii Masao, Bharath Rupireddy
Discussion: https://postgr.es/m/36a3b137e82e0ea9fe7e4234f03b64a1@oss.nttdata.com
2022-11-24 16:07:59 +09:00
Michael Paquier a54b658ce7 Add support for file inclusions in HBA and ident configuration files
pg_hba.conf and pg_ident.conf gain support for three record keywords:
- "include", to include a file.
- "include_if_exists", to include a file, ignoring it if missing.
- "include_dir", to include a directory of files.  These are classified
by name (C locale, mostly) and need to be prefixed by ".conf", hence
following the same rules as GUCs.

This commit relies on the refactoring pieces done in efc9816, ad6c528,
783e8c6 and 1b73d0b, adding a small wrapper to build a list of
TokenizedAuthLines (tokenize_include_file), and the code is shaped to
offer some symmetry with what is done for GUCs with the same options.

pg_hba_file_rules and pg_ident_file_mappings gain a new field called
file_name, to track from which file a record is located, taking
advantage of the addition of rule_number in c591300 to offer an
organized view of the HBA or ident records loaded.

Bump catalog version.

Author: Julien Rouhaud
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud
2022-11-24 13:51:34 +09:00
David Rowley d09dbeb9bd Speedup hash index builds by skipping needless binary searches
When building hash indexes using the spool method, tuples are added to the
index page in hashkey order.  Because of this, we can safely skip
performing the binary search on the existing tuples on the page to find
the location to insert the tuple based on its hashkey value.  For this
case, we can just always put the tuple at the end of the item array as the
tuples will always arrive in hashkey order.

Testing has shown that this can improve hash index build speeds by 5-15%
with a unique set of integer values.

Author: Simon Riggs
Reviewed-by: David Rowley
Tested-by: David Zhang, Tomas Vondra
Discussion: https://postgr.es/m/CANbhV-GBc5JoG0AneUGPZZW3o4OK5LjBGeKe_icpC3R1McrZWQ@mail.gmail.com
2022-11-24 17:21:44 +13:00
Michael Paquier d46ad72f46 Create memory context for tokenization after opening top-level file in hba.c
The memory context was created before attempting to open the first HBA
or ident file, which would cause it to leak.  This had no consequences
for the system views for HBA and ident files, but this would cause
memory leaks in the postmaster on reload if the initial HBA and/or ident
files are missing, which is a valid behavior while the backend is
running.

Oversight in efc9816.

Author: Ted Yu
Discussion: https://postgr.es/m/CALte62xH6ivgiKKzPRJgfekPZC6FKLB3xbnf3=tZmc_gKj78dw@mail.gmail.com
2022-11-24 10:27:38 +09:00
Michael Paquier d5566fbfeb Add missing initialization in tokenize_expand_file() for output list
This should have been added in efc9816, but it looks like I have found a
way to mess up a bit a patch split.  This should have no consequence in
practice, but let's be clean.

Discussion: https://postgr.es/m/Y324HvGKiWxW2yxe@paquier.xyz
2022-11-24 10:03:11 +09:00
Michael Paquier efc981627a Rework memory contexts in charge of HBA/ident tokenization
The list of TokenizedAuthLines generated at parsing for the HBA and
ident files is now stored in a static context called tokenize_context,
where only all the parsed tokens are stored.  This context is created
when opening the first authentication file of a HBA/ident set (hba_file
or ident_file), and is cleaned up once we are done all the work around
it through a new routine called free_auth_file().  One call of
open_auth_file() should have one matching call of free_auth_file(), the
creation and deletion of the tokenization context is controlled by the
recursion depth of the tokenization.

Rather than having tokenize_auth_file() return a memory context that
includes all the records, the tokenization logic now creates and deletes
one memory context each time this function is called.  This will
simplify recursive calls to this routine for the upcoming inclusion
record logic.

While on it, rename tokenize_inc_file() to tokenize_expand_file() as
this would conflict with the upcoming patch that will add inclusion
records for HBA/ident files.  An '@' file has its tokens added to an
existing list.

Reloading HBA/indent configuration in a tight loop shows no leaks, as of
one type of test done (with and without -DEXEC_BACKEND).

Author: Michael Paquier
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/Y324HvGKiWxW2yxe@paquier.xyz
2022-11-24 08:21:55 +09:00
Alexander Korotkov cee1209514 Support for custom slots in the custom executor nodes
Some custom table access method may have their tuple format and use custom
executor nodes for their custom scan types. The ability to set a custom slot
would save them from tuple format conversion. Other users of custom executor
nodes may also benefit.

Discussion: https://postgr.es/m/CAPpHfduJUU6ToecvTyRE_yjxTS80FyPpct4OHaLFk3OEheMTNA@mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Pavel Borisov
2022-11-24 00:36:11 +03:00
Andrew Dunstan b7a5ef17cf Simplify WARNING messages from skipped vacuum/analyze on a table
This will more easily accomodate adding new permissions for vacuum and
analyze.

Nathan Bossart following a suggestion from Kyotaro Horiguchi

Discussion: https://postgr.es/m/20220726.104712.912995710251150228.horikyota.ntt@gmail.com
2022-11-23 14:43:16 -05:00
Andrew Dunstan 7b378237aa Expand AclMode to 64 bits
We're running out of bits for new permissions. This change doubles the
number of permissions we can accomodate from 16 to 32, so the
forthcoming new ones for vacuum/analyze don't exhaust the pool.

Nathan Bossart

Reviewed by: Bharath Rupireddy, Kyotaro Horiguchi, Stephen Frost, Robert
Haas, Mark Dilger, Tom Lane, Corey Huinker, David G. Johnston, Michael
Paquier.

Discussion: https://postgr.es/m/20220722203735.GB3996698@nathanxps13
2022-11-23 14:43:16 -05:00
Peter Geoghegan b6074846ce Simplify vacuum_set_xid_limits() signature.
Pass VACUUM parameters (VacuumParams state) to vacuum_set_xid_limits()
directly, rather than passing most individual VacuumParams fields as
separate arguments.

Also make vacuum_set_xid_limits() output parameter symbol names match
those used by its vacuumlazy.c caller.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wz=TE7gW5DgSahDkf0UEZigFGAoHNNN6EvSrdzC=Kn+hrA@mail.gmail.com
2022-11-23 11:10:06 -08:00
Peter Geoghegan 02d647bbf0 Don't test HEAP_XMAX_INVALID when freezing xmax.
We shouldn't ever need to rely on whether HEAP_XMAX_INVALID is set in
t_infomask when considering whether or not an xmax should be deemed
already frozen, since that status flag is just a hint.  The only
acceptable representation for an "xmax_already_frozen" raw xmax field is
the transaction ID value zero (also known as InvalidTransactionId).

Adjust code that superficially appeared to rely on HEAP_XMAX_INVALID to
make the rule about xmax_already_frozen clear.  Also avoid needlessly
rereading the tuple's raw xmax.

Oversight in bugfix commit d2599ecf.  There is no evidence that this
ever led to incorrect behavior, so no backpatch.  The worst consequence
of this bug was that VACUUM could hypothetically fail to notice and
report on certain kinds of corruption, which seems fairly benign.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wzkh3DMCDRPfhZxj9xCq9v3WmzvmbiCpf1dNKUBPadhCbQ@mail.gmail.com
2022-11-23 10:49:39 -08:00