Commit Graph

23126 Commits

Author SHA1 Message Date
Michael Paquier f1431f3bff Handle NULL for short descriptions of custom GUC variables
If a short description is specified as NULL in one of the various
DefineCustomXXXVariable() functions available to external modules to
define a custom parameter, SHOW ALL would crash.  This change teaches
SHOW ALL to properly handle NULL short descriptions, as well as any code
paths that manipulate it, to gain in flexibility.  Note that
help_config.c was already able to do that, when describing a set of GUCs
for postgres --describe-config.

Author: Steve Chavez
Reviewed by: Nathan Bossart, Andres Freund, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/CAGRrpzY6hO-Kmykna_XvsTv8P2DshGiU6G3j8yGao4mk0CqjHA%40mail.gmail.com
Backpatch-through: 10
2022-05-28 12:12:40 +09:00
David Rowley 3e9abd2eb1 Teach remove_unused_subquery_outputs about window run conditions
9d9c02ccd added code to allow the executor to take shortcuts when quals
on monotonic window functions guaranteed that once the qual became false
it could never become true again.  When possible, baserestrictinfo quals
are converted to become these quals, which we call run conditions.

Unfortunately, in 9d9c02ccd, I forgot to update
remove_unused_subquery_outputs to teach it about these run conditions.
This could cause a WindowFunc column which was unused in the target list
but referenced by an upper-level WHERE clause to be removed from the
subquery when the qual in the WHERE clause was converted into a window run
condition.  Because of this, the entire WindowClause would be removed from
the query resulting in additional rows making it into the resultset when
they should have been filtered out by the WHERE clause.

Here we fix this by recording which target list items in the subquery have
run conditions. That gets passed along to remove_unused_subquery_outputs
to tell it not to remove these items from the target list.

Bug: #17495
Reported-by: Jeremy Evans
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/17495-7ffe2fa0b261b9fa@postgresql.org
2022-05-27 10:37:58 +12:00
Tom Lane 2b65de7fc2 Remove misguided SSL key file ownership check in libpq.
Commits a59c79564 et al. tried to sync libpq's SSL key file
permissions checks with what we've used for years in the backend.
We did not intend to create any new failure cases, but it turns out
we did: restricting the key file's ownership breaks cases where the
client is allowed to read a key file despite not having the identical
UID.  In particular a client running as root used to be able to read
someone else's key file; and having seen that I suspect that there are
other, less-dubious use cases that this restriction breaks on some
platforms.

We don't really need an ownership check, since if we can read the key
file despite its having restricted permissions, it must have the right
ownership --- under normal conditions anyway, and the point of this
patch is that any additional corner cases where that works should be
deemed allowable, as they have been historically.  Hence, just drop
the ownership check, and rearrange the permissions check to get rid
of its faulty assumption that geteuid() can't be zero.  (Note that the
comparable backend-side code doesn't have to cater for geteuid() == 0,
since the server rejects that very early on.)

This does have the end result that the permissions safety check used
for a root user's private key file is weaker than that used for
anyone else's.  While odd, root really ought to know what she's doing
with file permissions, so I think this is acceptable.

Per report from Yogendra Suralkar.  Like the previous patch,
back-patch to all supported branches.

Discussion: https://postgr.es/m/MW3PR15MB3931DF96896DC36D21AFD47CA3D39@MW3PR15MB3931.namprd15.prod.outlook.com
2022-05-26 14:14:05 -04:00
Tom Lane 6217053f4e Avoid ERRCODE_INTERNAL_ERROR in oracle_compat.c functions.
repeat() checked for integer overflow during its calculation of the
required output space, but it just passed the resulting integer to
palloc().  This meant that result sizes between 1GB and 2GB led to
ERRCODE_INTERNAL_ERROR, "invalid memory alloc request size" rather
than ERRCODE_PROGRAM_LIMIT_EXCEEDED, "requested length too large".
That seems like a bit of a wart, so add an explicit AllocSizeIsValid
check to make these error cases uniform.

Do likewise in the sibling functions lpad() etc.  While we're here,
also modernize their overflow checks to use pg_mul_s32_overflow() etc
instead of expensive divisions.

Per complaint from Japin Li.  This is basically cosmetic, so I don't
feel a need to back-patch.

Discussion: https://postgr.es/m/ME3P282MB16676ED32167189CB0462173B6D69@ME3P282MB1667.AUSP282.PROD.OUTLOOK.COM
2022-05-26 12:25:10 -04:00
Andres Freund 98f897339b Fix stats_fetch_consistency default value indicated in postgresql.conf.sample.
Mistake in 5891c7a8ed, likely made when switching the default value from none
to fetch during development.

Reported-By: Nathan Bossart <nathandbossart@gmail.com>
Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/20220524220147.GA1298892@nathanxps13
2022-05-24 21:26:39 -07:00
Michael Paquier c9dfe2e83a Remove duplicated words in comments of pgstat.c and pgstat_internal.h
Author: Atsushi Torikoshi
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/d00ddbf29f9d09b3a471e64977560de1@oss.nttdata.com
2022-05-24 11:00:41 +09:00
John Naylor 6e647ef0e7 Remove debug messages from tuplesort_sort_memtuples()
These were of value only during development.

Reported by Justin Pryzby
Discussion: https://www.postgresql.org/message-id/20220519201254.GU19626%40telsasoft.com
2022-05-23 13:11:43 +07:00
Tom Lane c7461fc255 Show 'AS "?column?"' explicitly when it's important.
ruleutils.c was coded to suppress the AS label for a SELECT output
expression if the column name is "?column?", which is the parser's
fallback if it can't think of something better.  This is fine, and
avoids ugly clutter, so long as (1) nothing further up in the parse
tree relies on that column name or (2) the same fallback would be
assigned when the rule or view definition is reloaded.  Unfortunately
(2) is far from certain, both because ruleutils.c might print the
expression in a different form from how it was originally written
and because FigureColname's rules might change in future releases.
So we shouldn't rely on that.

Detecting exactly whether there is any outer-level use of a SELECT
column name would be rather expensive.  This patch takes the simpler
approach of just passing down a flag indicating whether there *could*
be any outer use; for example, the output column names of a SubLink
are not referenceable, and we also do not care about the names exposed
by the right-hand side of a setop.  This is sufficient to suppress
unwanted clutter in all but one case in the regression tests.  That
seems like reasonable evidence that it won't be too much in users'
faces, while still fixing the cases we need to fix.

Per bug #17486 from Nicolas Lutic.  This issue is ancient, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/17486-1ad6fd786728b8af@postgresql.org
2022-05-21 14:45:58 -04:00
Tom Lane a916cb9d5a Avoid overflow hazard when clamping group counts to "long int".
Several places in the planner tried to clamp a double value to fit
in a "long" by doing
	(long) Min(x, (double) LONG_MAX);
This is subtly incorrect, because it casts LONG_MAX to double and
potentially back again.  If long is 64 bits then the double value
is inexact, and the platform might round it up to LONG_MAX+1
resulting in an overflow and an undesirably negative output.

While it's not hard to rewrite the expression into a safe form,
let's put it into a common function to reduce the risk of someone
doing it wrong in future.

In principle this is a bug fix, but since the problem could only
manifest with group count estimates exceeding 2^63, it seems unlikely
that anyone has actually hit this or will do so anytime soon.  We're
fixing it mainly to satisfy fuzzer-type tools.  That being the case,
a HEAD-only fix seems sufficient.

Andrey Lepikhov

Discussion: https://postgr.es/m/ebbc2efb-7ef9-bf2f-1ada-d6ec48f70e58@postgrespro.ru
2022-05-21 13:13:44 -04:00
Alvaro Herrera 6029861916
Fix DDL deparse of CREATE OPERATOR CLASS
When an implicit operator family is created, it wasn't getting reported.
Make it do so.

This has always been missing.  Backpatch to 10.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr
2022-05-20 18:52:55 +02:00
Alvaro Herrera 8d061acd12
Repurpose PROC_COPYABLE_FLAGS as PROC_XMIN_FLAGS
This is a slight, convenient semantics change from what commit
0f0cfb4940 ("Fix parallel operations that prevent oldest xmin from
advancing") introduced that lets us simplify the coding in the one place
where it is used.

Backpatch to 13.  This is related to commit 6fea65508a ("Tighten
ComputeXidHorizons' handling of walsenders") rewriting the code site
where this is used, which has not yet been backpatched, but it may well
be in the future.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/202204191637.eldwa2exvguw@alvherre.pgsql
2022-05-19 16:20:32 +02:00
Amit Kapila 0ff20288e1 Extend pg_publication_tables to display column list and row filter.
Commit 923def9a53 and 52e4f0cd47 allowed to specify column lists and row
filters for publication tables. This commit extends the
pg_publication_tables view and pg_get_publication_tables function to
display that information.

This information will be useful to users and we also need this for the
later commit that prohibits combining multiple publications with different
column lists for the same table.

Author: Hou Zhijie
Reviewed By: Amit Kapila, Alvaro Herrera, Shi Yu, Takamichi Osumi
Discussion: https://postgr.es/m/202204251548.mudq7jbqnh7r@alvherre.pgsql
2022-05-19 08:20:55 +05:30
Alvaro Herrera 12e423e21d
Fix EXPLAIN MERGE output when no tuples are processed
An 'else' clause was misplaced in commit 598ac10be1, making zero-rows
output look a bit silly.  Add a test case for it.

Pointed out by Tom Lane.

Discussion: https://postgr.es/m/21030.1652893083@sss.pgh.pa.us
2022-05-18 21:20:49 +02:00
Alvaro Herrera 0fbf011200
Check column list length in XMLTABLE/JSON_TABLE alias
We weren't checking the length of the column list in the alias clause of
an XMLTABLE or JSON_TABLE function (a "tablefunc" RTE), and it was
possible to make the server crash by passing an overly long one.  Fix it
by throwing an error in that case, like the other places that deal with
alias lists.

In passing, modify the equivalent test used for join RTEs to look like
the other ones, which was different for no apparent reason.

This bug came in when XMLTABLE was born in version 10; backpatch to all
stable versions.

Reported-by: Wang Ke <krking@zju.edu.cn>
Discussion: https://postgr.es/m/17480-1c9d73565bb28e90@postgresql.org
2022-05-18 20:28:31 +02:00
Alvaro Herrera 598ac10be1
Make EXPLAIN MERGE output format more compact
We can use a single line to print all tuple counts that MERGE processed,
for conciseness, and elide those that are zeroes.  Non-text formats
report all numbers, as is typical.

Per comment from Justin Pryzby <pryzby@telsasoft.com>

Discussion: https://postgr.es/m/20220511163350.GL19626@telsasoft.com
2022-05-18 18:33:04 +02:00
Michael Paquier bbf7c2d9e9 Fix typo in walreceiver.c
s/primary_slotname/primary_slot_name/.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACX3=pHkCpoGG-z+O=7Gp5YZv70jmfTyGnNV7YF3SkK73g@mail.gmail.com
2022-05-18 09:06:22 +09:00
Peter Eisentraut 6a8a7b1ccb Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: dde45df385dab9032155c1f867b677d55695310c
2022-05-16 11:12:42 +02:00
David Rowley 1e731ed12a Fix incorrect row estimates used for Memoize costing
In order to estimate the cache hit ratio of a Memoize node, one of the
inputs we require is the estimated number of times the Memoize node will
be rescanned.  The higher this number, the large the cache hit ratio is
likely to become.  Unfortunately, the value being passed as the number of
"calls" to the Memoize was incorrectly using the Nested Loop's
outer_path->parent->rows instead of outer_path->rows.  This failed to
account for the fact that the outer_path might be parameterized by some
upper-level Nested Loop.

This problem could lead to Memoize plans appearing more favorable than
they might actually be.  It could also lead to extended executor startup
times when work_mem values were large due to the planner setting overly
large MemoizePath->est_entries resulting in the Memoize hash table being
initially made much larger than might be required.

Fix this simply by passing outer_path->rows rather than
outer_path->parent->rows.  Also, adjust the expected regression test
output for a plan change.

Reported-by: Pavel Stehule
Author: David Rowley
Discussion: https://postgr.es/m/CAFj8pRAMp%3DQsMi6sPQJ4W3hczoFJRvyXHJV3AZAZaMyTVM312Q%40mail.gmail.com
Backpatch-through: 14, where Memoize was introduced
2022-05-16 16:07:56 +12:00
Michael Paquier fcab82a2d7 Fix comment in pg_proc.c
pgstat_create_function() creates stats for a function in a transactional
fashion, so the stats would be dropped if transaction creating the
function is aborted, not committed.

Author: Amul Sul
Discussion: https://postgr.es/m/CAAJ_b97x1T3xgAMWNj4w7kSgN0nTuG-vLrQJ4NB-dsNr0Kudxw@mail.gmail.com
2022-05-14 08:27:59 +09:00
Alvaro Herrera c4f113e8fe
Clean up newlines following left parentheses
Like commit c9d2977519.
2022-05-13 23:52:35 +02:00
Tom Lane 3ab9a63cb6 Rename JsonIsPredicate.value_type, fix JSON backend/nodes/ infrastructure.
I started out with the intention to rename value_type to item_type to
avoid a collision with a typedef name that appears on some platforms.

Along the way, I noticed that the adjacent field "format" was not being
correctly handled by the backend/nodes/ infrastructure functions:
copyfuncs.c erroneously treated it as a scalar, while equalfuncs,
outfuncs, and readfuncs omitted handling it at all.  This looks like
it might be cosmetic at the moment because the field is always NULL
after parse analysis; but that's likely a bug in itself, and the code's
certainly not very future-proof.  Let's fix it while we can still do so
without forcing an initdb on beta testers.

Further study found a few other inconsistencies in the backend/nodes/
infrastructure for the recently-added JSON node types, so fix those too.

catversion bumped because of potential change in stored rules.

Discussion: https://postgr.es/m/526703.1652385613@sss.pgh.pa.us
2022-05-13 11:40:08 -04:00
Robert Haas 4f2400cb3f Add a new shmem_request_hook hook.
Currently, preloaded libraries are expected to request additional
shared memory and LWLocks in _PG_init().  However, it is not unusal
for such requests to depend on MaxBackends, which won't be
initialized at that time.  Such requests could also depend on GUCs
that other modules might change.  This introduces a new hook where
modules can safely use MaxBackends and GUCs to request additional
shared memory and LWLocks.

Furthermore, this change restricts requests for shared memory and
LWLocks to this hook.  Previously, libraries could make requests
until the size of the main shared memory segment was calculated.
Unlike before, we no longer silently ignore requests received at
invalid times.  Instead, we FATAL if someone tries to request
additional shared memory or LWLocks outside of the hook.

Nathan Bossart and Julien Rouhaud

Discussion: https://postgr.es/m/20220412210112.GA2065815%40nathanxps13
Discussion: https://postgr.es/m/Yn2jE/lmDhKtkUdr@paquier.xyz
2022-05-13 09:31:06 -04:00
Peter Eisentraut 30ed71e423 Indent C code in flex and bison files
In the style of pgindent, done semi-manually.

Discussion: https://www.postgresql.org/message-id/flat/7d062ecc-7444-23ec-a159-acd8adf9b586%40enterprisedb.com
2022-05-13 07:17:29 +02:00
Andres Freund 0cf16cb8ca Don't report stats in LogicalRepApplyLoop() when in xact.
pgstat_report_stat() is only supposed to be called outside of transactions. In
5891c7a8ed I added a pgstat_report_stat() call into LogicalRepApplyLoop()'s
timeout branch. While not commonly reached inside a transaction, it is
reachable (e.g. due to network bottlenecks or the sender being stalled / slow
for some reason).

To fix, add a !IsTransactionState() check.

No test added because there's no easy way to reproduce this case without
patching the code.

Reported-By: Erik Rijkers <er@xs4all.nl>
Discussion: https://postgr.es/m/b3463b8c-2328-dcac-0136-af95715493c1@xs4all.nl
2022-05-12 18:54:26 -07:00
Andres Freund 07d683b54a Remove function declaration for function in pg_proc.
The declaration is automatically generated. Noticed when experimenting with
adding PGDLLIMPORT markers for functions.

Discussion: https://postgr.es/m/20220512164513.vaheofqp2q24l65r@alap3.anarazel.de
2022-05-12 12:39:33 -07:00
Andres Freund 0699b1ae2d Add missing binary_upgrade.h includes.
A few places used binary_upgrade_* variables without including the header,
which worked without warnings because the variables are defined in those
places. However that can cause linker complaints with MSVC - except that we
don't see them right now, due to the use of a symbol export file.

Discussion: https://postgr.es/m/20220512164513.vaheofqp2q24l65r@alap3.anarazel.de
2022-05-12 12:39:33 -07:00
Andres Freund 09cd33f47b Add 'static' to file-local variables missing it.
Noticed when comparing the set of exported symbols without / with
-fvisibility=hidden after adding PGDLLIMPORT to intentionally exported
symbols.

Discussion: https://postgr.es/m/20220512164513.vaheofqp2q24l65r@alap3.anarazel.de
2022-05-12 12:39:33 -07:00
Tom Lane 23e7b38bfe Pre-beta mechanical code beautification.
Run pgindent, pgperltidy, and reformat-dat-files.
I manually fixed a couple of comments that pgindent uglified.
2022-05-12 15:17:30 -04:00
Andres Freund b5f44225b8 Mark a few 'bbsink' related functions / variables static.
Discussion: https://postgr.es/m/20220506234924.6mxxotl3xl63db3l@alap3.anarazel.de
2022-05-12 09:11:31 -07:00
Tom Lane 79b58c6f68 Make pull_var_clause() handle GroupingFuncs exactly like Aggrefs.
This follows in the footsteps of commit 2591ee8ec by removing one more
ill-advised shortcut from planning of GroupingFuncs.  It's true that
we don't intend to execute the argument expression(s) at runtime, but
we still have to process any Vars appearing within them, or we risk
failure at setrefs.c time (or more fundamentally, in EXPLAIN trying
to print such an expression).  Vars in upper plan nodes have to have
referents in the next plan level, whether we ever execute 'em or not.

Per bug #17479 from Michael J. Sullivan.  Back-patch to all supported
branches.

Richard Guo

Discussion: https://postgr.es/m/17479-6260deceaf0ad304@postgresql.org
2022-05-12 11:31:46 -04:00
Robert Haas ab02d702ef Remove non-functional code for unloading loadable modules.
The code for unloading a library has been commented-out for over 12
years, ever since commit 602a9ef5a7, and we're
no closer to supporting it now than we were back then.

Nathan Bossart, reviewed by Michael Paquier and by me.

Discussion: http://postgr.es/m/Ynsc9bRL1caUSBSE@paquier.xyz
2022-05-11 15:30:30 -04:00
Michael Paquier 45edde037e Fix typos and grammar in code and test comments
This fixes the grammar of some comments in a couple of tests (SQL and
TAP), and in some C files.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220511020334.GH19626@telsasoft.com
2022-05-11 15:38:55 +09:00
Thomas Munro 0d3431497d Add logging for excessive ProcSignalBarrier waits.
To enable diagnosis of systems that are not processing ProcSignalBarrier
requests promptly, add a LOG message every 5 seconds if we seem to be
wedged.  Although you could already see this state as a wait event in
pg_stat_activity, the log message also shows the PID of the process that
is preventing progress.

Also add DEBUG1 logging around the whole wait loop.

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BTgmoYJ03r5359gQutRGP9BtigYCg3_UskcmnVjBf-QO3-0pQ%40mail.gmail.com
2022-05-11 18:03:03 +12:00
Amit Kapila f95d53eded Fix the logical replication timeout during large transactions.
The problem is that we don't send keep-alive messages for a long time
while processing large transactions during logical replication where we
don't send any data of such transactions. This can happen when the table
modified in the transaction is not published or because all the changes
got filtered. We do try to send the keep_alive if necessary at the end of
the transaction (via WalSndWriteData()) but by that time the
subscriber-side can timeout and exit.

To fix this we try to send the keepalive message if required after
processing certain threshold of changes.

Reported-by: Fabrice Chapuis
Author: Wang wei and Amit Kapila
Reviewed By: Masahiko Sawada, Euler Taveira, Hou Zhijie, Hayato Kuroda
Backpatch-through: 10
Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com
2022-05-11 11:11:44 +05:30
Michael Paquier 8bbf8461a3 Silence extra logging when using "postgres -C" on runtime-computed GUCs
Presently, the server may emit a variety of log messages when inspecting
a runtime-computed GUC, mostly in the shape of one LOG message with the
default configuration, related to the startup sequence launched as such
GUCs require a load of the control file and of external shared
libraries.

For example, the server will always emit a "database system is shut
down" LOG (unless the user has set log_min_messages higher than LOG),
which is an annoying behavior as "postgres -C" is expected to only emit
in its output the parameter value we are looking for.  The parameter
value is sent to stdout, while the logs are sent to stderr so we could
recommend to use a redirection, but there was not much love for this
workaround either.

To avoid such extra log messages, per discussion, this change sets
log_min_messages to FATAL internally when -C is used on a
runtime-computed GUC (even if set to PANIC in postgresql.conf).  At
FATAL, the user will still receive messages explaining why a GUC value
cannot be inspected, and will know if the command is attempted on a
server already running, something not supported yet for a
runtime-computed GUC.

Reported-by: Magnus Hagander, Bruce Momjian
Author: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/Yni6ZHkGotUU+RSf@paquier.xyz
2022-05-11 14:21:06 +09:00
David Rowley c90c16591c Fix some incorrect preprocessor tests in tuplesort specializations
697492434 added 3 new quicksort specialization functions for common
datatypes.

That commit was not very consistent in how it would determine if we're
compiling for 32-bit or 64-bit machines.  It would sometimes use
USE_FLOAT8_BYVAL and at other times check if SIZEOF_DATUM == 8.  This
could cause theoretical problems due to the way USE_FLOAT8_BYVAL is now
defined based on SIZEOF_VOID_P >= 8.  If pointers for some reason were
ever larger than 8-bytes then we'd end up doing 32-bit comparisons
mistakenly.  Let's just always check SIZEOF_DATUM >= 8.

It also seems that ssup_datum_signed_cmp is just never used on 32-bit
builds, so let's just ifdef that out to make sure we never accidentally
use that comparison function on such machines.  This also allows us to
ifdef out 1 of the 3 new specialization quicksort functions in 32-bit
builds which seems to shrink down the binary by over 4KB on my machine.

In passing, also add the missing DatumGetInt32() / DatumGetInt64() macros
in the comparison functions.

Discussion: https://postgr.es/m/CAApHDvqcQExRhtRa9hJrJB_5egs3SUfOcutP3m+3HO8A+fZTPA@mail.gmail.com
Reviewed-by: John Naylor
2022-05-11 11:38:13 +12:00
Peter Eisentraut 9700b250c5 Formatting and punctuation improvements in sample configuration files 2022-05-10 21:15:56 +02:00
Tom Lane fe20afaee8 Fix core dump in transformValuesClause when there are no columns.
The parser code that transformed VALUES from row-oriented to
column-oriented lists failed if there were zero columns.
You can't write that straightforwardly (though probably you
should be able to), but the case can be reached by expanding
a "tab.*" reference to a zero-column table.

Per bug #17477 from Wang Ke.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/17477-0af3c6ac6b0a6ae0@postgresql.org
2022-05-09 14:15:37 -04:00
Tom Lane 29904f5f2f Revert "Disallow infinite endpoints in generate_series() for timestamps."
This reverts commit eafdf9de06
and its back-branch counterparts.  Corey Huinker pointed out that
we'd discussed this exact change back in 2016 and rejected it,
on the grounds that there's at least one usage pattern with LIMIT
where an infinite endpoint can usefully be used.  Perhaps that
argument needs to be re-litigated, but there's no time left before
our back-branch releases.  To keep our options open, restore the
status quo ante; if we do end up deciding to change things, waiting
one more quarter won't hurt anything.

Rather than just doing a straight revert, I added a new test case
demonstrating the usage with LIMIT.  That'll at least remind us of
the issue if we forget again.

Discussion: https://postgr.es/m/3603504.1652068977@sss.pgh.pa.us
Discussion: https://postgr.es/m/CADkLM=dzw0Pvdqp5yWKxMd+VmNkAMhG=4ku7GnCZxebWnzmz3Q@mail.gmail.com
2022-05-09 11:40:40 -04:00
Noah Misch 0abc1a059e In REFRESH MATERIALIZED VIEW, set user ID before running user code.
It intended to, but did not, achieve this.  Adopt the new standard of
setting user ID just after locking the relation.  Back-patch to v10 (all
supported versions).

Reviewed by Simon Riggs.  Reported by Alvaro Herrera.

Security: CVE-2022-1552
2022-05-09 08:35:08 -07:00
Noah Misch a117cebd63 Make relation-enumerating operations be security-restricted operations.
When a feature enumerates relations and runs functions associated with
all found relations, the feature's user shall not need to trust every
user having permission to create objects.  BRIN-specific functionality
in autovacuum neglected to account for this, as did pg_amcheck and
CLUSTER.  An attacker having permission to create non-temp objects in at
least one schema could execute arbitrary SQL functions under the
identity of the bootstrap superuser.  CREATE INDEX (not a
relation-enumerating operation) and REINDEX protected themselves too
late.  This change extends to the non-enumerating amcheck interface.
Back-patch to v10 (all supported versions).

Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin.
Reported by Alexander Lakhin.

Security: CVE-2022-1552
2022-05-09 08:35:08 -07:00
Michael Paquier 7863ee4def Fix control file update done in restartpoints still running after promotion
If a cluster is promoted (aka the control file shows a state different
than DB_IN_ARCHIVE_RECOVERY) while CreateRestartPoint() is still
processing, this function could miss an update of the control file for
"checkPoint" and "checkPointCopy" but still do the recycling and/or
removal of the past WAL segments, assuming that the to-be-updated LSN
values should be used as reference points for the cleanup.  This causes
a follow-up restart attempting crash recovery to fail with a PANIC on a
missing checkpoint record if the end-of-recovery checkpoint triggered by
the promotion did not complete while the cluster abruptly stopped or
crashed before the completion of this checkpoint.  The PANIC would be
caused by the redo LSN referred in the control file as located in a
segment already gone, recycled by the previous restartpoint with
"checkPoint" out-of-sync in the control file.

This commit fixes the update of the control file during restartpoints so
as "checkPoint" and "checkPointCopy" are updated even if the cluster has
been promoted while a restartpoint is running, to be on par with the set
of WAL segments actually recycled in the end of CreateRestartPoint().

This problem exists in all the stable branches.  However, commit
7ff23c6, by removing the last call of CreateCheckPoint() from the
startup process, has made this bug much easier to reason about as
concurrent checkpoints are not possible anymore.  No backpatch is done
yet, mostly out of caution from me as a point release is close by, but
we need to think harder about the case of concurrent checkpoints at
promotion if the bgwriter is not considered as running by the startup
process in ~v14, so this change is done only on HEAD for the moment.

Reported-by: Fujii Masao, Rui Zhao
Author: Kyotaro Horiguchi
Reviewed-by: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/20220316.102444.2193181487576617583.horikyota.ntt@gmail.com
2022-05-09 08:39:59 +09:00
Thomas Munro e2f65f4255 Fix old-fd issues using global barriers everywhere.
Commits 4eb21763 and b74e94dc introduced a way to force every backend to
close all relation files, to fix an ancient Windows-only bug.

This commit extends that behavior to all operating systems and adds
a couple of extra barrier points, to fix a totally different class of
bug: the reuse of relfilenodes in scenarios that have no other kind of
cache invalidation to prevent file descriptor mix-ups.

In all releases, data corruption could occur when you moved a database
to another tablespace and then back again.  Despite that, no back-patch
for now as the infrastructure required is too new and invasive.  In
master only, since commit aa010514, it could also happen when using
CREATE DATABASE with a user-supplied OID or via pg_upgrade.

Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220209220004.kb3dgtn2x2k2gtdm%40alap3.anarazel.de
2022-05-07 16:47:29 +12:00
Thomas Munro b74e94dc27 Rethink PROCSIGNAL_BARRIER_SMGRRELEASE.
With sufficiently bad luck, it was possible for IssuePendingWritebacks()
to reopen a file after we'd processed PROCSIGNAL_BARRIER_SMGRRELEASE and
before the file was unlinked by some other backend.  That left a small
hole in commit 4eb21763's plan to fix all spurious errors from DROP
TABLESPACE and similar on Windows.

Fix by closing md.c's segments, instead of just closing fd.c's
descriptors, and then teaching smgrwriteback() not to open files that
aren't already open.

Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20220209220004.kb3dgtn2x2k2gtdm%40alap3.anarazel.de
2022-05-07 16:32:10 +12:00
Robert Haas 701d918a42 Fix misleading comments about background worker registration.
Since 6bc8ef0b7f, the maximum number
of backends can't change as background workers are registered, but
these comments still reflect the way things worked prior to that.

Also, per recent discussion, some modules call SetConfigOption()
from _PG_init(). It's not entirely clear to me whether we want to
regard that as a fully supported operation, but since we know it's
a thing that happens, it at least deserves a mention in the comments,
so add that.

Nathan Bossart, reviewed by Anton A. Melnikov

Discussion: http://postgr.es/m/20220419154658.GA2487941@nathanxps13
2022-05-06 09:24:06 -04:00
Michael Paquier 59a32f0093 Fix typo in origin.c
Introduced in 5aa2350.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+PsuWz6_7aCmivNU5FahgQxDUTQtc3+__XnWkBzQcfn43w@mail.gmail.com
2022-05-06 20:01:15 +09:00
Peter Eisentraut 7e367924e3 Update SQL features
Update a few items that have become supported or mostly supported but
weren't updated at the time.
2022-05-06 09:17:38 +02:00
Tom Lane c40ba5f318 Fix rowcount estimate for SubqueryScan that's under a Gather.
SubqueryScan was always getting labeled with a rowcount estimate
appropriate for non-parallel cases.  However, nodes that are
underneath a Gather should be treated as processing only one
worker's share of the rows, whether the particular node is explicitly
parallel-aware or not.  Most non-scan-level node types get this
right automatically because they base their rowcount estimate on
that of their input sub-Path(s).  But SubqueryScan didn't do that,
instead using the whole-relation rowcount estimate as if it were
a non-parallel-aware scan node.  If there is a parallel-aware node
below the SubqueryScan, this is wrong, and it results in inflating
the cost estimates for nodes above the SubqueryScan, which can cause
us to not choose a parallel plan, or choose a silly one --- as indeed
is visible in the one regression test whose results change with this
patch.  (Although that plan tree appears to contain no SubqueryScans,
there were some in it before setrefs.c deleted them.)

To fix, use path->subpath->rows not baserel->tuples as the number
of input tuples we'll process.  This requires estimating the quals'
selectivity afresh, which is slightly annoying; but it shouldn't
really add much cost thanks to the caching done in RestrictInfo.

This is pretty clearly a bug fix, but I'll refrain from back-patching
as people might not appreciate plan choices changing in stable branches.
The fact that it took us this long to identify the bug suggests that
it's not a major problem.

Per report from bucoo, though this is not his proposed patch.

Discussion: https://postgr.es/m/202204121457159307248@sohu.com
2022-05-04 14:44:40 -04:00
Peter Eisentraut dc2be6ed47 Remove JsonPathSpec typedef
It doesn't seem very useful, and it's a bit in the way of the planned
node support automation.

Discussion: https://www.postgresql.org/message-id/202204191140.3wsbevfhqmu3@alvherre.pgsql
2022-05-04 17:36:31 +02:00
Peter Eisentraut 2e77180d45 Fix incorrect format placeholders 2022-05-04 07:57:39 +02:00
Andres Freund 8f1537d10e Fix possibility of self-deadlock in ResolveRecoveryConflictWithBufferPin().
The tests added in 9f8a050f68 failed nearly reliably on FreeBSD in CI, and
occasionally on the buildfarm. That turns out to be caused not by a bug in the
test, but by a longstanding bug in recovery conflict handling.

The standby timeout handler, used by ResolveRecoveryConflictWithBufferPin(),
executed SendRecoveryConflictWithBufferPin() inside a signal handler. A bad
idea, because the deadlock timeout handler (or a spurious latch set) could
have interrupted ProcWaitForSignal(). If unlucky that could cause a
self-deadlock on ProcArrayLock, if the deadlock check is in
SendRecoveryConflictWithBufferPin()->CancelDBBackends().

To fix, set a flag in StandbyTimeoutHandler(), and check the flag in
ResolveRecoveryConflictWithBufferPin().

Subsequently the recovery conflict tests will be backpatched.

Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de
Backpatch: 10-
2022-05-02 18:25:00 -07:00
Etsuro Fujita d89f97e83e Fix typo in comment. 2022-05-02 16:45:00 +09:00
Jeff Davis ed57cac84d pg_walinspect: fix case where flush LSN is in the middle of a record.
Instability in the test for pg_walinspect revealed that
pg_get_wal_records_info_till_end_of_wal(x) would try to decode all the
records with a start LSN earlier than the flush LSN, even though that
might include a partial record at the end of the range. In that case,
read_local_xlog_page_no_wait() would return NULL when it tried to read
past the flush LSN, which would be interpreted as an error by the
caller. That caused a test failure only on a BF animal that had been
restarted recently, but could be expected to happen in the wild quite
easily depending on the alignment of various parameters.

Fix by using private data in read_local_xlog_page_no_wait() to signal
end-of-wal to the caller, so that it can be properly distinguished
from a real error.

Discussion: https://postgr.es/m/Ymd/e5eeZMNAkrXo%40paquier.xyz
Discussion: https://postgr.es/m/111657.1650910309@sss.pgh.pa.us

Authors: Thomas Munro, Bharath Rupireddy.
2022-04-30 09:05:32 -07:00
Andrew Dunstan a79153b7a2 Claim SQL standard compliance for SQL/JSON features
Discussion: https://postgr.es/m/d03d809c-d0fb-fd6a-1476-d6dc18ec940e@dunslane.net
2022-04-29 09:01:05 -04:00
Andrew Dunstan 9c3d25e178 Fix JSON_OBJECTAGG uniquefying bug
Commit f4fb45d15c contained a bug in removing items with null values when
unique keys are required, where the leading items that are sorted
contained such values. Fix that and add a test for it.

Discussion: https://postgr.es/m/CAJA4AWQ_XbSmsNbW226UqNyRLJ+wb=iQkQMj77cQyoNkqtf=2Q@mail.gmail.com
2022-04-28 15:28:20 -04:00
Etsuro Fujita 5c854e7a2c Disable asynchronous execution if using gating Result nodes.
mark_async_capable_plan(), which is called from create_append_plan() to
determine whether subplans are async-capable, failed to take into
account that the given subplan created from a given subpath might
include a gating Result node if the subpath is a SubqueryScanPath or
ForeignPath, causing a segmentation fault there when the subplan created
from a SubqueryScanPath includes the Result node, or causing
ExecAsyncRequest() to throw an error about an unrecognized node type
when the subplan created from a ForeignPath includes the Result node,
because in the latter case the Result node was unintentionally
considered as async-capable, but we don't currently support executing
Result nodes asynchronously.  Fix by modifying mark_async_capable_plan()
to disable asynchronous execution in such cases.  Also, adjust code in
the ProjectionPath case in mark_async_capable_plan(), for consistency
with other cases, and adjust/improve comments there.

is_async_capable_path() added in commit 27e1f1456, which was rewritten
to mark_async_capable_plan() in a later commit, has the same issue,
causing the error at execution mentioned above, so back-patch to v14
where the aforesaid commit went in.

Per report from Justin Pryzby.

Etsuro Fujita, reviewed by Zhihong Yu and Justin Pryzby.

Discussion: https://postgr.es/m/20220408124338.GK24419%40telsasoft.com
2022-04-28 15:15:00 +09:00
Michael Paquier 55b5686511 Revert recent changes with durable_rename_excl()
This reverts commits 2c902bb and ccfbd92.  Per buildfarm members
kestrel, rorqual and calliphoridae, the assertions checking that a TLI
history file should not exist when created by a WAL receiver have been
failing, and switching to durable_rename() over durable_rename_excl()
would cause the newest TLI history file to overwrite the existing one.
We need to think harder about such cases, so revert the new logic for
now.

Note that all the failures have been reported in the test
025_stuck_on_old_timeline.

Discussion: https://postgr.es/m/511362.1651116498@sss.pgh.pa.us
2022-04-28 13:08:16 +09:00
John Naylor e84f82ab5c Fix SQL syntax in comment in logical/worker.c
Euler Taveira

Disussion: https://www.postgresql.org/message-id/25f95189-eef8-43c4-9d7b-419b651963c8%40www.fastmail.com
2022-04-28 09:27:32 +07:00
Michael Paquier 2c902bbf19 Remove durable_rename_excl()
ccfbd92 has replaced all existing in-core callers of this function in
favor of durable_rename().  durable_rename_excl() is by nature unsafe on
crashes happening at the wrong time, so just remove it.

Author: Nathan Bossart
Reviewed-by: Robert Haas, Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/20220407182954.GA1231544@nathanxps13
2022-04-28 11:10:40 +09:00
Michael Paquier ccfbd9287d Replace existing durable_rename_excl() calls with durable_rename()
durable_rename_excl() attempts to avoid overwriting any existing files
by using link() and unlink(), falling back to rename() on some platforms
(e.g., Windows where link() followed by unlink() is not concurrent-safe,
see 909b449).  Most callers of durable_rename_excl() use it just in case
there is an existing file, but it happens that for all of them we never
expect a target file to exist (WAL segment recycling, creation of
timeline history file and basic_archive).

basic_archive used durable_rename_excl() to avoid overwriting an archive
concurrently created by another server.  Now, there is a stat() call to
avoid overwriting an existing archive a couple of lines above, so note
that this change opens a small TOCTOU window in this module between the
stat() call and durable_rename().

Furthermore, as mentioned in the top comment of durable_rename_excl(),
this routine can result in multiple hard links to the same file and data
corruption, with two or more links to the same file in pg_wal/ if a
crash happens before the unlink() call during WAL recycling.
Specifically, this would produce links to the same file for the current
WAL file and the next one because the half-recycled WAL file was
re-recycled during crash recovery of a follow-up cluster restart.

This change replaces all calls to durable_rename_excl() with
durable_rename().  This removes the protection against accidentally
overwriting an existing file, but some platforms are already living
without it, and all those code paths never expect an existing file (a
couple of assertions are added to check after that, in case).

This is a bug fix, but knowing the unlikeliness of the problem involving
one of more crashes at an exceptionally bad moment, no backpatch is
done.  This could be revisited in the future.

Author: Nathan Bossart
Reviewed-by: Robert Haas, Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/20220407182954.GA1231544@nathanxps13
2022-04-28 10:11:45 +09:00
Peter Eisentraut 755df30e48 Fix incorrect format placeholders 2022-04-27 09:49:10 +02:00
Peter Eisentraut 9ddf251f94 Handle NULL fields in WRITE_INDEX_ARRAY
Unlike existing WRITE_*_ARRAY macros, WRITE_INDEX_ARRAY needs to
handle the case that the field is NULL.  We already have the
convention to print NULL fields as "<>", so we do that here as well.
There is currently no corresponding read function for this, so reading
this back in is not implemented, but it could be if needed.

Reported-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-LN%3DbF8f9eU2R94dJtF54DfDvBq%2BovqHnOQqbinYDrUw%40mail.gmail.com
2022-04-27 09:15:09 +02:00
Alvaro Herrera 0bd56172b2
Always pfree strings returned by GetDatabasePath
Several places didn't do it, and in many cases it didn't matter because
it would be a small allocation in a short-lived context; but other
places may accumulate a few (for example, in CreateDatabaseUsingFileCopy,
one per tablespace).  In most databases this is highly unlikely to be
very serious either, but it seems better to make the code consistent in
case there's future copy-and-paste.

The only case of actual concern seems to be the aforementioned routine,
which is new with commit 9c08aea6a3, so there's no need to backpatch.

As pointed out by Coverity.
2022-04-25 10:32:13 +02:00
Tom Lane f819020d40 Fix incautious CTE matching in rewriteSearchAndCycle().
This function looks for a reference to the recursive WITH CTE,
but it checked only the CTE name not ctelevelsup, so that it could
seize on a lower CTE that happened to have the same name.  This
would result in planner failures later, either weird errors such as
"could not find attribute 2 in subquery targetlist", or crashes
or assertion failures.  The code also merely Assert'ed that it found
a matching entry, which is not guaranteed at all by the parser.

Per bugs #17320 and #17318 from Zhiyong Wu.
Thanks to Kyotaro Horiguchi for investigation.

Discussion: https://postgr.es/m/17320-70e37868182512ab@postgresql.org
Discussion: https://postgr.es/m/17318-2eb65a3a611d2368@postgresql.org
2022-04-23 12:16:12 -04:00
David Rowley 99c754129d Fix performance regression in tuplesort specializations
697492434 added 3 new qsort specialization functions aimed to improve the
performance of sorting many of the common pass-by-value data types when
they're the leading or only sort key.

Unfortunately, that has caused a performance regression when sorting
datasets where many of the values being compared were equal.  What was
happening here was that we were falling back to the standard sort
comparison function to handle tiebreaks.  When the two given Datums
compared equally we would incur both the overhead of an indirect function
call to the standard comparer to perform the tiebreak and also the
standard comparer function would go and compare the leading key needlessly
all over again.

Here improve the situation in the 3 new comparison functions.  We now
return 0 directly when the two Datums compare equally and we're performing
a 1-key sort.

Here we don't do anything to help the multi-key sort case where the
leading key uses one of the sort specializations functions.  On testing
this case, even when the leading key's values are all equal, there
appeared to be no performance regression.  Let's leave it up to future
work to optimize that case so that the tiebreak function no longer
re-compares the leading key over again.

Another possible fix for this would have been to add 3 additional sort
specialization functions to handle single-key sorts for these
pass-by-value types.  The reason we didn't do that here is that we may
deem some other sort specialization to be more useful than single-key
sorts.  It may be impractical to have sort specialization functions for
every single combination of what may be useful and it was already decided
that further analysis into which ones are the most useful would be delayed
until the v16 cycle.  Let's not let this regression force our hand into
trying to make that decision for v15.

Author: David Rowley
Reviewed-by: John Naylor
Discussion: https://postgr.es/m/CA+hUKGJRbzaAOUtBUcjF5hLtaSHnJUqXmtiaLEoi53zeWSizeA@mail.gmail.com
2022-04-22 16:02:15 +12:00
Tom Lane 92e7a53752 Remove inadequate assertion check in CTE inlining.
inline_cte() expected to find exactly as many references to the
target CTE as its cterefcount indicates.  While that should be
accurate for the tree as emitted by the parser, there are some
optimizations that occur upstream of here that could falsify it,
notably removal of unused subquery output expressions.

Trying to make the accounting 100% accurate seems expensive and
doomed to future breakage.  It's not really worth it, because
all this code is protecting is downstream assumptions that every
referenced CTE has a plan.  Let's convert those assertions to
regular test-and-elog just in case there's some actual problem,
and then drop the failing assertion.

Per report from Tomas Vondra (thanks also to Richard Guo for
analysis).  Back-patch to v12 where the faulty code came in.

Discussion: https://postgr.es/m/29196a1e-ed47-c7ca-9be2-b1c636816183@enterprisedb.com
2022-04-21 17:58:52 -04:00
Tom Lane 2cb1272445 Rethink method for assigning OIDs to the template0 and postgres DBs.
Commit aa0105141 assigned fixed OIDs to template0 and postgres
in a very ad-hoc way.  Notably, instead of teaching Catalog.pm
about these OIDs, the unused_oids script was just hacked to
not show them as unused.  That's problematic since, for example,
duplicate_oids wouldn't report any future conflict.  Hence,
invent a macro DECLARE_OID_DEFINING_MACRO() that can be used to
define an OID that is known to Catalog.pm and will participate
in duplicate-detection as well as renumbering by renumber_oids.pl.
(We don't anticipate renumbering these particular OIDs, but we
might as well build out all the Catalog.pm infrastructure while
we're here.)

Another issue is that aa0105141 neglected to touch IsPinnedObject,
with the result that it now claimed template0 and postgres are
pinned.  The right thing to do there seems to be to teach it that
no database is pinned, since in fact DROP DATABASE doesn't check
for pinned-ness (and at least for these cases, that is an
intentional choice).  It's not clear whether this wrong answer
had any visible effect, but perhaps it could have resulted in
erroneous management of dependency entries.

In passing, rename the TemplateDbOid macro to Template1DbOid
to reduce confusion (likely we should have done that way back
when we invented template0, but we didn't), and rename the
OID macros for template0 and postgres to have a similar style.

There are no changes to postgres.bki here, so no need for a
catversion bump.

Discussion: https://postgr.es/m/2935358.1650479692@sss.pgh.pa.us
2022-04-21 16:23:15 -04:00
Tom Lane 40eba064b2 Use DECLARE_TOAST_WITH_MACRO() to simplify toast-table declarations.
This is needed so that renumber_oids.pl can handle renumbering
shared catalog declarations, which need to provide C macros for
the OIDs of the shared toast table and index.  The previous
method of writing a C macro separately was error-prone anyway.

Also teach renumber_oids.pl about DECLARE_UNIQUE_INDEX_PKEY,
as we missed doing when inventing that macro.

There are no changes to postgres.bki here, so no need for a
catversion bump.

Discussion: https://postgr.es/m/2995325.1650487527@sss.pgh.pa.us
2022-04-21 12:02:23 -04:00
Peter Geoghegan ba6af6aa0b vacuumlazy.c: MultiXactIds are MXIDs, not XMIDs.
Oversights in commits 0b018fab and f3c15cbe.
2022-04-20 18:29:02 -07:00
Peter Geoghegan 8ab0ebb9a8 Fix CLUSTER tuplesorts on abbreviated expressions.
CLUSTER sort won't use the datum1 SortTuple field when clustering
against an index whose leading key is an expression.  This makes it
unsafe to use the abbreviated keys optimization, which was missed by the
logic that sets up SortSupport state.  Affected tuplesorts output tuples
in a completely bogus order as a result (the wrong SortSupport based
comparator was used for the leading attribute).

This issue is similar to the bug fixed on the master branch by recent
commit cc58eecc5d.  But it's a far older issue, that dates back to the
introduction of the abbreviated keys optimization by commit 4ea51cdfe8.

Backpatch to all supported versions.

Author: Peter Geoghegan <pg@bowt.ie>
Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKG+bA+bmwD36_oDxAoLrCwZjVtST2fqe=b4=qZcmU7u89A@mail.gmail.com
Backpatch: 10-
2022-04-20 17:17:43 -07:00
Tom Lane eafdf9de06 Disallow infinite endpoints in generate_series() for timestamps.
Such cases will lead to infinite loops, so they're of no practical
value.  The numeric variant of generate_series() already threw error
for this, so borrow its message wording.

Per report from Richard Wesley.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/91B44E7B-68D5-448F-95C8-B4B3B0F5DEAF@duckdblabs.com
2022-04-20 18:08:23 -04:00
Alvaro Herrera e70813fbc4
set_deparse_plan: Reuse variable to appease Coverity
Coverity complains that dpns->outer_plan is deferenced (to obtain
->targetlist) when possibly NULL.  We can avoid this by using
dpns->outer_tlist instead, which was already obtained a few lines up.

The fact that we end up with
  dpns->inner_tlist = dpns->outer_tlist
is a bit suspicious-looking and maybe worthy of more investigation, but
I'll leave that for another day.

Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/202204191345.qerjy3kxi3eb@alvherre.pgsql
2022-04-20 11:44:08 +02:00
Alvaro Herrera a87e759569
Move ModifyTableContext->lockmode to UpdateContext
Should have been done this way to start with, but I failed to notice
This way we avoid some pointless initialization, and better contains the
variable to exist in the scope where it is really used.

Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/202204191345.qerjy3kxi3eb@alvherre.pgsql
2022-04-20 11:18:04 +02:00
Alvaro Herrera 3dcc6bf406
ExecModifyTable: use context.planSlot instead of planSlot
There's no reason to keep a separate local variable when we have a place
for it elsewhere.  This allows to simplify some code.

Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/202204191345.qerjy3kxi3eb@alvherre.pgsql
2022-04-20 10:34:58 +02:00
Tom Lane 344a225cb9 Fix breakage in AlterFunction().
An ALTER FUNCTION command that tried to update both the function's
proparallel property and its proconfig list failed to do the former,
because it stored the new proparallel value into a tuple that was
no longer the interesting one.  Carelessness in 7aea8e4f2.

(I did not bother with a regression test, because the only likely
future breakage would be for someone to ignore the comment I added
and add some other field update after the heap_modify_tuple step.
A test using existing function properties could not catch that.)

Per report from Bryn Llewellyn.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com
2022-04-19 23:03:59 -04:00
Michael Paquier 83cca409ed Remove duplicated word in comment of basebackup.c
Oversight in 39969e2.

Author: Martín Marqués
Discussion: https://postgr.es/m/CABeG9LviA01oHC5h=ksLUuhMyXxmZR_tftRq6q3341CMT=j=4g@mail.gmail.com
2022-04-20 11:05:34 +09:00
Peter Eisentraut f2a2bf66c8 Fix extract epoch from interval calculation
The new numeric code for extract epoch from interval accidentally
truncated the DAYS_PER_YEAR value to an integer, leading to results
that mismatched the floating-point interval_part calculations.

The commit a2da77cdb4 that introduced
this actually contains the regression test change that this reverts.
I suppose this was missed at the time.

Reported-by: Joseph Koshakow <koshy44@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/CAAvxfHd5n%3D13NYA2q_tUq%3D3%3DSuWU-CufmTf-Ozj%3DfrEgt7pXwQ%40mail.gmail.com
2022-04-19 21:04:52 +02:00
Amit Kapila dd4ab6fd65 Fix the check to limit sync workers.
We don't allow to invoke more sync workers once we have reached the sync
worker limit per subscription. But the check to enforce this also doesn't
allow to launch an apply worker if it gets restarted.

This code was introduced by commit de43897122 but we caught the problem
only with the test added by recent commit c91f71b9dc which started failing
occasionally in the buildfarm.

As per buildfarm.
Diagnosed-by: Amit Kapila, Masahiko Sawada, Tomas Vondra
Author: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
	    https://postgr.es/m/f90d2b03-4462-ce95-a524-d91464e797c8@enterprisedb.com
2022-04-19 08:49:49 +05:30
Tom Lane 36d4efe779 Avoid invalid array reference in transformAlterTableStmt().
Don't try to look at the attidentity field of system attributes,
because they're not there in the TupleDescAttr array.  Sometimes
this is harmless because we accidentally pick up a zero, but
otherwise we'll report "no owned sequence found" from an attempt
to alter a system attribute.  (It seems possible that a SIGSEGV
could occur, too, though I've not seen it in testing.)

It's not in this function's charter to complain that you can't
alter a system column, so instead just hard-wire an assumption
that system attributes aren't identities.  I didn't bother with
a regression test because the appearance of the bug is very
erratic.

Per bug #17465 from Roman Zharkov.  Back-patch to all supported
branches.  (There's not actually a live bug before v12, because
before that get_attidentity() did the right thing anyway.
But for consistency I changed the test in the older branches too.)

Discussion: https://postgr.es/m/17465-f2a554a6cb5740d3@postgresql.org
2022-04-18 12:16:45 -04:00
Thomas Munro acf1dd4234 Don't retry restore_command while reading ahead.
Suppress further attempts to read ahead in the WAL if we run out of
data, until the records already decoded have been replayed.  This
restores the traditional behavior for continuous archive recovery, which
is to retry the failing restore_command only every 5 seconds.  With the
coding in 5dc0418f, we would start retrying every time through the
recovery loop when our WAL decoding window hit the end of the current
segment and we tried to look ahead into a not-yet-available next file.
That was very slow.

Also change the no_readahead_until mechanism to use <= rather than <,
which seems more useful.  Otherwise we'd either get one extra unwanted
retry of restore_command, or we'd need to add 1 to an LSN.

No change in behavior for regular streaming.  That was already limited
by the flushedUpto variable, which won't be updated until we replay what
we have already.

Reported by Andres Freund while analyzing the failure of a TAP test on
build farm animal skink (investigation ongoing but probably due to
otherwise unrelated timing bugs triggered by this slowness magnified by
valgrind).

Discussion: https://postgr.es/m/20220409005910.alw46xqmmgny2sgr%40alap3.anarazel.de
2022-04-17 10:50:19 +12:00
Andres Freund 4a736a161c pgstat: Use correct lock level in pgstat_drop_all_entries().
Previously we didn't, which lead to an assertion failure when resetting
partially loaded statistics. This was encountered on the buildfarm, for
as-of-yet unknown reasons.

Ttighten up a validity check when reading the stats file, verifying 'E'
signals the end of the file (rather than just stopping reading). That's then
used in a test appending to the stats file that crashed before the fix in
pgstat_drop_all_entries().

Reported by buildfarm animals mylodon and kestrel, via Tom Lane.

Discussion: https://postgr.es/m/1656446.1650043715@sss.pgh.pa.us
2022-04-16 14:44:58 -07:00
Tom Lane 9f4f0a0dad Fix incorrect logic in HaveRegisteredOrActiveSnapshot().
This function gave the wrong answer when there's more than one
RegisteredSnapshots entry, whether or not any of them is the
CatalogSnapshot.  This leads to assertion failure in some scenarios
involving fetching toasted data using a cursor.  (As per discussion,
I'm dubious that this is the right contract to be enforcing at all;
but it surely doesn't help to be enforcing it incorrectly.)

Fetching toasted data using a cursor is evidently under-tested,
so add a test case too.

Per report from Erik Rijkers.  This is new code, so no need for
back-patch.

Discussion: https://postgr.es/m/dc9dd229-ed30-6c62-4c41-d733ffff776b@xs4all.nl
2022-04-16 16:04:50 -04:00
Peter Geoghegan d3609dd254 Fix multi-table VACUUM VERBOSE accounting.
Per-backend global variables like VacuumPageHit are initialized once per
VACUUM command.  This was missed by commit 49c9d9fc, which unified
VACUUM VERBOSE and autovacuum logging.  As a result of that oversight,
incorrect values were shown when multiple relations were processed by a
single VACUUM VERBOSE command.

Relations that happened to be processed later on would show "buffer
usage:" values that incorrectly included buffer accesses made while
processing earlier unrelated relations.  The same accesses were counted
multiple times.

To fix, take initial values for the tracker variables at the start of
heap_vacuum_rel(), and report delta values later on.
2022-04-15 15:48:39 -07:00
Tom Lane 6fea65508a Tighten ComputeXidHorizons' handling of walsenders.
ComputeXidHorizons (nee GetOldestXmin) thought that it could identify
walsenders by checking for proc->databaseId == 0.  Perhaps that was
safe when the code was written, but it's been wrong at least since
autovacuum was invented.  Background processes that aren't connected
to any particular database, such as the autovacuum launcher and
logical replication launcher, look like that too.

This imprecision is harmful because when such a process advertises an
xmin, the result is to hold back dead-tuple cleanup in all databases,
though it'd be sufficient to hold it back in shared catalogs (which
are the only relations such a process can access).  Aside from being
generally inefficient, this has recently been seen to cause regression
test failures in the buildfarm, as a consequence of the logical
replication launcher's startup transaction preventing VACUUM from
marking pages of a user table as all-visible.

We only want that global hold-back effect for the case where a
walsender is advertising a hot standby feedback xmin.  Therefore,
invent a new PGPROC flag that says that a process' xmin should be
considered globally, and check that instead of using the incorrect
databaseId == 0 test.  Currently only a walsender sets that flag,
and only if it is not connected to any particular database.  (This is
for bug-compatibility with the undocumented behavior of the existing
code, namely that feedback sent by a client who has connected to a
particular database would not be applied globally.  I'm not sure this
is a great definition; however, such a client is capable of issuing
plain SQL commands, and I don't think we want xmins advertised for
such commands to be applied globally.  Perhaps this could do with
refinement later.)

While at it, I rewrote the comment in ComputeXidHorizons, and
re-ordered the commented-upon if-tests, to make them match up
for intelligibility's sake.

This is arguably a back-patchable bug fix, but given the lack of
complaints I think it prudent to let it age awhile in HEAD first.

Discussion: https://postgr.es/m/1346227.1649887693@sss.pgh.pa.us
2022-04-15 17:50:05 -04:00
Peter Geoghegan bdb71dbe80 VACUUM VERBOSE: Show dead items for an empty table.
Be consistent about the lines that VACUUM VERBOSE outputs by including
an "index scan not needed: " line for completely empty tables. This
makes the output more readable, especially with multiple distinct VACUUM
operations processed by the same VACUUM command.  It's also more
consistent; even empty tables can use the failsafe, which wasn't
reported in the standard way until now.

Follow-up to commit 6e20f460, which taught VACUUM VERBOSE to be more
consistent about reporting on scanned pages with empty tables.
2022-04-15 14:20:56 -07:00
Peter Geoghegan 357c8455e6 Adjust VACUUM's removable cutoff log message.
The age of OldestXmin (a.k.a. "removable cutoff") when VACUUM ends often
indicates the approximate number of XIDs consumed while VACUUM ran.
However, there is at least one important exception: the cutoff could be
held back by a snapshot that was acquired before our VACUUM even began.
Successive VACUUM operations may even use exactly the same old cutoff in
extreme cases involving long held snapshots.

The log messages that described how removable cutoff aged (which were
added by commit 872770fd) created the impression that we were reporting
on how VACUUM's usable cutoff advanced while VACUUM ran, which was
misleading in these extreme cases.  Fix by using a more general wording.

Per gripe from Tom Lane.

In passing, relocate related instrumentation code for clarity.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/1643035.1650035653@sss.pgh.pa.us
2022-04-15 13:21:43 -07:00
Andrew Dunstan f7a605f636 Small cleanups in SQL/JSON code
These are to keep Coverity happy. In one case remove a redundant NULL
check, and in another explicitly ignore a function result that is already
known.
2022-04-15 07:49:20 -04:00
Andres Freund 5cd1c40b3c pgstat: set timestamps of fixed-numbered stats after a crash.
When not loading stats at startup (i.e. pgstat_discard_stats() getting
called), reset timestamps of fixed numbered stats would be left at
0. Oversight in 5891c7a8ed.

Instead use pgstat_reset_after_failure() and add tests verifying that
fixed-numbered reset timestamps are set appropriately.

Reported-By: "David G. Johnston" <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CAKFQuwamFuaQHKdhcMt4Gbw5+Hca2UE741B8gOOXoA=TtAd2Yw@mail.gmail.com
2022-04-14 17:40:25 -07:00
Alvaro Herrera 3f19e176ae
Have CLUSTER ignore partitions not owned by caller
If a partitioned table has partitions owned by roles other than the
owner of the partitioned table, don't include them in the to-be-
clustered list.  This is similar to what VACUUM FULL does (except we do
it sooner, because there is no reason to postpone it).  Add a simple
test to verify that only owned partitions are clustered.

While at it, change memory context switch-and-back to occur once per
partition instead of outside of the loop.

Author: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20220411140609.GF26620@telsasoft.com
2022-04-14 22:11:06 +02:00
Andrew Dunstan 4cd8717af3 Improve a couple of sql/json error messages
Fix the grammar in two, and add a hint to one.
2022-04-14 10:26:29 -04:00
Andrew Dunstan fcdb35c32a Fix transformJsonBehavior
Commit 1a36bc9dba conained some logic that was a little opaque and
could have involved a NULL dereference, as complained about by Coverity.
Make the logic more transparent and in doing so avoid the NULL
dereference.
2022-04-14 09:24:22 -04:00
David Rowley a00fd066b1 Add missing spaces after single-line comments
Only 1 of 3 of these changes appear to be handled by pgindent. That change
is new to v15.  The remaining two appear to be left alone by pgindent. The
exact reason for that is not 100% clear to me.  It seems related to the
fact that it's a line that contains *only* a single line comment and no
actual code.  It does not seem worth investigating this in too much
detail.  In any case, these do not conform to our usual practices, so fix
them.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
2022-04-14 09:28:56 +12:00
Tom Lane 7b7ed046cb Prevent access to no-longer-pinned buffer in heapam_tuple_lock().
heap_fetch() used to have a "keep_buf" parameter that told it to return
ownership of the buffer pin to the caller after finding that the
requested tuple TID exists but is invisible to the specified snapshot.
This was thoughtlessly removed in commit 5db6df0c0, which broke
heapam_tuple_lock() (formerly EvalPlanQualFetch) because that function
needs to do more accesses to the tuple even if it's invisible.  The net
effect is that we would continue to touch the page for a microsecond or
two after releasing pin on the buffer.  Usually no harm would result;
but if a different session decided to defragment the page concurrently,
we could see garbage data and mistakenly conclude that there's no newer
tuple version to chain up to.  (It's hard to say whether this has
happened in the field.  The bug was actually found thanks to a later
change that allowed valgrind to detect accesses to non-pinned buffers.)

The most reasonable way to fix this is to reintroduce keep_buf,
although I made it behave slightly differently: buffer ownership
is passed back only if there is a valid tuple at the requested TID.
In HEAD, we can just add the parameter back to heap_fetch().
To avoid an API break in the back branches, introduce an additional
function heap_fetch_extended() in those branches.

In HEAD there is an additional, less obvious API change: tuple->t_data
will be set to NULL in all cases where buffer ownership is not returned,
in particular when the tuple exists but fails the time qual (and
!keep_buf).  This is to defend against any other callers attempting to
access non-pinned buffers.  We concluded that making that change in back
branches would be more likely to introduce problems than cure any.

In passing, remove a comment about heap_fetch that was obsoleted by
9a8ee1dc6.

Per bug #17462 from Daniil Anisimov.  Back-patch to v12 where the bug
was introduced.

Discussion: https://postgr.es/m/17462-9c98a0f00df9bd36@postgresql.org
2022-04-13 13:35:07 -04:00
Alvaro Herrera 24d2b2680a
Remove extraneous blank lines before block-closing braces
These are useless and distracting.  We wouldn't have written the code
with them to begin with, so there's no reason to keep them.

Author: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
Discussion: https://postgr.es/m/attachment/133167/0016-Extraneous-blank-lines.patch
2022-04-13 19:16:02 +02:00
Alvaro Herrera ed0fbc8e5a
Release cache tuple when no longer needed
There was a small buglet in commit 52e4f0cd47 whereby a tuple acquired
from cache was not released, giving rise to WARNING messages; fix that.

While at it, restructure the code a bit on stylistic grounds.

Author: Hou zj <houzj.fnst@fujitsu.com>
Reported-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut+PvKTyhTBtYCQsP6Ph7=o-oWRSX+v+PXXLXp81-o2bazig@mail.gmail.com
2022-04-13 18:19:38 +02:00
Andrew Dunstan 112fdb3528 Fix finalization for json_objectagg and friends
Commit f4fb45d15c misguidedly tried to free some state during aggregate
finalization for json_objectagg. This resulted in attempts to access
freed memory, especially when the function is used as a window function.
Commit 4eb9798879 attempted to ameliorate that, but in fact it should
just be ripped out, which is done here. Also add some regression tests
for json_objectagg in various flavors as a window function.

Original report from Jaime Casanova, diagnosis by Andres Freund.

Discussion: https://postgr.es/m/YkfeMNYRCGhySKyg@ahch-to
2022-04-13 10:37:43 -04:00
Peter Eisentraut a038679cd8 Fix incorrect format placeholders 2022-04-13 14:04:51 +02:00
Michael Paquier b940918dc8 Remove "recheck" argument from check_index_is_clusterable()
The last usage of this argument in this routine can be tracked down to
7e2f9062, aka 11 years ago.  Getting rid of this argument can also be an
advantage for extensions calling check_index_is_clusterable(), as it
removes any need to worry about the meaning of what a recheck would be
at this level.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411140609.GF26620@telsasoft.com
2022-04-13 15:32:35 +09:00
Robert Haas 7fc0e7de9f Revert the addition of GetMaxBackends() and related stuff.
This reverts commits 0147fc7, 4567596, aa64f23, and 5ecd018.
There is no longer agreement that introducing this function
was the right way to address the problem. The consensus now
seems to favor trying to make a correct value for MaxBackends
available to mdules executing their _PG_init() functions.

Nathan Bossart

Discussion: http://postgr.es/m/20220323045229.i23skfscdbvrsuxa@jrouhaud
2022-04-12 14:45:23 -04:00
Peter Eisentraut e7cc4a6e3d Use WRITE_ENUM_FIELD for enum field 2022-04-12 16:19:00 +02:00
Peter Eisentraut 51e8179405 Make node output prefix match node structure name
as done in e581360696
2022-04-12 16:18:01 +02:00
Alvaro Herrera 183c869e1c
adjust_partition_colnos mustn't be called if not needed
Add an assert to make this very explicit, as well as a code comment.
The former should silence Coverity complaining about this.

Introduced by 7103ebb7aa.

Reported-by: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQAqTTAOzXiYybab+1DQOb3ZUuK99=p_KD+yrRFhcDbd0jg@mail.gmail.com
2022-04-12 15:19:57 +02:00
Alvaro Herrera ce4f46fdc8
Change mechanism to set up source targetlist in MERGE
We were setting MERGE source subplan's targetlist by expanding the
individual attributes of the source relation completely, early in the
parse analysis phase.  This failed to work when the condition of an
action included a whole-row reference, causing setrefs.c to error out
with
  ERROR:  variable not found in subplan target lists
because at that point there is nothing to resolve the whole-row
reference with.  We can fix this by having preprocess_targetlist expand
the source targetlist for Vars required from the source rel by all
actions.  Moreover, by using this expansion mechanism we can do away
with the targetlist expansion in transformMergeStmt, which is good
because then we no longer pull in columns that aren't needed for
anything.

Add a test case for the problem.

While at it, remove some redundant code in preprocess_targetlist():
MERGE was doing separately what is already being done for UPDATE/DELETE,
so we can just rely on the latter and remove the former.  (The handling
of inherited rels was different for MERGE, but that was a no-longer-
necessary hack.)

Fix outdated, related comments for fix_join_expr also.

Author: Richard Guo <guofenglinux@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Joe Wildish <joe@lateraljoin.com>
Discussion: https://postgr.es/m/fab3b90a-914d-46a9-beb0-df011ee39ee5@www.fastmail.com
2022-04-12 09:29:39 +02:00
Michael Paquier a4b57543ac Rename backup_compression.{c,h} to compression.{c,h}
Compression option handling (level, algorithm or even workers) can be
used across several parts of the system and not only base backups.
Structures, objects and routines are renamed in consequence, to remove
the concept of base backups from this part of the code making this
change straight-forward.

pg_receivewal, that has gained support for LZ4 since babbbb5, will make
use of this infrastructure for its set of compression options, bringing
more consistency with pg_basebackup.  This cleanup needs to be done
before releasing a beta of 15.  pg_dump is a potential future target, as
well, and adding more compression options to it may happen in 16~.

Author: Michael Paquier
Reviewed-by: Robert Haas, Georgios Kokolatos
Discussion: https://postgr.es/m/YlPQGNAAa04raObK@paquier.xyz
2022-04-12 13:38:54 +09:00
Tom Lane bd037dc928 Make XLogRecGetBlockTag() throw error if there's no such block.
All but a few existing callers assume without checking that this
function succeeds.  While it probably will, that's a poor excuse for
not checking.  Let's make it return void and instead throw an error
if it doesn't find the block reference.  Callers that actually need
to handle the no-such-block case must now use the underlying function
XLogRecGetBlockTagExtended.

In addition to being a bit less error-prone, this should also serve
to suppress some Coverity complaints about XLogRecGetBlockRefInfo.

While at it, clean up some inconsistency about use of the
XLogRecHasBlockRef macro: make XLogRecGetBlockTagExtended use
that instead of open-coding the same condition, and avoid calling
XLogRecHasBlockRef twice in relevant code paths.  (That is,
calling XLogRecHasBlockRef followed by XLogRecGetBlockTag is now
deprecated: use XLogRecGetBlockTagExtended instead.)

Patch HEAD only; this doesn't seem to have enough value to consider
a back-branch API break.

Discussion: https://postgr.es/m/425039.1649701221@sss.pgh.pa.us
2022-04-11 17:43:53 -04:00
Peter Geoghegan 9debd12348 Remove comment about historic heap vacuuming issue.
Remove comment block about how heap page vacuuming used to set tuples
with storage to LP_UNUSED in a rare edge case that can no longer happen
following commit 8523492d4e.  The comments seem unnecessary now, since
it's now generally clear that heap vacuuming only applies to LP_DEAD
items from VACUUM's first heap pass following more recent work from
commits 12b5ade902 and 4f8d9d1217.
2022-04-11 14:20:46 -07:00
Tom Lane 9de692c101 Remove dead code in do_pg_backup_start().
As of commit 39969e2a1, no caller of do_pg_backup_start() passes NULL
for labelfile or tblspcmapfile, nor is it plausible that any would
do so in the future.  Remove the code that coped with that case,
as (a) it's dead and (b) it causes Coverity to bleat about possibly
leaked storage.

While here, do some janitorial work on the function's header comment.
2022-04-11 15:56:01 -04:00
Tom Lane 3c702b3ed1 Explicitly ignore guaranteed-true result from pgstat_lock_entry().
With nowait passed as false, pgstat_lock_entry() must return true
so there's no need to check its result.  Coverity seems unconvinced
of this, so whack it upside the head with a (void) cast.
2022-04-11 13:22:37 -04:00
Tom Lane 93fcf2d209 fgetc() returns int, not char.
This has no practical effect, since this code doesn't actually need to
distinguish EOF (-1) from \0377; but it silences a Coverity complaint.
2022-04-11 13:15:46 -04:00
David Rowley b0e5f02ddc Fix various typos and spelling mistakes in code comments
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
2022-04-11 20:49:41 +12:00
Michael Paquier 7597cc3083 Fix the dates of some copyright notices
0ad8032 and 4e34747 are at the origin of that.  Julien has found the one
in parse_jsontable.c, while I have spotted the rest.

Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/20220411060838.ftnzyvflpwu6f74w@jrouhaud
2022-04-11 16:36:25 +09:00
Peter Eisentraut 38abc39c81 Add missing serial commas 2022-04-09 16:15:01 +02:00
Robert Haas f37015a161 Rename delayChkpt to delayChkptFlags.
Before commit 412ad7a556, delayChkpt
was a Boolean. Now it's an integer. Extensions using it need to be
appropriately updated, so let's rename the field to make sure that
a hard compilation failure occurs.

Replacing delayChkpt with delayChkptFlags made a few comments extend
past 80 characters, so I reflowed them and changed some wording very
slightly.

The back-branches will need a different change to restore compatibility
with existing minor releases; this is just for master.

Per suggestion from Tom Lane.

Discussion: http://postgr.es/m/a7880f4d-1d74-582a-ada7-dad168d046d1@enterprisedb.com
2022-04-08 11:44:17 -04:00
Jeff Davis 12aaae5131 Check XLogRecHasBlockRef() before XLogRecHasBlockImage().
Trial fix of buildfarm failures on kestrel and tamandua.
2022-04-08 02:30:57 -07:00
Jeff Davis 2258e76f90 Add contrib/pg_walinspect.
Provides similar functionality to pg_waldump, but from a SQL interface
rather than a separate utility.

Author: Bharath Rupireddy
Reviewed-by: Greg Stark, Kyotaro Horiguchi, Andres Freund, Ashutosh Sharma, Nitin Jadhav, RKN Sai Krishna
Discussion: https://postgr.es/m/CALj2ACUGUYXsEQdKhEdsBzhGEyF3xggvLdD8C0VT72TNEfOiog%40mail.gmail.com
2022-04-08 00:26:44 -07:00
Peter Eisentraut 708007dced Remove error message hints mentioning configure options
These are usually not useful since users will use packaged
distributions and won't be interested in rebuilding their installation
from source.  Also, we have only used these kinds of hints for some
features and in some places, not consistently throughout.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/2552aed7-d0e9-280a-54aa-2dc7073f371d%40enterprisedb.com
2022-04-08 07:41:55 +02:00
Michael Paquier efb0ef909f Track I/O timing for temporary file blocks in EXPLAIN (BUFFERS)
Previously, the output of EXPLAIN (BUFFERS) option showed only the I/O
timing spent reading and writing shared and local buffers.  This commit
adds on top of that the I/O timing for temporary buffers in the output
of EXPLAIN (for spilled external sorts, hashes, materialization. etc).
This can be helpful for users in cases where the I/O related to
temporary buffers is the bottleneck.

Like its cousin, this information is available only when track_io_timing
is enabled.  Playing the patch, this is showing an extra overhead of up
to 1% even when using gettimeofday() as implementation for interval
timings, which is slightly within the usual range noise still that's
measurable.

Author: Masahiko Sawada
Reviewed-by: Georgios Kokolatos, Melanie Plageman, Julien Rouhaud,
Ranier Vilela
Discussion: https://postgr.es/m/CAD21AoAJgotTeP83p6HiAGDhs_9Fw9pZ2J=_tYTsiO5Ob-V5GQ@mail.gmail.com
2022-04-08 11:27:21 +09:00
Peter Geoghegan 10a8d13823 Truncate line pointer array during heap pruning.
Reclaim space from the line pointer array when heap pruning leaves
behind a contiguous group of LP_UNUSED items at the end of the array.
This happens during subsequent page defragmentation.  Certain kinds of
heap line pointer bloat are ameliorated by this new optimization.

Follow-up work to commit 3c3b8a4b26, which taught VACUUM to truncate the
line pointer array in about the same way during VACUUM's second pass
over the heap.  We now apply line pointer array truncation during both
the first and the second pass over the heap made by VACUUM.  We can also
perform line pointer array truncation during opportunistic pruning.

Matthias van de Meent, with small tweaks by me.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAEze2WjgaQc55Y5f5CQd3L=eS5CZcff2Obxp=O6pto8-f0hC4w@mail.gmail.com
Discussion: https://postgr.es/m/CAEze2Wg36%2B4at2eWJNcYNiW2FJmht34x3YeX54ctUSs7kKoNcA%40mail.gmail.com
2022-04-07 15:42:12 -07:00
David Rowley 9d9c02ccd1 Teach planner and executor about monotonic window funcs
Window functions such as row_number() always return a value higher than
the previously returned value for tuples in any given window partition.

Traditionally queries such as;

SELECT * FROM (
   SELECT *, row_number() over (order by c) rn
   FROM t
) t WHERE rn <= 10;

were executed fairly inefficiently.  Neither the query planner nor the
executor knew that once rn made it to 11 that nothing further would match
the outer query's WHERE clause.  It would blindly continue until all
tuples were exhausted from the subquery.

Here we implement means to make the above execute more efficiently.

This is done by way of adding a pg_proc.prosupport function to various of
the built-in window functions and adding supporting code to allow the
support function to inform the planner if the window function is
monotonically increasing, monotonically decreasing, both or neither.  The
planner is then able to make use of that information and possibly allow
the executor to short-circuit execution by way of adding a "run condition"
to the WindowAgg to allow it to determine if some of its execution work
can be skipped.

This "run condition" is not like a normal filter.  These run conditions
are only built using quals comparing values to monotonic window functions.
For monotonic increasing functions, quals making use of the btree
operators for <, <= and = can be used (assuming the window function column
is on the left). You can see here that once such a condition becomes false
that a monotonic increasing function could never make it subsequently true
again.  For monotonically decreasing functions the >, >= and = btree
operators for the given type can be used for run conditions.

The best-case situation for this is when there is a single WindowAgg node
without a PARTITION BY clause.  Here when the run condition becomes false
the WindowAgg node can simply return NULL.  No more tuples will ever match
the run condition.  It's a little more complex when there is a PARTITION
BY clause.  In this case, we cannot return NULL as we must still process
other partitions.  To speed this case up we pull tuples from the outer
plan to check if they're from the same partition and simply discard them
if they are.  When we find a tuple belonging to another partition we start
processing as normal again until the run condition becomes false or we run
out of tuples to process.

When there are multiple WindowAgg nodes to evaluate then this complicates
the situation.  For intermediate WindowAggs we must ensure we always
return all tuples to the calling node.  Any filtering done could lead to
incorrect results in WindowAgg nodes above.  For all intermediate nodes,
we can still save some work when the run condition becomes false.  We've
no need to evaluate the WindowFuncs anymore.  Other WindowAgg nodes cannot
reference the value of these and these tuples will not appear in the final
result anyway.  The savings here are small in comparison to what can be
saved in the top-level WingowAgg, but still worthwhile.

Intermediate WindowAgg nodes never filter out tuples, but here we change
WindowAgg so that the top-level WindowAgg filters out tuples that don't
match the intermediate WindowAgg node's run condition.  Such filters
appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node.

Here we add prosupport functions to allow the above to work for;
row_number(), rank(), dense_rank(), count(*) and count(expr).  It appears
technically possible to do the same for min() and max(), however, it seems
unlikely to be useful enough, so that's not done here.

Bump catversion

Author: David Rowley
Reviewed-by: Andy Fan, Zhihong Yu
Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com
2022-04-08 10:34:36 +12:00
Alvaro Herrera a90641eac2
Revert "Rewrite some RI code to avoid using SPI"
This reverts commit 99392cdd78.
We'd rather rewrite ri_triggers.c as a whole rather than piecemeal.

Discussion: https://postgr.es/m/E1ncXX2-000mFt-Pe@gemulon.postgresql.org
2022-04-07 23:42:13 +02:00
Alvaro Herrera 99392cdd78
Rewrite some RI code to avoid using SPI
Modify the subroutines called by RI trigger functions that want to check
if a given referenced value exists in the referenced relation to simply
scan the foreign key constraint's unique index, instead of using SPI to
execute
  SELECT 1 FROM referenced_relation WHERE ref_key = $1
This saves a lot of work, especially when inserting into or updating a
referencing relation.

This rewrite allows to fix a PK row visibility bug caused by a partition
descriptor hack which requires ActiveSnapshot to be set to come up with
the correct set of partitions for the RI query running under REPEATABLE
READ isolation.  We now set that snapshot indepedently of the snapshot
to be used by the PK index scan, so the two no longer interfere.  The
buggy output in src/test/isolation/expected/fk-snapshot.out of the
relevant test case added by commit 00cb86e75d has been corrected.
(The bug still exists in branch 14, however, but this fix is too
invasive to backpatch.)

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Li Japin <japinli@hotmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/CA+HiwqGkfJfYdeq5vHPh6eqPKjSbfpDDY+j-kXYFePQedtSLeg@mail.gmail.com
2022-04-07 21:10:03 +02:00
Tomas Vondra 2c7ea57e56 Revert "Logical decoding of sequences"
This reverts a sequence of commits, implementing features related to
logical decoding and replication of sequences:

 - 0da92dc530
 - 80901b3291
 - b779d7d8fd
 - d5ed9da41d
 - a180c2b34d
 - 75b1521dae
 - 2d2232933b
 - 002c9dd97a
 - 05843b1aa4

The implementation has issues, mostly due to combining transactional and
non-transactional behavior of sequences. It's not clear how this could
be fixed, but it'll require reworking significant part of the patch.

Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com
2022-04-07 20:06:36 +02:00
Peter Eisentraut 344d62fb9a Unlogged sequences
Add support for unlogged sequences.  Unlike for unlogged tables, this
is not a performance feature.  It allows sequences associated with
unlogged tables to be excluded from replication.

A new subcommand ALTER SEQUENCE ... SET LOGGED/UNLOGGED is added.

An identity/serial sequence now automatically gets and follows the
persistence level (logged/unlogged) of its owning table.  (The
sequences owned by temporary tables were already temporary through the
separate mechanism in RangeVarAdjustRelationPersistence().)  But you
can still change the persistence of an owned sequence separately.
Also, pg_dump and pg_upgrade preserve the persistence of existing
sequences.

Discussion: https://www.postgresql.org/message-id/flat/04e12818-2f98-257c-b926-2845d74ed04f%402ndquadrant.com
2022-04-07 16:18:00 +02:00
Daniel Gustafsson bab588cd5c Fix typo in xlogrecovery.c code comment
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CALj2ACUoPtnReT=yAQMcWLtcCpk7p83xjeA8tiRX8Q0_sjh8kw@mail.gmail.com
2022-04-07 14:02:33 +02:00
Thomas Munro 5dc0418fab Prefetch data referenced by the WAL, take II.
Introduce a new GUC recovery_prefetch.  When enabled, look ahead in the
WAL and try to initiate asynchronous reading of referenced data blocks
that are not yet cached in our buffer pool.  For now, this is done with
posix_fadvise(), which has several caveats.  Since not all OSes have
that system call, "try" is provided so that it can be enabled where
available.  Better mechanisms for asynchronous I/O are possible in later
work.

Set to "try" for now for test coverage.  Default setting to be finalized
before release.

The GUC wal_decode_buffer_size limits the distance we can look ahead in
bytes of decoded data.

The existing GUC maintenance_io_concurrency is used to limit the number
of concurrent I/Os allowed, based on pessimistic heuristics used to
infer that I/Os have begun and completed.  We'll also not look more than
maintenance_io_concurrency * 4 block references ahead.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version)
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version)
Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version)
Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version)
Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version)
Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
2022-04-07 19:42:14 +12:00
Jeff Davis 9553b4115f Fix warning introduced in 5c279a6d35.
Change two macros to be static inline functions instead to keep the
data type consistent. This avoids a "comparison is always true"
warning that was occurring with -Wtype-limits. In the process, change
the names to look less like macros.

Discussion: https://postgr.es/m/20220407063505.njnnrmbn4sxqfsts@alap3.anarazel.de
2022-04-07 00:39:30 -07:00
Andres Freund ad401664b8 pgstat: add pg_stat_have_stats() test helper.
Will be used by tests committed subsequently.

Bumps catversion (this time for real, the one in 0f96965c65 got lost when
rebasing over 5c279a6d35).

Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_aNxL1WegCa45r=VAViCLnpOU7uNC7bTtGw+=QAPyYivw@mail.gmail.com
2022-04-07 00:21:54 -07:00
Andres Freund 0f96965c65 pgstat: add pg_stat_force_next_flush(), use it to simplify tests.
In the stats collector days it was hard to write tests for the stats system,
because fundamentally delivery of stats messages over UDP was not
synchronous (nor guaranteed). Now we easily can force pending stats updates to
be flushed synchronously.

This moves stats.sql into a parallel group, there isn't a reason for it to run
in isolation anymore. And it may shake out some bugs.

Bumps catversion.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 23:35:56 -07:00
Andres Freund 5e07d3d6bd pgstat: fix small bug in pgstat_drop_relation().
Just after committing 5891c7a8ed, a test running with debug_discard_caches=1
failed locally...

pgstat_drop_relation() neither checked pgstat_should_count_relation() nor
called pgstat_prep_relation_pending(). With debug_discard_caches=1
rel->pgstat_info wasn't set up, leading pg_stat_get_xact_tuples_inserted()
spuriously still returning > 0 while in the transaction dropping the table.
2022-04-06 23:35:56 -07:00
Andres Freund 81ae9e6588 pgstat: prevent fix pgstat_reinit_entry() from zeroing out lwlock.
Zeroing out an lwlock in a normal build turns out to not trigger any alarms,
if nobody can use the lwlock at that moment (as the case here). But with
--disable-spinlocks --disable-atomics, the sema field needs to be initialized.

We probably should make sure that this fails on more common configurations as
well...

Per buildfarm animal rorqual
2022-04-06 23:35:56 -07:00
Andres Freund 3536b851ad Fix compilation with WAL_DEBUG.
Broke with 5c279a6d35. But looks like it had been half-broken since
70e81861fa, because 'rmid' didn't refer to the current record's rmid anymore,
but to rmid from "Initialize resource managers" - a constant.
2022-04-06 23:26:59 -07:00
Jeff Davis 5c279a6d35 Custom WAL Resource Managers.
Allow extensions to specify a new custom resource manager (rmgr),
which allows specialized WAL. This is meant to be used by a Table
Access Method or Index Access Method.

Prior to this commit, only Generic WAL was available, which offers
support for recovery and physical replication but not logical
replication.

Reviewed-by: Julien Rouhaud, Bharath Rupireddy, Andres Freund
Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com
2022-04-06 23:06:46 -07:00
Michael Paquier 06f5295af6 Add single-item cache when looking at topmost XID of a subtrans XID
This change affects SubTransGetTopmostTransaction(), used to find the
topmost transaction ID of a given transaction ID.  The cache is able to
store one value, so as we can save the backend from unnecessary lookups
at pg_subtrans/ on repetitive calls of this routine.  There is a similar
practice in transam.c, for example.

Author: Simon Riggs
Reviewed-by: Andrey Borodin, Julien Rouhaud
Discussion: https://postgr.es/m/CANbhV-G8Co=yq4v4BkW7MJDqVt68K_8A48nAZ_+8UQS7LrwLEQ@mail.gmail.com
2022-04-07 14:34:37 +09:00
Andres Freund fbfe6910ec pgstat: move pgstat.c to utils/activity.
Now that pgstat is not related to postmaster anymore, src/backend/postmaster
is not a well fitting directory.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund 1db4e5a4ee pgstat: rename STATS_COLLECTOR GUC group to STATS_CUMULATIVE.
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund 6f0cf87872 pgstat: remove stats_temp_directory.
With stats now being stored in shared memory, the GUC isn't needed
anymore. However, the pg_stat_tmp directory and PG_STAT_TMP_DIR define are
kept, as pg_stat_statements (and some out-of-core extensions) store data in
it.

Docs will be updated in a subsequent commit, together with the other pending
docs updates due to shared memory stats.

Author: Andres Freund <andres@anarazel.de>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220330233550.eiwsbearu6xhuqwe@alap3.anarazel.de
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund 5891c7a8ed pgstat: store statistics in shared memory.
Previously the statistics collector received statistics updates via UDP and
shared statistics data by writing them out to temporary files regularly. These
files can reach tens of megabytes and are written out up to twice a
second. This has repeatedly prevented us from adding additional useful
statistics.

Now statistics are stored in shared memory. Statistics for variable-numbered
objects are stored in a dshash hashtable (backed by dynamic shared
memory). Fixed-numbered stats are stored in plain shared memory.

The header for pgstat.c contains an overview of the architecture.

The stats collector is not needed anymore, remove it.

By utilizing the transactional statistics drop infrastructure introduced in a
prior commit statistics entries cannot "leak" anymore. Previously leaked
statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On
systems with many small relations pgstat_vacuum_stat() could be quite
expensive.

Now that replicas drop statistics entries for dropped objects, it is not
necessary anymore to reset stats when starting from a cleanly shut down
replica.

Subsequent commits will perform some further code cleanup, adapt docs and add
tests.

Bumps PGSTAT_FILE_FORMAT_ID.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@2ndquadrant.com> (in a much earlier version)
Reviewed-By: Arthur Zakirov <a.zakirov@postgrespro.ru> (in a much earlier version)
Reviewed-By: Antonin Houska <ah@cybertec.at> (in a much earlier version)
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
Discussion: https://postgr.es/m/20210319235115.y3wz7hpnnrshdyv6@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund be902e2651 pgstat: normalize function naming.
Most of pgstat uses pgstat_<verb>_<subject>() or just <verb>_<subject>(). But
not all (some introduced fairly recently by me). Rename ones that aren't
intentionally following a different scheme (e.g. AtEOXact_*).
2022-04-06 21:29:46 -07:00
Amit Kapila 79b716cfb7 Reorder subskiplsn in pg_subscription to avoid alignment issues.
The column 'subskiplsn' uses TYPALIGN_DOUBLE (which has 4 bytes alignment
on AIX) for storage. But the C Struct (Form_pg_subscription) has 8-byte
alignment for this field, so retrieving it from storage causes an
unaligned read.

To fix this, we rearranged the 'subskiplsn' column in the catalog so that
it naturally comes at an 8-byte boundary.

We have fixed a similar problem in commit f3b421da5f. This patch adds a
test to avoid a similar mistake in the future.

Reported-by: Noah Misch
Diagnosed-by: Noah Misch, Masahiko Sawada, Amit Kapila
Author: Masahiko Sawada
Reviewed-by: Noah Misch, Amit Kapila
Discussion: https://postgr.es/m/20220401074423.GC3682158@rfd.leadboat.com
	    https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
2022-04-07 09:39:25 +05:30
Andres Freund e41aed674f pgstat: revise replication slot API in preparation for shared memory stats.
Previously the pgstat <-> replication slots API was done with on the basis of
names. However, the upcoming move to storing stats in shared memory makes it
more convenient to use a integer as key.

Change the replication slot functions to take the slot rather than the slot
name, and expose ReplicationSlotIndex() to compute the index of an replication
slot. Special handling will be required for restarts, as the index is not
stable across restarts. For now pgstat internally still uses names.

Rename pgstat_report_replslot_{create,drop}() to
pgstat_{create,drop}_replslot() to match the functions for other kinds of
stats.

Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
2022-04-06 18:38:24 -07:00
Andres Freund 8b1dccd37c pgstat: scaffolding for transactional stats creation / drop.
One problematic part of the current statistics collector design is that there
is no reliable way of getting rid of statistics entries. Because of that
pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the
current database with the catalog contents and tries to drop now-superfluous
entries. That's quite expensive. What's worse, it doesn't work on physical
replicas, despite physical replicas collection statistics entries.

This commit introduces infrastructure to create / drop statistics entries
transactionally, together with the underlying catalog objects (functions,
relations, subscriptions). pgstat_xact.c maintains a list of stats entries
created / dropped transactionally in the current transaction. To ensure the
removal of statistics entries is durable dropped statistics entries are
included in commit / abort (and prepare) records, which also ensures that
stats entries are dropped on standbys.

Statistics entries created separately from creating the underlying catalog
object (e.g. when stats were previously lost due to an immediate restart)
are *not* WAL logged. However that can only happen outside of the transaction
creating the catalog object, so it does not lead to "leaked" statistics
entries.

For this to work, functions creating / dropping functions / relations /
subscriptions need to call into pgstat. For subscriptions this was already
done when dropping subscriptions, via pgstat_report_subscription_drop() (now
renamed to pgstat_drop_subscription()).

This commit does not actually drop stats yet, it just provides the
infrastructure. It is however a largely independent piece of infrastructure,
so committing it separately makes sense.

Bumps XLOG_PAGE_MAGIC.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 18:27:52 -07:00
Andres Freund 8fb580a35c pgstat: prepare APIs used by pgstatfuncs for shared memory stats.
With the introduction of PgStat_Kind PgStat_Single_Reset_Type,
PgStat_Shared_Reset_Target don't make sense anymore. Replace them with
PgStat_Kind.

Instead of having dedicated reset functions for different kinds of stats, use
two generic helper routines (one to reset all stats of a kind, one to reset
one stats entry).

A number of reset functions were named pgstat_reset_*_counter(), despite
affecting multiple counters. The generic helper routines get rid of
pgstat_reset_single_counter(), pgstat_reset_subscription_counter().

Rename pgstat_reset_slru_counter(), pgstat_reset_replslot_counter() to
pgstat_reset_slru(), pgstat_reset_replslot() respectively, and have them only
deal with a single SLRU/slot. Resetting all SLRUs/slots goes through the
generic pgstat_reset_of_kind().

Previously pg_stat_reset_replication_slot() used SearchNamedReplicationSlot()
to check if a slot exists. API wise it seems better to move that to
pgstat_replslot.c.

This is done separately from the - quite large - shared memory statistics
patch to make review easier.

Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
2022-04-06 17:56:19 -07:00
Andres Freund 8ea7963fc7 pgstat: add pgstat_copy_relation_stats().
Until now index_concurrently_swap() directly modified pgstat internal
datastructures. That will break with the introduction of shared memory
statistics and seems off architecturally.

This is done separately from the - quite large - shared memory statistics
patch to make review easier.

Author: Andres Freund <andres@anarazel.de>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 14:09:18 -07:00
Andres Freund cc96373cf3 pgstat: rename some pgstat_send_* functions to pgstat_report_*.
Only the pgstat_send_* functions that are called from outside pgstat*.c are
renamed (the rest will go away). This is done separately from the - quite
large - shared memory statistics patch to make review easier.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
2022-04-06 14:08:57 -07:00
Tom Lane dbafe127bb Suppress "variable 'pagesaving' set but not used" warning.
With asserts disabled, late-model clang notices that this variable
is incremented but never otherwise read.

Discussion: https://postgr.es/m/3171401.1649275153@sss.pgh.pa.us
2022-04-06 17:03:50 -04:00
Andres Freund bdbd3d9064 pgstat: stats collector references in comments.
Soon the stats collector will be no more, with statistics instead getting
stored in shared memory. There are a lot of references to the stats collector
in comments. This commit replaces most of these references with "cumulative
statistics system", with the remaining ones getting replaced as part of
subsequent commits.

This is done separately from the - quite large - shared memory statistics
patch to make review easier.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
2022-04-06 13:56:06 -07:00
Andres Freund ab62a642d5 pgstat: move transactional code into pgstat_xact.c.
The transactional integration code is largely independent from the rest of
pgstat.c. Subsequent commits will add more related code.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
2022-04-06 13:23:47 -07:00
Andres Freund c3e9b07936 pgstat: move pgstat_report_autovac() to pgstat_database.c.
I got the location wrong in 13619598f1. The name did make it sound like it
belonged in pgstat_relation.c...
2022-04-06 12:41:29 -07:00
Andres Freund 46a2d2499a dsm: allow use in single user mode.
It might seem pointless to allow use of dsm in single user mode, but otherwise
subsystems might need dedicated single user mode code paths.

Besides changing the assert, all that's needed is to make some windows code
assuming the presence of postmaster conditional.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGL9hY_VY=+oUK+Gc1iSRx-Ls5qeYJ6q=dQVZnT3R63Taw@mail.gmail.com
2022-04-06 12:40:04 -07:00
Stephen Frost 39969e2a1e Remove exclusive backup mode
Exclusive-mode backups have been deprecated since 9.6 (when
non-exclusive backups were introduced) due to the issues
they can cause should the system crash while one is running and
generally because non-exclusive provides a much better interface.
Further, exclusive backup mode wasn't really being tested (nor was most
of the related code- like being able to log in just to stop an exclusive
backup and the bits of the state machine related to that) and having to
possibly deal with an exclusive backup and the backup_label file
existing during pg_basebackup, pg_rewind, etc, added other complexities
that we are better off without.

This patch removes the exclusive backup mode, the various special cases
for dealing with it, and greatly simplifies the online backup code and
documentation.

Authors: David Steele, Nathan Bossart
Reviewed-by: Chapman Flack
Discussion: https://postgr.es/m/ac7339ca-3718-3c93-929f-99e725d1172c@pgmasters.net
https://postgr.es/m/CAHg+QDfiM+WU61tF6=nPZocMZvHDzCK47Kneyb0ZRULYzV5sKQ@mail.gmail.com
2022-04-06 14:41:03 -04:00
Tom Lane a0ffa885e4 Allow granting SET and ALTER SYSTEM privileges on GUC parameters.
This patch allows "PGC_SUSET" parameters to be set by non-superusers
if they have been explicitly granted the privilege to do so.
The privilege to perform ALTER SYSTEM SET/RESET on a specific parameter
can also be granted.
Such privileges are cluster-wide, not per database.  They are tracked
in a new shared catalog, pg_parameter_acl.

Granting and revoking these new privileges works as one would expect.
One caveat is that PGC_USERSET GUCs are unaffected by the SET privilege
--- one could wish that those were handled by a revocable grant to
PUBLIC, but they are not, because we couldn't make it robust enough
for GUCs defined by extensions.

Mark Dilger, reviewed at various times by Andrew Dunstan, Robert Haas,
Joshua Brindle, and myself

Discussion: https://postgr.es/m/3D691E20-C1D5-4B80-8BA5-6BEB63AF3029@enterprisedb.com
2022-04-06 13:24:33 -04:00
Peter Eisentraut 01effb1304 Fix unsigned output format in SLRU error reporting
Avoid printing signed values as unsigned.  (No impact in practice
expected.)

Author: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CALT9ZEHN7hWJo6MgJKqoDMGj%3DGOzQU50wTvOYZXDj7x%3DsUK-kw%40mail.gmail.com
2022-04-06 09:15:05 +02:00
Etsuro Fujita c2bb02bc2e Allow asynchronous execution in more cases.
In commit 27e1f1456, create_append_plan() only allowed the subplan
created from a given subpath to be executed asynchronously when it was
an async-capable ForeignPath.  To extend coverage, this patch handles
cases when the given subpath includes some other Path types as well that
can be omitted in the plan processing, such as a ProjectionPath directly
atop an async-capable ForeignPath, allowing asynchronous execution in
partitioned-scan/partitioned-join queries with non-Var tlist expressions
and more UNION queries.

Andrey Lepikhov and Etsuro Fujita, reviewed by Alexander Pyhalov and
Zhihong Yu.

Discussion: https://postgr.es/m/659c37a8-3e71-0ff2-394c-f04428c76f08%40postgrespro.ru
2022-04-06 15:45:00 +09:00
Amit Kapila 2d09e44d30 Improve comments for row filtering and toast interaction in logical replication.
Reported-by: Antonin Houska
Author: Amit Kapila
Reviewed-by: Antonin Houska, Ajin Cherian
Discussion: https://postgr.es/m/84638.1649152255@antos
2022-04-06 08:20:40 +05:30
Andrew Dunstan fadb48b00e PLAN clauses for JSON_TABLE
These clauses allow the user to specify how data from nested paths are
joined, allowing considerable freedom in shaping the tabular output of
JSON_TABLE.

PLAN DEFAULT allows the user to specify the global strategies when
dealing with sibling or child nested paths. The is often sufficient to
achieve the necessary goal, and is considerably simpler than the full
PLAN clause, which allows the user to specify the strategy to be used
for each named nested path.

Nikita Glukhov

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zhihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/7e2cb85d-24cf-4abb-30a5-1a33715959bd@postgrespro.ru
2022-04-05 14:17:08 -04:00
Peter Geoghegan e83ebfe6d7 Have VACUUM warn on relfrozenxid "in the future".
Commits 74cf7d46 and a61daa14 fixed pg_upgrade bugs involving oversights
in how relfrozenxid or relminmxid are carried forward or initialized.
Corruption caused by bugs of this nature was ameliorated by commit
78db307bb2, which taught VACUUM to always overwrite existing invalid
relfrozenxid or relminmxid values that are apparently "in the future".

Extend that work now by showing a warning in the event of overwriting
either relfrozenxid or relminmxid due to an existing value that is "in
the future".  There is probably a decent chance that the sanity checks
added by commit 699bf7d05c will raise an error before VACUUM reaches
this point, but we shouldn't rely on that.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-WzmRZEzeGvLv8yDW0AbFmSvJjTziORqjVUrf74mL4GL0Ww@mail.gmail.com
2022-04-05 09:44:52 -07:00
Alvaro Herrera 297daa9d43
Refactor and cleanup runtime partition prune code a little
* Move the execution pruning initialization steps that are common
between both ExecInitAppend() and ExecInitMergeAppend() into a new
function ExecInitPartitionPruning() defined in execPartition.c.
Those steps include creation of a PartitionPruneState to be used for
all instances of pruning and determining the minimal set of child
subplans that need to be initialized by performing initial pruning if
needed, and finally adjusting the subplan_map arrays in the
PartitionPruneState to reflect the new set of subplans remaining
after initial pruning if it was indeed performed.
ExecCreatePartitionPruneState() is no longer exported out of
execPartition.c and has been renamed to CreatePartitionPruneState()
as a local sub-routine of ExecInitPartitionPruning().

* Likewise, ExecFindInitialMatchingSubPlans() that was in charge of
performing initial pruning no longer needs to be exported.  In fact,
since it would now have the same body as the more generally named
ExecFindMatchingSubPlans(), except differing in the value of
initial_prune passed to the common subroutine
find_matching_subplans_recurse(), it seems better to remove it and add
an initial_prune argument to ExecFindMatchingSubPlans().

* Add an ExprContext field to PartitionPruneContext to remove the
implicit assumption in the runtime pruning code that the ExprContext to
use to compute pruning expressions that need one can always rely on the
PlanState providing it.  A future patch will allow runtime pruning (at
least the initial pruning steps) to be performed without the
corresponding PlanState yet having been created, so this will help.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqEYCpEqh2LMDOp9mT+4-QoVe8HgFMKBjntEMCTZLpcCCA@mail.gmail.com
2022-04-05 11:46:48 +02:00
Andres Freund 909eebf27b dshash: revise sequential scan support.
The previous coding of dshash_seq_next(), on the first call, accessed
status->hash_table->size_log2 without holding a partition lock and without
guaranteeing that ensure_valid_bucket_pointers() had ever been called.

That oversight turns out to not have immediately visible effects, because
bucket 0 is always in partition 0, and ensure_valid_bucket_pointers() was
called after acquiring the partition lock.  However,
PARTITION_FOR_BUCKET_INDEX() with a size_log2 of 0 ends up triggering formally
undefined behaviour.

Simplify by accessing partition 0, without using PARTITION_FOR_BUCKET_INDEX().

While at it, remove dshash_get_current(), there is no convincing use
case. Also polish a few comments.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGL9hY_VY=+oUK+Gc1iSRx-Ls5qeYJ6q=dQVZnT3R63Taw@mail.gmail.com
2022-04-04 14:32:52 -07:00
Andres Freund edadf8098f pgstat: consistent function comment formatting.
There was a wild mishmash of function comment formatting in pgstat, making it
hard to know what to use for any new function and hard to extend existing
comments (particularly due to randomly different forms of indentation).

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220329191727.mzzwbl7udhpq7pmf@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
2022-04-04 13:53:34 -07:00
Andrew Dunstan 4e34747c88 JSON_TABLE
This feature allows jsonb data to be treated as a table and thus used in
a FROM clause like other tabular data. Data can be selected from the
jsonb using jsonpath expressions, and hoisted out of nested structures
in the jsonb to form multiple rows, more or less like an outer join.

Nikita Glukhov

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zhihong Yu (whose
name I previously misspelled), Himanshu Upadhyaya, Daniel Gustafsson,
Justin Pryzby.

Discussion: https://postgr.es/m/7e2cb85d-24cf-4abb-30a5-1a33715959bd@postgrespro.ru
2022-04-04 16:03:47 -04:00
Peter Geoghegan c42a6fc41d vacuumlazy.c: Further consolidate resource allocation.
Move remaining VACUUM resource allocation and deallocation code from
lazy_scan_heap() to its caller, heap_vacuum_rel().  This finishes off
work started by commit 73f6ec3d.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wzk3fNBa_S3Ngi+16GQiyJ=AmUu3oUY99syMDTMRxitfyQ@mail.gmail.com
2022-04-04 11:53:33 -07:00
Andrew Dunstan 4eb9798879 Avoid freeing objects during json aggregate finalization
Commit f4fb45d15c tried to free memory during aggregate finalization.
This cause issues, particularly when used as a window function, so stop
doing that.

Per complaint by Jaime Casanova and diagnosis by Andres Freund

Discussion: https://postgr.es/m/YkfeMNYRCGhySKyg@ahch-to
2022-04-04 11:03:49 -04:00
David Rowley 40af10b571 Use Generation memory contexts to store tuples in sorts
The general usage pattern when we store tuples in tuplesort.c is that
we store a series of tuples one by one then either perform a sort or spill
them to disk.  In the common case, there is no pfreeing of already stored
tuples.  For the common case since we do not individually pfree tuples, we
have very little need for aset.c memory allocation behavior which
maintains freelists and always rounds allocation sizes up to the next
power of 2 size.

Here we conditionally use generation.c contexts for storing tuples in
tuplesort.c when the sort will never be bounded.  Unfortunately, the
memory context to store tuples is already created by the time any calls
would be made to tuplesort_set_bound(), so here we add a new sort option
that allows callers to specify if they're going to need a bounded sort or
not.  We'll use a standard aset.c allocator when this sort option is not
set.

Extension authors must ensure that the TUPLESORT_ALLOWBOUNDED flag is
used when calling tuplesort_begin_* for any sorts that make a call to
tuplesort_set_bound().

Author: David Rowley
Reviewed-by: Andy Fan
Discussion: https://postgr.es/m/CAApHDvoH4ASzsAOyHcxkuY01Qf++8JJ0paw+03dk+W25tQEcNQ@mail.gmail.com
2022-04-04 22:52:35 +12:00
David Rowley 77bae396df Adjust tuplesort API to have bitwise option flags
This replaces the bool flag for randomAccess.  An upcoming patch requires
adding another option, so instead of breaking the API for that, then
breaking it again one day if we add more options, let's just break it
once.  Any boolean options we add in the future will just make use of an
unused bit in the flags.

Any extensions making use of tuplesorts will need to update their code
to pass TUPLESORT_RANDOMACCESS instead of true for randomAccess.
TUPLESORT_NONE can be used for a set of empty options.

Author: David Rowley
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/CAApHDvoH4ASzsAOyHcxkuY01Qf%2B%2B8JJ0paw%2B03dk%2BW25tQEcNQ%40mail.gmail.com
2022-04-04 22:24:59 +12:00
David Rowley 1b0d9aa4f7 Improve the generation memory allocator
Here we make a series of improvements to the generation memory
allocator, namely:

1. Allow generation contexts to have a minimum, initial and maximum block
sizes. The standard allocator allows this already but when the generation
context was added, it only allowed fixed-sized blocks.  The problem with
fixed-sized blocks is that it's difficult to choose how large to make the
blocks.  If the chosen size is too small then we'd end up with a large
number of blocks and a large number of malloc calls. If the block size is
made too large, then memory is wasted.

2. Add support for "keeper" blocks.  This is a special block that is
allocated along with the context itself but is never freed.  Instead,
when the last chunk in the keeper block is freed, we simply mark the block
as empty to allow new allocations to make use of it.

3. Add facility to "recycle" newly empty blocks instead of freeing them
and having to later malloc an entire new block again.  We do this by
recording a single GenerationBlock which has become empty of any chunks.
When we run out of space in the current block, we check to see if there is
a "freeblock" and use that if it contains enough space for the allocation.

Author: David Rowley, Tomas Vondra
Reviewed-by: Andy Fan
Discussion: https://postgr.es/m/d987fd54-01f8-0f73-af6c-519f799a0ab8@enterprisedb.com
2022-04-04 20:53:13 +12:00
Thomas Munro cc58eecc5d Fix tuplesort optimization for CLUSTER-on-expression.
When dispatching sort operations to specialized variants, commit
69749243 failed to handle the case where CLUSTER-sort decides not to
initialize datum1 and isnull1.  Fix by hoisting that decision up a level
and advertising whether datum1 can be relied on, in the Tuplesortstate
object.

Per reports from UBsan and Valgrind build farm animals, while running
the cluster.sql test.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAFBsxsF1TeK5Fic0M%2BTSJXzbKsY6aBqJGNj6ptURuB09ZF6k_w%40mail.gmail.com
2022-04-04 10:52:02 +12:00
Tom Lane 591e088dd5 Fix portability issues in datetime parsing.
datetime.c's parsing logic has assumed that strtod() will accept
a string that looks like ".", which it does in glibc, but not on
some less-common platforms such as AIX.  The result of this was
that datetime fields like "123." would be accepted on some platforms
but not others; which is a sufficiently odd case that it's not that
surprising we've heard no field complaints.  But commit e39f99046
extended that assumption to new places, and happened to add a test
case that exposed the platform dependency.  Remove this dependency
by special-casing situations without any digits after the decimal
point.

(Again, this is in part a pre-existing bug but I don't feel a
compulsion to back-patch.)

Also, rearrange e39f99046's changes in formatting.c to avoid a
Coverity complaint that we were copying an uninitialized field.

Discussion: https://postgr.es/m/1592893.1648969747@sss.pgh.pa.us
2022-04-03 17:04:33 -04:00
Peter Geoghegan f3c15cbe50 Generalize how VACUUM skips all-frozen pages.
Non-aggressive VACUUMs were at a gratuitous disadvantage (relative to
aggressive VACUUMs) around advancing relfrozenxid and relminmxid before
now.  The issue only came up when concurrent activity unset some heap
page's visibility map bit right as VACUUM was considering if the page
should get counted in frozenskipped_pages.  The non-aggressive case
would recheck the all-frozen bit at this point.  The aggressive case
reasoned that the page (a skippable page) must have at least been
all-frozen in the recent past, so skipping it won't make relfrozenxid
advancement unsafe (which is never okay for aggressive VACUUMs).

The recheck created a window for some other backend to confuse matters
for VACUUM.  If the page's VM bit turned out to be unset, VACUUM would
conclude that the page was _never_ all-frozen.  frozenskipped_pages was
not incremented, and yet VACUUM couldn't back out of skipping at this
late stage (it couldn't choose to scan the page instead).  This made it
unsafe to advance relfrozenxid later on.

Consistently avoid the issue by generalizing how we skip frozen pages
during aggressive VACUUMs: take the same approach when skipping any
skippable page range during aggressive and non-aggressive VACUUMs alike.
The new approach makes ranges (not individual pages) the fundamental
unit of skipping using the visibility map.  frozenskipped_pages is
replaced with a boolean flag that represents whether some skippable
range with one or more all-visible pages was actually skipped.

It is safe for VACUUM to treat a page as all-frozen provided it at least
had its all-frozen bit set after the OldestXmin cutoff was established.
VACUUM is only required to scan pages that might have XIDs < OldestXmin
(unfrozen XIDs) to be able to safely advance relfrozenxid.  Tuples
concurrently inserted on "skipped" pages can be thought of as equivalent
to tuples concurrently inserted on a block >= rel_pages.

It's possible that the issue this commit fixes hardly ever came up in
practice.  But we only had to be unlucky once to lose out on advancing
relfrozenxid -- a single affected heap page was enough to throw VACUUM
off.  That seems like something to avoid on general principle.  This is
similar to an issue fixed by commit 44fa8488, which taught vacuumlazy.c
to not give up on non-aggressive relfrozenxid advancement just because a
cleanup lock wasn't immediately available on some heap page.

Skipping an all-visible range is now explicitly structured as a choice
made by non-aggressive VACUUMs, by weighing known costs (scanning extra
skippable pages to freeze their tuples early) against known benefits
(advancing relfrozenxid early).  This works in essentially the same way
as it always has (don't skip ranges < SKIP_PAGES_THRESHOLD).  We could
do much better here in the future by considering other relevant factors.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wzn6bGJGfOy3zSTJicKLw99PHJeSOQBOViKjSCinaxUKDQ@mail.gmail.com
Discussion: https://postgr.es/m/CA%2BTgmoZiSOY6H7aadw5ZZGm7zYmfDzL6nwmL5V7GL4HgJgLF_w%40mail.gmail.com
2022-04-03 13:35:43 -07:00
Peter Geoghegan 0b018fabaa Set relfrozenxid to oldest extant XID seen by VACUUM.
When VACUUM set relfrozenxid before now, it set it to whatever value was
used to determine which tuples to freeze -- the FreezeLimit cutoff.
This approach was very naive.  The relfrozenxid invariant only requires
that new relfrozenxid values be <= the oldest extant XID remaining in
the table (at the point that the VACUUM operation ends), which in
general might be much more recent than FreezeLimit.

VACUUM now carefully tracks the oldest remaining XID/MultiXactId as it
goes (the oldest remaining values _after_ lazy_scan_prune processing).
The final values are set as the table's new relfrozenxid and new
relminmxid in pg_class at the end of each VACUUM.  The oldest XID might
come from a tuple's xmin, xmax, or xvac fields.  It might even come from
one of the table's remaining MultiXacts.

Final relfrozenxid values must still be >= FreezeLimit in an aggressive
VACUUM (FreezeLimit still acts as a lower bound on the final value that
aggressive VACUUM can set relfrozenxid to).  Since standard VACUUMs
still make no guarantees about advancing relfrozenxid, they might as
well set relfrozenxid to a value from well before FreezeLimit when the
opportunity presents itself.  In general standard VACUUMs may now set
relfrozenxid to any value > the original relfrozenxid and <= OldestXmin.

Credit for the general idea of using the oldest extant XID to set
pg_class.relfrozenxid at the end of VACUUM goes to Andres Freund.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkymFbz6D_vL+jmqSn_5q1wsFvFrE+37yLgL_Rkfd6Gzg@mail.gmail.com
2022-04-03 09:57:21 -07:00
Tom Lane e39f990467 Fix overflow hazards in interval input and output conversions.
DecodeInterval (interval input) was careless about integer-overflow
hazards, allowing bogus results to be obtained for sufficiently
large input values.  Also, since it initially converted the input
to a "struct tm", it was impossible to produce the full range of
representable interval values.

Meanwhile, EncodeInterval (interval output) and a few other
functions could suffer failures if asked to process sufficiently
large interval values, because they also relied on being able to
represent an interval in "struct tm" which is not designed to
handle that.

Fix all this stuff by introducing new struct types that are more
fit for purpose.

While this is clearly a bug fix, it's also an API break for any
code that's calling these functions directly.  So back-patching
doesn't seem wise, especially in view of the lack of field
complaints.

Joe Koshakow, editorialized a bit by me

Discussion: https://postgr.es/m/CAAvxfHff0JLYHwyBrtMx_=6wr=k2Xp+D+-X3vEhHjJYMj+mQcg@mail.gmail.com
2022-04-02 16:12:29 -04:00
Peter Geoghegan 14bf1e8313 vacuumlazy.c: Clean up variable declarations.
Move some of the heap_vacuum_rel() instrumentation related variables to
the scope where they're actually needed.  Also reorder some of the
variable declarations at the start of heap_vacuum_rel() so that related
variables appear together.
2022-04-02 10:33:21 -07:00
Joe Conway 9752436f04 Use has_privs_for_roles for predefined role checks: round 2
Similar to commit 6198420ad, replace is_member_of_role with
has_privs_for_role for predefined role access checks in recently
committed basebackup code. In passing fix a double-word error
in a nearby comment.

Discussion: https://postgr.es/m/flat/CAGB+Vh4Zv_TvKt2tv3QNS6tUM_F_9icmuj0zjywwcgVi4PAhFA@mail.gmail.com
2022-04-02 13:24:38 -04:00
Alvaro Herrera cfdd03f45e
Allow CLUSTER on partitioned tables
This is essentially the same as applying VACUUM FULL to a partitioned
table, which has been supported since commit 3c3bb99330 (March 2017).
While there's no great use case in applying CLUSTER to partitioned
tables, we don't have any strong reason not to allow it either.

For now, partitioned indexes cannot be marked clustered, so an index
must always be specified.

While at it, rename some variables that were RangeVars during the
development that led to 8bc717cb88 but never made it that way to the
source tree; there's no need to perpetuate names that have always been
more confusing than helpful.

Author: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/20201028003312.GU9241@telsasoft.com
Discussion: https://postgr.es/m/20200611153502.GT14879@telsasoft.com
2022-04-02 19:08:34 +02:00
John Naylor 6974924347 Specialize tuplesort routines for different kinds of abbreviated keys
Previously, the specialized tuplesort routine inlined handling for
reverse-sort and NULLs-ordering but called the datum comparator via a
pointer in the SortSupport struct parameter. Testing has showed that we
can get a useful performance gain by specializing datum comparison for
the different representations of abbreviated keys -- signed and unsigned
64-bit integers and signed 32-bit integers. Almost all abbreviatable data
types will benefit -- the only exception for now is numeric, since the
datum comparison is more complex. The performance gain depends on data
type and input distribution, but often falls in the range of 10-20% faster.

Thomas Munro

Reviewed by Peter Geoghegan, review and performance testing by me

Discussion:
https://www.postgresql.org/message-id/CA%2BhUKGKKYttZZk-JMRQSVak%3DCXSJ5fiwtirFf%3Dn%3DPAbumvn1Ww%40mail.gmail.com
2022-04-02 15:22:25 +07:00
Michael Paquier d16773cdc8 Add macros in hash and btree AMs to get the special area of their pages
This makes the code more consistent with SpGiST, GiST and GIN, that
already use this style, and the idea is to make easier the introduction
of more sanity checks for each of these AM-specific macros.  BRIN uses a
different set of macros to get a page's type and flags, so it has no
need for something similar.

Author: Matthias van de Meent
Discussion: https://postgr.es/m/CAEze2WjE3+tGO9Fs9+iZMU+z6mMZKo54W1Zt98WKqbEUHbHOBg@mail.gmail.com
2022-04-01 13:24:50 +09:00
Andrew Dunstan 9f91344223 Fix comments with "a expression" 2022-03-31 15:45:25 -04:00
Andrew Dunstan 49082c2cc3 RETURNING clause for JSON() and JSON_SCALAR()
This patch is extracted from a larger patch that allowed setting the
default returned value from these functions to json or jsonb. That had
problems, but this piece of it is fine. For these functions only json or
jsonb can be specified in the RETURNING clause.

Extracted from an original patch from Nikita Glukhov

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
2022-03-31 15:45:24 -04:00
Tom Lane f3dd9fe1dd Fix postgres_fdw to check shippability of sort clauses properly.
postgres_fdw would push ORDER BY clauses to the remote side without
verifying that the sort operator is safe to ship.  Moreover, it failed
to print a suitable USING clause if the sort operator isn't default
for the sort expression's type.  The net result of this is that the
remote sort might not have anywhere near the semantics we expect,
which'd be disastrous for locally-performed merge joins in particular.

We addressed similar issues in the context of ORDER BY within an
aggregate function call in commit 7012b132d, but failed to notice
that query-level ORDER BY was broken.  Thus, much of the necessary
logic already existed, but it requires refactoring to be usable
in both cases.

Back-patch to all supported branches.  In HEAD only, remove the
core code's copy of find_em_expr_for_rel, which is no longer used
and really should never have been pushed into equivclass.c in the
first place.

Ronan Dunklau, per report from David Rowley;
reviews by David Rowley, Ranier Vilela, and myself

Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com
2022-03-31 14:29:48 -04:00
Amit Kapila 8f2e2bbf14 Raise a WARNING for missing publications.
When we create or alter a subscription to add publications raise a warning
for non-existent publications. We don't want to give an error here because
it is possible that users can later create the missing publications.

Author: Vignesh C
Reviewed-by: Bharath Rupireddy, Japin Li, Dilip Kumar, Euler Taveira, Ashutosh Sharma, Amit Kapila
Discussion: https://postgr.es/m/CALDaNm0f4YujGW+q-Di0CbZpnQKFFrXntikaQQKuEmGG0=Zw=Q@mail.gmail.com
2022-03-31 08:25:50 +05:30
Tomas Vondra db0d67db24 Optimize order of GROUP BY keys
When evaluating a query with a multi-column GROUP BY clause using sort,
the cost may be heavily dependent on the order in which the keys are
compared when building the groups. Grouping does not imply any ordering,
so we're allowed to compare the keys in arbitrary order, and a Hash Agg
leverages this. But for Group Agg, we simply compared keys in the order
as specified in the query. This commit explores alternative ordering of
the keys, trying to find a cheaper one.

In principle, we might generate grouping paths for all permutations of
the keys, and leave the rest to the optimizer. But that might get very
expensive, so we try to pick only a couple interesting orderings based
on both local and global information.

When planning the grouping path, we explore statistics (number of
distinct values, cost of the comparison function) for the keys and
reorder them to minimize comparison costs. Intuitively, it may be better
to perform more expensive comparisons (for complex data types etc.)
last, because maybe the cheaper comparisons will be enough. Similarly,
the higher the cardinality of a key, the lower the probability we’ll
need to compare more keys. The patch generates and costs various
orderings, picking the cheapest ones.

The ordering of group keys may interact with other parts of the query,
some of which may not be known while planning the grouping. E.g. there
may be an explicit ORDER BY clause, or some other ordering-dependent
operation, higher up in the query, and using the same ordering may allow
using either incremental sort or even eliminate the sort entirely.

The patch generates orderings and picks those minimizing the comparison
cost (for various pathkeys), and then adds orderings that might be
useful for operations higher up in the plan (ORDER BY, etc.). Finally,
it always keeps the ordering specified in the query, on the assumption
the user might have additional insights.

This introduces a new GUC enable_group_by_reordering, so that the
optimization may be disabled if needed.

The original patch was proposed by Teodor Sigaev, and later improved and
reworked by Dmitry Dolgov. Reviews by a number of people, including me,
Andrey Lepikhov, Claudio Freire, Ibrar Ahmed and Zhihong Yu.

Author: Dmitry Dolgov, Teodor Sigaev, Tomas Vondra
Reviewed-by: Tomas Vondra, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed, Zhihong Yu
Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru
Discussion: https://postgr.es/m/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com
2022-03-31 01:13:33 +02:00
Andrew Dunstan 606948b058 SQL JSON functions
This Patch introduces three SQL standard JSON functions:

JSON() (incorrectly mentioned in my commit message for f4fb45d15c)
JSON_SCALAR()
JSON_SERIALIZE()

JSON() produces json values from text, bytea, json or jsonb values, and
has facilitites for handling duplicate keys.
JSON_SCALAR() produces a json value from any scalar sql value, including
json and jsonb.
JSON_SERIALIZE() produces text or bytea from input which containis or
represents json or jsonb;

For the most part these functions don't add any significant new
capabilities, but they will be of use to users wanting standard
compliant JSON handling.

Nikita Glukhov

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
2022-03-30 16:30:37 -04:00
Peter Eisentraut 7ae1619bc5 Add range_agg with multirange inputs
range_agg for normal ranges already existed.  A lot of code can be
shared.

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Chapman Flack <chap@anastigmatix.net>
Discussion: https://www.postgresql.org/message-id/flat/007ef255-35ef-fd26-679c-f97e7a7f30c2@illuminatedcomputing.com
2022-03-30 20:16:23 +02:00
Peter Eisentraut f453d684ec Change some internal error messages to elogs
Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Chapman Flack <chap@anastigmatix.net>
Discussion: https://www.postgresql.org/message-id/flat/007ef255-35ef-fd26-679c-f97e7a7f30c2@illuminatedcomputing.com
2022-03-30 17:53:54 +02:00
Robert Haas 51c0d186d9 Allow parallel zstd compression when taking a base backup.
libzstd allows transparent parallel compression just by setting
an option when creating the compression context, so permit that
for both client and server-side backup compression. To use this,
use something like pg_basebackup --compress WHERE-zstd:workers=N
where WHERE is "client" or "server" and N is an integer.

When compression is performed on the server side, this will spawn
threads inside the PostgreSQL backend. While there is almost no
PostgreSQL server code which is thread-safe, the threads here are used
internally by libzstd and touch only data structures controlled by
libzstd.

Patch by me, based in part on earlier work by Dipesh Pandit
and Jeevan Ladhe. Reviewed by Justin Pryzby.

Discussion: http://postgr.es/m/CA+Tgmobj6u-nWF-j=FemygUhobhryLxf9h-wJN7W-2rSsseHNA@mail.gmail.com
2022-03-30 09:41:26 -04:00
Etsuro Fujita f505bec711 Fix typo in comment. 2022-03-30 19:00:00 +09:00
Peter Eisentraut 072132f04e Add header matching mode to COPY FROM
COPY FROM supports the HEADER option to silently discard the header
line from a CSV or text file.  It is possible to load by mistake a
file that matches the expected format, for example, if two text
columns have been swapped, resulting in garbage in the database.

This adds a new option value HEADER MATCH that checks the column names
in the header line against the actual column names and errors out if
they do not match.

Author: Rémi Lapeyre <remi.lapeyre@lenstra.fr>
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAF1-J-0PtCWMeLtswwGV2M70U26n4g33gpe1rcKQqe6wVQDrFA@mail.gmail.com
2022-03-30 09:02:31 +02:00
Amit Kapila d5a9d86d8f Skip empty transactions for logical replication.
The current logical replication behavior is to send every transaction to
subscriber even if the transaction is empty. This can happen because
transaction doesn't contain changes from the selected publications or all
the changes got filtered. It is a waste of CPU cycles and network
bandwidth to build/transmit these empty transactions.

This patch addresses the above problem by postponing the BEGIN message
until the first change is sent. While processing a COMMIT message, if
there was no other change for that transaction, do not send the COMMIT
message. This allows us to skip sending BEGIN/COMMIT messages for empty
transactions.

When skipping empty transactions in synchronous replication mode, we send
a keepalive message to avoid delaying such transactions.

Author: Ajin Cherian, Hou Zhijie, Euler Taveira
Reviewed-by: Peter Smith, Takamichi Osumi, Shi Yu, Masahiko Sawada, Greg Nancarrow, Vignesh C, Amit Kapila
Discussion: https://postgr.es/m/CAMkU=1yohp9-dv48FLoSPrMqYEyyS5ZWkaZGD41RJr10xiNo_Q@mail.gmail.com
2022-03-30 07:41:05 +05:30
Andrew Dunstan 1a36bc9dba SQL/JSON query functions
This introduces the SQL/JSON functions for querying JSON data using
jsonpath expressions. The functions are:

JSON_EXISTS()
JSON_QUERY()
JSON_VALUE()

All of these functions only operate on jsonb. The workaround for now is
to cast the argument to jsonb.

JSON_EXISTS() tests if the jsonpath expression applied to the jsonb
value yields any values. JSON_VALUE() must return a single value, and an
error occurs if it tries to return multiple values. JSON_QUERY() must
return a json object or array, and there are various WRAPPER options for
handling scalar or multi-value results. Both these functions have
options for handling EMPTY and ERROR conditions.

Nikita Glukhov

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
2022-03-29 16:57:13 -04:00
Robert Haas 9c08aea6a3 Add new block-by-block strategy for CREATE DATABASE.
Because this strategy logs changes on a block-by-block basis, it
avoids the need to checkpoint before and after the operation.
However, because it logs each changed block individually, it might
generate a lot of extra write-ahead logging if the template database
is large. Therefore, the older strategy remains available via a new
STRATEGY parameter to CREATE DATABASE, and a corresponding --strategy
option to createdb.

Somewhat controversially, this patch assembles the list of relations
to be copied to the new database by reading the pg_class relation of
the template database. Cross-database access like this isn't normally
possible, but it can be made to work here because there can't be any
connections to the database being copied, nor can it contain any
in-doubt transactions. Even so, we have to use lower-level interfaces
than normal, since the table scan and relcache interfaces will not
work for a database to which we're not connected. The advantage of
this approach is that we do not need to rely on the filesystem to
determine what ought to be copied, but instead on PostgreSQL's own
knowledge of the database structure. This avoids, for example,
copying stray files that happen to be located in the source database
directory.

Dilip Kumar, with a fairly large number of cosmetic changes by me.
Reviewed and tested by Ashutosh Sharma, Andres Freund, John Naylor,
Greg Nancarrow, Neha Sharma. Additional feedback from Bruce Momjian,
Heikki Linnakangas, Julien Rouhaud, Adam Brusselback, Kyotaro
Horiguchi, Tomas Vondra, Andrew Dunstan, Álvaro Herrera, and others.

Discussion: http://postgr.es/m/CA+TgmoYtcdxBjLh31DLxUXHxFVMPGzrU5_T=CYCvRyFHywSBUQ@mail.gmail.com
2022-03-29 11:48:36 -04:00
Alvaro Herrera bf902c1393
Revert "Fix replay of create database records on standby"
This reverts commit 49d9cfc68b.  The approach taken by this patch has
problems, so we'll come up with a radically different fix.

Discussion: https://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=iCFM0SuYqBbqZEamEg@mail.gmail.com
2022-03-29 15:36:21 +02:00
Robert Haas edea649afb Explain why the startup process can't cause a shortage of sinval slots.
Bharath Rupireddy, reviewed by Fujii Masao and Yura Sokolov.
Lightly edited by me.

Discussion: http://postgr.es/m/CALj2ACU=3_frMkDp9UUeuZoAMjaK1y0Z_q5RFNbGvwi8NM==AA@mail.gmail.com
2022-03-29 09:24:24 -04:00
Michael Paquier a2c84990be Add system view pg_ident_file_mappings
This view is similar to pg_hba_file_rules view, except that it is
associated with the parsing of pg_ident.conf.  Similarly to its cousin,
this view is useful to check via SQL if changes planned in pg_ident.conf
would work upon reload or restart, or to diagnose a previous failure.

Bumps catalog version.

Author: Julien Rouhaud
Reviewed-by: Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud
2022-03-29 10:15:48 +09:00
Andrew Dunstan 33a377608f IS JSON predicate
This patch intrdocuces the SQL standard IS JSON predicate. It operates
on text and bytea values representing JSON as well as on the json and
jsonb types. Each test has an IS and IS NOT variant. The tests are:

IS JSON [VALUE]
IS JSON ARRAY
IS JSON OBJECT
IS JSON SCALAR
IS JSON  WITH | WITHOUT UNIQUE KEYS

These are mostly self-explanatory, but note that IS JSON WITHOUT UNIQUE
KEYS is true whenever IS JSON is true, and IS JSON WITH UNIQUE KEYS is
true whenever IS JSON is true except it IS JSON OBJECT is true and there
are duplicate keys (which is never the case when applied to jsonb values).

Nikita Glukhov

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
2022-03-28 15:37:08 -04:00
Joe Conway 6198420ad8 Use has_privs_for_roles for predefined role checks
Generally if a role is granted membership to another role with NOINHERIT
they must use SET ROLE to access the privileges of that role, however
with predefined roles the membership and privilege is conflated. Fix that
by replacing is_member_of_role with has_privs_for_role for predefined
roles. Patch does not remove is_member_of_role from acl.h, but it does
add a warning not to use that function for privilege checking. Not
backpatched based on hackers list discussion.

Author: Joshua Brindle
Reviewed-by: Stephen Frost, Nathan Bossart, Joe Conway
Discussion: https://postgr.es/m/flat/CAGB+Vh4Zv_TvKt2tv3QNS6tUM_F_9icmuj0zjywwcgVi4PAhFA@mail.gmail.com
2022-03-28 15:10:04 -04:00
Robert Haas 79de9842ab Remove the ability of a role to administer itself.
Commit f9fd176461 effectively gave
every role ADMIN OPTION on itself. However, this appears to be
something that happened accidentally as a result of refactoring
work rather than an intentional decision. Almost a decade later,
it was discovered that this was a security vulnerability. As a
result, commit fea164a72a restricted
this implicit ADMIN OPTION privilege to be exercisable only when
the role being administered is the same as the session user and
when no security-restricted operation is in progress. That
commit also documented the existence of this implicit privilege
for what seems to be the first time.

The effect of the privilege is to allow a login role to grant
the privileges of that role, and optionally ADMIN OPTION on it,
to some other role. That's an unusual thing to do, because generally
membership is granted in roles used as groups, rather than roles
used as users. Therefore, it does not seem likely that removing
the privilege will break things for many PostgreSQL users.

However, it will make it easier to reason about the permissions
system. This is the only case where a user who has not been given any
special permission (superuser, or ADMIN OPTION on some role) can
modify role membership, so removing it makes things more consistent.
For example, if a superuser sets up role A and B and grants A to B
but no other privileges to anyone, she can now be sure that no one
else will be able to revoke that grant. Without this change, that
would have been true only if A was a non-login role.

Patch by me. Reviewed by Tom Lane and Stephen Frost.

Discussion: http://postgr.es/m/CA+Tgmoawdt03kbA+dNyBcNWJpRxu0f4X=69Y3+DkXXZqmwMDLg@mail.gmail.com
2022-03-28 13:38:13 -04:00
Robert Haas 61762426e6 Fix a few goofs in new backup compression code.
When we try to set the zstd compression level either on the client
or on the server, check for errors.

For any algorithm, on the client side, don't try to set the compression
level unless the user specified one. This was visibly broken for
zstd, which managed to set -1 rather than 0 in this case, but tidy
up the code for the other methods, too.

On the client side, if we fail to create a ZSTD_CCtx, exit after
reporting the error. Otherwise we'll dereference a null pointer.

Patch by me, reviewed by Dipesh Pandit.

Discussion: http://postgr.es/m/CA+TgmoZK3zLQUCGi1h4XZw4jHiAWtcACc+GsdJR1_Mc19jUjXA@mail.gmail.com
2022-03-28 12:19:05 -04:00
Tom Lane d22646922d Add public ruleutils.c entry point to deparse a Query.
This has no in-core callers but will be wanted by extensions.
It's just a thin wrapper around get_query_def, so it adds little code.

Also, fix get_from_clause_item() to force insertion of an alias
for a SUBQUERY RTE item.  This is irrelevant to existing uses because
RTE_SUBQUERY items made by the parser always have aliases already.
However, if one tried to use pg_get_querydef() to inspect a post-rewrite
Query, it could be an issue.  In any case, get_from_clause_item already
contained logic to force alias insertion for VALUES items, so the lack
of the same for SUBQUERY is a pretty clear oversight.

In passing, replace duplicated code for selection of pretty-print
options with a common macro.

Julien Rouhaud, reviewed by Pavel Stehule, Gilles Darold, and myself

Discussion: https://postgr.es/m/20210627041138.zklczwmu3ms4ufnk@nol
2022-03-28 11:19:37 -04:00
Alvaro Herrera 7103ebb7aa
Add support for MERGE SQL command
MERGE performs actions that modify rows in the target table using a
source table or query. MERGE provides a single SQL statement that can
conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise
require multiple PL statements.  For example,

MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;

MERGE works with regular tables, partitioned tables and inheritance
hierarchies, including column and row security enforcement, as well as
support for row and statement triggers and transition tables therein.

MERGE is optimized for OLTP and is parameterizable, though also useful
for large scale ETL/ELT. MERGE is not intended to be used in preference
to existing single SQL commands for INSERT, UPDATE or DELETE since there
is some overhead.  MERGE can be used from PL/pgSQL.

MERGE does not support targetting updatable views or foreign tables, and
RETURNING clauses are not allowed either.  These limitations are likely
fixable with sufficient effort.  Rewrite rules are also not supported,
but it's not clear that we'd want to support them.

Author: Pavan Deolasee <pavan.deolasee@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Simon Riggs <simon.riggs@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions)
Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions)
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions)
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql
2022-03-28 16:47:48 +02:00
Peter Eisentraut e26114c817 Make JSON path numeric literals more correct
Per ECMAScript standard (ECMA-262, referenced by SQL standard), the
syntax forms

.1
1.

should be allowed for decimal numeric literals, but the existing
implementation rejected them.

Also, by the same standard, reject trailing junk after numeric
literals.

Note that the ECMAScript standard for numeric literals is in respects
like these slightly different from the JSON standard, which might be
the original cause for this discrepancy.

A change is that this kind of syntax is now rejected:

    1.type()

This needs to be written as

    (1).type()

This is correct; normal JavaScript also does not accept this syntax.

We also need to fix up the jsonpath output function for this case.  We
put parentheses around numeric items if they are followed by another
path item.

Reviewed-by: Nikita Glukhov <n.gluhov@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/flat/50a828cc-0a00-7791-7883-2ed06dfb2dbb@enterprisedb.com
2022-03-28 11:11:39 +02:00
Andres Freund 91c0570a79 Don't fail for > 1 walsenders in 019_replslot_limit, add debug messages.
So far the first of the retries introduced in f28bf667f6 resolves the
issue. But I (Andres) am still suspicious that the start of the failures might
indicate a problem.

To reduce noise, stop reporting a failure if a retry resolves the problem. To
allow figuring out what causes the slow slot drop, add a few more debug
messages to ReplicationSlotDropPtr.

See also commit afdeff1052, fe0972ee5e and f28bf667f6.

Discussion: https://postgr.es/m/20220327213219.smdvfkq2fl74flow@alap3.anarazel.de
2022-03-27 22:35:42 -07:00
Tom Lane cc7401d5ca Fix up compiler warnings/errors from f4fb45d15.
Per early buildfarm returns.
2022-03-27 18:32:40 -04:00
Andrew Dunstan f4fb45d15c SQL/JSON constructors
This patch introduces the SQL/JSON standard constructors for JSON:

JSON()
JSON_ARRAY()
JSON_ARRAYAGG()
JSON_OBJECT()
JSON_OBJECTAGG()

For the most part these functions provide facilities that mimic
existing json/jsonb functions. However, they also offer some useful
additional functionality. In addition to text input, the JSON() function
accepts bytea input, which it will decode and constuct a json value from.
The other functions provide useful options for handling duplicate keys
and null values.

This series of patches will be followed by a consolidated documentation
patch.

Nikita Glukhov

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
2022-03-27 17:03:34 -04:00
Andrew Dunstan f79b803dcc Common SQL/JSON clauses
This introduces some of the building blocks used by the SQL/JSON
constructor and query functions. Specifically, it provides node
executor and grammar support for the FORMAT JSON [ENCODING foo]
clause, and values decorated with it, and for the RETURNING clause.

The following SQL/JSON patches will leverage these.

Nikita Glukhov (who probably deserves an award for perseverance).

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
2022-03-27 17:03:33 -04:00
Tom Lane c2d81ee902 Remove useless variable.
flatten_join_alias_vars_mutator counted attnums, but didn't
do anything with the count (no doubt it did at one time).

Noted by buildfarm member seawasp.
2022-03-27 15:05:22 -04:00
Tom Lane 0fb6954aa5 Fix breakage of get_ps_display() in the PS_USE_NONE case.
Commit 8c6d30f21 caused this function to fail to set *displen
in the PS_USE_NONE code path.  If the variable's previous value
had been negative, that'd lead to a memory clobber at some call
sites.  We'd managed not to notice due to very thin test coverage
of such configurations, but this appears to explain buildfarm member
lorikeet's recent struggles.

Credit to Andrew Dunstan for spotting the problem.  Back-patch
to v13 where the bug was introduced.

Discussion: https://postgr.es/m/136102.1648320427@sss.pgh.pa.us
2022-03-27 12:57:46 -04:00
Michael Paquier 411b91360f Fix comment in execParallel.c
0f61727 has made this comment incorrect.

Author: Julien Rouhaud
Reviewed-by: Matthias van de Meent
Discussion: https://postgr.es/m/20220326160117.qtp5nkuku6cvhcby@jrouhaud
2022-03-27 18:22:22 +09:00
Tom Lane 979cd655c1 Suppress compiler warning in pub_collist_to_bitmapset().
A fair percentage of the buildfarm doesn't recognize that oldcxt
won't be used uninitialized; silence those warnings by initializing it.

While here, upgrade the function's thoroughly inadequate header comment.

Oversight in 923def9a5, per buildfarm.
2022-03-26 11:53:37 -04:00
Tomas Vondra 923def9a53 Allow specifying column lists for logical replication
This allows specifying an optional column list when adding a table to
logical replication. The column list may be specified after the table
name, enclosed in parentheses. Columns not included in this list are not
sent to the subscriber, allowing the schema on the subscriber to be a
subset of the publisher schema.

For UPDATE/DELETE publications, the column list needs to cover all
REPLICA IDENTITY columns. For INSERT publications, the column list is
arbitrary and may omit some REPLICA IDENTITY columns. Furthermore, if
the table uses REPLICA IDENTITY FULL, column list is not allowed.

The column list can contain only simple column references. Complex
expressions, function calls etc. are not allowed. This restriction could
be relaxed in the future.

During the initial table synchronization, only columns included in the
column list are copied to the subscriber. If the subscription has
several publications, containing the same table with different column
lists, columns specified in any of the lists will be copied.

This means all columns are replicated if the table has no column list
at all (which is treated as column list with all columns), or when of
the publications is defined as FOR ALL TABLES (possibly IN SCHEMA that
matches the schema of the table).

For partitioned tables, publish_via_partition_root determines whether
the column list for the root or the leaf relation will be used. If the
parameter is 'false' (the default), the list defined for the leaf
relation is used. Otherwise, the column list for the root partition
will be used.

Psql commands \dRp+ and \d <table-name> now display any column lists.

Author: Tomas Vondra, Alvaro Herrera, Rahila Syed
Reviewed-by: Peter Eisentraut, Alvaro Herrera, Vignesh C, Ibrar Ahmed,
Amit Kapila, Hou zj, Peter Smith, Wang wei, Tang, Shi yu
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
2022-03-26 01:01:27 +01:00
Tomas Vondra 05843b1aa4 Minor improvements in sequence decoding code and docs
A couple minor comment improvements and code cleanups, based on
post-commit feedback to the sequence decoding patch.

Author: Amit Kapila, vignesh C
Discussion: https://postgr.es/m/aeb2ba8d-e6f4-5486-cc4c-0d4982c291cb@enterprisedb.com
2022-03-25 21:07:17 +01:00
Tomas Vondra 002c9dd97a Handle sequences in preprocess_pubobj_list
Commit 75b1521dae added support for logical replication of sequences,
including grammar changes, but it did not update preprocess_pubobj_list
accordingly. This can cause segfaults with "continuations", i.e. when
command specifies a list of objects:

  CREATE PUBLICATION p FOR SEQUENCE s1, s2;

Reported by Amit Kapila, patch by me.

Reported-by: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1JxDNKGBSNTyN-Xj2JRjzFo+ziSqJbjH==vuO0YF_CQrg@mail.gmail.com
2022-03-25 14:29:56 +01:00
Alvaro Herrera 49d9cfc68b
Fix replay of create database records on standby
Crash recovery on standby may encounter missing directories when
replaying create database WAL records.  Prior to this patch, the standby
would fail to recover in such a case.  However, the directories could be
legitimately missing.  Consider a sequence of WAL records as follows:

    CREATE DATABASE
    DROP DATABASE
    DROP TABLESPACE

If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.

This patch adds a mechanism similar to invalid-page tracking, to keep a
tally of missing directories during crash recovery.  If all the missing
directory references are matched with corresponding drop records at the
end of crash recovery, the standby can safely continue following the
primary.

Backpatch to 13, at least for now.  The bug is older, but fixing it in
older branches requires more careful study of the interactions with
commit e6d8069522, which appeared in 13.

A new TAP test file is added to verify the condition.  However, because
it depends on commit d6d317dbf6, it can only be added to branch
master.  I (Álvaro) manually verified that the code behaves as expected
in branch 14.  It's a bit nervous-making to leave the code uncovered by
tests in older branches, but leaving the bug unfixed is even worse.
Also, the main reason this fix took so long is precisely that we
couldn't agree on a good strategy to approach testing for the bug, so
perhaps this is the best we can do.

Diagnosed-by: Paul Guo <paulguo@gmail.com>
Author: Paul Guo <paulguo@gmail.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
2022-03-25 13:16:21 +01:00
Peter Eisentraut 23119d51a1 Refactor DLSUFFIX handling
Move DLSUFFIX from makefiles into header files for all platforms.
Move the DLSUFFIX assignment from src/makefiles/ to src/templates/,
have configure read it, and then substitute it into Makefile.global
and pg_config.h.  This avoids the need for all makefile rules that
need it to locally set CPPFLAGS.  It also resolves an inconsistent
setup between the two Windows build systems.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/2f9861fb-8969-9005-7518-b8e60f2bead9@enterprisedb.com
2022-03-25 08:56:02 +01:00
Michael Paquier ad8759bea0 Fix typos in standby.c
xl_running_xacts exists, not xl_xact_running_xacts.

Author: Hou Zhijie
Discussion: https://postgr.es/m/OS0PR01MB57160D8B01097FFB5C175065941A9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2022-03-25 14:11:18 +09:00
Amit Kapila 3e67a5cac6 Remove some useless free calls.
These were introduced in recent commit 52e4f0cd47. We were trying to free
some transient space consumption and that too was not entirely correct and
complete. We don't need this partial freeing of memory as it will be
allocated just once for a query and will be freed at the end of the query.

Author: Zhihong Yu
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CALNJ-vQORfQ=vicbKA_RmeGZGzm1y3WsEcZqXWi7qjN43Cz_vg@mail.gmail.com
2022-03-25 07:37:06 +05:30
Tom Lane ce95c54376 Fix pg_statio_all_tables view for multiple TOAST indexes.
A TOAST table can normally have only one index, but there are corner
cases where it has more; for example, transiently during REINDEX
CONCURRENTLY.  In such a case, the pg_statio_all_tables view produced
multiple rows for the owning table, one per TOAST index.  Refactor the
view to avoid that, instead summing the stats across all the indexes,
as we do for regular table indexes.

While this has been wrong for a long time, back-patching seems unwise
due to the difficulty of putting a system view change into back
branches.

Andrei Zubkov, tweaked a bit by me

Discussion: https://postgr.es/m/acefef4189706971fc475f912c1afdab1c48d627.camel@moonset.ru
2022-03-24 16:33:13 -04:00
Robert Haas 412ad7a556 Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.
If TRUNCATE causes some buffers to be invalidated and thus the
checkpoint does not flush them, TRUNCATE must also ensure that the
corresponding files are truncated on disk. Otherwise, a replay
from the checkpoint might find that the buffers exist but have
the wrong contents, which may cause replay to fail.

Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design
suggestion from Heikki Linnakangas, with some changes to the
comments by me. Review of this and a prior patch that approached
the issue differently by Heikki Linnakangas, Andres Freund, Álvaro
Herrera, Masahiko Sawada, and Tom Lane.

Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com
2022-03-24 14:52:28 -04:00
Tomas Vondra 75b1521dae Add decoding of sequences to built-in replication
This commit adds support for decoding of sequences to the built-in
replication (the infrastructure was added by commit 0da92dc530).

The syntax and behavior mostly mimics handling of tables, i.e. a
publication may be defined as FOR ALL SEQUENCES (replicating all
sequences in a database), FOR ALL SEQUENCES IN SCHEMA (replicating
all sequences in a particular schema) or individual sequences.

To publish sequence modifications, the publication has to include
'sequence' action. The protocol is extended with a new message,
describing sequence increments.

A new system view pg_publication_sequences lists all the sequences
added to a publication, both directly and indirectly. Various psql
commands (\d and \dRp) are improved to also display publications
including a given sequence, or sequences included in a publication.

Author: Tomas Vondra, Cary Huang
Reviewed-by: Peter Eisentraut, Amit Kapila, Hannu Krosing, Andres
             Freund, Petr Jelinek
Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com
Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca
2022-03-24 18:49:27 +01:00
Alvaro Herrera e27f4ee0a7
Change fastgetattr and heap_getattr to inline functions
They were macros previously, but recent callsite additions made Coverity
complain about one of the assertions being always true.  This change
could have been made a long time ago, but the Coverity complain broke
the inertia.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/202203241021.uts52sczx3al@alvherre.pgsql
2022-03-24 18:02:27 +01:00
Tom Lane 0bd7af082a Invent recursive_worktable_factor GUC to replace hard-wired constant.
Up to now, the planner estimated the size of a recursive query's
worktable as 10 times the size of the non-recursive term.  It's hard
to see how to do significantly better than that automatically, but
we can give users control over the multiplier to allow tuning for
specific use-cases.  The default behavior remains the same.

Simon Riggs

Discussion: https://postgr.es/m/CANbhV-EuaLm4H3g0+BSTYHEGxJj3Kht0R+rJ8vT57Dejnh=_nA@mail.gmail.com
2022-03-24 11:47:41 -04:00
Peter Eisentraut a47651447f Remove unnecessary translator comment
Discussion: https://www.postgresql.org/message-id/flat/CALj2ACUfJKTmK5v%3DvF%2BH2iLkqM9Yvjsp6iXaCqAks6gDpzZh6g%40mail.gmail.com
2022-03-24 14:07:38 +01:00
Michael Paquier d4781d8873 Refactor code related to pg_hba_file_rules() into new file
hba.c is growing big, and more contents are planned for it.  In order to
prepare for this future work, this commit moves all the code related to
the system function processing the contents of pg_hba.conf,
pg_hba_file_rules() to a new file called hbafuncs.c, which will be used
as the location for the SQL portion of the authentication file parsing.
While on it, HbaToken, the structure holding a string token lexed from a
configuration file related to authentication, is renamed to a more
generic AuthToken, as it gets used not only for pg_hba.conf, but also
for pg_ident.conf.  TokenizedLine is now named TokenizedAuthLine.

The size of hba.c is reduced by ~12%.

Author: Julien Rouhaud
Reviewed-by: Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud
2022-03-24 12:42:30 +09:00
Andres Freund 3ac7d02412 Don't try to translate NULL in GetConfigOptionByNum().
Noticed via -fsanitize=undefined. Introduced when a few columns in
GetConfigOptionByNum() / pg_settings started to be translated in 72be8c29a /
PG 12.

Backpatch to all affected branches, for the same reasons as 46ab07ffda.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 12-
2022-03-23 13:05:59 -07:00
Andres Freund 1c6bb380e5 Don't call fwrite() with len == 0 when writing out relcache init file.
Noticed via -fsanitize=undefined.

Backpatch to all branches, for the same reasons as 46ab07ffda.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 10-
2022-03-23 13:05:25 -07:00
Robert Haas 591767150f pg_basebackup: Try to fix some failures on Windows.
Commit ffd53659c4 messed up the
mechanism that was being used to pass parameters to LogStreamerMain()
on Windows. It worked on Linux because only Windows was using threads.
Repair by moving the additional parameters added by that commit into
the 'logstreamer_param' struct.

Along the way, fix a compiler warning on builds without HAVE_LIBZ.

Discussion: http://postgr.es/m/CA+TgmoY5=AmWOtMj3v+cySP2rR=Bt6EGyF_joAq4CfczMddKtw@mail.gmail.com
2022-03-23 13:25:26 -04:00
Alvaro Herrera 9d92582abf
Fix "missing continuation record" after standby promotion
Invalidate abortedRecPtr and missingContrecPtr after a missing
continuation record is successfully skipped on a standby. This fixes a
PANIC caused when a recently promoted standby attempts to write an
OVERWRITE_RECORD with an LSN of the previously read aborted record.

Backpatch to 10 (all stable versions).

Author: Sami Imseih <simseih@amazon.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/44D259DE-7542-49C4-8A52-2AB01534DCA9@amazon.com
2022-03-23 18:22:10 +01:00
Robert Haas 607e75e8f0 Unbreak the build.
Commit ffd53659c4 broke the build for
anyone not compiling with LZ4 and ZSTD enabled. Woops.
2022-03-23 10:22:54 -04:00
Robert Haas ffd53659c4 Replace BASE_BACKUP COMPRESSION_LEVEL option with COMPRESSION_DETAIL.
There are more compression parameters that can be specified than just
an integer compression level, so rename the new COMPRESSION_LEVEL
option to COMPRESSION_DETAIL before it gets released. Introduce a
flexible syntax for that option to allow arbitrary options to be
specified without needing to adjust the main replication grammar,
and common code to parse it that is shared between the client and
the server.

This commit doesn't actually add any new compression parameters,
so the only user-visible change is that you can now type something
like pg_basebackup --compress gzip:level=5 instead of writing just
pg_basebackup --compress gzip:5. However, it should make it easy to
add new options. If for example gzip starts offering fries, we can
support pg_basebackup --compress gzip:level=5,fries=true for the
benefit of users who want fries with that.

Along the way, this fixes a few things in pg_basebackup so that the
pg_basebackup can be used with a server-side compression algorithm
that pg_basebackup itself does not understand. For example,
pg_basebackup --compress server-lz4 could still succeed even if
only the server and not the client has LZ4 support, provided that
the other options to pg_basebackup don't require the client to
decompress the archive.

Patch by me. Reviewed by Justin Pryzby and Dagfinn Ilmari Mannsåker.

Discussion: http://postgr.es/m/CA+TgmoYvpetyRAbbg1M8b3-iHsaN4nsgmWPjOENu5-doHuJ7fA@mail.gmail.com
2022-03-23 09:19:14 -04:00
Andrew Dunstan 1460fc5942 Revert "Common SQL/JSON clauses"
This reverts commit 865fe4d5df.

This has caused issues with a significant number of buildfarm members
2022-03-22 19:56:14 -04:00
Andrew Dunstan 865fe4d5df Common SQL/JSON clauses
This introduces some of the building blocks used by the SQL/JSON
constructor and query functions. Specifically, it provides node
executor and grammar support for the FORMAT JSON [ENCODING foo]
clause, and values decorated with it, and for the RETURNING clause.

The following SQL/JSON patches will leverage these.

Nikita Glukhov (who probably deserves an award for perseverance).

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup. Erik Rijkers, Zihong Yu and
Himanshu Upadhyaya.

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
2022-03-22 17:32:54 -04:00
Andres Freund ce8c978295 pgstat: fix function name in comment.
Introduced in 3fa17d3771.
2022-03-22 08:15:40 -07:00
Andrew Dunstan d11e84ea46 Add String object access hooks
This caters for cases where the access is to an object identified by
name rather than Oid.

The first user of these is the GUC access controls

Joshua Brindle and Mark Dilger

Discussion: https://postgr.es/m/47F87A0E-C0E5-43A6-89F6-D403F2B45175@enterprisedb.com
2022-03-22 10:28:31 -04:00
Tom Lane 29992a6a50 Revert "graceful shutdown" changes for Windows.
This reverts commits 6051857fc and ed52c3707 in HEAD (they were already
reverted in the back branches).  Further testing has shown that while
those changes do fix some things, they also break others; in particular,
it looks like walreceivers fail to detect walsender-initiated connection
close reliably if the walsender shuts down this way.  A proper fix for
this seems possible but won't be done in time for v15.

Discussion: https://postgr.es/m/CA+hUKG+OeoETZQ=Qw5Ub5h3tmwQhBmDA=nuNO3KG=zWfUypFAw@mail.gmail.com
Discussion: https://postgr.es/m/CA+hUKGKkp2XkvSe9nG+bsgkXVKCdTeGSa_TR0Qx1jafc_oqCVA@mail.gmail.com
2022-03-22 10:19:15 -04:00
Dean Rasheed 7faa5fc84b Add support for security invoker views.
A security invoker view checks permissions for accessing its
underlying base relations using the privileges of the user of the
view, rather than the privileges of the view owner. Additionally, if
any of the base relations are tables with RLS enabled, the policies of
the user of the view are applied, rather than those of the view owner.

This allows views to be defined without giving away additional
privileges on the underlying base relations, and matches a similar
feature available in other database systems.

It also allows views to operate more naturally with RLS, without
affecting the assignments of policies to users.

Christoph Heiss, with some additional hacking by me. Reviewed by
Laurenz Albe and Wolfgang Walther.

Discussion: https://postgr.es/m/b66dd6d6-ad3e-c6f2-8b90-47be773da240%40cybertec.at
2022-03-22 10:28:10 +00:00
Amit Kapila 208c5d65bb Add ALTER SUBSCRIPTION ... SKIP.
This feature allows skipping the transaction on subscriber nodes.

If incoming change violates any constraint, logical replication stops
until it's resolved. Currently, users need to either manually resolve the
conflict by updating a subscriber-side database or by using function
pg_replication_origin_advance() to skip the conflicting transaction. This
commit introduces a simpler way to skip the conflicting transactions.

The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX),
which allows the apply worker to skip the transaction finished at
specified LSN. The apply worker skips all data modification changes within
the transaction.

Author: Masahiko Sawada
Reviewed-by: Takamichi Osumi, Hou Zhijie, Peter Eisentraut, Amit Kapila, Shi Yu, Vignesh C, Greg Nancarrow, Haiying Tang, Euler Taveira
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
2022-03-22 07:11:19 +05:30
Andres Freund 315ae75e9b pgstat: reorder pgstat.[ch] contents.
Now that 13619598f1 has split pgstat up into multiple files it isn't quite as
hard to come up with a sensible order for pgstat.[ch]. Inconsistent naming
makes it still not quite right looking, but that's work for another commit.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-03-21 16:21:00 -07:00
Tom Lane 2591ee8ec4 Fix assorted missing logic for GroupingFunc nodes.
The planner needs to treat GroupingFunc like Aggref for many purposes,
in particular with respect to processing of the argument expressions,
which are not to be evaluated at runtime.  A few places hadn't gotten
that memo, notably including subselect.c's processing of outer-level
aggregates.  This resulted in assertion failures or wrong plans for
cases in which a GROUPING() construct references an outer aggregation
level.

Also fix missing special cases for GroupingFunc in cost_qual_eval
(resulting in wrong cost estimates for GROUPING(), although it's
not clear that that would affect plan shapes in practice) and in
ruleutils.c (resulting in excess parentheses in pretty-print mode).

Per bug #17088 from Yaoguang Chen.  Back-patch to all supported
branches.

Richard Guo, Tom Lane

Discussion: https://postgr.es/m/17088-e33882b387de7f5c@postgresql.org
2022-03-21 17:44:29 -04:00
Andres Freund 13619598f1 pgstat: split different types of stats into separate files.
pgstat.c is very long, and it's hard to find an order that makes sense and is
likely to be maintained over time. Splitting the different pieces into
separate files makes that a lot easier.

With a few exceptions, this commit just moves code around. Those exceptions
are:
- adding file headers for new files
- removing 'static' from functions
- adapting pgstat_assert_is_up() to work across TUs
- minor comment adjustments
git diff --color-moved=dimmed-zebra is very helpful separating code movement
from code changes.

The next commit in this series will reorder pgstat.[ch] contents to be a bit
more coherent.

Earlier revisions of this patch had "global" statistics (archiver, bgwriter,
checkpointer, replication slots, SLRU, WAL) in one file, because each seemed
small enough. However later commits will increase their size and their
aggregate size is not insubstantial. It also just seems easier to split each
type of statistic into its own file.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-03-21 12:02:25 -07:00
Tom Lane cb02fcb4c9 Fix bogus dependency handling for GENERATED expressions.
For GENERATED columns, we record all dependencies of the generation
expression as AUTO dependencies of the column itself.  This means
that the generated column is silently dropped if any dependency
is removed, even if CASCADE wasn't specified.  This is at least
a POLA violation, but I think it's actually based on a misreading
of the standard.  The standard does say that you can't drop a
dependent GENERATED column in RESTRICT mode; but that's buried down
in a subparagraph, on a different page from some pseudocode that
makes it look like an AUTO drop is being suggested.

Change this to be more like the way that we handle regular default
expressions, ie record the dependencies as NORMAL dependencies of
the pg_attrdef entry.  Also, make the pg_attrdef entry's dependency
on the column itself be INTERNAL not AUTO.  That has two effects:

* the column will go away, not just lose its default, if any
dependency of the expression is dropped with CASCADE.  So we
don't need any special mechanism to make that happen.

* it provides an additional cross-check preventing someone from
dropping the default expression without dropping the column.

catversion bump because of change in the contents of pg_depend
(which also requires a change in one information_schema view).

Per bug #17439 from Kevin Humphreys.  Although this is a longstanding
bug, it seems impractical to back-patch because of the need for
catalog contents changes.

Discussion: https://postgr.es/m/17439-7df4421197e928f0@postgresql.org
2022-03-21 14:58:49 -04:00
Tom Lane 17f3bc0928 Move pg_attrdef manipulation code into new file catalog/pg_attrdef.c.
This is a pure refactoring commit: there isn't (I hope) any functional
change.

StoreAttrDefault and RemoveAttrDefault[ById] are moved from heap.c,
reducing the size of that overly-large file by about 300 lines.
I took the opportunity to trim unused #includes from heap.c, too.

Two new functions for translating between a pg_attrdef OID and the
relid/attnum of the owning column are created by extracting ad-hoc
code from objectaddress.c.  This already removes one copy of said
code, and a follow-on bug fix will create more callers.

The only other function directly manipulating pg_attrdef is
AttrDefaultFetch.  I judged it was better to leave that in relcache.c,
since it shares special concerns about recursion and error handling
with the rest of that module.

Discussion: https://postgr.es/m/651168.1647451676@sss.pgh.pa.us
2022-03-21 14:38:23 -04:00
Tom Lane 7b6ec86532 Fix risk of deadlock failure while dropping a partitioned index.
DROP INDEX needs to lock the index's table before the index itself,
else it will deadlock against ordinary queries that acquire the
relation locks in that order.  This is correctly mechanized for
plain indexes by RangeVarCallbackForDropRelation; but in the case of
a partitioned index, we neglected to lock the child tables in advance
of locking the child indexes.  We can fix that by traversing the
inheritance tree and acquiring the needed locks in RemoveRelations,
after we have acquired our locks on the parent partitioned table and
index.

While at it, do some refactoring to eliminate confusion between
the actual and expected relkind in RangeVarCallbackForDropRelation.
We can save a couple of syscache lookups too, by having that function
pass back info that RemoveRelations will need.

Back-patch to v11 where partitioned indexes were added.

Jimmy Yih, Gaurab Dey, Tom Lane

Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com
2022-03-21 12:22:13 -04:00
Tom Lane 1f8bc44868 Remove workarounds for avoiding [U]INT64_FORMAT in translatable strings.
Further code simplification along the same lines as d914eb347
and earlier patches.

Aleksander Alekseev, Japin Li

Discussion: https://postgr.es/m/CAJ7c6TMSKi3Xs8h5MP38XOnQQpBLazJvVxVfPn++roitDJcR7g@mail.gmail.com
2022-03-21 11:11:55 -04:00
Magnus Hagander c540d37157 Fix typo in file identification
Clearly a simple copy/paste mistake when the file was created.
2022-03-21 12:35:48 +02:00
Andres Freund d4ba8b51c7 pgstat: separate "xact level" handling out of relation specific functions.
This is in preparation of a later commit moving relation stats handling into
its own file.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-03-20 19:12:09 -07:00
Andres Freund bff258a273 pgstat: rename pgstat_initstats() to pgstat_relation_init().
The old name was overly generic. An upcoming commit moves relation stats
handling into its own file, making pgstat_initstats() look even more out of
place.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-03-20 19:12:09 -07:00
Andres Freund 8363102009 pgstat: introduce pgstat_relation_should_count().
A later commit will make the check more complicated than the
current (rel)->pgstat_info != NULL. It also just seems nicer to have a central
copy of the logic, even while still simple.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-03-20 19:12:09 -07:00
Alvaro Herrera 2d655a08d5
Blind fix for uninitialized memory bug in ba9a7e3921
Valgrind animal skink shows a crash in this new code.  I couldn't
reproduce the problem locally, but going by blind code inspection,
initializing insert_destrel should be sufficient to fix the problem.
2022-03-20 22:10:24 +01:00
Alvaro Herrera ba9a7e3921
Enforce foreign key correctly during cross-partition updates
When an update on a partitioned table referenced in foreign key
constraints causes a row to move from one partition to another,
the fact that the move is implemented as a delete followed by an insert
on the target partition causes the foreign key triggers to have
surprising behavior.  For example, a given foreign key's delete trigger
which implements the ON DELETE CASCADE clause of that key will delete
any referencing rows when triggered for that internal DELETE, although
it should not, because the referenced row is simply being moved from one
partition of the referenced root partitioned table into another, not
being deleted from it.

This commit teaches trigger.c to skip queuing such delete trigger events
on the leaf partitions in favor of an UPDATE event fired on the root
target relation.  Doing so is sensible because both the old and the new
tuple "logically" belong to the root relation.

The after trigger event queuing interface now allows passing the source
and the target partitions of a particular cross-partition update when
registering the update event for the root partitioned table.  Along with
the two ctids of the old and the new tuple, the after trigger event now
also stores the OIDs of those partitions. The tuples fetched from the
source and the target partitions are converted into the root table
format, if necessary, before they are passed to the trigger function.

The implementation currently has a limitation that only the foreign keys
pointing into the query's target relation are considered, not those of
its sub-partitioned partitions.  That seems like a reasonable
limitation, because it sounds rare to have distinct foreign keys
pointing to sub-partitioned partitions instead of to the root table.

This misbehavior stems from commit f56f8f8da6 (which added support for
foreign keys to reference partitioned tables) not paying sufficient
attention to commit 2f17844104 (which had introduced cross-partition
updates a year earlier).  Even though the former commit goes back to
Postgres 12, we're not backpatching this fix at this time for fear of
destabilizing things too much, and because there are a few ABI breaks in
it that we'd have to work around in older branches.  It also depends on
commit f4566345cf, which had its own share of backpatchability issues
as well.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Eduard Català <eduard.catala@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqFvkBCmfwkQX_yBqv2Wz8ugUGiBDxum8=WvVbfU1TXaNg@mail.gmail.com
Discussion: https://postgr.es/m/CAL54xNZsLwEM1XCk5yW9EqaRzsZYHuWsHQkA2L5MOSKXAwviCQ@mail.gmail.com
2022-03-20 18:43:40 +01:00
Peter Eisentraut 3a671e1f7c Fix global ICU collations for ICU < 54
createdb() didn't check for collation attributes validity, which has
to be done explicitly on ICU < 54.  It also forgot to close the ICU collator
opened during the check which leaks some memory.

To fix both, add a new check_icu_locale() that does all the appropriate
verification and close the ICU collator.

initdb also had some partial check for ICU < 54.  To have consistent error
reporting across major ICU versions, and get rid of the need to include ucol.h,
remove the partial check there.  The backend will report an error if needed
during the post-boostrap iniitialization phase.

Author: Julien Rouhaud <julien.rouhaud@free.fr>
Discussion: https://www.postgresql.org/message-id/20220319041459.qqqiqh335sga5ezj@jrouhaud
2022-03-20 10:21:45 +01:00
Andres Freund 78f9506b38 pgstat: split out WAL handling from pgstat_{initialize,report_stat}.
A later commit will move the handling of the different kinds of stats into
separate files.  By splitting out WAL handling in this commit that later move
will just move code around without other changes.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-03-19 11:42:22 -07:00
Andres Freund 89c546c294 pgstat: split relation, database handling out of pgstat_report_stat().
pgstat_report_stat() handles several types of stats, yet relation stats have
so far been handled directly in pgstat_report_stat().

A later commit will move the handling of the different kinds of stats into
separate files.  By splitting out relation handling in this commit that later
move will just move code around without other changes.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-03-19 11:42:22 -07:00