Commit Graph

911 Commits

Author SHA1 Message Date
Robert Haas 6c1d5ba486 Update docs and error message for superuser_reserved_connections.
Commit ea92368cd1 made max_wal_senders
a separate pool of backends from max_connections, but the documentation
and error message for superuser_reserved_connections weren't updated
at the time, and as a result are somewhat misleading. Update.

This is arguably a back-patchable bug fix, but because it seems quite
minor, no back-patch.

Patch by Nathan Bossart. Reviewed by Tushar Ahuja and by me.

Discussion: http://postgr.es/m/20230119194601.GA4105788@nathanxps13
2023-01-20 15:23:04 -05:00
Tom Lane 09d465c397 Doc: fix a few oddly-spelled SGML ID attributes.
Avoid use of "_" in SGML IDs.  Awhile back that was actually
disallowed by the toolchain, as a consequence of which our convention
has been to use "-" instead.  Fix a couple of stragglers that are
particularly inconsistent with that convention and with related IDs.

This is just neatnik-ism, so no need for back-patch.

Discussion: https://postgr.es/m/769446.1673478332@sss.pgh.pa.us
2023-01-17 17:13:20 -05:00
Robert Haas e5b8a4c098 Add new GUC createrole_self_grant.
Can be set to the empty string, or to either or both of "set" or
"inherit". If set to a non-empty value, a non-superuser who creates
a role (necessarily by relying up the CREATEROLE privilege) will
grant that role back to themselves with the specified options.

This isn't a security feature, because the grant that this feature
triggers can also be performed explicitly. Instead, it's a user experience
feature. A superuser would necessarily inherit the privileges of any
created role and be able to access all such roles via SET ROLE;
with this patch, you can configure createrole_self_grant = 'set, inherit'
to provide a similar experience for a user who has CREATEROLE but not
SUPERUSER.

Discussion: https://postgr.es/m/CA+TgmobN59ct+Emmz6ig1Nua2Q-_o=r6DSD98KfU53kctq_kQw@mail.gmail.com
2023-01-10 12:44:49 -05:00
Tom Lane 78ee60ed84 Doc: add XML ID attributes to <sectN> and <varlistentry> tags.
This doesn't have any external effect at the moment, but it
will allow adding useful link-discoverability features later.

Brar Piening, reviewed by Karl Pinc.

Discussion: https://postgr.es/m/CAB8KJ=jpuQU9QJe4+RgWENrK5g9jhoysMw2nvTN_esoOU0=a_w@mail.gmail.com
2023-01-09 15:08:24 -05:00
Amit Kapila 216a784829 Perform apply of large transactions by parallel workers.
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.

In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode -  in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.

This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).

In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.

Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-09 07:52:45 +05:30
Peter Geoghegan 1de58df4fe Add page-level freezing to VACUUM.
Teach VACUUM to decide on whether or not to trigger freezing at the
level of whole heap pages.  Individual XIDs and MXIDs fields from tuple
headers now trigger freezing of whole pages, rather than independently
triggering freezing of each individual tuple header field.

Managing the cost of freezing over time now significantly influences
when and how VACUUM freezes.  The overall amount of WAL written is the
single most important freezing related cost, in general.  Freezing each
page's tuples together in batch allows VACUUM to take full advantage of
the freeze plan WAL deduplication optimization added by commit 9e540599.

Also teach VACUUM to trigger page-level freezing whenever it detects
that heap pruning generated an FPI.  We'll have already written a large
amount of WAL just to do that much, so it's very likely a good idea to
get freezing out of the way for the page early.  This only happens in
cases where it will directly lead to marking the page all-frozen in the
visibility map.

In most cases "freezing a page" removes all XIDs < OldestXmin, and all
MXIDs < OldestMxact.  It doesn't quite work that way in certain rare
cases involving MultiXacts, though.  It is convenient to define "freeze
the page" in a way that gives FreezeMultiXactId the leeway to put off
the work of processing an individual tuple's xmax whenever it happens to
be a MultiXactId that would require an expensive second pass to process
aggressively (allocating a new multi is especially worth avoiding here).
FreezeMultiXactId is eager when processing is cheap (as it usually is),
and lazy in the event of an individual multi that happens to require
expensive second pass processing.  This avoids regressions related to
processing of multis that page-level freezing might otherwise cause.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
2022-12-28 08:50:47 -08:00
Amit Kapila 5de94a041e Add 'logical_decoding_mode' GUC.
This enables streaming or serializing changes immediately in logical
decoding. This parameter is intended to be used to test logical decoding
and replication of large transactions for which otherwise we need to
generate the changes till logical_decoding_work_mem is reached.

This helps in reducing the timing of existing tests related to logical
replication of in-progress transactions and will help in writing tests for
for the upcoming feature for parallelly applying large in-progress
transactions.

Author: Shi yu
Reviewed-by: Sawada Masahiko, Shveta Mallik, Amit Kapila, Dilip Kumar, Kuroda Hayato, Kyotaro Horiguchi
Discussion: https://postgr.es/m/OSZPR01MB63104E7449DBE41932DB19F1FD1B9@OSZPR01MB6310.jpnprd01.prod.outlook.com
2022-12-26 08:58:16 +05:30
David Rowley 3226f47282 Add enable_presorted_aggregate GUC
1349d279 added query planner support to allow more efficient execution of
aggregate functions which have an ORDER BY or a DISTINCT clause.  Prior to
that commit, the planner would only request that the lower planner produce
a plan with the order required for the GROUP BY clause and it would be
left up to nodeAgg.c to perform the final sort of records within each
group so that the aggregate transition functions were called in the
correct order.  Now that the planner requests the lower planner produce a
plan with the GROUP BY and the ORDER BY / DISTINCT aggregates in mind,
there is the possibility that the planner chooses a plan which could be
less efficient than what would have been produced before 1349d279.

While developing 1349d279, I had in mind that Incremental Sort would help
us in cases where an index exists only on the GROUP BY column(s).
Incremental Sort would just replace the implicit tuplesorts which are
being performed in nodeAgg.c.  However, because the planner has the
flexibility to instead choose a plan which just performs a full sort on
both the GROUP BY and ORDER BY / DISTINCT aggregate columns, there is
potential for the planner to make a bad choice.  The costing for
Incremental Sort is not perfect as it assumes an even distribution of rows
to sort within each sort group.

Here we add an escape hatch in the form of the enable_presorted_aggregate
GUC.  This will allow users to get the pre-PG16 behavior in cases where
they have no other means to convince the query planner to produce a plan
which only sorts on the GROUP BY column(s).

Discussion: https://postgr.es/m/CAApHDvr1Sm+g9hbv4REOVuvQKeDWXcKUAhmbK5K+dfun0s9CvA@mail.gmail.com
2022-12-20 22:28:58 +13:00
Alvaro Herrera a8500750ca
Better document logical replication parameters
Add some cross-links between chapter "20. Server Parameters" and
"31. Logical Replication" regarding the available configuration
parameters, for easier navigation; and some more explanatory text too.

I (Álvaro) chose to duplicate max_replication_slots in Chapter 20,
because it has completely different meanings at each side of the
replication link.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: samay sharma <smilingsamay@gmail.com>
Discussion: https://postgr.es/m/CAHut+PsESqpy7w3Y6cX98c255ZuCjvipkhKjy6hZBjOv4E6iJA@mail.gmail.com
2022-12-12 20:18:56 +01:00
Michael Paquier 8a476fda5e doc: Add missing <varlistentry> markups for developer GUCs
Missing such markups makes it impossible to create links back to these
GUCs, and all the other parameters have one already.

Author: Ian Lawrence Barwick
Discussion: https://postgr.es/m/CAB8KJ=jx=6dFB_EN3j0UkuvG3cPu5OmQiM-ZKRAz+fKvS+u8Ng@mail.gmail.com
Backpatch-through: 11
2022-12-05 11:23:08 +09:00
Bruce Momjian 66bc9d2d3e doc: add transaction processing chapter with internals info
This also adds references to this new chapter at relevant sections of
our documentation.  Previously much of these internal details were
exposed to users, but not explained.  This also updates RELEASE
SAVEPOINT.

Discussion: https://postgr.es/m/CANbhV-E_iy9fmrErxrCh8TZTyenpfo72Hf_XD2HLDppva4dUNA@mail.gmail.com

Author: Simon Riggs, Laurenz Albe

Reviewed-by: Bruce Momjian

Backpatch-through: 11
2022-11-29 20:49:52 -05:00
Thomas Munro cd4329d939 Remove promote_trigger_file.
Previously, an idle startup (recovery) process would wake up every 5
seconds to have a chance to poll for promote_trigger_file, even if that
GUC was not configured.  That promotion triggering mechanism was
effectively superseded by pg_ctl promote and pg_promote() a long time
ago.  There probably aren't many users left and it's very easy to change
to the modern mechanisms, so we agreed to remove the feature.

This is part of a campaign to reduce wakeups on idle systems.

Author: Simon Riggs <simon.riggs@enterprisedb.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Ian Lawrence Barwick <barwick@gmail.com>
Discussion: https://postgr.es/m/CANbhV-FsjnzVOQGBpQ589%3DnWuL1Ex0Ykn74Nh1hEjp2usZSR5g%40mail.gmail.com
2022-11-29 12:08:38 +13:00
Daniel Gustafsson 1f059a4408 doc: Clarify unit of logging for log_temp_files
When the unit is omitted from log_temp_files the value is taken as kb,
but the logged value is also without unit but specified in bytes. This
could cause some confusion, so clarify in the documentation which unit
is used when logging.

Reported-by: iphatiey5eu2au6@pasms.be
Reviewed-by: Bruce Momjian <bruce@momjian.us>
Discussion: https://postgr.es/m/166859439833.632.13122583000472281400@wrigleys.postgresql.org
2022-11-28 11:10:01 +01:00
Tom Lane 51b5834cd5 Provide options for postmaster to kill child processes with SIGABRT.
The postmaster normally sends SIGQUIT to force-terminate its
child processes after a child crash or immediate-stop request.
If that doesn't result in child exit within a few seconds,
we follow it up with SIGKILL.  This patch provides GUC flags
that allow either of these signals to be replaced with SIGABRT.
On typically-configured Unix systems, that will result in a
core dump being produced for each such child.  This can be
useful for debugging problems, although it's not something you'd
want to have on in production due to the risk of disk space
bloat from lots of core files.

The old postmaster -T switch, which sent SIGSTOP in place of
SIGQUIT, is changed to be the same as send_abort_for_crash.
As far as I can tell from the code comments, the intent of
that switch was just to block things for long enough to force
core dumps manually, which seems like an unnecessary extra step.
(Maybe at the time, there was no way to get most kernels to
produce core files with per-PID names, requiring manual core
file renaming after each one.  But now it's surely the hard way.)

I also took the opportunity to remove the old postmaster -n
(skip shmem reinit) switch, which hasn't actually done anything
in decades, though the documentation still claimed it did.

Discussion: https://postgr.es/m/2251016.1668797294@sss.pgh.pa.us
2022-11-21 11:59:29 -05:00
Peter Eisentraut d627ce3b70 Disallow setting archive_library and archive_command at the same time
Setting archive_library and archive_command at the same time is now an
error.  Before, archive_library would take precedence over
archive_command.

Author: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://www.postgresql.org/message-id/20220914222736.GA3042279%40nathanxps13
2022-11-15 10:03:47 +01:00
Michael Paquier f186c7c885 doc: Fix type of cursor_position in jsonlog table
This entry was listed as a "string", but it is a "number.  The other
fields are correctly described, on a second look.

Reported-by: Nuko Yokohama
Author: Tatsuo Ishii
Discussion: https://postgr.es/m/CAF3Gu1awoVoDP5d0_eN=cR=QkGVwH+OtFvwJkkc5cB_ZMWjyeA@mail.gmail.com
Backpatch-through: 15
2022-10-25 09:29:21 +09:00
Michael Paquier 9668c4a661 Rework shutdown callback of archiver modules
As currently designed, with a callback registered in a ERROR_CLEANUP
block, the shutdown callback would get called twice when updating
archive_library on SIGHUP, which is something that we want to avoid to
ease the life of extension writers.

Anyway, an ERROR in the archiver process is treated as a FATAL, stopping
it immediately, hence there is no need for a ERROR_CLEANUP block.
Instead of that, the shutdown callback is not called upon
before_shmem_exit(), giving to the modules the opportunity to do any
cleanup actions before the server shuts down its subsystems.

While on it, this commit adds some testing coverage for the shutdown
callback.  Neither shell_archive nor basic_archive have been using it,
and one is added to shell_archive, whose trigger is checked in a TAP
test through a shutdown sequence.

Author: Nathan Bossart, Bharath Rupireddy
Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/20221015221328.GB1821022@nathanxps13
Backpatch-through: 15
2022-10-19 14:06:56 +09:00
Bruce Momjian 23f3989d92 doc: clarify description for log_startup_progress_interval
Reported-by: Elena Indrupskaya

Discussion: https://postgr.es/m/0af43b49-1646-93d0-ccf1-bb3c635c8c6f@postgrespro.ru

Author: Elena Indrupskaya

Backpatch-through: 15
2022-10-05 15:53:40 -04:00
Tom Lane f4c7c410ee Revert "Optimize order of GROUP BY keys".
This reverts commit db0d67db24 and
several follow-on fixes.  The idea of making a cost-based choice
of the order of the sorting columns is not fundamentally unsound,
but it requires cost information and data statistics that we don't
really have.  For example, relying on procost to distinguish the
relative costs of different sort comparators is pretty pointless
so long as most such comparator functions are labeled with cost 1.0.
Moreover, estimating the number of comparisons done by Quicksort
requires more than just an estimate of the number of distinct values
in the input: you also need some idea of the sizes of the larger
groups, if you want an estimate that's good to better than a factor of
three or so.  That's data that's often unknown or not very reliable.
Worse, to arrive at estimates of the number of calls made to the
lower-order-column comparison functions, the code needs to make
estimates of the numbers of distinct values of multiple columns,
which are necessarily even less trustworthy than per-column stats.
Even if all the inputs are perfectly reliable, the cost algorithm
as-implemented cannot offer useful information about how to order
sorting columns beyond the point at which the average group size
is estimated to drop to 1.

Close inspection of the code added by db0d67db2 shows that there
are also multiple small bugs.  These could have been fixed, but
there's not much point if we don't trust the estimates to be
accurate in-principle.

Finally, the changes in cost_sort's behavior made for very large
changes (often a factor of 2 or so) in the cost estimates for all
sorting operations, not only those for multi-column GROUP BY.
That naturally changes plan choices in many situations, and there's
precious little evidence to show that the changes are for the better.
Given the above doubts about whether the new estimates are really
trustworthy, it's hard to summon much confidence that these changes
are better on the average.

Since we're hard up against the release deadline for v15, let's
revert these changes for now.  We can always try again later.

Note: in v15, I left T_PathKeyInfo in place in nodes.h even though
it's unreferenced.  Removing it would be an ABI break, and it seems
a bit late in the release cycle for that.

Discussion: https://postgr.es/m/TYAPR01MB586665EB5FB2C3807E893941F5579@TYAPR01MB5866.jpnprd01.prod.outlook.com
2022-10-03 10:56:16 -04:00
Tom Lane 5f1048881d Doc: minor cleanups.
Improve a couple of things I noticed while working on v15
release notes.
2022-09-23 18:20:14 -04:00
Peter Eisentraut ba50834551 Restore archive_command documentation
Commit 5ef1eefd76, which added
archive_library, purged most mentions of archive_command from the
documentation.  This is inappropriate, since archive_command is still
a feature in use and users will want to see information about it.

This restores all the removed mentions and rephrases things so that
archive_command and archive_library are presented as alternatives of
each other.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/9366d634-a917-85a9-4991-b2a4859edaf9@enterprisedb.com
2022-09-22 07:35:00 -04:00
Tom Lane 31dcfae83c Use the terminology "WAL file" not "log file" more consistently.
Referring to the WAL as just "log" invites confusion with the
postmaster log, so avoid doing that in docs and error messages.
Also shorten "WAL segment file" to just "WAL file" in various
places.

Bharath Rupireddy, reviewed by Nathan Bossart and Kyotaro Horiguchi

Discussion: https://postgr.es/m/CALj2ACUeXa8tDPaiTLexBDMZ7hgvaN+RTb957-cn5qwv9zf-MQ@mail.gmail.com
2022-09-14 18:40:58 -04:00
Peter Eisentraut 111d954024 Small wording improvements 2022-09-14 22:56:55 +02:00
Bruce Momjian 115464bb5b doc: add section about heap-only tuples (HOT)
Reported-by: Jonathan S. Katz

Discussion: https://postgr.es/m/c59ffbd5-96ac-a5a5-a401-14f627ca1405@postgresql.org

Backpatch-through: 11
2022-08-12 15:05:13 -04:00
Bruce Momjian 50e088d6f2 doc: warn about security issues around log files
Reported-by: Simon Riggs

Discussion: https://postgr.es/m/CANP8+jJESuuXYq9Djvf-+tx2vY2OFLmfEuu+UvwHNJ1RT7iJCQ@mail.gmail.com

Author: Simon Riggs

Backpatch-through: 10
2022-08-12 12:02:21 -04:00
Bruce Momjian 4d807bbc4b doc: improve wal_level docs for the 'minimal' level
Reported-by: David G. Johnston

Discussion: https://postgr.es/m/CAKFQuwZ24UcfkoyLLSW3PMGQATomOcw1nuYFRuMev-NoOF+mYw@mail.gmail.com

Author: David G. Johnston

Backpatch-through: 14, partial to 13
2022-08-12 10:30:01 -04:00
Michael Paquier 0b039e3a84 Fix some inconsistencies with GUC categories
This commit addresses a few things around GUCs:
- The TCP-related parameters (the four tcp_keepalives_* and
client_connection_check_interval are listed in postgresql.conf.sample in
a subsection called "TCP settings" of "CONNECTIONS AND AUTHENTICATION",
but they did not have their own group name in guc.c.
- enable_group_by_reordering, stats_fetch_consistency and
recovery_prefetch had an inconsistent description, missing a dot at the
end.
- In postgresql.conf.sample, "Process title" should not have a section
of its own, but it should be a subsection of "REPORTING AND LOGGING".

This impacts the contents of pg_settings, which could be seen as a
compatibility break, so no backpatch is done.  This is similar to the
cleanup done in a55a984.

Author: Shinya Kato
Discussion: https://postgr.es/m/5e0c9c608624eafbba910c344282cb14@oss.nttdata.com
2022-08-09 20:01:44 +09:00
Alexander Korotkov f77ff08335 Document the ability to specify TableAM for pgbench
Upcoming custom Table Access Methods (TableAM) need benchmarking.  Despite
pgbench doesn't have an explicit option for TableAM specification, one can
specify it using PGOPTION environmental variable.  The present commit documents
this way to specify TableAM for pgbench.

Discussion: https://postgr.es/m/CAC77N6ih%3DLbhZQXV76grEsaVQkBL464Y2Foqq9o%3Df4UBfEOfEQ%40mail.gmail.com
Author: Michel Pelletier, Alexander Korotkov
Reviewed-by: Justin Pryzby, Mason Sharp, Michael Paquier
2022-07-20 15:49:37 +03:00
Thomas Munro 94ebf8117c Default to dynamic_shared_memory_type=sysv on Solaris.
POSIX shm_open() can sleep for a long time and fail spuriously because
of contention on an internal lock file on Solaris (and presumably
illumos).  Commit 389869af fixed the main problem with this, namely that
we could crash, but it's now clear that "posix" is not a good default.

Therefore, choose "sysv" at initdb time on Solaris and illumos.  Other
choices are still available by editing the postgresql.conf file.

Back-patch only to 15, because contention is much less likely further
back, and it doesn't seem like a good idea to change this in released
branches.  This should clear up the failures on build farm animal
margay.

Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
2022-07-02 16:23:39 +12:00
Peter Eisentraut 9f0b953457 doc: Clean up title case use 2022-06-22 14:24:48 +02:00
Michael Paquier 7bd4a9e990 doc: Do s/int/integer/ to describe the type of some GUC parameters
Three parameters have been using "int" rather than "integer" to describe
their type:
auth_delay.milliseconds
max_logical_replication_workers
pg_prewarm.autoprewarm_interval

This is inconsistent with any other integer GUCs listed in the docs
(148, as far as I can see).

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv6X5T-veN2abUDUvBxZm+SSm-9otfi3LZPGyOc6u6hiA@mail.gmail.com
2022-06-17 09:03:07 +09:00
Michael Paquier b3fb16e8bb doc: Reword description of roles able to view track_activities's info
The information generated when track_activities is accessible to
superusers, roles with the privileges of pg_read_all_stats, as well as
roles one has the privileges of.  The original text did not outline the
last point, while the change done in ac1ae47 was unclear about the
second point.

Per discussion with Nathan Bossart.

Discussion: https://postgr.es/m/20220521185743.GA886636@nathanxps13
Backpatch-through: 10
2022-05-30 10:50:21 +09:00
Michael Paquier ac1ae477f8 doc: Mention pg_read_all_stats in description of track_activities
The description of track_activities mentioned that it is visible to
superusers and that the information related to the current session can
be seen, without telling about pg_read_all_stats.  Roles that are
granted the privileges of pg_read_all_stats can also see this
information, so mention it in the docs.

Author: Ian Barwick
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/CAB8KJ=jhPyYFu-A5r-ZGP+Ax715mUKsMxAGcEQ9Cx_mBAmrPow@mail.gmail.com
Backpatch-through: 10
2022-05-21 19:05:47 +09:00
Peter Eisentraut 648aa6734f doc: Properly punctuate "etc." 2022-05-19 09:42:17 +02:00
Peter Eisentraut 826be1ffb2 doc: Add links to tables
Formal tables should generally have an xref in the text that points to
them.  Add them here.
2022-04-22 11:19:17 +02:00
David Rowley a59746d311 Docs: wording improvement for compute_query_id = regress
It's more accurate to say that the query identifier is not shown when
compute_query_id = regress rather than to say it is hidden.

This change (ebf6c5249) appeared in v14, so it makes sense to backpatch
this small adjustment to keep the documents consistent between v14 and
master.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
Backpatch-through: 14, where compute_query_id = regress was added
2022-04-13 21:28:25 +12:00
Robert Haas ad385a494f docs: Note the recovery_min_apply_delay bloats pg_wal.
Those WAL files that we're waiting to apply have to be stored
somewhere.

Thom Brown

Discussion: http://postgr.es/m/CAA-aLv4SkJRK6GGcd0Axt8kt6_eWMEbtG7f8NJpFh+rNshtdNA@mail.gmail.com
2022-04-11 10:52:18 -04:00
David Rowley bba3c35b29 Docs: Fix various mistakes and typos
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
2022-04-11 20:48:48 +12:00
Andres Freund b3abca6810 pgstat: Update docs to match the shared memory stats reality.
This includes removing documentation for stats_temp_directory, adding
documentation for stats_fetch_consistency, rephrasing references to the stats
collector and documenting that starting a cleanly shut down standby will not
remove stats anymore. The latter point might require further wordsmithing, it
wasn't easy to adjust some of the existing content.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-By: Lukas Fittl <lukas@fittl.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-07 21:35:35 -07:00
Thomas Munro dafae9707a Fix recovery_prefetch docs.
Correct a typo and a couple of sentences that weren't updated to reflect
recent changes to the code.

Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20220407125555.GC24419%40telsasoft.com
2022-04-08 13:43:37 +12:00
Thomas Munro 5dc0418fab Prefetch data referenced by the WAL, take II.
Introduce a new GUC recovery_prefetch.  When enabled, look ahead in the
WAL and try to initiate asynchronous reading of referenced data blocks
that are not yet cached in our buffer pool.  For now, this is done with
posix_fadvise(), which has several caveats.  Since not all OSes have
that system call, "try" is provided so that it can be enabled where
available.  Better mechanisms for asynchronous I/O are possible in later
work.

Set to "try" for now for test coverage.  Default setting to be finalized
before release.

The GUC wal_decode_buffer_size limits the distance we can look ahead in
bytes of decoded data.

The existing GUC maintenance_io_concurrency is used to limit the number
of concurrent I/Os allowed, based on pessimistic heuristics used to
infer that I/Os have begun and completed.  We'll also not look more than
maintenance_io_concurrency * 4 block references ahead.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version)
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version)
Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version)
Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version)
Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version)
Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
2022-04-07 19:42:14 +12:00
Jeff Davis 5c279a6d35 Custom WAL Resource Managers.
Allow extensions to specify a new custom resource manager (rmgr),
which allows specialized WAL. This is meant to be used by a Table
Access Method or Index Access Method.

Prior to this commit, only Generic WAL was available, which offers
support for recovery and physical replication but not logical
replication.

Reviewed-by: Julien Rouhaud, Bharath Rupireddy, Andres Freund
Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com
2022-04-06 23:06:46 -07:00
Tom Lane a0ffa885e4 Allow granting SET and ALTER SYSTEM privileges on GUC parameters.
This patch allows "PGC_SUSET" parameters to be set by non-superusers
if they have been explicitly granted the privilege to do so.
The privilege to perform ALTER SYSTEM SET/RESET on a specific parameter
can also be granted.
Such privileges are cluster-wide, not per database.  They are tracked
in a new shared catalog, pg_parameter_acl.

Granting and revoking these new privileges works as one would expect.
One caveat is that PGC_USERSET GUCs are unaffected by the SET privilege
--- one could wish that those were handled by a revocable grant to
PUBLIC, but they are not, because we couldn't make it robust enough
for GUCs defined by extensions.

Mark Dilger, reviewed at various times by Andrew Dunstan, Robert Haas,
Joshua Brindle, and myself

Discussion: https://postgr.es/m/3D691E20-C1D5-4B80-8BA5-6BEB63AF3029@enterprisedb.com
2022-04-06 13:24:33 -04:00
Tomas Vondra db0d67db24 Optimize order of GROUP BY keys
When evaluating a query with a multi-column GROUP BY clause using sort,
the cost may be heavily dependent on the order in which the keys are
compared when building the groups. Grouping does not imply any ordering,
so we're allowed to compare the keys in arbitrary order, and a Hash Agg
leverages this. But for Group Agg, we simply compared keys in the order
as specified in the query. This commit explores alternative ordering of
the keys, trying to find a cheaper one.

In principle, we might generate grouping paths for all permutations of
the keys, and leave the rest to the optimizer. But that might get very
expensive, so we try to pick only a couple interesting orderings based
on both local and global information.

When planning the grouping path, we explore statistics (number of
distinct values, cost of the comparison function) for the keys and
reorder them to minimize comparison costs. Intuitively, it may be better
to perform more expensive comparisons (for complex data types etc.)
last, because maybe the cheaper comparisons will be enough. Similarly,
the higher the cardinality of a key, the lower the probability we’ll
need to compare more keys. The patch generates and costs various
orderings, picking the cheapest ones.

The ordering of group keys may interact with other parts of the query,
some of which may not be known while planning the grouping. E.g. there
may be an explicit ORDER BY clause, or some other ordering-dependent
operation, higher up in the query, and using the same ordering may allow
using either incremental sort or even eliminate the sort entirely.

The patch generates orderings and picks those minimizing the comparison
cost (for various pathkeys), and then adds orderings that might be
useful for operations higher up in the plan (ORDER BY, etc.). Finally,
it always keeps the ordering specified in the query, on the assumption
the user might have additional insights.

This introduces a new GUC enable_group_by_reordering, so that the
optimization may be disabled if needed.

The original patch was proposed by Teodor Sigaev, and later improved and
reworked by Dmitry Dolgov. Reviews by a number of people, including me,
Andrey Lepikhov, Claudio Freire, Ibrar Ahmed and Zhihong Yu.

Author: Dmitry Dolgov, Teodor Sigaev, Tomas Vondra
Reviewed-by: Tomas Vondra, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed, Zhihong Yu
Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru
Discussion: https://postgr.es/m/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com
2022-03-31 01:13:33 +02:00
Daniel Gustafsson 860ea46ba7 doc: Clarify when SSL actually means TLS
SSL has become the de facto term to mean an end-to-end encrypted channel
regardless of protocol used, even though the SSL protocol is deprecated.
Clarify what we mean with SSL in our documentation, especially for new
users who might be looking for TLS.

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/D4ABB281-6CFD-46C6-A4E0-8EC23A2977BC@yesql.se
2022-03-30 13:07:30 +02:00
Tom Lane 0bd7af082a Invent recursive_worktable_factor GUC to replace hard-wired constant.
Up to now, the planner estimated the size of a recursive query's
worktable as 10 times the size of the non-recursive term.  It's hard
to see how to do significantly better than that automatically, but
we can give users control over the multiplier to allow tuning for
specific use-cases.  The default behavior remains the same.

Simon Riggs

Discussion: https://postgr.es/m/CANbhV-EuaLm4H3g0+BSTYHEGxJj3Kht0R+rJ8vT57Dejnh=_nA@mail.gmail.com
2022-03-24 11:47:41 -04:00
Michael Paquier 6bdf1a1400 Fix collection of typos in the code and the documentation
Some words were duplicated while other places were grammatically
incorrect, including one variable name in the code.

Author: Otto Kekalainen, Justin Pryzby
Discussion: https://postgr.es/m/7DDBEFC5-09B6-4325-B942-B563D1A24BDC@amazon.com
2022-03-15 11:29:35 +09:00
Michael Paquier 8e375ea4a0 Bump XLOG_PAGE_MAGIC due to the addition of wal_compression=zstd
While on it, fix a thinko in the docs, introduced by the same commit.

Oversights in e953732.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20220311214900.GN28503@telsasoft.com
2022-03-12 09:39:13 +09:00
Michael Paquier 9198e63996 doc: Standardize capitalization of term "hot standby"/"Hot Standby"
"Hot Standby" was capitalized in a couple of places in the docs, as the
style primarily used when it was introduced, but this has not been much
respected across the years.  Per discussion, it is more natural for the
reader to use "hot standby" (aka lower-case only) when in the middle of
a sentence, and "Hot standby" (aka capitalized) in a title.  This commit
adjusts all the places in the docs to be consistent with this choice,
rather than applying one style or the other midway.

Author: Daniel Westermann
Reviewed-by: Kyotaro Horiguchi, Aleksander Alekseev, Robert Treat
Discussion: https://postgr.es/m/GVAP278MB093160025A779A1A5788D0EAD2039@GVAP278MB0931.CHEP278.PROD.OUTLOOK.COM
2022-03-11 15:16:21 +09:00
Michael Paquier e9537321a7 Add support for zstd with compression of full-page writes in WAL
wal_compression gains a new value, "zstd", to allow the compression of
full-page images using the compression method of the same name.

Compression is done using the default level recommended by the library,
as of ZSTD_CLEVEL_DEFAULT = 3.  Some benchmarking has shown that it
could make sense to use a level lower for the FPI compression, like 1 or
2, as the compression rate did not change much with a bit less CPU
consumed, but any tests done would only cover few scenarios so it is
hard to come to a clear conclusion.  Anyway, there is no reason to not
use the default level instead, which is the level recommended by the
library so it should be fine for most cases.

zstd outclasses easily pglz, and is better than LZ4 where one wants to
have more compression at the cost of extra CPU but both are good enough
in their own scenarios, so the choice between one or the other of these
comes to a study of the workload patterns and the schema involved,
mainly.

This commit relies heavily on 4035cd5, that reshaped the code creating
and restoring full-page writes to be aware of the compression type,
making this integration straight-forward.

This patch borrows some early work from Andrey Borodin, though the patch
got a complete rewrite.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220222231948.GJ9008@telsasoft.com
2022-03-11 12:18:53 +09:00