Commit Graph

21860 Commits

Author SHA1 Message Date
Tom Lane 6ee41a301e Fix mis-planning of repeated application of a projection.
create_projection_plan contains a hidden assumption (here made
explicit by an Assert) that a projection-capable Path will yield a
projection-capable Plan.  Unfortunately, that assumption is violated
only a few lines away, by create_projection_plan itself.  This means
that two stacked ProjectionPaths can yield an outcome where we try to
jam the upper path's tlist into a non-projection-capable child node,
resulting in an invalid plan.

There isn't any good reason to have stacked ProjectionPaths; indeed the
whole concept is faulty, since the set of Vars/Aggs/etc needed by the
upper one wouldn't necessarily be available in the output of the lower
one, nor could the lower one create such values if they weren't
available from its input.  Hence, we can fix this by adjusting
create_projection_path to strip any top-level ProjectionPath from the
subpath it's given.  (This amounts to saying "oh, we changed our
minds about what we need to project here".)

The test case added here only fails in v13 and HEAD; before that, we
don't attempt to shove the Sort into the parallel part of the plan,
for reasons that aren't entirely clear to me.  However, all the
directly-related code looks generally the same as far back as v11,
where the hazard was introduced (by d7c19e62a).  So I've got no faith
that the same type of bug doesn't exist in v11 and v12, given the
right test case.  Hence, back-patch the code changes, but not the
irrelevant test case, into those branches.

Per report from Bas Poot.

Discussion: https://postgr.es/m/534fca83789c4a378c7de379e9067d4f@politie.nl
2021-05-31 12:03:00 -04:00
Michael Paquier 12cc956664 Improve some error wording with multirange type parsing
Braces were referred in some error messages as only brackets (not curly
brackets or curly braces), which can be confusing as other types of
brackets could be used.

While on it, add one test to check after the case of junk characters
detected after a right brace.

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20210514.153153.1814935914483287479.horikyota.ntt@gmail.com
2021-05-31 11:35:00 +09:00
Thomas Munro b1d6538903 Fix race condition when sharing tuple descriptors.
Parallel query processes that called BlessTupleDesc() for identical
tuple descriptors at the same moment could crash.  There was code to
handle that rare case, but it dereferenced a bogus DSA pointer.  Repair.

Back-patch to 11, where commit cc5f8136 added support for sharing tuple
descriptors in parallel queries.

Reported-by: Eric Thinnes <e.thinnes@gmx.de>
Discussion: https://postgr.es/m/99aaa2eb-e194-bf07-c29a-1a76b4f2bcf9%40gmx.de
2021-05-29 15:12:34 +12:00
Peter Geoghegan 9afdea9824 Fix VACUUM VERBOSE's LP_DEAD item pages output.
Oversight in commit 5100010e.
2021-05-27 17:09:16 -07:00
Tom Lane a4390abecf Reduce the range of OIDs reserved for genbki.pl.
Commit ab596105b increased FirstBootstrapObjectId from 12000 to 13000,
but we've had some push-back about that.  It's worrisome to reduce the
daylight between there and FirstNormalObjectId, because the number of
OIDs consumed during initdb for collation objects is hard to predict.

We can improve the situation by abandoning the assumption that these
OIDs must be globally unique.  It should be sufficient for them to be
unique per-catalog.  (Any code that's unhappy about that is broken
anyway, since no more than per-catalog uniqueness can be guaranteed
once the OID counter wraps around.)  With that change, the largest OID
assigned during genbki.pl (starting from a base of 10000) is a bit
under 11000.  This allows reverting FirstBootstrapObjectId to 12000
with reasonable confidence that that will be sufficient for many years
to come.

We are not, at this time, abandoning the expectation that
hand-assigned OIDs (below 10000) are globally unique.  Someday that'll
likely be necessary, but the need seems years away still.

This is late for v14, but it seems worth doing it now so that
downstream software doesn't have to deal with the consequences of
a change in FirstBootstrapObjectId.  In any case, we already
bought into forcing an initdb for beta2, so another catversion
bump won't hurt.

Discussion: https://postgr.es/m/1665197.1622065382@sss.pgh.pa.us
2021-05-27 15:55:08 -04:00
Tom Lane e6241d8e03 Rethink definition of pg_attribute.attcompression.
Redefine '\0' (InvalidCompressionMethod) as meaning "if we need to
compress, use the current setting of default_toast_compression".
This allows '\0' to be a suitable default choice regardless of
datatype, greatly simplifying code paths that initialize tupledescs
and the like.  It seems like a more user-friendly approach as well,
because now the default compression choice doesn't migrate into table
definitions, meaning that changing default_toast_compression is
usually sufficient to flip an installation's behavior; one needn't
tediously issue per-column ALTER SET COMPRESSION commands.

Along the way, fix a few minor bugs and documentation issues
with the per-column-compression feature.  Adopt more robust
APIs for SetIndexStorageProperties and GetAttributeCompression.

Bump catversion because typical contents of attcompression will now
be different.  We could get away without doing that, but it seems
better to ensure v14 installations all agree on this.  (We already
forced initdb for beta2, anyway.)

Discussion: https://postgr.es/m/626613.1621787110@sss.pgh.pa.us
2021-05-27 13:24:27 -04:00
Peter Eisentraut 388e75ad33 Replace run-time error check with assertion
The error message was checking that the structures returned from the
parser matched expectations.  That's something we usually use
assertions for, not a full user-facing error message.  So replace that
with an assertion (hidden inside lfirst_node()).

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/452e9df8-ec89-e01b-b64a-8cc6ce830458%40enterprisedb.com
2021-05-27 09:54:14 +02:00
Michael Paquier 2941138e60 doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)

This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.

Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 14:57:28 +09:00
Amit Kapila 6f4bdf8152 Fix assertion during streaming of multi-insert toast changes.
While decoding the multi-insert WAL we can't clean the toast untill we get
the last insert of that WAL record. Now if we stream the changes before we
get the last change, the memory for toast chunks won't be released and we
expect the txn to have streamed all changes after streaming.  This
restriction is mainly to ensure the correctness of streamed transactions
and it doesn't seem worth uplifting such a restriction just to allow this
case because anyway we will stream the transaction once such an insert is
complete.

Previously we were using two different flags (one for toast tuples and
another for speculative inserts) to indicate partial changes. Now instead
we replaced both of them with a single flag to indicate partial changes.

Reported-by: Pavan Deolasee
Author: Dilip Kumar
Reviewed-by: Pavan Deolasee, Amit Kapila
Discussion: https://postgr.es/m/CABOikdN-_858zojYN-2tNcHiVTw-nhxPwoQS4quExeweQfG1Ug@mail.gmail.com
2021-05-27 07:59:43 +05:30
Michael Paquier 190fa5a00a Fix typo in heapam.c
Author: Hou Zhijie
Discussion: https://postgr.es/m/OS0PR01MB571612191738540B27A8DE5894249@OS0PR01MB5716.jpnprd01.prod.outlook.com
2021-05-26 19:53:03 +09:00
Tom Lane e30e3fdea8 Fix use of uninitialized variable in inline_function().
Commit e717a9a18 introduced a code path that bypassed the call of
get_expr_result_type, which is not good because we need its rettupdesc
result to pass to check_sql_fn_retval.  We'd failed to notice right
away because the code path in which check_sql_fn_retval uses that
argument is fairly hard to reach in this context.  It's not impossible
though, and in any case inline_function would have no business
assuming that check_sql_fn_retval doesn't need that value.

To fix, move get_expr_result_type out of the if-block, which in
turn requires moving the construction of the dummy FuncExpr
out of it.

Per report from Ranier Vilela.  (I'm bemused by the lack of any
compiler complaints...)

Discussion: https://postgr.es/m/CAEudQAqBqQpQ3HruWAGU_7WaMJ7tntpk0T8k_dVtNB46DqdBgw@mail.gmail.com
2021-05-25 12:55:55 -04:00
Peter Eisentraut 8673a37c85 postgresql.conf.sample: Make vertical spacing consistent 2021-05-25 11:49:54 +02:00
Michael Paquier fb0f5f0172 Fix memory leak when de-toasting compressed values in VACUUM FULL/CLUSTER
VACUUM FULL and CLUSTER can be used to enforce the use of the existing
compression method of a toastable column if a value currently stored is
compressed with a method that does not match the column's defined
method.  The code in charge of decompressing and recompressing toast
values at rewrite left around the detoasted values, causing an
accumulation of memory allocated in TopTransactionContext.

When processing large relations, this could cause the system to run out
of memory.  The detoasted values are not needed once their tuple is
rewritten, and this commit ensures that the necessary cleanup happens.

Issue introduced by bbe0a81d.  The comments of the area are reordered a
bit while on it.

Reported-by: Andres Freund
Analyzed-by: Andres Freund
Author: Michael Paquier
Reviewed-by: Dilip Kumar
Discussion: https://postgr.es/m/20210521211929.pcehg6f23icwstdb@alap3.anarazel.de
2021-05-25 14:27:18 +09:00
Amit Kapila 0734b0e983 Improve docs and error messages for parallel vacuum.
The error messages, docs, and one of the options were using
'parallel degree' to indicate parallelism used by vacuum command. We
normally use 'parallel workers' at other places so change it for parallel
vacuum accordingly.

Author: Bharath Rupireddy
Reviewed-by: Dilip Kumar, Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/CALj2ACWz=PYrrFXVsEKb9J1aiX4raA+UBe02hdRp_zqDkrWUiw@mail.gmail.com
2021-05-25 09:26:53 +05:30
Michael Paquier 01e6f1a842 Disallow SSL renegotiation
SSL renegotiation is already disabled as of 48d23c72, however this does
not prevent the server to comply with a client willing to use
renegotiation.  In the last couple of years, renegotiation had its set
of security issues and flaws (like the recent CVE-2021-3449), and it
could be possible to crash the backend with a client attempting
renegotiation.

This commit takes one extra step by disabling renegotiation in the
backend in the same way as SSL compression (f9264d15) or tickets
(97d3a0b0).  OpenSSL 1.1.0h has added an option named
SSL_OP_NO_RENEGOTIATION able to achieve that.  In older versions
there is an option called SSL3_FLAGS_NO_RENEGOTIATE_CIPHERS that
was undocumented, and could be set within the SSL object created when
the TLS connection opens, but I have decided not to use it, as it feels
trickier to rely on, and it is not official.  Note that this option is
not usable in OpenSSL < 1.1.0h as the internal contents of the *SSL
object are hidden to applications.

SSL renegotiation concerns protocols up to TLSv1.2.

Per original report from Robert Haas, with a patch based on a suggestion
by Andres Freund.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/YKZBXx7RhU74FlTE@paquier.xyz
Backpatch-through: 9.6
2021-05-25 10:10:09 +09:00
David Rowley cba5c70b95 Fix setrefs.c code for Result Cache nodes
Result Cache, added in 9eacee2e6 neglected to properly adjust the plan
references in setrefs.c.  This could lead to the following error during
EXPLAIN:

ERROR:  cannot decompile join alias var in plan tree

Fix that.

Bug: 17030
Reported-by: Hans Buschmann
Discussion: https://postgr.es/m/17030-5844aecae42fe223@postgresql.org
2021-05-25 12:50:22 +12:00
Peter Geoghegan c242baa4a8 Consider triggering VACUUM failsafe during scan.
The wraparound failsafe mechanism added by commit 1e55e7d1 handled the
one-pass strategy case (i.e. the "table has no indexes" case) by adding
a dedicated failsafe check.  This made up for the fact that the usual
one-pass checks inside lazy_vacuum_all_indexes() cannot ever be reached
during a one-pass strategy VACUUM.

This approach failed to account for two-pass VACUUMs that opt out of
index vacuuming up-front.  The INDEX_CLEANUP off case in the only case
that works like that.

Fix this by performing a failsafe check every 4GB during the first scan
of the heap, regardless of the details of the VACUUM.  This eliminates
the special case, and will make the failsafe trigger more reliably.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/20210424002921.pb3t7h6frupdqnkp@alap3.anarazel.de
2021-05-24 17:14:02 -07:00
David Rowley 99c5852e20 Add missing NULL check when building Result Cache paths
Code added in 9e215378d to disable building of Result Cache paths when
not all join conditions are part of the parameterization of a unique
join failed to first check if the inner path's param_info was set before
checking the param_info's ppi_clauses.

Add a check for NULL values here and just bail on trying to build the
path if param_info is NULL. lateral_vars are not considered when
deciding if the join is unique, so we're not missing out on doing the
optimization when there are lateral_vars and no param_info.

Reported-by: Coverity, via Tom Lane
Discussion: https://postgr.es/m/457998.1621779290@sss.pgh.pa.us
2021-05-24 12:37:11 +12:00
Tom Lane f5024d8d7b Re-order pg_attribute columns to eliminate some padding space.
Now that attcompression is just a char, there's a lot of wasted
padding space after it.  Move it into the group of char-wide
columns to save a net of 4 bytes per pg_attribute entry.  While
we're at it, swap the order of attstorage and attalign to make for
a more logical grouping of these columns.

Also re-order actions in related code to match the new field ordering.

This patch also fixes one outright bug: equalTupleDescs() failed to
compare attcompression.  That could, for example, cause relcache
reload to fail to adopt a new value following a change.

Michael Paquier and Tom Lane, per a gripe from Andres Freund.

Discussion: https://postgr.es/m/20210517204803.iyk5wwvwgtjcmc5w@alap3.anarazel.de
2021-05-23 12:12:09 -04:00
Tom Lane bc2a389efb Be more verbose when the postmaster unexpectedly quits.
Emit a LOG message when the postmaster stops because of a failure in
the startup process.  There already is a similar message if we exit
for that reason during PM_STARTUP phase, so it seems inconsistent
that there was none if the startup process fails later on.

Also emit a LOG message when the postmaster stops after a crash
because restart_after_crash is disabled.  This seems potentially
helpful in case DBAs (or developers) forget that that's set.
Also, it was the only remaining place where the postmaster would
do an abnormal exit without any comment as to why.

In passing, remove an unreachable call of ExitPostmaster(0).

Discussion: https://postgr.es/m/194914.1621641288@sss.pgh.pa.us
2021-05-23 10:50:21 -04:00
Tom Lane b39630fd41 Fix access to no-longer-open relcache entry in logical-rep worker.
If we redirected a replicated tuple operation into a partition child
table, and then tried to fire AFTER triggers for that event, the
relation cache entry for the child table was already closed.  This has
no visible ill effects as long as the entry is still there and still
valid, but an unluckily-timed cache flush could result in a crash or
other misbehavior.

To fix, postpone the ExecCleanupTupleRouting call (which is what
closes the child table) until after we've fired triggers.  This
requires a bit of refactoring so that the cleanup function can
have access to the necessary state.

In HEAD, I took the opportunity to simplify some of worker.c's
function APIs based on use of the new ApplyExecutionData struct.
However, it doesn't seem safe/practical to back-patch that aspect,
at least not without a lot of analysis of possible interactions
with a04daa97a.

In passing, add an Assert to afterTriggerInvokeEvents to catch
such cases.  This seems worthwhile because we've grown a number
of fairly unstructured ways of calling AfterTriggerEndQuery.

Back-patch to v13, where worker.c grew the ability to deal with
partitioned target tables.

Discussion: https://postgr.es/m/3382681.1621381328@sss.pgh.pa.us
2021-05-22 21:25:29 -04:00
David Rowley 9e215378d7 Fix planner's use of Result Cache with unique joins
When the planner considered using a Result Cache node to cache results
from the inner side of a Nested Loop Join, it failed to consider that the
inner path's parameterization may not be the entire join condition.  If
the join was marked as inner_unique then we may accidentally put the cache
in singlerow mode.  This meant that entries would be marked as complete
after caching the first row.  That was wrong as if only part of the join
condition was parameterized then the uniqueness of the unique join was not
guaranteed at the Result Cache's level.  The uniqueness is only guaranteed
after Nested Loop applies the join filter.  If subsequent rows were found,
this would lead to:

ERROR: cache entry already complete

This could have been fixed by only putting the cache in singlerow mode if
the entire join condition was parameterized.  However, Nested Loop will
only read its inner side so far as the first matching row when the join is
unique, so that might mean we never get an opportunity to mark cache
entries as complete.  Since non-complete cache entries are useless for
subsequent lookups, we just don't bother considering a Result Cache path
in this case.

In passing, remove the XXX comment that claimed the above ERROR might be
better suited to be an Assert.  After there being an actual case which
triggered it, it seems better to keep it an ERROR.

Reported-by: David Christensen
Discussion: https://postgr.es/m/CAOxo6X+dy-V58iEPFgst8ahPKEU+38NZzUuc+a7wDBZd4TrHMQ@mail.gmail.com
2021-05-22 16:22:27 +12:00
Tom Lane 4b10074453 Disallow whole-row variables in GENERATED expressions.
This was previously allowed, but I think that was just an oversight.
It's a clear violation of the rule that a generated column cannot
depend on itself or other generated columns.  Moreover, because the
code was relying on the assumption that no such cross-references
exist, it was pretty easy to crash ALTER TABLE and perhaps other
places.  Even if you managed not to crash, you got quite unstable,
implementation-dependent results.

Per report from Vitaly Ustinov.
Back-patch to v12 where GENERATED came in.

Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
2021-05-21 15:12:08 -04:00
Tom Lane 2b0ee126bb Fix usage of "tableoid" in GENERATED expressions.
We consider this supported (though I've got my doubts that it's a
good idea, because tableoid is not immutable).  However, several
code paths failed to fill the field in soon enough, causing such
a GENERATED expression to see zero or the wrong value.  This
occurred when ALTER TABLE adds a new GENERATED column to a table
with existing rows, and during regular INSERT or UPDATE on a
foreign table with GENERATED columns.

Noted during investigation of a report from Vitaly Ustinov.
Back-patch to v12 where GENERATED came in.

Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
2021-05-21 15:02:06 -04:00
Tom Lane 84f5c2908d Restore the portal-level snapshot after procedure COMMIT/ROLLBACK.
COMMIT/ROLLBACK necessarily destroys all snapshots within the session.
The original implementation of intra-procedure transactions just
cavalierly did that, ignoring the fact that this left us executing in
a rather different environment than normal.  In particular, it turns
out that handling of toasted datums depends rather critically on there
being an outer ActiveSnapshot: otherwise, when SPI or the core
executor pop whatever snapshot they used and return, it's unsafe to
dereference any toasted datums that may appear in the query result.
It's possible to demonstrate "no known snapshots" and "missing chunk
number N for toast value" errors as a result of this oversight.

Historically this outer snapshot has been held by the Portal code,
and that seems like a good plan to preserve.  So add infrastructure
to pquery.c to allow re-establishing the Portal-owned snapshot if it's
not there anymore, and add enough bookkeeping support that we can tell
whether it is or not.

We can't, however, just re-establish the Portal snapshot as part of
COMMIT/ROLLBACK.  As in normal transaction start, acquiring the first
snapshot should wait until after SET and LOCK commands.  Hence, teach
spi.c about doing this at the right time.  (Note that this patch
doesn't fix the problem for any PLs that try to run intra-procedure
transactions without using SPI to execute SQL commands.)

This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD,
rename that to allow_nonatomic.

replication/logical/worker.c also needs some fixes, because it wasn't
careful to hold a snapshot open around AFTER trigger execution.
That code doesn't use a Portal, which I suspect someday we're gonna
have to fix.  But for now, just rearrange the order of operations.
This includes back-patching the recent addition of finish_estate()
to centralize the cleanup logic there.

This also back-patches commit 2ecfeda3e into v13, to improve the
test coverage for worker.c (it was that test that exposed that
worker.c's snapshot management is wrong).

Per bug #15990 from Andreas Wicht.  Back-patch to v11 where
intra-procedure COMMIT was added.

Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
2021-05-21 14:03:59 -04:00
Amit Kapila 6d0eb38557 Fix deadlock for multiple replicating truncates of the same table.
While applying the truncate change, the logical apply worker acquires
RowExclusiveLock on the relation being truncated. This allowed truncate on
the relation at a time by two apply workers which lead to a deadlock. The
reason was that one of the workers after updating the pg_class tuple tries
to acquire SHARE lock on the relation and started to wait for the second
worker which has acquired RowExclusiveLock on the relation. And when the
second worker tries to update the pg_class tuple, it starts to wait for
the first worker which leads to a deadlock. Fix it by acquiring
AccessExclusiveLock on the relation before applying the truncate change as
we do for normal truncate operation.

Author: Peter Smith, test case by Haiying Tang
Reviewed-by: Dilip Kumar, Amit Kapila
Backpatch-through: 11
Discussion: https://postgr.es/m/CAHut+PsNm43p0jM+idTvWwiGZPcP0hGrHMPK9TOAkc+a4UpUqw@mail.gmail.com
2021-05-21 07:54:27 +05:30
Tom Lane f21fadafaf Avoid detoasting failure after COMMIT inside a plpgsql FOR loop.
exec_for_query() normally tries to prefetch a few rows at a time
from the query being iterated over, so as to reduce executor
entry/exit overhead.  Unfortunately this is unsafe if we have
COMMIT or ROLLBACK within the loop, because there might be
TOAST references in the data that we prefetched but haven't
yet examined.  Immediately after the COMMIT/ROLLBACK, we have
no snapshots in the session, meaning that VACUUM is at liberty
to remove recently-deleted TOAST rows.

This was originally reported as a case triggering the "no known
snapshots" error in init_toast_snapshot(), but even if you miss
hitting that, you can get "missing toast chunk", as illustrated
by the added isolation test case.

To fix, just disable prefetching in non-atomic contexts.  Maybe
there will be performance complaints prompting us to work harder
later, but it's not clear at the moment that this really costs
much, and I doubt we'd want to back-patch any complicated fix.

In passing, adjust that error message in init_toast_snapshot()
to be a little clearer about the likely cause of the problem.

Patch by me, based on earlier investigation by Konstantin Knizhnik.

Per bug #15990 from Andreas Wicht.  Back-patch to v11 where
intra-procedure COMMIT was added.

Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
2021-05-20 18:32:37 -04:00
Fujii Masao 167bd48049 Make standby promotion reset the recovery pause state to 'not paused'.
If a promotion is triggered while recovery is paused, the paused state ends
and promotion continues. But previously in that case
pg_get_wal_replay_pause_state() returned 'paused' wrongly while a promotion
was ongoing.

This commit changes a standby promotion so that it marks the recovery
pause state as 'not paused' when it's triggered, to fix the issue.

Author: Fujii Masao
Reviewed-by: Dilip Kumar, Kyotaro Horiguchi
Discussion: https://postgr.es/m/f706876c-4894-0ba5-6f4d-79803eeea21b@oss.nttdata.com
2021-05-19 13:48:19 +09:00
Fujii Masao d8735b8b46 Fix issues in pg_stat_wal.
1) Previously there were both pgstat_send_wal() and pgstat_report_wal()
   in order to send WAL activity to the stats collector. With the former being
   used by wal writer, the latter by most other processes. They were a bit
   redundant and so this commit merges them into pgstat_send_wal() to
   simplify the code.

2) Previously WAL global statistics counters were calculated and then
   compared with zero-filled buffer in order to determine whether any WAL
   activity has happened since the last submission. These calculation and
   comparison were not cheap. This was regularly exercised even in read-only
   workloads. This commit fixes the issue by making some WAL activity
   counters directly be checked to determine if there's WAL activity stats
   to send.

3) Previously pgstat_report_stat() did not check if there's WAL activity
   stats to send as part of the "Don't expend a clock check if nothing to do"
   check at the top. It's probably rare to have pending WAL stats without
   also passing one of the other conditions, but for safely this commit
   changes pgstat_report_stats() so that it checks also some WAL activity
   counters at the top.

This commit also adds the comments about the design of WAL stats.

Reported-by: Andres Freund
Author: Masahiro Ikeda
Reviewed-by: Kyotaro Horiguchi, Atsushi Torikoshi, Andres Freund, Fujii Masao
Discussion: https://postgr.es/m/20210324232224.vrfiij2rxxwqqjjb@alap3.anarazel.de
2021-05-19 11:38:34 +09:00
David Rowley 2ded19fa3a Fix typo and outdated information in README.barrier
README.barrier didn't seem to get the memo when atomics were added. Fix
that.

Author: Tatsuo Ishii, David Rowley
Discussion: https://postgr.es/m/20210516.211133.2159010194908437625.t-ishii%40sraoss.co.jp
Backpatch-through: 9.6, oldest supported release
2021-05-18 09:54:56 +12:00
Peter Eisentraut 6292b83074 Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 9bbd9c3714d0c76daaa806588b1fbf744aa60496
2021-05-17 14:30:27 +02:00
Alvaro Herrera 354f32d01d
Unbreak EXEC_BACKEND build
Per buildfarm
2021-05-15 15:17:15 -04:00
Alvaro Herrera cafde58b33
Allow compute_query_id to be set to 'auto' and make it default
Allowing only on/off meant that all either all existing configuration
guides would become obsolete if we disabled it by default, or that we
would have to accept a performance loss in the default config if we
enabled it by default.  By allowing 'auto' as a middle ground, the
performance cost is only paid by those who enable pg_stat_statements and
similar modules.

I only edited the release notes to comment-out a paragraph that is now
factually wrong; further edits are probably needed to describe the
related change in more detail.

Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20210513002623.eugftm4nk2lvvks3@nol
2021-05-15 14:13:09 -04:00
Tom Lane 30d8bad494 Be more careful about barriers when releasing BackgroundWorkerSlots.
ForgetBackgroundWorker lacked any memory barrier at all, while
BackgroundWorkerStateChange had one but unaccountably did
additional manipulation of the slot after the barrier.  AFAICS,
the rule must be that the barrier is immediately before setting
or clearing slot->in_use.

It looks like back in 9.6 when ForgetBackgroundWorker was first
written, there might have been some case for not needing a
barrier there, but I'm not very convinced of that --- the fact
that the load of bgw_notify_pid is in the caller doesn't seem
to guarantee no memory ordering problem.  So patch 9.6 too.

It's likely that this doesn't fix any observable bug on Intel
hardware, but machines with weaker memory ordering rules could
have problems here.

Discussion: https://postgr.es/m/4046084.1620244003@sss.pgh.pa.us
2021-05-15 12:21:06 -04:00
Peter Geoghegan 8f72bbac3e Harden nbtree deduplication posting split code.
Add a defensive "can't happen" error to code that handles nbtree posting
list splits (promote an existing assertion).  This avoids a segfault in
the event of an insertion of a newitem that is somehow identical to an
existing non-pivot tuple in the index.  An nbtree index should never
have two index tuples with identical TIDs.

This scenario is not particular unlikely in the event of any kind of
corruption that leaves the index in an inconsistent state relative to
the heap relation that is indexed.  There are two known reports of
preventable hard crashes.  Doing nothing seems unacceptable given the
general expectation that nbtree will cope reasonably well with corrupt
data.

Discussion: https://postgr.es/m/CAH2-Wz=Jr_d-dOYEEmwz0-ifojVNWho01eAqewfQXgKfoe114w@mail.gmail.com
Backpatch: 13-, where nbtree deduplication was introduced.
2021-05-14 15:08:02 -07:00
Tom Lane c3c35a733c Prevent infinite insertion loops in spgdoinsert().
Formerly we just relied on operator classes that assert longValuesOK
to eventually shorten the leaf value enough to fit on an index page.
That fails since the introduction of INCLUDE-column support (commit
09c1c6ab4), because the INCLUDE columns might alone take up more
than a page, meaning no amount of leaf-datum compaction will get
the job done.  At least with spgtextproc.c, that leads to an infinite
loop, since spgtextproc.c won't throw an error for not being able
to shorten the leaf datum anymore.

To fix without breaking cases that would otherwise work, add logic
to spgdoinsert() to verify that the leaf tuple size is decreasing
after each "choose" step.  Some opclasses might not decrease the
size on every single cycle, and in any case, alignment roundoff
of the tuple size could obscure small gains.  Therefore, allow
up to 10 cycles without additional savings before throwing an
error.  (Perhaps this number will need adjustment, but it seems
quite generous right now.)

As long as we've developed this logic, let's back-patch it.
The back branches don't have INCLUDE columns to worry about, but
this seems like a good defense against possible bugs in operator
classes.  We already know that an infinite loop here is pretty
unpleasant, so having a defense seems to outweigh the risk of
breaking things.  (Note that spgtextproc.c is actually the only
known opclass with longValuesOK support, so that this is all moot
for known non-core opclasses anyway.)

Per report from Dilip Kumar.

Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
2021-05-14 15:07:34 -04:00
Tom Lane eb7a6b9229 Fix query-cancel handling in spgdoinsert().
Knowing that a buggy opclass could cause an infinite insertion loop,
spgdoinsert() intended to allow its loop to be interrupted by query
cancel.  However, that never actually worked, because in iterations
after the first, we'd be holding buffer lock(s) which would cause
InterruptHoldoffCount to be positive, preventing servicing of the
interrupt.

To fix, check if an interrupt is pending, and if so fall out of
the insertion loop and service the interrupt after we've released
the buffers.  If it was indeed a query cancel, that's the end of
the matter.  If it was a non-canceling interrupt reason, make use
of the existing provision to retry the whole insertion.  (This isn't
as wasteful as it might seem, since any upper-level index tuples we
already created should be usable in the next attempt.)

While there's no known instance of such a bug in existing release
branches, it still seems like a good idea to back-patch this to
all supported branches, since the behavior is fairly nasty if a
loop does happen --- not only is it uncancelable, but it will
quickly consume memory to the point of an OOM failure.  In any
case, this code is certainly not working as intended.

Per report from Dilip Kumar.

Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
2021-05-14 13:29:39 -04:00
Tom Lane e47f93f981 Refactor CHECK_FOR_INTERRUPTS() to add flexibility.
Split up CHECK_FOR_INTERRUPTS() to provide an additional macro
INTERRUPTS_PENDING_CONDITION(), which just tests whether an
interrupt is pending without attempting to service it.  This is
useful in situations where the caller knows that interrupts are
blocked, and would like to find out if it's worth the trouble
to unblock them.

Also add INTERRUPTS_CAN_BE_PROCESSED(), which indicates whether
CHECK_FOR_INTERRUPTS() can be relied on to clear the pending interrupt.

This commit doesn't actually add any uses of the new macros,
but a follow-on bug fix will do so.  Back-patch to all supported
branches to provide infrastructure for that fix.

Alvaro Herrera and Tom Lane

Discussion: https://postgr.es/m/20210513155351.GA7848@alvherre.pgsql
2021-05-14 13:29:39 -04:00
David Rowley 6cb93beddd Convert misleading while loop into an if condition
This seems to be leftover from ea15e1867 and from when we used to evaluate
SRFs at each node.

Since there is an unconditional "return" at the end of the loop body, only
1 loop is ever possible, so we can just change this into an if condition.

There is no actual bug being fixed here so no back-patch. It seems fine to
just fix this anomaly in master only.

Author: Greg Nancarrow
Discussion: https://postgr.es/m/CAJcOf-d7T1q0az-D8evWXnsuBZjigT04WkV5hCAOEJQZRWy28w@mail.gmail.com
2021-05-14 12:26:11 +12:00
Peter Geoghegan fbe9b80610 Fix autovacuum log output heap truncation issue.
The percentage of blocks from the table value reported by autovacuum log
output (following commit 5100010ee4) should never exceed 100% because
it describes the state of the table back when lazy_vacuum() was called.
The value could nevertheless exceed 100% in the event of heap relation
truncation.  We failed to compensate for how truncation affects
rel_pages.

Fix the faulty accounting by using the original rel_pages value instead
of the current/final rel_pages value.

Reported-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210423204306.5osfpkt2ggaedyvy@alap3.anarazel.de
2021-05-13 16:07:17 -07:00
Alvaro Herrera db16c65647
Rename the logical replication global "wrconn"
The worker.c global wrconn is only meant to be used by logical apply/
tablesync workers, but there are other variables with the same name. To
reduce future confusion rename the global from "wrconn" to
"LogRepWorkerWalRcvConn".

While this is just cosmetic, it seems better to backpatch it all the way
back to 10 where this code appeared, to avoid future backpatching
issues.

Author: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CAHut+Pu7Jv9L2BOEx_Z0UtJxfDevQSAUW2mJqWU+CtmDrEZVAg@mail.gmail.com
2021-05-12 19:13:54 -04:00
Tom Lane 7dde98728a Double-space commands in system_constraints.sql/system_functions.sql.
Previously, any error reported by the backend while reading
system_constraints.sql would report the entire file, not just the
particular command it was working on.  (Ask me how I know.)  Likewise,
there were chunks of system_functions.sql that would be read as one
command, which would be annoying if anything failed there.

The issue for system_constraints.sql is an oversight in commit
dfb75e478.  I didn't try to trace down where the poor formatting
in system_functions.sql started, but it's certainly contrary to
the advice at the head of that file.
2021-05-12 18:41:39 -04:00
Tom Lane def5b065ff Initial pgindent and pgperltidy run for v14.
Also "make reformat-dat-files".

The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
2021-05-12 13:14:10 -04:00
Michael Paquier e6ccd1ce16 Simplify one use of ScanKey in pg_subscription.c
The section of the code in charge of returning all the relations
associated to a subscription only need one ScanKey, but allocated two of
them.  This code was introduced as a copy-paste from a different area on
the same file by 7c4f524, making the result confusing to follow.

Author: Peter Smith
Reviewed-by: Tom Lane, Julien Rouhaud, Bharath Rupireddy
Discussion: https://postgr.es/m/CAHut+PsLKe+rN3FjchoJsd76rx2aMsFTB7CTFxRgUP05p=kcpQ@mail.gmail.com
2021-05-12 14:54:02 +09:00
Peter Eisentraut ec6e70c79f Refactor some error messages for easier translation 2021-05-12 07:42:51 +02:00
Etsuro Fujita a363bc6da9 Fix EXPLAIN ANALYZE for async-capable nodes.
EXPLAIN ANALYZE for an async-capable ForeignScan node associated with
postgres_fdw is done just by using instrumentation for ExecProcNode()
called from the node's callbacks, causing the following problems:

1) If the remote table to scan is empty, the node is incorrectly
   considered as "never executed" by the command even if the node is
   executed, as ExecProcNode() isn't called from the node's callbacks at
   all in that case.
2) The command fails to collect timings for things other than
   ExecProcNode() done in the node, such as creating a cursor for the
   node's remote query.

To fix these problems, add instrumentation for async-capable nodes, and
modify postgres_fdw accordingly.

My oversight in commit 27e1f1456.

While at it, update a comment for the AsyncRequest struct in execnodes.h
and the documentation for the ForeignAsyncRequest API in fdwhandler.sgml
to match the code in ExecAsyncAppendResponse() in nodeAppend.c, and fix
typos in comments in nodeAppend.c.

Per report from Andrey Lepikhov, though I didn't use his patch.

Reviewed-by: Andrey Lepikhov
Discussion: https://postgr.es/m/2eb662bb-105d-fc20-7412-2f027cc3ca72%40postgrespro.ru
2021-05-12 14:00:00 +09:00
Fujii Masao d780d7c088 Change data type of counters in BufferUsage and WalUsage from long to int64.
Previously long was used as the data type for some counters in BufferUsage
and WalUsage. But long is only four byte, e.g., on Windows, and it's entirely
possible to wrap a four byte counter. For example, emitting more than
four billion WAL records in one transaction isn't actually particularly rare.

To avoid the overflows of those counters, this commit changes the data type
of them from long to int64.

Suggested-by: Andres Freund
Author: Masahiro Ikeda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20201221211650.k7b53tcnadrciqjo@alap3.anarazel.de
Discussion: https://postgr.es/m/af0964ac-7080-1984-dc23-513754987716@oss.nttdata.com
2021-05-12 09:56:34 +09:00
Andrew Dunstan 0bf62931ca
Tweak generation of Gen_dummy_probes.pl
Use a static prolog file instead of generating the prolog from the
existing perl script. Also, support generation of the file in a vpath
build.

Discussion: https://postgr.es/m/700620.1620662868@sss.pgh.pa.us
2021-05-11 20:02:02 -04:00
Peter Eisentraut 6d177e2813 Fix typo 2021-05-11 09:06:49 +02:00
Tom Lane 049e1e2edb Fix mishandling of resjunk columns in ON CONFLICT ... UPDATE tlists.
It's unusual to have any resjunk columns in an ON CONFLICT ... UPDATE
list, but it can happen when MULTIEXPR_SUBLINK SubPlans are present.
If it happens, the ON CONFLICT UPDATE code path would end up storing
tuples that include the values of the extra resjunk columns.  That's
fairly harmless in the short run, but if new columns are added to
the table then the values would become accessible, possibly leading
to malfunctions if they don't match the datatypes of the new columns.

This had escaped notice through a confluence of missing sanity checks,
including

* There's no cross-check that a tuple presented to heap_insert or
heap_update matches the table rowtype.  While it's difficult to
check that fully at reasonable cost, we can easily add assertions
that there aren't too many columns.

* The output-column-assignment cases in execExprInterp.c lacked
any sanity checks on the output column numbers, which seems like
an oversight considering there are plenty of assertion checks on
input column numbers.  Add assertions there too.

* We failed to apply nodeModifyTable's ExecCheckPlanOutput() to
the ON CONFLICT UPDATE tlist.  That wouldn't have caught this
specific error, since that function is chartered to ignore resjunk
columns; but it sure seems like a bad omission now that we've seen
this bug.

In HEAD, the right way to fix this is to make the processing of
ON CONFLICT UPDATE tlists work the same as regular UPDATE tlists
now do, that is don't add "SET x = x" entries, and use
ExecBuildUpdateProjection to evaluate the tlist and combine it with
old values of the not-set columns.  This adds a little complication
to ExecBuildUpdateProjection, but allows removal of a comparable
amount of now-dead code from the planner.

In the back branches, the most expedient solution seems to be to
(a) use an output slot for the ON CONFLICT UPDATE projection that
actually matches the target table, and then (b) invent a variant of
ExecBuildProjectionInfo that can be told to not store values resulting
from resjunk columns, so it doesn't try to store into nonexistent
columns of the output slot.  (We can't simply ignore the resjunk columns
altogether; they have to be evaluated for MULTIEXPR_SUBLINK to work.)
This works back to v10.  In 9.6, projections work much differently and
we can't cheaply give them such an option.  The 9.6 version of this
patch works by inserting a JunkFilter when it's necessary to get rid
of resjunk columns.

In addition, v11 and up have the reverse problem when trying to
perform ON CONFLICT UPDATE on a partitioned table.  Through a
further oversight, adjust_partition_tlist() discarded resjunk columns
when re-ordering the ON CONFLICT UPDATE tlist to match a partition.
This accidentally prevented the storing-bogus-tuples problem, but
at the cost that MULTIEXPR_SUBLINK cases didn't work, typically
crashing if more than one row has to be updated.  Fix by preserving
resjunk columns in that routine.  (I failed to resist the temptation
to add more assertions there too, and to do some minor code
beautification.)

Per report from Andres Freund.  Back-patch to all supported branches.

Security: CVE-2021-32028
2021-05-10 11:02:29 -04:00