Commit Graph

3934 Commits

Author SHA1 Message Date
Peter Geoghegan 180cf876d4 Remove ineffective heapam CHECK_FOR_INTERRUPTS().
Remove a CHECK_FOR_INTERRUPTS() call that could never actually handle an
interrupt.  We always have a heap page buffer lock at this point.
Having a useless CHECK_FOR_INTERRUPTS() call is harmless but misleading.

It is probably possible to work around the immediate problem by moving
the CHECK_FOR_INTERRUPTS() to before the heap page buffer lock is
acquired.  That isn't enough to make the function responsive to
interrupts, though.  The index AM caller will still hold an exclusive
buffer lock of its own.
2020-11-09 09:00:12 -08:00
Noah Misch 0c3185e963 In security-restricted operations, block enqueue of at-commit user code.
Specifically, this blocks DECLARE ... WITH HOLD and firing of deferred
triggers within index expressions and materialized view queries.  An
attacker having permission to create non-temp objects in at least one
schema could execute arbitrary SQL functions under the identity of the
bootstrap superuser.  One can work around the vulnerability by disabling
autovacuum and not manually running ANALYZE, CLUSTER, REINDEX, CREATE
INDEX, VACUUM FULL, or REFRESH MATERIALIZED VIEW.  (Don't restore from
pg_dump, since it runs some of those commands.)  Plain VACUUM (without
FULL) is safe, and all commands are fine when a trusted user owns the
target object.  Performance may degrade quickly under this workaround,
however.  Back-patch to 9.5 (all supported versions).

Reviewed by Robert Haas.  Reported by Etienne Stalmans.

Security: CVE-2020-25695
2020-11-09 07:32:09 -08:00
Peter Geoghegan 5a2f154a2e Improve nbtree README's LP_DEAD section.
The description of how LP_DEAD bit setting by index scans works
following commit 2ed5b87f was rather unclear.  Clean that up a bit.

Also refer to LP_DEAD bit setting within _bt_check_unique() at the start
of the same section.  This mechanism may actually be more important than
the generic kill_prior_tuple mechanism that the section focuses on, so
it at least deserves to be mentioned in passing.
2020-11-07 18:51:12 -08:00
Alvaro Herrera 52eec1c53a
Message style improvements
* Avoid pointlessly highlighting that an index vacuum was executed by a
  parallel worker; user doesn't care.

* Don't give the impression that a non-concurrent reindex of an invalid
  index on a TOAST table would work, because it wouldn't.

* Add a "translator:" comment for a mysterious message.

Discussion: https://postgr.es/m/20201107034943.GA16596@alvherre.pgsql
Reviewed-by: Michael Paquier <michael@paquier.xyz>
2020-11-07 19:33:43 -03:00
Tomas Vondra 7577dd8480 Properly detoast data in brin_form_tuple
brin_form_tuple failed to consider the values may be toasted, inserting
the toast pointer into the index. This may easily result in index
corruption, as the toast data may be deleted and cleaned up by vacuum.
The cleanup however does not care about indexes, leaving invalid toast
pointers behind, which triggers errors like this:

  ERROR:  missing chunk number 0 for toast value 16433 in pg_toast_16426

A less severe consequence are inconsistent failures due to the index row
being too large, depending on whether brin_form_tuple operated on plain
or toasted version of the row. For example

    CREATE TABLE t (val TEXT);
    INSERT INTO t VALUES ('... long value ...')
    CREATE INDEX idx ON t USING brin (val);

would likely succeed, as the row would likely include toast pointer.
Switching the order of INSERT and CREATE INDEX would likely fail:

    ERROR:  index row size 8712 exceeds maximum 8152 for index "idx"

because this happens before the row values are toasted.

The bug exists since PostgreSQL 9.5 where BRIN indexes were introduced.
So backpatch all the way back.

Author: Tomas Vondra
Reviewed-by: Alvaro Herrera
Backpatch-through: 9.5
Discussion: https://postgr.es/m/20201001184133.oq5uq75sb45pu3aw@development
Discussion: https://postgr.es/m/20201104010544.zexj52mlldagzowv%40development
2020-11-07 00:39:19 +01:00
Peter Geoghegan efc5dcfd8a Fix wal_consistency_checking nbtree bug.
wal_consistency_checking indicated an inconsistency in certain cases
involving nbtree page deletion.  The underlying issue is that there was
a minor difference between the page image produced after a REDO routine
ran and the corresponding page image following original execution.

This harmless inconsistency has been around forever.  We more or less
expect total consistency among even deleted nbtree pages these days,
though, so this won't do anymore.

To fix, tweak the REDO routine to match original execution.

Oversight in commit f47b5e13.
2020-11-05 15:01:40 -08:00
Peter Geoghegan 48e1291342 Fix nbtree cleanup-only VACUUM stats inaccuracies.
Logic for counting heap TIDs from posting list tuples (added by commit
0d861bbb) was faulty.  It didn't count any TIDs/index tuples in the
event of no callback being set.  This meant that we incorrectly counted
no index tuples in clean-up only VACUUMs, which could lead to
pg_class.reltuples being spuriously set to 0 in affected indexes.

To fix, go back to counting items from the page in cases where there is
no callback.  This approach isn't very accurate, but it works well
enough in practice while avoiding the expense of accessing every index
tuple during cleanup-only VACUUMs.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
https://postgr.es/m/20201023174451.69e358f1@firost
Backpatch: 13-, where nbtree deduplication was introduced
2020-11-04 18:42:27 -08:00
Thomas Munro c732c3f8c1 Fix unlinking of SLRU segments.
Commit dee663f7 intended to drop any queued up fsync requests before
unlinking segment files, but missed a code path.  Fix, by centralizing
the forget-and-unlink code into a single function.

Reported-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Discussion: https://postgr.es/m/20201104013205.icogbi773przyny5%40development
2020-11-05 13:49:49 +13:00
Fujii Masao 113d3591b8 Fix segmentation fault that commit ac22929a26 caused.
Commit ac22929a26 changed recoveryWakeupLatch so that it's reset to
NULL at the end of recovery. This change could cause a segmentation fault
in the buildfarm member 'elver'.

Previously the latch was reset to NULL after calling ShutdownWalRcv().
But there could be a window between ShutdownWalRcv() and the actual
exit of walreceiver. If walreceiver set the latch during that window,
the segmentation fault could happen.

To fix the issue, this commit changes walreceiver so that it sets
the latch only when the latch has not been reset to NULL yet.

Author: Fujii Masao
Discussion: https://postgr.es/m/5c1f8a85-747c-7bf9-241e-dd467d8a3586@iki.fi
2020-11-04 21:49:00 +09:00
Fujii Masao ac22929a26 Get rid of the dedicated latch for signaling the startup process.
This commit gets rid of the dedicated latch for signaling the startup
process in favor of using its procLatch,  since that comports better
with possible generic signal handlers using that latch.

Commit 1e53fe0e70 changed background processes so that they use standard
SIGHUP handler. Like that, this commit also makes the startup process use
standard SIGHUP handler to simplify the code.

Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACXPorUqePswDtOeM_s82v9RW32E1fYmOPZ5NuE+TWKj_A@mail.gmail.com
2020-11-04 16:43:43 +09:00
Peter Eisentraut dd26a0ad76 Use PG_GETARG_TRANSACTIONID where appropriate
Some places were using PG_GETARG_UINT32 where PG_GETARG_TRANSACTIONID
would be more appropriate.  (Of course, they are the same internally,
so there is no externally visible effect.)  To do that, export
PG_GETARG_TRANSACTIONID outside of xid.c.  We also export
PG_RETURN_TRANSACTIONID for symmetry, even though there are currently
no external users.

Author: Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/d8f6bdd536df403b9b33816e9f7e0b9d@G08CNEXMBPEKD05.g08.fujitsu.local
2020-11-02 16:48:22 +01:00
Michael Paquier 8a15e735be Fix some grammar and typos in comments and docs
The documentation fixes are backpatched down to where they apply.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20201031020801.GD3080@telsasoft.com
Backpatch-through: 9.6
2020-11-02 15:14:41 +09:00
Noah Misch f90e80b913 Reproduce debug_query_string==NULL on parallel workers.
Certain background workers initiate parallel queries while
debug_query_string==NULL, at which point they attempted strlen(NULL) and
died to SIGSEGV.  Older debug_query_string observers allow NULL, so do
likewise in these newer ones.  Back-patch to v11, where commit
7de4a1bcc5 introduced the first of these.

Discussion: https://postgr.es/m/20201014022636.GA1962668@rfd.leadboat.com
2020-10-31 08:43:28 -07:00
Heikki Linnakangas 6f0bc5e1da Fix missing validation for the new GiST sortsupport functions.
Because of this, if you tried to create an operator family with the new
sortsupport function, you got an error:

ERROR:  support function number 11 is invalid for access method gist

We missed this in commit 16fa9b2b30 that added the sortsupport function,
because it only added sortsupport to a built-in operator family.

Author: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/3520A18A-5C38-4697-A2E3-F3BDE3496CD5%40yandex-team.ru
2020-10-30 19:30:19 +02:00
Andres Freund 94bc27b576 Centralize horizon determination for temp tables, fixing bug due to skew.
This fixes a bug in the edge case where, for a temp table, heap_page_prune()
can end up with a different horizon than heap_vacuum_rel(). Which can trigger
errors like "ERROR: cannot freeze committed xmax ...".

The bug was introduced due to interaction of a7212be8b9 "Set cutoff xmin more
aggressively when vacuuming a temporary table." with dc7420c2c9 "snapshot
scalability: Don't compute global horizons while building snapshots.".

The problem is caused by lazy_scan_heap() assuming that the only reason its
HeapTupleSatisfiesVacuum() call would return HEAPTUPLE_DEAD is if the tuple is
a HOT tuple, or if the tuple's inserting transaction has aborted since the
heap_page_prune() call. But after a7212be8b9 that was also possible in other
cases for temp tables, because heap_page_prune() uses a different visibility
test after dc7420c2c9.

The fix is fairly simple: Move the special case logic for temp tables from
vacuum_set_xid_limits() to the infrastructure introduced in dc7420c2c9. That
ensures that the horizon used for pruning is at least as aggressive as the one
used by lazy_scan_heap(). The concrete horizon used for temp tables is
slightly different than the logic in dc7420c2c9, but should always be as
aggressive as before (see comments).

A significant benefit to centralizing the logic procarray.c is that now the
more aggressive horizons for temp tables does not just apply to VACUUM but
also to e.g. HOT pruning and the nbtree killtuples logic.

Because isTopLevel is not needed by vacuum_set_xid_limits() anymore, I
undid the the related changes from a7212be8b9.

This commit also adds an isolation test ensuring that the more aggressive
vacuuming and pruning of temp tables keeps working.

Debugged-By: Amit Kapila <amit.kapila16@gmail.com>
Debugged-By: Tom Lane <tgl@sss.pgh.pa.us>
Debugged-By: Ashutosh Sharma <ashu.coek88@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20201014203103.72oke6hqywcyhx7s@alap3.anarazel.de
Discussion: https://postgr.es/m/20201015083735.derdzysdtqdvxshp@alap3.anarazel.de
2020-10-28 18:02:31 -07:00
Robert Haas 866e24d47d Extend amcheck to check heap pages.
Mark Dilger, reviewed by Peter Geoghegan, Andres Freund, Álvaro Herrera,
Michael Paquier, Amul Sul, and by me. Some last-minute cosmetic
revisions by me.

Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com
2020-10-22 08:44:18 -04:00
Alvaro Herrera 93f84d59f8 Revert "Remove pointless HeapTupleHeaderIndicatesMovedPartitions calls"
This reverts commit 85adb5e91e.  It was not intended for commit just
yet.
2020-10-15 15:16:11 -03:00
Alvaro Herrera 85adb5e91e Remove pointless HeapTupleHeaderIndicatesMovedPartitions calls
Pavan Deolasee recently noted that a few of the
HeapTupleHeaderIndicatesMovedPartitions calls added by commit
5db6df0c01 are useless, since they are done after comparing t_self
with t_ctid.  But because t_self can never be set to the magical values
that indicate that the tuple moved partition, this can never succeed: if
the first test fails (so we know t_self equals t_ctid), necessarily the
second test will also fail.

So these checks can be removed and no harm is done.

Discussion: https://postgr.es/m/20200929164411.GA15497@alvherre.pgsql
2020-10-15 14:32:34 -03:00
David Rowley 110d81728a Fixup some appendStringInfo and appendPQExpBuffer calls
A number of places were using appendStringInfo() when they could have been
using appendStringInfoString() instead.  While there's no functionality
change there, it's just more efficient to use appendStringInfoString()
when no formatting is required.  Likewise for some
appendStringInfoString() calls which were just appending a single char.
We can just use appendStringInfoChar() for that.

Additionally, many places were using appendPQExpBuffer() when they could
have used appendPQExpBufferStr(). Change those too.

Patch by Zhijie Hou, but further searching by me found significantly more
places that deserved the same treatment.

Author: Zhijie Hou, David Rowley
Discussion: https://postgr.es/m/cb172cf4361e4c7ba7167429070979d4@G08CNEXMBPEKD05.g08.fujitsu.local
2020-10-15 20:35:17 +13:00
Tom Lane 371668a838 Fix GiST buffering build to work when there are included columns.
gistRelocateBuildBuffersOnSplit did not get the memo about which
attribute count to use.  This could lead to a crash if there were
included columns and buffering build was chosen.  (Because there
are random page-split decisions elsewhere in GiST index build,
the crashes are not entirely deterministic.)

Back-patch to v12 where GiST gained support for included columns.

Pavel Borisov

Discussion: https://postgr.es/m/CALT9ZEECCV5m7wvxg46PC-7x-EybUmnpupBGhSFMoAAay+r6HQ@mail.gmail.com
2020-10-12 18:01:34 -04:00
Tom Lane 78c0b6ed27 Re-allow testing of GiST buffered builds.
Commit 16fa9b2b3 broke the ability to reliably test GiST buffered builds,
because it caused sorted builds to be done instead if sortsupport is
available, regardless of any attempt to override that.  While a would-be
test case could try to work around that by choosing an opclass that has
no sortsupport function, coverage would be silently lost the moment
someone decides it'd be a good idea to add a sortsupport function.

Hence, rearrange the logic in gistbuild() so that if "buffering = on"
is specified in CREATE INDEX, we will use that method, sortsupport or no.

Also document the interaction between sorting and the buffering
parameter, as 16fa9b2b3 failed to do.

(Note that in fact we still lack any test coverage of buffered builds,
but this is a prerequisite to adding a non-fragile test.)

Discussion: https://postgr.es/m/3249980.1602532990@sss.pgh.pa.us
2020-10-12 17:09:50 -04:00
Michael Paquier b90b79e140 Fix typo in multixact.c
AtEOXact_MultiXact() was referenced in two places with an incorrect
routine name.

Author: Hou Zhijie
Discussion: https://postgr.es/m/1b41e9311e8f474cb5a360292f0b3cb1@G08CNEXMBPEKD05.g08.fujitsu.local
2020-10-08 14:06:12 +09:00
Michael Paquier 0a3c864c32 Fix compilation warning in xlog.c
Oversight in 9d0bd95.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/20201006023802.qqfi6m5bw5y77zql@alap3.anarazel.de
2020-10-06 15:29:34 +09:00
Fujii Masao 8d9a935965 Add pg_stat_wal statistics view.
This view shows the statistics about WAL activity. Currently it has only
two columns: wal_buffers_full and stats_reset. wal_buffers_full column
indicates the number of times WAL data was written to the disk because
WAL buffers got full. This information is useful when tuning wal_buffers.
stats_reset column indicates the time at which these statistics were
last reset.

pg_stat_wal view is also the basic infrastructure to expose other
various statistics about WAL activity later.

Bump PGSTAT_FILE_FORMAT_ID due to the change in pgstat format.

Bump catalog version.

Author: Masahiro Ikeda
Reviewed-by: Takayuki Tsunakawa, Kyotaro Horiguchi, Amit Kapila, Fujii Masao
Discussion: https://postgr.es/m/188bd3f2d2233cf97753b5ced02bb050@oss.nttdata.com
2020-10-02 10:17:11 +09:00
Michael Paquier 9d0bd95fa9 Add block information in error context of WAL REDO apply loop
Providing this information can be useful for example when diagnosing
problems related to recovery conflicts or for recovery issues without
having to go through the output generated by pg_waldump to get some
information about the blocks a WAL record works on.

The block information is printed in the same format as pg_waldump.  This
already existed in xlog.c for debugging purposes with -DWAL_DEBUG, so
adding the block information in the callback has required just a small
refactoring.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Masahiko Sawada
Discussion: https://postgr.es/m/c31e2cba-efda-762c-f4ad-5c25e5dac3d0@amazon.com
2020-10-02 09:31:50 +09:00
Heikki Linnakangas 265ea56785 Set right-links during sorted GiST index build.
This is not strictly necessary, as the right-links are only needed by
scans that are concurrent with page splits, and neither scans or page
splits can happen during sorted index build. But it seems like a good
idea to set them anyway, if we e.g. want to add a check to amcheck in
the future to verify that the chain of right-links is complete.

Author: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/4D68C21F-9FB9-41DA-B663-FDFC8D143788%40yandex-team.ru
2020-10-01 11:10:43 +03:00
Thomas Munro dee663f784 Defer flushing of SLRU files.
Previously, we called fsync() after writing out individual pg_xact,
pg_multixact and pg_commit_ts pages due to cache pressure, leading to
regular I/O stalls in user backends and recovery.  Collapse requests for
the same file into a single system call as part of the next checkpoint,
as we already did for relation files, using the infrastructure developed
by commit 3eb77eba.  This can cause a significant improvement to
recovery performance, especially when it's otherwise CPU-bound.

Hoist ProcessSyncRequests() up into CheckPointGuts() to make it clearer
that it applies to all the SLRU mini-buffer-pools as well as the main
buffer pool.  Rearrange things so that data collected in CheckpointStats
includes SLRU activity.

Also remove the Shutdown{CLOG,CommitTS,SUBTRANS,MultiXact}() functions,
because they were redundant after the shutdown checkpoint that
immediately precedes them.  (I'm not sure if they were ever needed, but
they aren't now.)

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (parts)
Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com>
Discussion: https://postgr.es/m/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com
2020-09-25 19:00:15 +12:00
Peter Eisentraut c005eb00e7 Standardize the printf format for st_size
Existing code used various inconsistent ways to printf struct stat's
st_size member.  The type of that is off_t, which is in most cases a
signed 64-bit integer, so use the long long int format for it.
2020-09-24 21:04:21 +02:00
Thomas Munro aca74843e4 Fix missing fsync of SLRU directories.
Harmonize behavior by moving reponsibility for fsyncing directories down
into slru.c.  In 10 and later, only the multixact directories were
missed (see commit 1b02be21), and in older branches all SLRUs were
missed.

Back-patch to all supported releases.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGLtsTUOScnNoSMZ-2ZLv%2BwGh01J6kAo_DM8mTRq1sKdSQ%40mail.gmail.com
2020-09-24 10:39:52 +12:00
Tom Lane 9436041ed8 Copy editing: fix a bunch of misspellings and poor wording.
99% of this is docs, but also a couple of comments.  No code changes.

Justin Pryzby

Discussion: https://postgr.es/m/20200919175804.GE30557@telsasoft.com
2020-09-21 12:43:42 -04:00
Heikki Linnakangas c47a240fe6 Fix checksum calculation in the new sorting GiST build.
Since we're bypassing the buffer manager, we need to call
PageSetChecksumInplace() directly. As reported by Justin Pryzby.

In the passing, add RelationOpenSmgr() calls before all smgrwrite() and
smgrextend() calls. Tom added one before the first smgrextend() call in
commit c2bb287025, which seems to be enough, but let's play it safe and
do it before each one. That's how it's done in the similar code in
nbtsort.c, too.

Discussion: https://www.postgresql.org/message-id/20200920224446.GF30557@telsasoft.com
2020-09-21 14:50:07 +03:00
Tom Lane c2bb287025 Fix new GIST build code to work under CLOBBER_CACHE_ALWAYS.
Can't say if this fixes *all* cases, but at least we get through
the "point" regression test now, which hyrax's last run did not.

Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2020-09-19%2021%3A27%3A23
2020-09-20 17:08:49 -04:00
Amit Kapila 0d32511eca Fix comments in heapam.c.
After commits 85f6b49c2c and 3ba59ccc89, we can allow parallel inserts
which was earlier not possible as parallel group members won't conflict
for relation extension and page lock.  In those commits, we forgot to
update comments at few places.

Author: Amit Kapila
Reviewed-by: Robert Haas and Dilip Kumar
Backpatch-through: 13
Discussion: https://postgr.es/m/CAFiTN-tMrQh5FFMPx5aWJ+1gi1H6JxktEhq5mDwCHgnEO5oBkA@mail.gmail.com
2020-09-18 09:50:44 +05:30
Amit Kapila b7f2dd959a Update parallel BTree scan state when the scan keys can't be satisfied.
For parallel btree scan to work for array of scan keys, it should reach
BTPARALLEL_DONE state once for every distinct combination of array keys.
This is required to ensure that the parallel workers don't try to seize
blocks at the same time for different scan keys. We missed to update this
state when we discovered that the scan keys can't be satisfied.

Author: James Hunter
Reviewed-by: Amit Kapila
Tested-by: Justin Pryzby
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/4248CABC-25E3-4809-B4D0-128E1BAABC3C@amazon.com
2020-09-17 16:11:48 +05:30
Heikki Linnakangas 16fa9b2b30 Add support for building GiST index by sorting.
This adds a new optional support function to the GiST access method:
sortsupport. If it is defined, the GiST index is built by sorting all data
to the order defined by the sortsupport's comparator function, and packing
the tuples in that order to GiST pages. This is similar to how B-tree
index build works, and is much faster than inserting the tuples one by
one. The resulting index is smaller too, because the pages are packed more
tightly, upto 'fillfactor'. The normal build method works by splitting
pages, which tends to lead to more wasted space.

The quality of the resulting index depends on how good the opclass-defined
sort order is. A good order preserves locality of the input data.

As the first user of this facility, add 'sortsupport' function to the
point_ops opclass. It sorts the points in Z-order (aka Morton Code), by
interleaving the bits of the X and Y coordinates.

Author: Andrey Borodin
Reviewed-by: Pavel Borisov, Thomas Munro
Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru
2020-09-17 11:33:40 +03:00
David Rowley 10a5b35a00 Report resource usage at the end of recovery
Reporting this has been rather useful in some recent recovery speedup
work.  It also seems like something that will be useful to the average DBA
too.

Author: David Rowley
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CAApHDvqYVORiZxq2xPvP6_ndmmsTkvr6jSYv4UTNaFa5i1kd%3DQ%40mail.gmail.com
2020-09-16 11:25:46 +12:00
Peter Eisentraut 3e0242b24c Message fixes and style improvements 2020-09-14 06:42:30 +02:00
Alvaro Herrera 9f1cf97bb5
Print WAL logical message contents in pg_waldump
This helps debuggability when looking at WAL streams containing logical
messages.

Author: Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAExHW5sWx49rKmXbg5H1Xc1t+nRv9PaYKQmgw82HPt6vWDVmDg@mail.gmail.com
2020-09-10 19:37:02 -03:00
Tom Lane a5cc4dab6d Yet more elimination of dead stores and useless initializations.
I'm not sure what tool Ranier was using, but the ones I contributed
were found by using a newer version of scan-build than I tried before.

Ranier Vilela and Tom Lane

Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com
2020-09-05 13:17:32 -04:00
Alvaro Herrera f43e295f68
Report expected contrecord length on mismatch
When reading a WAL record fails to find continuation record(s) of the
proper length, report what it expects, for clarity.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20200903212152.GA15319@alvherre.pgsql
2020-09-04 14:58:32 -04:00
Tom Lane 38a2d70329 Remove some more useless assignments.
Found with clang's scan-build tool.  It also whines about a lot of
other dead stores that we should *not* change IMO, either as a matter
of style or future-proofing.  But these places seem like clear
oversights.

Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com
2020-09-04 14:32:19 -04:00
Bruce Momjian e36e936e0e remove redundant initializations
Reported-by: Ranier Vilela

Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com

Author: Ranier Vilela

Backpatch-through: master
2020-09-03 22:57:35 -04:00
Michael Paquier 1d65416661 Improve handling of dropped relations for REINDEX DATABASE/SCHEMA/SYSTEM
When multiple relations are reindexed, a scan of pg_class is done first
to build the list of relations to work on.  However the REINDEX logic
has never checked if a relation listed still exists when beginning the
work on it, causing for example sudden cache lookup failures.

This commit adds safeguards against dropped relations for REINDEX,
similarly to VACUUM or CLUSTER where we try to open the relation,
ignoring it if it is missing.  A new option is added to the REINDEX
routines to control if a missed relation is OK to ignore or not.

An isolation test, based on REINDEX SCHEMA, is added for the concurrent
and non-concurrent cases.

Author: Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/20200813043805.GE11663@paquier.xyz
2020-09-02 09:08:12 +09:00
Tom Lane a7212be8b9 Set cutoff xmin more aggressively when vacuuming a temporary table.
Since other sessions aren't allowed to look into a temporary table
of our own session, we do not need to worry about the global xmin
horizon when setting the vacuum XID cutoff.  Indeed, if we're not
inside a transaction block, we may set oldestXmin to be the next
XID, because there cannot be any in-doubt tuples in a temp table,
nor any tuples that are dead but still visible to some snapshot of
our transaction.  (VACUUM, of course, is never inside a transaction
block; but we need to test that because CLUSTER shares the same code.)

This approach allows us to always clean out a temp table completely
during VACUUM, independently of concurrent activity.  Aside from
being useful in its own right, that simplifies building reproducible
test cases.

Discussion: https://postgr.es/m/3490536.1598629609@sss.pgh.pa.us
2020-09-01 18:40:43 -04:00
Tom Lane 3d351d916b Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.
Historically, we've considered the state with relpages and reltuples
both zero as indicating that we do not know the table's tuple density.
This is problematic because it's impossible to distinguish "never yet
vacuumed" from "vacuumed and seen to be empty".  In particular, a user
cannot use VACUUM or ANALYZE to override the planner's normal heuristic
that an empty table should not be believed to be empty because it is
probably about to get populated.  That heuristic is a good safety
measure, so I don't care to abandon it, but there should be a way to
override it if the table is indeed intended to stay empty.

Hence, represent the initial state of ignorance by setting reltuples
to -1 (relpages is still set to zero), and apply the minimum-ten-pages
heuristic only when reltuples is still -1.  If the table is empty,
VACUUM or ANALYZE (but not CREATE INDEX) will override that to
reltuples = relpages = 0, and then we'll plan on that basis.

This requires a bunch of fiddly little changes, but we can get rid of
some ugly kluges that were formerly needed to maintain the old definition.

One notable point is that FDWs' GetForeignRelSize methods will see
baserel->tuples = -1 when no ANALYZE has been done on the foreign table.
That seems like a net improvement, since those methods were formerly
also in the dark about what baserel->tuples = 0 really meant.  Still,
it is an API change.

I bumped catversion because code predating this change would get confused
by seeing reltuples = -1.

Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
2020-08-30 12:21:51 -04:00
Tom Lane 10564ee02c Fix code for re-finding scan position in a multicolumn GIN index.
collectMatchBitmap() needs to re-find the index tuple it was previously
looking at, after transiently dropping lock on the index page it's on.
The tuple should still exist and be at its prior position or somewhere
to the right of that, since ginvacuum never removes tuples but
concurrent insertions could add one.  However, there was a thinko in
that logic, to the effect of expecting any inserted tuples to have the
same index "attnum" as what we'd been scanning.  Since there's no
physical separation of tuples with different attnums, it's not terribly
hard to devise scenarios where this fails, leading to transient "lost
saved point in index" errors.  (While I've duplicated this with manual
testing, it seems impossible to make a reproducible test case with our
available testing technology.)

Fix by just continuing the scan when the attnum doesn't match.

While here, improve the error message used if we do fail, so that it
matches the wording used in btree for a similar case.

collectMatchBitmap()'s posting-tree code path was previously not
exercised at all by our regression tests.  While I can't make
a regression test that exhibits the bug, I can at least improve
the code coverage here, so do that.  The test case I made for this
is an extension of one added by 4b754d6c1, so it only works in
HEAD and v13; didn't seem worth trying hard to back-patch it.

Per bug #16595 from Jesse Kinkead.  This has been broken since
multicolumn capability was added to GIN (commit 27cb66fdf),
so back-patch to all supported branches.

Discussion: https://postgr.es/m/16595-633118be8eef9ce2@postgresql.org
2020-08-27 17:36:13 -04:00
Amit Kapila 7e453634bb Add additional information in the vacuum error context.
The additional information added will be an offset number for heap
operations. This information will help us in finding the exact tuple due
to which the error has occurred.

Author: Mahendra Singh Thalor and Amit Kapila
Reviewed-by: Sawada Masahiko, Justin Pryzby and Amit Kapila
Discussion: https://postgr.es/m/CAKYtNApK488TDF4bMbw+1QH8HJf9cxdNDXquhU50TK5iv_FtCQ@mail.gmail.com
2020-08-26 09:40:52 +05:30
Amit Kapila a3c66de6c5 Improve the vacuum error context phase information.
We were displaying the wrong phase information for 'info' message in the
index clean up phase because we were switching to the previous phase a bit
early. We were also not displaying context information for heap phase
unless the block number is valid which is fine for error cases but for
messages at 'info' or lower error level it appears to be inconsistent with
index phase information.

Reported-by: Sawada Masahiko
Author: Sawada Masahiko
Reviewed-by: Amit Kapila
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/CA+fd4k4HcbhPnCs7paRTw1K-AHin8y4xKomB9Ru0ATw0UeTy2w@mail.gmail.com
2020-08-24 08:16:19 +05:30
Andres Freund c62a0a49f3 Revert "Make vacuum a bit more verbose to debug BF failure."
This reverts commit 49967da65a.

Enough time has passed that we can be confident that 07f32fcd23
resolved the issue. Therefore we can remove the temporary debugging
aids.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/E1k7tGP-0005V0-5k@gemulon.postgresql.org
2020-08-20 12:59:00 -07:00
Heikki Linnakangas a28d731a11 Mark commit and abort WAL records with XLR_SPECIAL_REL_UPDATE.
If a commit or abort record includes "dropped relfilenodes", then replaying
the record will remove data files. That is surely a "special rel update",
but the records were not marked as such. Fix that, teach pg_rewind to
expect and ignore them, and add a test case to cover it.

It's always been like this, but no backporting for fear of breaking
existing applications. If an application parsed the WAL but was not
handling commit/abort records, it would stop working. That might be a good
thing if it really needed to handle the dropped rels, but it will be caught
when the application is updated to work with PostgreSQL v14 anyway.

Discussion: https://www.postgresql.org/message-id/07b33e2c-46a6-86a1-5f9e-a7da73fddb95%40iki.fi
Reviewed-by: Amit Kapila, Michael Paquier
2020-08-17 10:52:58 +03:00