Commit Graph

38025 Commits

Author SHA1 Message Date
Michael Paquier 8a4237908c Fix snapshot builds during promotion of hot standby node with 2PC
Some specific logic is done at the end of recovery when involving 2PC
transactions:
1) Call RecoverPreparedTransactions(), to recover the state of 2PC
transactions into memory (re-acquire locks, etc.).
2) ShutdownRecoveryTransactionEnvironment(), to move back to normal
operations, mainly cleaning up recovery locks and KnownAssignedXids
(including any 2PC transaction tracked previously).
3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is
the tipping point for any process calling RecoveryInProgress() to check
if the cluster is still in recovery or not.

Any snapshot taken between steps 2) and 3) would be empty, causing any
transaction relying on a snapshot at this point to potentially corrupt
data as there could still be some 2PC transactions to track, with
RecentXmin moving backwards on successive calls to GetSnapshotData() in
the same transaction.

As SharedRecoveryState is the point to take into account to know if it
is safe to discard KnownAssignedXids, this commit moves step 2) after
step 3), so as we can never finish with empty snapshots.

This exists since the introduction of hot standby, so backpatch all the
way down.  The window with incorrect snapshots is extremely small, but I
have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm
member fairywren.  Thomas Munro also found it independently.  Special
thanks to Andres Freund for taking the time to analyze this issue.

Reported-by: Thomas Munro, Michael Paquier
Analyzed-by: Andres Freund
Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de
Backpatch-through: 9.6
2021-10-04 14:05:20 +09:00
Tom Lane a0558cfa39 Fix checking of query type in plpgsql's RETURN QUERY command.
Prior to v14, we insisted that the query in RETURN QUERY be of a type
that returns tuples.  (For instance, INSERT RETURNING was allowed,
but not plain INSERT.)  That happened indirectly because we opened a
cursor for the query, so spi.c checked SPI_is_cursor_plan().  As a
consequence, the error message wasn't terribly on-point, but at least
it was there.

Commit 2f48ede08 lost this detail.  Instead, plain RETURN QUERY
insisted that the query be a SELECT (by checking for SPI_OK_SELECT)
while RETURN QUERY EXECUTE failed to check the query type at all.
Neither of these changes was intended.

The only convenient place to check this in the EXECUTE case is inside
_SPI_execute_plan, because we haven't done parse analysis until then.
So we need to pass down a flag saying whether to enforce that the
query returns tuples.  Fortunately, we can squeeze another boolean
into struct SPIExecuteOptions without an ABI break, since there's
padding space there.  (It's unlikely that any extensions would
already be using this new struct, but preserving ABI in v14 seems
like a smart idea anyway.)

Within spi.c, it seemed like _SPI_execute_plan's parameter list
was already ridiculously long, and I didn't want to make it longer.
So I thought of passing SPIExecuteOptions down as-is, allowing that
parameter list to become much shorter.  This makes the patch a bit
more invasive than it might otherwise be, but it's all internal to
spi.c, so that seems fine.

Per report from Marc Bachmann.  Back-patch to v14 where the
faulty code came in.

Discussion: https://postgr.es/m/1F2F75F0-27DF-406F-848D-8B50C7EEF06A@gmail.com
2021-10-03 13:21:20 -04:00
Peter Geoghegan 2903f1404d Enable deduplication in system catalog indexes.
The "equality implies image equality" opclass infrastructure disallowed
deduplication in system catalog indexes and TOAST indexes before now.
That seemed like the right approach back when the infrastructure was
added by commit 612a1ab7, since ALTER INDEX cannot set deduplicate_items
to 'off' (due to an old implementation restriction).  But that decision
now seems arbitrary at best.  Remove special case handling implementing
this policy.

No catversion bump, since existing catalog indexes will still work.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wz=rYQHFaJ3WYBdK=xgwxKzaiGMSSrh-ZCREa-pS-7Zjew@mail.gmail.com
2021-10-02 17:12:59 -07:00
Tom Lane 9b8d68cc65 Update our mapping of Windows time zone names using CLDR info.
This corrects a bunch of entries in win32_tzmap[], and adds a few
new ones, based on the CLDR project's windowsZones.xml file.
Non-cosmetic changes fall into four main categories:

* Flat-out errors:

US/Aleutan doesn't exist
America/Salvador doesn't exist
Asia/Baku is wrong for Yerevan
Asia/Dhaka (Bangladesh) is wrong for Astana (Kazakhstan)
Europe/Bucharest is wrong for Chisinau
America/Mexico_City is wrong for Chetumal
America/Buenos_Aires is wrong for Cayenne
America/Caracas has its own zone, so poor fit for La Paz
US/Eastern is wrong for Haiti
US/Eastern is wrong for Indiana (East)
Asia/Karachi is wrong for Tashkent
Etc/UTC+12 doesn't exist
Signs of Etc/GMT zones were backwards

* Judgment calls:

(These changes follow CLDR's choices, except for the first one)

Use Europe/London for "Greenwich Standard Time", since that seems much
more likely than Africa/Casablanca to be what people will think that
zone name means.  CLDR has Atlantic/Reykjavik here, but that's no better.

Asia/Shanghai seems a better fit than Hong Kong for "China Standard
Time".

Europe/Sarajevo is now a link to Belgrade, ie "Central Europe Standard
Time"; so use Warsaw for "Central European Standard Time".

America/Sao_Paulo seems more representative than Araguaina for
"E. South America Standard Time".

Africa/Johannesburg seems more representative than Harare for
"South Africa Standard Time".

* New Windows zone names:

"Israel Standard Time"
"Kaliningrad Standard Time"
"Russia Time Zone N" for various N
"Singapore Standard Time"
"South Sudan Standard Time"
"W. Central Africa Standard Time"
"West Bank Standard Time"
"Yukon Standard Time"

Some of these replace older spellings, but I kept the older spellings
too in case our code runs on a machine with the older data.

* Replace aliases (tzdb Links) with underlying city-named zones:

(This tracks tzdb's longstanding practice, and reduces inconsistency
with the rest of the entries, as well as with CLDR.)

US/Alaska
Asia/Kuwait
Asia/Muscat
Canada/Atlantic
Australia/Canberra
Canada/Saskatchewan
US/Central
US/Eastern
US/Hawaii
US/Mountain
Canada/Newfoundland
US/Pacific

Back-patch to all supported branches, as is our usual practice for
time zone data updates.

Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us
2021-10-02 16:05:42 -04:00
Tom Lane ad740067ae Re-alphabetize the win32_tzmap[] array.
The original intent seems to have been to sort case-insensitively
by the Windows zone name, but various changes over the years did
not get that memo.  This commit just moves a few entries to
restore exact alphabetic order, to ease comparison to the outputs
of processing scripts.

Back-patch to all supported branches, as is our usual practice for
time zone data updates.

Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us
2021-10-02 16:05:10 -04:00
Andres Freund 795862c280 Reference test binary using TESTDIR in 001_libpq_pipeline.pl.
The previous approach didn't really work on windows, due to the PATH separator
being ';' not ':'. Instead of making the PATH change more complicated,
reference the binary using the TESTDIR environment.

Reported-By: Andres Freund <andres@anarazel.de>
Suggested-By: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/20210930214040.odkdd42vknvzifm6@alap3.anarazel.de
Backpatch: 14-, where the test was introduced.
2021-10-01 15:30:16 -07:00
Alvaro Herrera c6bc655ee2
Error out if SKIP LOCKED and WITH TIES are both specified
Both bugs #16676[1] and #17141[2] illustrate that the combination of
SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes
to rows returned to other sessions accessing the same row.  Since this
situation is detectable from the syntax and hard to fix otherwise,
forbid for now, with the potential to fix in the future.

[1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org
[2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org

Backpatch-through: 13, where WITH TIES was introduced
Author: David Christensen <david.christensen@crunchydata.com>
Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
2021-10-01 18:29:18 -03:00
Alvaro Herrera d186d233df
Remove unstable, unnecessary test; fix typo
Commit ff9f111bce added some test code that's unportable and doesn't
add meaningful coverage.  Remove it rather than try and get it to work
everywhere.

While at it, fix a typo in a log message added by the aforementioned
commit.

Backpatch to 14.

Discussion: https://postgr.es/m/3000074.1632947632@sss.pgh.pa.us
2021-10-01 18:03:11 -03:00
Daniel Gustafsson 0ded7039fa Fix memory leak in pg_hmac
The intermittent h buffer was not freed, causing it to leak. Backpatch
through 14 where HMAC was refactored to the current API.

Author: Sergey Shinderuk <s.shinderuk@postgrespro.ru>
Discussion: https://postgr.es/m/af07e620-7e28-a742-4637-2bc44aa7c2be@postgrespro.ru
Backpatch-through: 14
2021-10-01 22:47:05 +02:00
Tom Lane 8c1144ba73 Avoid believing incomplete MCV-only stats in get_variable_range().
get_variable_range() would incautiously believe that statistics
containing only an MCV list are sufficient to derive a range estimate.
That's okay for an enum-like column that contains only MCVs, but
otherwise the estimate could be pretty bad.  Make it report that the
range is indeterminate unless the MCVs plus nullfrac account for
the whole table.

I don't think this needs a dedicated test case, since a quick code
coverage check verifies that the existing regression tests traverse
all the alternatives.  There is room to doubt that a future-proof
test case could be built anyway, given that the submitted example
accidentally doesn't fail before v11.

Per bug #17207 from Simon Perepelitsa.  Back-patch to v10.
In principle this has been broken all along, but I'm hesitant to
make such changes in 9.6, since if anyone is unhappy with 9.6.24's
behavior there will be no second chance to fix it.

Discussion: https://postgr.es/m/17207-5265aefa79e333b4@postgresql.org
2021-10-01 14:59:35 -04:00
Tom Lane 7b5d4c29ed Fix Portal snapshot tracking to handle subtransactions properly.
Commit 84f5c2908 forgot to consider the possibility that
EnsurePortalSnapshotExists could run inside a subtransaction with
lifespan shorter than the Portal's.  In that case, the new active
snapshot would be popped at the end of the subtransaction, leaving
a dangling pointer in the Portal, with mayhem ensuing.

To fix, make sure the ActiveSnapshot stack entry is marked with
the same subtransaction nesting level as the associated Portal.
It's certainly safe to do so since we won't be here at all unless
the stack is empty; hence we can't create an out-of-order stack.

Let's also apply this logic in the case where PortalRunUtility
sets portalSnapshot, just to be sure that path can't cause similar
problems.  It's slightly less clear that that path can't create
an out-of-order stack, so add an assertion guarding it.

Report and patch by Bertrand Drouvot (with kibitzing by me).
Back-patch to v11, like the previous commit.

Discussion: https://postgr.es/m/ff82b8c5-77f4-3fe7-6028-fcf3303e82dd@amazon.com
2021-10-01 11:10:12 -04:00
David Rowley 16239c5fdf Ensure interleaved_parts field is always initialized
This field was recently added in db632fbca, however that commit missed one
place where it should have initialized the new field to NULL.  The missed
location is where the PartitionBoundInfo is created for partition-wise
join relations.  Technically there could be interleaved partitions in a
partition-wise join relation, but currently the only optimization we use
this field for only does so for base rels and other member rels.  So just
document that we don't populate this field for join rels.

Reported-by: Amit Langote
Author: Amit Langote, David Rowley
Reviewed-by: Amit Langote, David Rowley
Discussion: https://postgr.es/m/CA+HiwqE76Rps24kwHsd2Cr82Ua07tJC9t9reG0c7ScX9n_xrEA@mail.gmail.com
2021-10-01 15:09:49 +13:00
Tom Lane 20f8671ef6 Remove gratuitous environment dependency in 002_types.pl test.
Computing related timestamps by subtracting "N days" is sensitive
to the prevailing timezone, since we interpret that as "same local
time on the N'th prior day".  Even though the intervals in question
are only two to four days, through remarkable bad luck they managed
to cross the end of Ramadan in 2014, causing the test's output to
change if timezone is set to Africa/Casablanca.  (Maybe in other
Muslim areas as well; I didn't check.)  There's absolutely no reason
for this test to exercise interval subtraction, so just get rid of
that and use plain timestamptz constants representing the intended
values.

Per report from Andres Freund.  Back-patch to v10 where this test
script came in.

Discussion: https://postgr.es/m/20210930183641.7lh4jhvpipvromca@alap3.anarazel.de
2021-09-30 16:23:10 -04:00
Tom Lane b484ddf4d2 Treat ETIMEDOUT as indicating a non-recoverable connection failure.
Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS' list of "errnos that
identify hard failure of a previously-established network connection".
While one could imagine that this is sometimes recoverable, the same
could be said of other entries such as ENETDOWN.

In support of this, handle ETIMEDOUT on par with other socket errors
in relevant infrastructure, such as TranslateSocketError().
(I made a couple of cosmetic adjustments in TranslateSocketError(),
too.)  The code now assumes that ETIMEDOUT is defined everywhere,
which it should be given that POSIX has required it since SUSv2.

Perhaps this should be back-patched, but I'm hesitant to do so given
the lack of previous complaints, and the hazard that there's a small
ABI break on Windows from redefining the symbol.  Even if we decide
to do that, it'd be prudent to let this bake awhile in HEAD first.

Jelte Fennema

Discussion: https://postgr.es/m/AM5PR83MB01782BFF2978505F6D6C559AF7AA9@AM5PR83MB0178.EURPRD83.prod.outlook.com
2021-09-30 14:16:08 -04:00
Alvaro Herrera d03bca4d70
Repair two portability oversights of new test
First, as pointed out by Tom Lane and Michael Paquier, I failed to
realize that Windows' PostgresNode needs an extra pg_hba.conf line
(added by PostgresNode->set_replication_conf, called internally by
->init() when 'allows_streaming=>1' is given -- but I purposefully
omitted that).  I think a good fix should be to have nodes with only
'has_archiving=>1' set up for replication too, but that's a bigger
discussion.  Fix it by calling ->set_replication_conf, which is not
unprecedented, as pointed out by Andrew Dunstan.

I also forgot to uncomment a ->finish() call for a pumpable IPC::Run
file descriptor.  Apparently this is innocuous in almost all platforms.

Backpatch to 14.  The older branches were added this file too, but not
this particular part of the test.

Discussion: https://postgr.es/m/3000074.1632947632@sss.pgh.pa.us
Discussion: https://postgr.es/m/YVT7qwhR8JmC2kfz@paquier.xyz
2021-09-30 10:01:03 -03:00
Peter Eisentraut 14d755b000 psql: Add various tests
Add tests for psql features

- AUTOCOMMIT
- ON_ERROR_ROLLBACK
- ECHO errors

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://www.postgresql.org/message-id/6954328d-96f2-77f7-735f-7ce493a40949%40enterprisedb.com
2021-09-29 23:17:10 +02:00
Alvaro Herrera ff9f111bce
Fix WAL replay in presence of an incomplete record
Physical replication always ships WAL segment files to replicas once
they are complete.  This is a problem if one WAL record is split across
a segment boundary and the primary server crashes before writing down
the segment with the next portion of the WAL record: WAL writing after
crash recovery would happily resume at the point where the broken record
started, overwriting that record ... but any standby or backup may have
already received a copy of that segment, and they are not rewinding.
This causes standbys to stop following the primary after the latter
crashes:
  LOG:  invalid contrecord length 7262 at A8/D9FFFBC8
because the standby is still trying to read the continuation record
(contrecord) for the original long WAL record, but it is not there and
it will never be.  A workaround is to stop the replica, delete the WAL
file, and restart it -- at which point a fresh copy is brought over from
the primary.  But that's pretty labor intensive, and I bet many users
would just give up and re-clone the standby instead.

A fix for this problem was already attempted in commit 515e3d84a0, but
it only addressed the case for the scenario of WAL archiving, so
streaming replication would still be a problem (as well as other things
such as taking a filesystem-level backup while the server is down after
having crashed), and it had performance scalability problems too; so it
had to be reverted.

This commit fixes the problem using an approach suggested by Andres
Freund, whereby the initial portion(s) of the split-up WAL record are
kept, and a special type of WAL record is written where the contrecord
was lost, so that WAL replay in the replica knows to skip the broken
parts.  With this approach, we can continue to stream/archive segment
files as soon as they are complete, and replay of the broken records
will proceed across the crash point without a hitch.

Because a new type of WAL record is added, users should be careful to
upgrade standbys first, primaries later. Otherwise they risk the standby
being unable to start if the primary happens to write such a record.

A new TAP test that exercises this is added, but the portability of it
is yet to be seen.

This has been wrong since the introduction of physical replication, so
backpatch all the way back.  In stable branches, keep the new
XLogReaderState members at the end of the struct, to avoid an ABI
break.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Nathan Bossart <bossartn@amazon.com>
Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql
2021-09-29 11:21:51 -03:00
Fujii Masao 2acb7cc6b5 pgbench: Fix handling of socket errors during benchmark.
Previously socket errors such as invalid socket or socket wait method failures
during benchmark caused pgbench to exit with status 0. Instead, errors during
the run should result in exit status 2.

Back-patch to v12 where pgbench started reporting exit status.

Original complaint and patch by Hayato Kuroda.

Author: Yugo Nagata, Fabien COELHO
Reviewed-by: Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/TYCPR01MB5870057375ACA8A73099C649F5349@TYCPR01MB5870.jpnprd01.prod.outlook.com
2021-09-29 21:48:52 +09:00
Fujii Masao d336747089 pgbench: Correct log level of message output when socket wait method fails.
The failure of socket wait method like "select()" doesn't terminate pgbench.
So the log level of error message when that failure happens should be ERROR.
But previously FATAL was used in that case.

Back-patch to v13 where pgbench started using common logging API.

Author: Yugo Nagata, Fabien COELHO
Reviewed-by: Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/20210617005934.8bd37bf72efd5f1b38e6f482@sraoss.co.jp
2021-09-29 21:46:56 +09:00
Michael Paquier 070d2e19e4 Clarify use of "statistics objects" in the code
The code inconsistently used "statistic object" or "statistics" where
the correct term, as discussed, is actually "statistics object".  This
improves the state of the code to be more consistent.

While on it, fix an incorrect error message introduced in a4d75c8.  This
error should never happen, as the code states, but it would be
misleading.

Author: Justin Pryzby
Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
Backpatch-through: 14
2021-09-29 15:29:38 +09:00
Peter Eisentraut 0b947c3101 Fix incorrect format placeholder 2021-09-29 08:12:23 +02:00
Michael Paquier 5b0b699f74 Refactor output file handling when forking syslogger under EXEC_BACKEND
A forked logging collector in EXEC_BACKEND builds passes down file
descriptors (or HANDLEs in WIN32) through a command for files to be
reopened (for stderr and csvlog).  Some of its logic was duplicated, and
this commit refactors the code with some wrapper routines for file
reopening after forking and fd grabbing when building the command for
the fork.

While on it, this simplifies a use of "long" in the code, introduced by
ab0ba6e to take care of a warning related to MinGW-W64 when mapping a
intptr_t to a printed value.  "long" is 32-bit long on Windows, and
interoperability of Win32 and Win64 ensures that handles are always
32-bit significant, so we can just use "int" for the same result.  This
also makes the new routines more symmetric.

This change makes easier the introduction of new log destinations in the
logging collector, and this is not the only piece of refactoring
planned.  I have tested this change with EXEC_BACKEND on linux, macos,
and of course MSVC (both Win32 and Win64), but not MinGW so the
buildfarm may have something to say here.

Author: Sehrope Sarkuni, Michael Paquier
Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
2021-09-29 10:54:45 +09:00
Magnus Hagander 07f8a9e784 Properly schema-prefix reference to pg_catalog.pg_get_statisticsobjdef_columns
Author: Tatsuro Yamada
Backpatch-through: 14
Discussion: https://www.postgresql.org/message-id/7ad8cd13-db5b-5cf6-8561-dccad1a934cb@nttcom.co.jp
2021-09-28 16:23:18 +02:00
Peter Eisentraut c3b011d991 Support amcheck of sequences
Sequences were left out of the list of relation kinds that
verify_heapam knew how to check, though it is fairly trivial to allow
them.  Doing that, and while at it, updating pg_amcheck to include
sequences in relations matched by table and relation patterns.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/81ad4757-92c1-4aa3-7bee-f609544837e3%40enterprisedb.com
2021-09-28 15:26:25 +02:00
Michael Paquier e767ddcd35 Fix typos and grammar in code comments
Several mistakes have piled in the code comments over the time,
including incorrect grammar, function names and simple typos.  This
commit takes care of a portion of these.

No backpatch is done as this is only cosmetic.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
2021-09-27 14:21:28 +09:00
Peter Geoghegan 895267a326 Remove unneeded nbtree latestRemovedXid comments.
Discussing the low level issue of nbtree VACUUM and recovery conflicts
in btvacuumpage() now seems inappropriate.  The same issue is discussed
in nbtxlog.h, as well as in a comment block above _bt_delitems_vacuum().

The comment block made more sense when it was part of a broader
discussion of nbtree VACUUM "pin scans".  These were removed by commit
9f83468b.
2021-09-26 20:25:14 -07:00
Thomas Munro e6a7600202 Track LLVM 14 API changes.
Only done on the master branch for now to fix build farm animal seawasp
(which tests bleeeding edge PostgreSQL with bleeding edge LLVM).  We can
back-patch a consolidated fix closer to LLVM 14's release, once its API
has stopped moving around.

Discussion: https://postgr.es/m/CA%2BhUKGL%3Dyg6qqgg6W6SAuvRQejditeoDNy-X3b9H_6Fnw8j5Wg%40mail.gmail.com
2021-09-27 10:53:20 +13:00
Tom Lane e94c1a55da Avoid unnecessary division in interval_cmp_value().
Splitting the time field into days and microseconds is pretty
useless when we're just going to recombine those values.
It's unclear if anyone will notice the speedup in real-world
cases, but a cycle shaved is a cycle earned.

Discussion: https://postgr.es/m/2629129.1632675713@sss.pgh.pa.us
2021-09-26 14:24:03 -04:00
Peter Geoghegan ce2a860533 Update obsolete nbtree deletion comments.
_bt_delitems_delete() is no longer the high-level entry point used by
index tuple deletion driven by index tuples whose LP_DEAD bits are set
(now called "simple index tuple deletion").  It became a lower level
routine that's only called by _bt_delitems_delete_check() following
commit d168b66682.
2021-09-25 15:05:56 -07:00
Peter Geoghegan c1a47dfe2e vacuumlazy.c: Remove obsolete 'onecall' comment.
Remove obsolete reference to lazy_vacuum()'s onecall argument.  The
function argument was removed by commit 3499df0dee.

Also remove adjoining comment block that introduces the wraparound
failsafe concept.  Talking about the failsafe here no longer makes
sense, since lazy_vacuum() (and related functions) are no longer the
only place where the failsafe might be triggered.  This has been the
case since commit c242baa4a8 taught VACUUM to consider triggering the
failsafe mechanism during its initial heap scan.
2021-09-25 10:22:53 -07:00
Peter Geoghegan 48064a8d33 nbtree README: Add note about latestRemovedXid.
Point out that index tuple deletion generally needs a latestRemovedXid
value for the deletion operation's WAL record.  This is bound to be the
most expensive part of the whole deletion operation now that it takes
place up front, during original execution.

This was arguably an oversight in commit 558a9165e0, which moved the
work required to generate these values from index deletion REDO routines
to original execution of index deletion operations.
2021-09-24 13:53:48 -07:00
Peter Eisentraut 73aa5e0caf Add missing $Test::Builder::Level settings
One of these was accidentally removed by c50624c.  The others are
added by analogy.

Discussion: https://www.postgresql.org/message-id/ae1143fb-455c-c80f-ed66-78d45bd93303@enterprisedb.com
2021-09-23 23:07:09 +02:00
John Naylor 88b0ae15bc Add exception for unicode_east_asian_fw_table.h to headerscheck also
Followup to a315b19cc
2021-09-23 15:30:25 -04:00
John Naylor a315b19cce Add exception for unicode_east_asian_fw_table.h to cpluspluscheck
unicode_east_asian_fw_table.h should not be compiled standalone, similarly
to unicode_combining_table.h, but cpluspluscheck did not get the memo.

Oversight in bab982161.

Per report from Tom Lane
2021-09-23 15:23:48 -04:00
Alexander Korotkov b92f9f7443 Split macros from visibilitymap.h into a separate header
That allows to include just visibilitymapdefs.h from file.c, and in turn,
remove include of postgres.h from relcache.h.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/20210913232614.czafiubr435l6egi%40alap3.anarazel.de
Author: Alexander Korotkov
Reviewed-by: Andres Freund, Tom Lane, Alvaro Herrera
Backpatch-through: 13
2021-09-23 19:59:03 +03:00
Tomas Vondra ad8a166ca8 Release memory allocated by dependency_degree
Calculating degree of a functional dependency may allocate a lot of
memory - we have released mot of the explicitly allocated memory, but
e.g. detoasted varlena values were left behind. That may be an issue,
because we consider a lot of dependencies (all combinations), and the
detoasting may happen for each one again.

Fixed by calling dependency_degree() in a dedicated context, and
resetting it after each call. We only need the calculated dependency
degree, so we don't need to copy anything.

Backpatch to PostgreSQL 10, where extended statistics were introduced.

Backpatch-through: 10
Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
2021-09-23 18:13:36 +02:00
Tomas Vondra 83772cc78e Free memory after building each statistics object
Until now, all extended statistics on a given relation were built in the
same memory context, without resetting. Some of the memory was released
explicitly, but not all of it - for example memory allocated while
detoasting values is hard to free. This is how it worked since extended
statistics were introduced in PostgreSQL 10, but adding support for
extended stats on expressions made the issue somewhat worse as it
increases the number of statistics to build.

Fixed by adding a memory context which gets reset after building each
statistics object (all the statistics kinds included in it). Resetting
it after building each statistics kind would be even better, but it
would require more invasive changes and copying of results, making it
harder to backpatch.

Backpatch to PostgreSQL 10, where extended statistics were introduced.

Author: Justin Pryzby
Reported-by: Justin Pryzby
Reviewed-by: Tomas Vondra
Backpatch-through: 10
Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
2021-09-23 18:05:10 +02:00
Peter Geoghegan c7aeb775df Document issue with heapam line pointer truncation.
Checking that an offset number isn't past the end of a heap page's line
pointer array was just a defensive sanity check for HOT-chain traversal
code before commit 3c3b8a4b.  It's etrictly necessary now, though.  Add
comments that reference the issue to code in heapam that needs to get it
right.

Per suggestion from Alexander Lakhin.

Discussion: https://postgr.es/m/f76a292c-9170-1aef-91a0-59d9443b99a3@gmail.com
2021-09-22 19:21:36 -07:00
Peter Eisentraut f9ea296031 Make use of PG_INT64_MAX/PG_INT64_MIN
This code was written before those symbols were introduced, but now we
can simplify it.
2021-09-22 07:31:05 +02:00
Amit Kapila 4548c76738 Invalidate all partitions for a partitioned table in publication.
Updates/Deletes on a partition were allowed even without replica identity
after the parent table was added to a publication. This would later lead
to an error on subscribers. The reason was that we were not invalidating
the partition's relcache and the publication information for partitions
was not getting rebuilt. Similarly, we were not invalidating the
partitions' relcache after dropping a partitioned table from a publication
which will prohibit Updates/Deletes on its partition without replica
identity even without any publication.

Reported-by: Haiying Tang
Author: Hou Zhijie and Vignesh C
Reviewed-by: Vignesh C and Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-09-22 08:00:54 +05:30
Amit Kapila 5e77625b26 Add parent table name in an error in reorderbuffer.c.
This can help in troubleshooting the cause of a particular error that can
occur during decoding.

Author: Jeremy Schneider
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/808ed65b-994c-915a-361c-577f088b837f@amazon.com
2021-09-22 07:42:52 +05:30
Peter Geoghegan dd94c2852e Fix "single value strategy" index deletion issue.
It is not appropriate for deduplication to apply single value strategy
when triggered by a bottom-up index deletion pass.  This wastes cycles
because later bottom-up deletion passes will overinterpret older
duplicate tuples that deduplication actually just skipped over "by
design".  It also makes bottom-up deletion much less effective for low
cardinality indexes that happen to cross a meaningless "index has single
key value per leaf page" threshold.

To fix, slightly narrow the conditions under which deduplication's
single value strategy is considered.  We already avoided the strategy
for a unique index, since our high level goal must just be to buy time
for VACUUM to run (not to buy space).  We'll now also avoid it when we
just had a bottom-up pass that reported failure.  The two cases share
the same high level goal, and already overlapped significantly, so this
approach is quite natural.

Oversight in commit d168b666, which added bottom-up index deletion.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznaOvM+Gyj-JQ0X=JxoMDxctDTYjiEuETdAGbF5EUc3MA@mail.gmail.com
Backpatch: 14-, where bottom-up deletion was introduced.
2021-09-21 18:57:32 -07:00
Michael Paquier 1a9d802828 Fix some issues with TAP tests for postgres -C
This addresses two issues with the tests added in 0c39c292 for runtime
GUCs:
- Re-enable the test on Msys.  The test could fail because of \r\n
generated by Msys perl.  0d91c52a has taken care of this issue.
- Allow the test to run in the context of a privileged account.  CIs
running under privileged accounts would fail on permission failures, as
reported by Andres Freund.  This issue is fixed by wrapping the postgres
command within pg_ctl as the latter will take care of any permissions
needed.  The test checking a failure of postgres -C for a runtime
parameter with an instance running is removed, as pg_ctl produces an
unstable error code (no need for a CI to reproduce that).

Discussion: https://postgr.es/m/20210921032040.lyl4lcax37aedx2x@alap3.anarazel.de
2021-09-22 10:13:38 +09:00
Michael Paquier 0d91c52a8f Fix places in TestLib.pm in need of adaptation to the output of Msys perl
Contrary to the output of native perl, Msys perl generates outputs with
CRLFs characters.  There are already places in the TAP code where CRLFs
(\r\n) are automatically converted to LF (\n) on Msys, but we missed a
couple of places when running commands and using their output for
comparison, that would lead to failures.

This problem has been found thanks to the test added in 5adb067 using
TestLib::command_checks_all(), but after a closer look more code paths
were missing a filter.

This is backpatched all the way down to prevent any surprises if a new
test is introduced in stable branches.

Reviewed-by: Andrew Dunstan, Álvaro Herrera
Discussion: https://postgr.es/m/1252480.1631829409@sss.pgh.pa.us
Backpatch-through: 9.6
2021-09-22 08:42:42 +09:00
Tom Lane 4476bcb877 Fix misevaluation of STABLE parameters in CALL within plpgsql.
Before commit 84f5c2908, a STABLE function in a plpgsql CALL
statement's argument list would see an up-to-date snapshot,
because exec_stmt_call would push a new snapshot.  I got rid of
that because the possibility of the snapshot disappearing within
COMMIT made it too hard to manage a snapshot across the CALL
statement.  That's fine so far as the procedure itself goes,
but I forgot to think about the possibility of STABLE functions
within the CALL argument list.  As things now stand, those'll
be executed with the Portal's snapshot as ActiveSnapshot,
keeping them from seeing updates more recent than Portal startup.

(VOLATILE functions don't have a problem because they take their
own snapshots; which indeed is also why the procedure itself
doesn't have a problem.  There are no STABLE procedures.)

We can fix this by pushing a new snapshot transiently within
ExecuteCallStmt itself.  Popping the snapshot before we get
into the procedure proper eliminates the management problem.
The possibly-useless extra snapshot-grab is slightly annoying,
but it's no worse than what happened before 84f5c2908.

Per bug #17199 from Alexander Nawratil.  Back-patch to v11,
like the previous patch.

Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
2021-09-21 19:06:53 -04:00
Alvaro Herrera ade24dab97
Document XLOG_INCLUDE_XID a little better
I noticed that commit 0bead9af48 left this flag undocumented in
XLogSetRecordFlags, which led me to discover that the flag doesn't
actually do what the one comment on it said it does.  Improve the
situation by adding some more comments.

Backpatch to 14, where the aforementioned commit appears.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/202109212119.c3nhfp64t2ql@alvherre.pgsql
2021-09-21 19:47:53 -03:00
Michael Paquier 43c1c4f65e Introduce GUC shared_memory_size_in_huge_pages
This runtime-computed GUC shows the number of huge pages required
for the server's main shared memory area, taking advantage of the
work done in 0c39c29 and 0bd305e.  This is useful for users to estimate
the amount of huge pages required for a server as it becomes possible to
do an estimation without having to start the server and potentially
allocate a large chunk of shared memory.

The number of huge pages is calculated based on the existing GUC
huge_page_size if set, or by using the system's default by looking at
/proc/meminfo on Linux.  There is nothing new here as this commit reuses
the existing calculation methods, and just exposes this information
directly to the user.  The routine calculating the huge page size is
refactored to limit the number of files with platform-specific flags.

This new GUC's name was the most popular choice based on the discussion
done.  This is only supported on Linux.

I have taken the time to test the change on Linux, Windows and MacOS,
though for the last two ones large pages are not supported.  The first
one calculates correctly the number of pages depending on the existing
GUC huge_page_size or the system's default.

Thanks to Andres Freund, Robert Haas, Kyotaro Horiguchi, Tom Lane,
Justin Pryzby (and anybody forgotten here) for the discussion.

Author: Nathan Bossart
Discussion: https://postgr.es/m/F2772387-CE0F-46BF-B5F1-CC55516EB885@amazon.com
2021-09-21 10:31:58 +09:00
Peter Geoghegan 5e6716cde5 Remove overzealous index deletion assertion.
A broken HOT chain is not an unexpected condition, even when the offset
number points past the end of the page's line pointer array.
heap_prune_chain() does not (and never has) treated this condition as
unexpected, so derivative code in heap_index_delete_tuples() shouldn't
do so either.

Oversight in commit 4228817449.

The assertion can probably only fail on Postgres 14 and master.  Earlier
releases don't have commit 3c3b8a4b, which taught VACUUM to truncate the
line pointer array of heap pages.  Backpatch all the same, just to be
consistent.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17197-9438f31f46705182@postgresql.org
Backpatch: 12-, just like commit 4228817449.
2021-09-20 14:26:25 -07:00
Andres Freund 6b9501660c pgstat: Prepare to use mechanism for truncated rels also for droppped rels.
The upcoming shared memory stats patch drops stats for dropped objects in a
transactional manner, rather than removing them later as part of vacuum. This
means that stats for DROP inside a transaction needs to handle aborted
(sub-)transactions similar to TRUNCATE: The stats up to the DROP should be
restored.

Rename the existing infrastructure in preparation.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de
2021-09-20 14:02:48 -07:00
Andres Freund e1f958d759 pgstat: Split out relation stats handling from AtEO[Sub]Xact_PgStat() etc.
An upcoming patch will add additional work to these functions. To avoid the
functions getting too complicated / doing too many things at once, split out
sub-tasks into their own functions.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de
2021-09-20 13:56:16 -07:00