From 68601985e699adeb267636fd19d3d6113554bd1f Mon Sep 17 00:00:00 2001 From: Fujii Masao Date: Wed, 6 Oct 2021 00:16:03 +0900 Subject: [PATCH] Make recovery report error message when invalid page header is found. Commit 0668719801 changed XLogPageRead() so that it validated the page header, if invalid page header was found reset the error message and retried reading the page, to fix the scenario where streaming standby got stuck at a continuation record. This change hid the error message about invalid page header, which would make it harder for users to investigate what the actual issue was found in WAL. To fix the issue, this commit makes XLogPageRead() report the error message when invalid page header is found. When not in standby mode, an invalid page header should cause recovery to end, not retry reading the page, so XLogPageRead() doesn't need to validate the page header for the retry. Instead, ReadPageInternal() should be responsible for the validation in that case. Therefore this commit changes XLogPageRead() so that if not in standby mode it doesn't validate the page header for the retry. Reported-by: Yugo Nagata Author: Yugo Nagata, Kyotaro Horiguchi Reviewed-by: Ranier Vilela, Fujii Masao Discussion: https://postgr.es/m/20210718045505.32f463ed6c227111038d8ae4@sraoss.co.jp --- src/backend/access/transam/xlog.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index eddb13d13a..26dcc00ac0 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -12423,7 +12423,7 @@ retry: /* * Check the page header immediately, so that we can retry immediately if - * it's not valid. This may seem unnecessary, because XLogReadRecord() + * it's not valid. This may seem unnecessary, because ReadPageInternal() * validates the page header anyway, and would propagate the failure up to * ReadRecord(), which would retry. However, there's a corner case with * continuation records, if a record is split across two pages such that @@ -12447,9 +12447,23 @@ retry: * * Validating the page header is cheap enough that doing it twice * shouldn't be a big deal from a performance point of view. + * + * When not in standby mode, an invalid page header should cause recovery + * to end, not retry reading the page, so we don't need to validate the + * page header here for the retry. Instead, ReadPageInternal() is + * responsible for the validation. */ - if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf)) + if (StandbyMode && + !XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf)) { + /* + * Emit this error right now then retry this page immediately. Use + * errmsg_internal() because the message was already translated. + */ + if (xlogreader->errormsg_buf[0]) + ereport(emode_for_corrupt_record(emode, EndRecPtr), + (errmsg_internal("%s", xlogreader->errormsg_buf))); + /* reset any error XLogReaderValidatePageHeader() might have set */ xlogreader->errormsg_buf[0] = '\0'; goto next_record_is_invalid;