Teach mdnblocks() not to create zero-length files.

It's entirely surprising that mdnblocks() has the side effect of
creating new files on disk, so let's make it not do that.  One
consequence of the old behavior is that, if running on a damaged
cluster that is missing a file, mdnblocks() can recreate the file
and allow a subsequent _mdfd_getseg() for a higher segment to succeed.
This happens because, while mdnblocks() stops when it finds a segment
that is shorter than 1GB, _mdfd_getseg() has no such check, and thus
the empty file created by mdnblocks() can allow it to continue its
traversal and find higher-numbered segments which remain.

It might be a good idea for _mdfd_getseg() to actually verify that
each segment it finds is exactly 1GB before proceeding to the next
one, but that would involve some additional system calls, so for
now I'm just doing this much.

Patch by me, per off-list analysis by Kevin Grittner and Rahila Syed.
Review by Andres Freund.
This commit is contained in:
Robert Haas 2015-12-15 13:57:45 -05:00
parent 6150a1b08a
commit 049469e7e7
1 changed files with 7 additions and 9 deletions

View File

@ -848,17 +848,15 @@ mdnblocks(SMgrRelation reln, ForkNumber forknum)
if (v->mdfd_chain == NULL)
{
/*
* Because we pass O_CREAT, we will create the next segment (with
* zero length) immediately, if the last segment is of length
* RELSEG_SIZE. While perhaps not strictly necessary, this keeps
* the logic simple.
* We used to pass O_CREAT here, but that's has the disadvantage
* that it might create a segment which has vanished through some
* operating system misadventure. In such a case, creating the
* segment here undermine _mdfd_getseg's attempts to notice and
* report an error upon access to a missing segment.
*/
v->mdfd_chain = _mdfd_openseg(reln, forknum, segno, O_CREAT);
v->mdfd_chain = _mdfd_openseg(reln, forknum, segno, 0);
if (v->mdfd_chain == NULL)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m",
_mdfd_segpath(reln, forknum, segno))));
return segno * ((BlockNumber) RELSEG_SIZE);
}
v = v->mdfd_chain;