Skip to content

Commit ef281d5

Browse files
zeitgeist87konis
authored andcommitted
nilfs-utils: Fix conflicting data buffer error
Under certain high concurrency loads, NILFS2 can produce a segment that crashes the cleanerd process with a conflicting data buffer error. The segment is perfectly valid and the file system is not corrupted. However, the cleanerd process can no longer be started and the file system will eventually fill up and cannot be used any more. The reason for this crash is, that a single logical segment can contain multiple partial segments. If a block is written in one partial segment and then immediately overwritten in another partial segment, then these blocks have the same inode number, checkpoint number and offset. However, these three numbers are used by the kernel to uniquely identify a block. If the cleaner tries to clean two blocks that point to the exact same buffer_head in the kernel, it creates a conflicting data buffer error. The solution is to detect these blocks and treat them as dead blocks. If vd_period.p_end is equal to the checkpoint number, it means that the block was overwritten within the same logical segment. So it must be dead, and there is another block with the same ino, cno, and offset, which is alive. Signed-off-by: Andreas Rohner <[email protected]> Signed-off-by: Ryusuke Konishi <[email protected]>
1 parent 187ffc0 commit ef281d5

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

lib/gc.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -405,6 +405,19 @@ static int nilfs_vdesc_is_live(const struct nilfs_vdesc *vdesc,
405405
return vdesc->vd_period.p_end == NILFS_CNO_MAX;
406406
}
407407

408+
if (vdesc->vd_period.p_end == vdesc->vd_cno) {
409+
/*
410+
* This block was overwritten in the same logical segment, but
411+
* in a different partial segment. Probably because of
412+
* fdatasync() or a flush to disk.
413+
* Without this check, gc will cause buffer confliction error
414+
* if both partial segments are cleaned at the same time.
415+
* In that case there will be two vdesc with the same ino,
416+
* cno and offset.
417+
*/
418+
return 0;
419+
}
420+
408421
if (vdesc->vd_period.p_end == NILFS_CNO_MAX ||
409422
vdesc->vd_period.p_end > protect)
410423
return 1;

0 commit comments

Comments
 (0)