Description
Seem like there's an issue when issuing IOs to a relatively small range when verifying headers.
Using fio version:
[ec2-user@ip-172-31-30-241 ~]$ /usr/local/bin/fio --version
fio-3.23
The following job file:
[test]
direct=0
ioengine=libaio
#do_verify=1
verify=crc32c
verify_fatal=1
verify_dump=1
bwavgtime=1000
iopsavgtime=1000
rw=randrw
iodepth=2
bs_unaligned=1
bsrange=512b-1k
time_based=1
size=10m
verify_backlog=10
runtime=6h
serialize_overlap=1
filename=/test/barak.test
Debug log:
verify 19888 fill crc32c io_u 0x2631780, len 854
io 19888 queue: io_u 0x2631780: off=0x9bf600,len=0x356,ddir=1,file=/test/barak.test
io 19888 calling ->commit(), depth 2
io 19888 io_u_queued_complete: min=1
io 19888 getevents: 1
io 19888 complete: io_u 0x26325c0: off=0x31b200,len=0x200,ddir=0,file=/test/barak.test
io 19888 fill: io_u 0x26325c0: off=0x9bf600,len=0x243,ddir=1,file=/test/barak.test
io 19888 prep: io_u 0x26325c0: off=0x9bf600,len=0x243,ddir=1,file=/test/barak.test
io 19888 prep: io_u 0x26325c0: ret=0
verify 19888 fill random bytes len=579
verify 19888 fill crc32c io_u 0x26325c0, len 579
io 19888 queue: io_u 0x26325c0: off=0x9bf600,len=0x243,ddir=1,file=/test/barak.test
io 19888 complete: io_u 0x26325c0: off=0x9bf600,len=0x243,ddir=1,file=/test/barak.test
io 19888 prep: io_u 0x2631780: off=0x9bf600,len=0x356,ddir=0,file=/test/barak.test
io 19888 prep: io_u 0x2631780: ret=0
io 19888 queue: io_u 0x2631780: off=0x9bf600,len=0x356,ddir=0,file=/test/barak.test
io 19888 complete: io_u 0x2631780: off=0x9bf600,len=0x356,ddir=0,file=/test/barak.test
verify: bad header length 579, wanted 854 at file /dev/nvme6n1 offset 10221056, length 854 (requested block: offset=10221056, length=854)
hdr_fail data dumped as nvme6n1.10221056.hdr_fail
As can be seen in debug log fio is issuing two unaligned writes, one with len 854
and the other with len 579
, to the same offset (0x9bf600
)
This means that header len on disk is populated with recent write buflen which is 579, due to the following code with td->o.bs_unaligned=1
:
static unsigned int get_hdr_inc(struct thread_data *td, struct io_u *io_u)
{
unsigned int hdr_inc;
/*
* If we use bs_unaligned, buflen can be larger than the verify
* interval (which just defaults to the smallest blocksize possible).
*/
hdr_inc = io_u->buflen;
if (td->o.verify_interval && td->o.verify_interval <= io_u->buflen &&
!td->o.bs_unaligned)
hdr_inc = td->o.verify_interval;
return hdr_inc;
}
When bs_unaligned=1
header len is set to submitted buflen. However on overwrite to the same block buflen may differ. This problem does not exist with bs_unaligned=0
as header len will be fixed to the verify_interval the user sets in config.
I believe the straightforward way to fix this will be not to verify header when bs_unaligned=1
. Please lmk if I'm missing something (e.g. my jobfile isn't valid). I've verified the same workload works when bs_unaligned isn't set.