Tested on commit 6cb4e86
As I understand, any "large inline" files (inline files which are too large to fit into cache under current settings) which are open must be "outlined" (evicted into CTZ list) whenever a commit is added to the directory that such a file resides in. However, this process doesn't always work and results in the call that creates the file returning LFS_ERR_CORRUPT.
To reproduce this issue I created the attached PoC. It can be run with argument 0, 1 or 2.
0 is a sanity check that should always pass. In this case we never create a situation that contains "large inline" files.
The test passes printing successful run
1 leads to a situation where a "large inline" file is created. The file is then opened. Then another file is created in the same directory. Creation fails with LFS_ERR_CORRUPT when it shouldn't.
The test fails printing error -84 line 97
2 is similar to 1, but after the "large inline" file is opened, it is seeked to the end of the file, which somehow makes the outlining work correctly. To ensure that the outlining was successful, the file is reopened and verified at the end.
The test passes printing successful run
It uses the RAM block device defined in this repo from bd/lfs_rambd.c.
The PoC code: main.c
I managed to track down the LFS_ERR_CORRUPT value to lfs_bd_read. Below is gdb backtrace from when this incorrect read happens:
#0 lfs_bd_read (lfs=lfs@entry=0x5555555613e0 <lfs>, pcache=pcache@entry=0x0, rcache=rcache@entry=0x7fffffffceb8, hint=4096, block=4294967294, off=0, buffer=0x7fffffffce6f, size=1) at lfs.c:51
#1 0x00005555555573b4 in lfs_file_flushedread (lfs=lfs@entry=0x5555555613e0 <lfs>, file=file@entry=0x7fffffffce70, buffer=buffer@entry=0x7fffffffce6f, size=size@entry=1) at lfs.c:3544
#2 0x000055555555885c in lfs_file_flush (lfs=lfs@entry=0x5555555613e0 <lfs>, file=file@entry=0x555555561360 <file>) at lfs.c:3380
#3 0x00005555555599df in lfs_dir_orphaningcommit (lfs=lfs@entry=0x5555555613e0 <lfs>, dir=0x5555555612ec <file2+12>, attrs=0x7fffffffd020, attrcount=3) at lfs.c:2422
#4 0x000055555555a24d in lfs_dir_commit (lfs=0x5555555613e0 <lfs>, dir=<optimised out>, attrs=<optimised out>, attrcount=<optimised out>) at lfs.c:2604
#5 0x000055555555a888 in lfs_file_opencfg_ (lfs=lfs@entry=0x5555555613e0 <lfs>, file=file@entry=0x5555555612e0 <file2>, path=<optimised out>, path@entry=0x55555555dd54 "/some_file", flags=flags@entry=256, cfg=cfg@entry=0x55555555dfb0 <defaults>)
at lfs.c:3125
#6 0x000055555555b86e in lfs_file_open_ (flags=256, path=0x55555555dd54 "/some_file", file=0x5555555612e0 <file2>, lfs=0x5555555613e0 <lfs>) at lfs.c:3242
#7 0x00005555555553b6 in main (argc=<optimised out>, argv=0x7fffffffd1e8) at main.c:97
It appears that in lfs_file_flushedread the file->block member has value LFS_BLOCK_INLINE, which then gets treated as if it were a valid block number and a call to lfs_bd_read happens.
I understand that "large inline" files are considered somewhat "anomalous", so to say, but if the code has provisions for handling them then they should work correctly. For our use case we use external tools to create an LFS image which is then read on embedded device - setting cache size explicitly is an option, but there is risk of user error if cache size is out of sync between host tools and embedded device.
If this is fixed then I suggest adding a test for this condition to the test suite - currently the inline eviction code path (around lfs.c:2421) is never triggered during the tests.
Tested on commit 6cb4e86
As I understand, any "large inline" files (inline files which are too large to fit into cache under current settings) which are open must be "outlined" (evicted into CTZ list) whenever a commit is added to the directory that such a file resides in. However, this process doesn't always work and results in the call that creates the file returning
LFS_ERR_CORRUPT.To reproduce this issue I created the attached PoC. It can be run with argument
0,1or2.0is a sanity check that should always pass. In this case we never create a situation that contains "large inline" files.The test passes printing
successful run1leads to a situation where a "large inline" file is created. The file is then opened. Then another file is created in the same directory. Creation fails withLFS_ERR_CORRUPTwhen it shouldn't.The test fails printing
error -84 line 972is similar to1, but after the "large inline" file is opened, it is seeked to the end of the file, which somehow makes the outlining work correctly. To ensure that the outlining was successful, the file is reopened and verified at the end.The test passes printing
successful runIt uses the RAM block device defined in this repo from
bd/lfs_rambd.c.The PoC code: main.c
I managed to track down the
LFS_ERR_CORRUPTvalue tolfs_bd_read. Below isgdbbacktrace from when this incorrect read happens:It appears that in
lfs_file_flushedreadthefile->blockmember has valueLFS_BLOCK_INLINE, which then gets treated as if it were a valid block number and a call tolfs_bd_readhappens.I understand that "large inline" files are considered somewhat "anomalous", so to say, but if the code has provisions for handling them then they should work correctly. For our use case we use external tools to create an LFS image which is then read on embedded device - setting cache size explicitly is an option, but there is risk of user error if cache size is out of sync between host tools and embedded device.
If this is fixed then I suggest adding a test for this condition to the test suite - currently the inline eviction code path (around lfs.c:2421) is never triggered during the tests.