Every source implementation uses map_while(Result::ok) to iterate over lines, which means any I/O error mid-read (corrupt file, disk failure, invalid UTF-8 in binary files) silently terminates the iterator instead of returning an error.
For example in source/file.rs:
reader
.lines()
.map_while(Result::ok)
.filter(|line| !line.is_empty()),
Same pattern in stdin.rs, url.rs, aspell.rs, and seclists.rs.
If you accidentally point this at a file with some binary content mixed in (common with raw wordlists), the read stops at the first non-UTF-8 line and you get a partial build with no indication anything went wrong. The build command will report however many words it managed to read as if that was the complete input.
A better approach would be to either skip invalid lines with a warning counter, or propagate the error and let the user decide. The Source::words() return type already wraps Result, but the actual errors never surface through it.
Every source implementation uses
map_while(Result::ok)to iterate over lines, which means any I/O error mid-read (corrupt file, disk failure, invalid UTF-8 in binary files) silently terminates the iterator instead of returning an error.For example in
source/file.rs:reader .lines() .map_while(Result::ok) .filter(|line| !line.is_empty()),Same pattern in
stdin.rs,url.rs,aspell.rs, andseclists.rs.If you accidentally point this at a file with some binary content mixed in (common with raw wordlists), the read stops at the first non-UTF-8 line and you get a partial build with no indication anything went wrong. The
buildcommand will report however many words it managed to read as if that was the complete input.A better approach would be to either skip invalid lines with a warning counter, or propagate the error and let the user decide. The
Source::words()return type already wrapsResult, but the actual errors never surface through it.