ensure chunk is not out of bound, which will bring data corruption.#18620
ensure chunk is not out of bound, which will bring data corruption.#18620tiehexue wants to merge 2 commits into
Conversation
67b1ce8 to
c1d6078
Compare
c1d6078 to
a618c82
Compare
|
should we concern the zloop test failed in github checks. For a specific test like "ztest -G -VVVVV -K draid -m 0 -r 27 -D 9 -S 2 -R 2 -v 0 -a 9 -C special=random -s 512m -f /mnt/zloop/zloop-run -T 120 -P 60", it is normally failed in different reason in my local dev in both master branch or this PR. And sometimes tests pass. |
Signed-off-by: tiehexue <tiehexue@hotmail.com>
a618c82 to
1c9117f
Compare
I did another force-push to run the zloop again, and it just succeed. |
|
I have doubts about productivity of this. We can't catch all possible bitflips. |
Yes. But this is a bug for decades. This PR does not bring too much overhead, just replacing comparison against CHAIN_END with l_chunk_count, and even using l_chunk_count to replace the macro for counting which should be good for performance. The cost is a new field in memory. |
I have the same doubts. There are countless places in on-disk structures where faults cause horrible things to happen if they were written with a good checksum. Doing something about this particular case requires a reason to think it is common, such as a bug in older releases that enables it. I am not aware of one. |
|
@ryao @amotin hi, would you like to look at #18572 , where I stated how I reproduced the bug, how to test this, and @robn also mentioned that there were a lot similar bugs/issues reported, and he also made a patch but did not merge. So I have to say more for your attention:
|
|
We can't catch all possible bit flips, but if we can efficiently detect and handle internally inconsistent on-disk state we should do so. We already do something similar with I'm not aware of any existing or previously fixed bug which would explain this, but this has been reported often enough over the years I think it's reasonable to include a check for it. |
behlendorf
left a comment
There was a problem hiding this comment.
I'm still working my way through this and will pick it up tomorrow, but generally speaking we should return an error wherever possible instead of logging a debug message which will never be seen.
there is 4-bytes hole before, now 2 left, checked with pahole. Signed-off-by: tiehexue <tiehexue@hotmail.com>
Thanks for your review. Adding a member to a structure is nervous, and luckily it is not a on-disk one. For the debug message, there are two thought: 1) I think when bad things happens, the user would find a directory is not listable, he would check debug message after enable it; 2) these are void methods, I am not sure how to post out errors rather than panic. But panic is not what I want, keep silent, keep the system acting normally as much as possible, may be better. Let me know a better way. |
Nope. Because it is unpredictable. If we assume the errors are possible there, then either make functions return status that will be verified, or panic. Silent return with uninitialized buffer is a request for troubles, that will be impossible to debug. |
Motivation and Context
This PR is to fix #18572.
There is no bound checking in production build in zap_leaf.c, ASSERT is no-op. So if some field corrupted, e.g. memory hardware error, other software bug, zap leaf will go into silently corruption. Refer to #18572 for details.
Description
Assume that the root cause is one bit-flip in fields in CHAIN_END. This is PR do following to avoid and recover from data corruption:
How Has This Been Tested?
Tested in normal cases, creating/deleting.
And use test code to create a corrupted directory, then use new code to do "ls -la", "cp", it works, no soft lockup. But also, there is data lost.
Types of changes
Checklist:
Signed-off-by.