[fix](bug) Resolve the crash issue during string hash computation #48580
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Problem Summary:
problem
When handling aggregate queries, the BE server may crash. The stack trace shows that the crash occurs in the doris::CRC32Hash::operator() function when calculating the hash value of a StringRef object. This issue is more likely to be triggered when the size of the StringRef is not a multiple of 8 (e.g., 15 bytes).
cause analysis
through GDB debugging, we found:
3-1. the crash occurs during the hash table resize process, when calculating the hash value of a StringRef
3-2. he issue is that the size of the StringRef is 15 bytes, and the data pointer is valid
doris::vectorized::UInt64 word = unaligned_load<doris::vectorized::UInt64>(end - 8);
when the size of the StringRef is 15, the code attempts to read 8 bytes starting from data+7. Although these bytes exist in memory, such unaligned memory access may cause a crash on some architectures
GDB
the 8 bytes starting from data+7 span across two memory lines, which may cause unaligned access issues
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)