-
-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Description
Detail Bug Report
Summary
- Context: The
check_text_innerfunction generates LSP diagnostic messages for detected typos, including the text range (start and end positions) where each typo appears. - Bug: The end position of diagnostic ranges is calculated using byte length instead of UTF-16 code unit count.
- Actual vs. expected: When a typo contains multi-byte UTF-8 characters (e.g., "café"), the end position is calculated as
start_position + byte_length, but LSP requires UTF-16 code units, so it should bestart_position + utf16_length. - Impact: Diagnostic ranges are incorrect for typos containing non-ASCII characters, causing misaligned highlighting and incorrect code action ranges in editors.
Code with bug
// crates/typos-lsp/src/lsp.rs:449-455
crate::typos::check_str(buffer, tokenizer, dict, ignore)
.map(|(typo, line_num, line_pos)| {
Diagnostic {
range: Range::new(
Position::new(line_num as u32, line_pos as u32),
Position::new(line_num as u32, (line_pos + typo.typo.len()) as u32), // <-- BUG 🔴 using byte length instead of UTF-16 length
),Logical proof
- LSP positions must be in UTF-16 code units, and this server advertises
UTF16position encoding. line_posis computed as a UTF-16 code unit count:let line_pos = before_typo.chars().map(char::len_utf16).sum();
typo.typois astr/Cow<str>;str::len()returns byte length, not UTF-16 units.- The code adds
UTF-16 line_postoUTF-8 byte length(typo.typo.len()), mixing units, which is incorrect whenever the typo includes non-ASCII characters. - Example: "café" has byte length 5 but UTF-16 length 4. If the typo starts at UTF-16 position 10, the buggy end is 15 instead of 14, leading to a one-unit overshoot in the diagnostic range.
Recommended fix
Replace the byte-length addition with a UTF-16 code unit count:
Position::new(
line_num as u32,
(line_pos + typo.typo.chars().map(|c| c.len_utf16()).sum::<usize>()) as u32 // <-- FIX 🟢 count UTF-16 code units
)Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels