-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Fix duplicated definitions #14573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix duplicated definitions #14573
Conversation
I used an auxiliar order |
helix-term/src/commands/lsp.rs
Outdated
if location.offset_encoding > existing.offset_encoding { | ||
*existing = location; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This handling of offset encoding here is not correct. Offset encoding controls how the character
offset in a lsp::Position
(within lsp::Range
, within Location
) should be interpreted: either as a byte offset (UTF-8), UTF-16 code unit offset or character offset (UTF-32). We can't know how a Location
using one offset encoding compares to a Location
using another offset encoding until we read the file's contents.
Instead let's not attempt to deduplicate locations using different offset encodings, i.e. only deduplicate when the full Location
is equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we depuplicate only when the full Location
is equal, this does not solve the problem in #14551, since the OffsetEncoding
is the only difference between the Location
s given by the different LSPs. If I remove the curent behavior, the user won't feel any difference between before and after the fix, because they will still be duplicated in the picker.
Also, when actually following the definitions, I couldn't find a difference between cursor or buffer position in either encoding while testing.
How do you think we should handle that? Are there two known LSPs that provide different results when following Location
s where the OffsetEncoding
is different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, considering this, I used the highest OffsetEncoding
as the default, but we could just as well use the first one that appears, if no difference is actually detectable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You wouldn't notice the difference on ASCII text. For an example, if a line contains a character like 🏴 (U+1F3F4
) then the byte offset after the '🏴' would be 4 for UTF-8, 2 for UTF-16 and 1 for UTF-32. So if the contents of a line are not all ASCII then you can't know that two lsp::Range
s with different offset encodings are equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So, the duplicates are expected behavior, then? If the whole Location
should be taken into account, I can use a IndexSet<Location>
and simplify the function even further.
Is this PR still useful or should I original issue be closed concluding that it's expected behavior?
Fixes #14551.
Deduplicates definitions' locations before rendering. When a collision happens for the locations using their URIs and ranges as keys, the one with the highest offset encoding is used (chosen because
OffsetEncoding::Utf16
is the highest in the enum, and also the default)