-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Rework TextEdit
arrow navigation to handle Unicode graphemes
#5812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework TextEdit
arrow navigation to handle Unicode graphemes
#5812
Conversation
Preview available at https://egui-pr-preview.github.io/pr/5812-unicode-grapheme-navigation |
I did a quick check, and this increases the .wasm size by ~50 kB, which I think is acceptable (it's because of the tables here: https://github.com/unicode-rs/unicode-segmentation/blob/master/src/tables.rs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Than you for working on this!
Does this fully close #62 ?
Please add a some unit tests of this feature so that we know it works, and that it won't break again 🙏
I just reworked the word splitting because I found out it complete fell apart around emojis. Then I saw the The new unicode implementation may be useful, if used at a larger scale in the future (not just for word splitting in text edit). But currently the local-only effect of the dependency may not be worth what it brings compared to allowing non-ASCII characters in the existing implementation. |
This may end up covering the same ground as #5784. |
That's true, though I think this can be finalized and merged and later replaced by #5784. |
@valadaptive do you think merging this PR will help or hinder your parley work? |
I think I'm going to need to redo it from scratch anyway, so go ahead and merge this. |
# Conflicts: # crates/egui/src/text_selection/text_cursor_state.rs # crates/egui/src/widgets/text_edit/text_buffer.rs
Previously, navigating text in
TextEdit
with Ctrl + left/right arrow would jump inside words that contained combining characters (i.e. diacritics). This PR introduces new dependency ofunicode-segmentation
to handle grapheme encoding. The new implementation ignores whitespace and other separators such as-
(dash) between words, but respects_
(underscore).