-
Notifications
You must be signed in to change notification settings - Fork 284
feat(core)!: more explicit handling of case-sensitivity in dictionaries #2630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
86xsk
wants to merge
95
commits into
Automattic:master
Choose a base branch
from
86xsk:fix-dict-casing2
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,692
−1,653
Draft
Changes from 29 commits
Commits
Show all changes
95 commits
Select commit
Hold shift + click to select a range
008c534
test(core): add failing test
86xsk a5f4a1b
test(comments): don't expect 'lin' to be marked as a spelling error
86xsk 5face23
test(core): move tests
86xsk fe37ba6
test(core): don't expect `SpellCheck` to mark capitalization issues
86xsk 6f02acd
deps(core): add `indexmap`
86xsk 03b1f5b
feat(core)!: more explicit handling of case-sensitivity in dictionaries
86xsk ce666c5
chore: update snapshots
86xsk 4443a4e
Partially revert "fix(core): PR getting flagged as 'misspelled' (#2476)"
86xsk e982f23
test(core): merge tests and add test
86xsk 5688aee
Merge branch 'master' into fix-dict-casing2
86xsk 0a1e2d4
test(core): move test
86xsk cf9a90a
fix(core): fix logic in `OrthographicConsistency`
86xsk 007df6e
test(core): add failing test
86xsk 7518350
fix(core): allow all case-variants in `OrthographicConsistency`
86xsk 3c9d54e
test(core): remove Lego -> LEGO test in `OrthographicConsistency`
86xsk 8b426d9
chore: update snapshots
86xsk 3381fba
test(core): add test
86xsk e11a2d6
test(core): fix incorrect test expectation
86xsk b23f652
refactor(core): appease Clippy
86xsk 3f068bb
feat(core): support multiple `derived_from`
86xsk 18ba296
perf(core): reduce Vec cloning
86xsk aeba563
refactor(core): reuse code from similar function
86xsk f30cfff
Merge branch 'master' into fix-dict-casing2
86xsk 32ce68c
refactor(core): remove dead code
86xsk 5bd11c1
Merge branch 'master' into fix-dict-casing2
86xsk a9d3f75
Merge branch 'master' into fix-dict-casing2
86xsk a4709d6
Merge branch 'master' into fix-dict-casing2
86xsk 028a39b
Merge branch 'master' into fix-dict-casing2
86xsk 0a8a93a
fix(core): suggest "need" for "ned"
86xsk bcfea8f
fix(core): make `SpellCheck` case-sensitive again
86xsk 57c8562
Revert "test(comments): don't expect 'lin' to be marked as a spelling…
86xsk 9313f00
Revert "test(core): don't expect `SpellCheck` to mark capitalization …
86xsk 11842a3
refactor(core): split word ID structs into separate files
86xsk 34462b8
docs(core): fix grammar
86xsk 1a53a39
refactor(core): make `WordIdPair` `pub(crate)`
86xsk 1053d6f
style(core): fix whitespace in `dictionary.rs`
86xsk a77437d
perf(core): add early exits for URL lexing
86xsk 7f93814
perf(core): early exit in `lex_email_address`
86xsk e7d9c85
refactor(core): simplify code
86xsk acde3e5
style(core): reorder imports
86xsk a5196da
perf(core): cache `WordSet` in `ModalVerb`
86xsk 8c64ce3
refactor(core): replace `ModalVerb::init` function
86xsk 092c5b4
Merge branch 'master' into fix-dict-casing2
86xsk 8a1e0da
Merge branch 'master' into fix-dict-casing2
86xsk c72cb68
Merge branch 'master' into fix-dict-casing2
86xsk f6b6fa4
refactor(core): default impls for `Dictionary` str fns
86xsk 9ba8fce
refactor(core)!: return `WordMapEntry` from `Dictionary`
86xsk db7e0e2
refactor(core): avoid unnecessary cloning
86xsk 44aea1e
refactor(core): rename `get_correct_capitalization_of`
86xsk d30db60
refactor(core): default impl for `get_correct_capitalizations_of`
86xsk ce349b2
refactor(core): de-Arc `MutableDictionary::curated`
86xsk 2d4c501
refactor(core): de-Arc `FstDictionary::curated`
86xsk 68449db
refactor: take argument by value instead of mut ref
86xsk 85f04d7
refactor!: remove pointless `Box` in `CollapseIdentifiers::new`
86xsk 63368ef
refactor!: don't refcount/`thread_local!` read-only statics
86xsk ea05ae2
perf: use `dyn` in place of `impl`
86xsk 823d21e
docs(core): add documentation for `WordMap`
86xsk 602cc10
refactor(core): move curated dictionary init to `word_map`
86xsk 777de5d
refactor(core): remove pointless Arc in `FstDictionary`
86xsk 4144cae
refactor(core): rename `word_map` to `fst_map` in `FstDictionary`
86xsk 1fc8691
style(core): rearrange lines
86xsk 2c5673a
refactor(core): remove unused argument/member
86xsk fd71cab
refactor(core): impl `Dictionary` for `WordMap`
86xsk df628c4
refactor: remove redundant `self::` in paths
86xsk 7bde37d
feat(core): add `WordMap::curated`
86xsk e772f86
refactor(core): `WordMap` instead of `FstDictionary` in `MergeableWords`
86xsk aba2cd8
perf(core): avoid conversion between string and char array
86xsk 2046a6f
perf(core): specialize `get_word_metadata_combined` for `WordMap`
86xsk 2e75187
feat(core): create `WordMap::is_empty`
86xsk f75ac16
refactor(core): use `WordMap` in more places
86xsk 76069db
refactor(core): fix inconsistent casing
86xsk 7a8b43b
feat(core): create `Dictionary::get_word_map`
86xsk 3a9c4bb
refactor(core)!: create `CommonDictFuncs`
86xsk 516110f
refactor(core): avoid generics and use `WordMap` in more places
86xsk 0605d95
refactor(core): fix warning by removing pointless borrow
86xsk 5770756
refactor(core): remove `MutableDictionary`; alias as `WordMap`
86xsk 727ebde
feat(core)!: create `WordMap::to_fst`
86xsk 3e14c4a
refactor(core): `impl Extend<WordMapEntry> for WordMap`
86xsk 99cb6d1
refactor(core): move `WordMapEntry` to its own module
86xsk d969a77
refactor(core): absorb `MutableDictionary` functions into `WordMap`
86xsk 1831ccf
refactor(core): clean up code in `FstDictionary`
86xsk 48df22f
feat(core): add std trait impls for `WordMap`
86xsk 6d7e5ff
perf(core): avoid storing duplicated data in `FstDictionary`
86xsk 9b84c61
perf(core)!: change `FstDictionary::new` to take `WordMap`
86xsk cb69409
refactor(core): remove unused import
86xsk 0f18056
style(core): run `cargo fmt`
86xsk 207a041
Merge branch 'master' into fix-dict-casing2
86xsk 9493c16
Merge branch 'fix-dict-casing2' into fix-dict-casing2-refactor-dictio…
86xsk e541ea3
refactor(core): remove pointless borrow
86xsk 9f05228
fix(core): make certain statics `thread_local` again
86xsk 6fa0123
refactor(core): simplify code
86xsk a5c31df
refactor(core): `Lrc<[char]>` instead of `Lrc<Vec<char>>` in `Document`
86xsk 63aaaa3
perf(core): optimizations in `LintGroup`
86xsk ccc72e6
Merge branch 'master' into fix-dict-casing2
86xsk c2d915b
fix(wasm): use `FstDictionary::curated()`
86xsk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| use std::iter::Extend; | ||
| use std::slice::Iter; | ||
|
|
||
| use serde::{Deserialize, Serialize}; | ||
|
|
||
| use crate::spell::CanonicalWordId; | ||
|
|
||
| /// A container for storing word IDs that a word is considered to be derived from. | ||
| #[derive(Debug, Default, Clone, PartialEq, Eq, Serialize, Deserialize, PartialOrd, Hash)] | ||
| pub struct DerivedFrom { | ||
| inner: Vec<CanonicalWordId>, | ||
| } | ||
|
|
||
| impl DerivedFrom { | ||
| /// Insert another word ID, if it's not already contained in the list. | ||
| /// | ||
| /// If it is already contained in the list, it's quietly ignored. | ||
| pub fn insert(&mut self, id: CanonicalWordId) { | ||
| if !self.contains(id) { | ||
| self.inner.push(id); | ||
| } | ||
| } | ||
|
|
||
| /// Is the list empty? In other words, Does this word have no known words it's derived from? | ||
| pub fn is_empty(&self) -> bool { | ||
| self.inner.is_empty() | ||
| } | ||
|
|
||
| /// Is this word derived from the word represented by `id`? | ||
| pub fn contains(&self, id: CanonicalWordId) -> bool { | ||
| self.inner.contains(&id) | ||
| } | ||
|
|
||
| /// Create a new `DerivedFrom` containing a single initial word ID. | ||
| pub fn from_canonical_word_id(word_id: CanonicalWordId) -> Self { | ||
| Self { | ||
| inner: vec![word_id], | ||
| } | ||
| } | ||
|
|
||
| /// Get an iterator of the contained [`CanonicalWordId`]. | ||
| pub fn iter(&self) -> Iter<'_, CanonicalWordId> { | ||
| self.inner.iter() | ||
| } | ||
| } | ||
|
|
||
| impl Extend<CanonicalWordId> for DerivedFrom { | ||
| fn extend<T: IntoIterator<Item = CanonicalWordId>>(&mut self, iter: T) { | ||
| // Extend additional word ID's, as long as they don't already exist. | ||
| // This is intended to emulate the behavior of a `HashSet`. | ||
| iter.into_iter().for_each(|canonical_word_id| { | ||
| self.insert(canonical_word_id); | ||
| }); | ||
| } | ||
| } | ||
|
|
||
| impl<'a> Extend<&'a CanonicalWordId> for DerivedFrom { | ||
| fn extend<T: IntoIterator<Item = &'a CanonicalWordId>>(&mut self, iter: T) { | ||
| // Extend additional word ID's, as long as they don't already exist. | ||
| // This is intended to emulate the behavior of a `HashSet`. | ||
| iter.into_iter().copied().for_each(|canonical_word_id| { | ||
| self.insert(canonical_word_id); | ||
| }); | ||
| } | ||
| } | ||
|
|
||
| impl IntoIterator for DerivedFrom { | ||
| type Item = CanonicalWordId; | ||
|
|
||
| type IntoIter = std::vec::IntoIter<Self::Item>; | ||
|
|
||
| fn into_iter(self) -> Self::IntoIter { | ||
| self.inner.into_iter() | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.