fix: preserve uncased character positions#736
Conversation
📝 WalkthroughWalkthroughRefactors Alphabet.placeAllWithCaseBeforeAllWithOtherCase to a single-pass stable partition and adds unit tests validating uncased-character position retention and locale-sorted alphabet ordering constraints. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
utils/alphabet.ts (1)
250-256: Clean fix — stable partition correctly preserves uncased positions.The new implementation cleanly expresses the intent:
keepretains uppercase + uncased characters in order (whencaseToComeFirst === 'uppercase'),movecollects lowercase. SinceuppercaseCharacterCodePointis only set when the character itself has an uppercase mapping (i.e. it is lowercase),isOtherCasecorrectly identifies the characters to move to the end, and the outlier-cased-character bug described in the PR (e.g. U+0345 creating a distant slot that gets refilled with an uppercase letter) is avoided because the relative positions of non-moved characters are preserved byfilter.♻️ Optional: single-pass partition
Minor readability/perf nit — avoid calling
isOtherCasetwice per character:- let isOtherCase = (character: Character): boolean => - caseToComeFirst === 'uppercase' ? - character.uppercaseCharacterCodePoint !== undefined - : character.lowercaseCharacterCodePoint !== undefined - let keep = this.characters.filter(character => !isOtherCase(character)) - let move = this.characters.filter(character => isOtherCase(character)) - this.characters = [...keep, ...move] + let otherCaseKey = + caseToComeFirst === 'uppercase' ? + 'uppercaseCharacterCodePoint' + : 'lowercaseCharacterCodePoint' + let keep: Character[] = [] + let move: Character[] = [] + for (let character of this.characters) { + ;(character[otherCaseKey] === undefined ? keep : move).push(character) + } + this.characters = [...keep, ...move]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@utils/alphabet.ts` around lines 250 - 256, The current implementation calls isOtherCase twice per character via two filters (keep and move), which is a minor perf/readability issue; replace the double-filter with a single-pass stable partition: iterate once over this.characters, call isOtherCase(character) a single time and push each character into either keep or move accordingly, then set this.characters = [...keep, ...move]; keep references to caseToComeFirst and isOtherCase so the same selection logic and preservation of uncased positions remain intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@utils/alphabet.ts`:
- Around line 250-256: The current implementation calls isOtherCase twice per
character via two filters (keep and move), which is a minor perf/readability
issue; replace the double-filter with a single-pass stable partition: iterate
once over this.characters, call isOtherCase(character) a single time and push
each character into either keep or move accordingly, then set this.characters =
[...keep, ...move]; keep references to caseToComeFirst and isOtherCase so the
same selection logic and preservation of uncased positions remain intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 03f49245-0079-4156-b680-2c65e7ed25c4
📒 Files selected for processing (3)
test/utils/alphabet.test.tstest/utils/compare/compare-by-custom-sort.test.tsutils/alphabet.ts
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #736 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 224 224
Lines 4402 4400 -2
Branches 1375 1375
=========================================
- Hits 4402 4400 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
azat-io
left a comment
There was a problem hiding this comment.
Thank you for your PR!
Looks good to me. Could you fix ESLint and Prettier errors?
The previous implementation of placeAllWithCaseBeforeAllWithOtherCase collected every cased-character slot across the whole alphabet and refilled them uppercase-first. This meant a single outlier cased character — e.g. the Greek combining mark ͅ(U+0345), which has a case mapping to Ι and lands at an early position under en-US locale collation — created a "cased slot" thousands of positions ahead of the Latin block. The refill then placed 'A' into that distant slot, putting uppercase letters ahead of digits and breaking the expected ordering (e.g. base0A sorting before base09). Rewrite the method as a stable partition: characters in the "other" case move to the end; all other characters (including uncased ones like digits and punctuation) keep their relative positions. This preserves the documented uppercase-first / lowercase-first semantics on purely cased alphabets while leaving uncased characters untouched. Add regression tests covering the partition semantics, the recommended alphabet under en-US locale, and a realistic base16 palette sort that reproduces the original user-facing failure. Signed-off-by: Aaron Campbell <aaron@monkey.org>
8c91778 to
1dcb734
Compare
Yes sorry, done. I also updated the code to retain a single-pass implementation based on a suggestion from the bot here. |
|
@azat-io Thanks for the approval! It’s not merged yet, but perhaps you batch them prior to release (I’m new here). Love this project by the way, it was a lifesaver to sort everything alphabetically in my project after the LLM just placed everything randomly during heavy dev haha. |
|
Released in v5.9.0. |
Description
The previous implementation of placeAllWithCaseBeforeAllWithOtherCase collected every cased-character slot across the whole alphabet and refilled them uppercase-first. This meant a single outlier cased character — e.g. the Greek combining mark ͅ(U+0345), which has a case mapping to Ι and lands at an early position under en-US locale collation — created a "cased slot" thousands of positions ahead of the Latin block. The refill then placed 'A' into that distant slot, putting uppercase letters ahead of digits and breaking the expected ordering (e.g. base0A sorting before base09).
Rewrite the method as a stable partition: characters in the "other" case move to the end; all other characters (including uncased ones like digits and punctuation) keep their relative positions. This preserves the documented uppercase-first / lowercase-first semantics on purely cased alphabets while leaving uncased characters untouched.
Add regression tests covering the partition semantics, the recommended alphabet under en-US locale, and a realistic base16 palette sort that reproduces the original user-facing failure.
Reviewer notes
Here is a concrete demonstration of what I ran into:
i.e. it was incorrectly re-ordering:
base00, ... base09, base0A, base0B, ...to
base0A, base00, ..., base09, base0B, ...which is clearly not what someone would want.
Pre-merge checklist