-
Notifications
You must be signed in to change notification settings - Fork 285
fix: improve Unicode width calculation for emoji alignment #563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix: improve Unicode width calculation for emoji alignment #563
Conversation
|
It seems that the // checkAsianCharacter checks if the character is an Asian character (character of 2 width)
func checkAsianCharacter(r rune) bool {
if unicode.Is(unicode.Han, r) || // CJK characters
unicode.Is(unicode.Hangul, r) || // Korean Hangul characters
(r >= 0x3130 && r <= 0x318F) || // Hangul Compatibility Jamo (ㄱ-ㅎ, ㅏ-ㅣ)
(r >= 0x1100 && r <= 0x11FF) || // Korean Hangul Jamo (ㄱ-ㅎ, ㅏ-ㅣ)
(r >= 0x3200 && r <= 0x32FF) || // Enclosed CJK Letters and Months
unicode.Is(unicode.Hiragana, r) || // Japanese Hiragana characters
unicode.Is(unicode.Katakana, r) { // Japanese Katakana characters
return true
}
return false
}
// containsComplexUnicode checks if string contains emoji or complex Unicode
func containsComplexUnicode(s string) bool {
for _, r := range s {
// Check for emoji ranges
if (r >= 0x1F600 && r <= 0x1F64F) || // Emoticons
(r >= 0x1F300 && r <= 0x1F5FF) || // Misc Symbols and Pictographs
(r >= 0x1F680 && r <= 0x1F6FF) || // Transport and Map Symbols
(r >= 0x1F700 && r <= 0x1F77F) || // Alchemical Symbols
(r >= 0x2600 && r <= 0x26FF) || // Miscellaneous Symbols
(r >= 0x2700 && r <= 0x27BF) || // Dingbats
(r >= 0x23E9 && r <= 0x23FA) || // Symbols like ⏰
checkAsianCharacter(r) ||
r > 0x3000 { // Other wide characters
return true
}
}
return false
}Thank you. |
Improved Unicode width calculation for Korean and Japanese characters by adding dedicated checkAsianCharacter helper function. Changes: - Add checkAsianCharacter() with comprehensive Korean/Japanese ranges: * Korean Hangul (unicode.Hangul) * Korean Hangul Jamo (0x1100-0x11FF) * Korean Hangul Compatibility Jamo (0x3130-0x318F) * Enclosed CJK Letters (0x3200-0x32FF) * Japanese Hiragana (unicode.Hiragana) * Japanese Katakana (unicode.Katakana) - Add Miscellaneous Technical emoji range (0x2300-0x23FF) for clock symbols and similar emoji - Add comprehensive tests for Korean/Japanese character detection - Add TestCheckAsianCharacter for validating the helper function Credit: Implementation based on iblea's code review suggestion on PR charmbracelet#563 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Per iblea's suggestion in PR charmbracelet#563, added comprehensive Korean and Japanese character detection with checkAsianCharacter() helper function covering: - Korean Hangul (unicode.Hangul) - Korean Jamo ranges (0x1100-0x11FF, 0x3130-0x318F) - Japanese Hiragana and Katakana (unicode.Hiragana, unicode.Katakana) - Enclosed CJK Letters (0x3200-0x32FF) Key insight discovered during implementation: ansi.StringWidth already handles CJK characters correctly, so we only need the runewidth fallback for emoji and special symbols. This keeps table rendering consistent while improving emoji support. Changes: - Simplified stringWidth() to always use fallback for emoji - Removed CJK from containsComplexUnicode() detection - Updated tests to reflect that CJK is handled by ansi.StringWidth - All tests pass including table width constraints 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Add fallback calculation using go-runewidth for better emoji support - Smart detection of complex Unicode characters (emoji, CJK, etc.) - Maintain existing ANSI sequence handling for compatibility - Add comprehensive test suite covering emoji and Unicode edge cases - Performance optimized: fallback only triggers for problematic strings Fixes layout misalignment issues when using emoji/Unicode in TUI boxes. Before: emoji boxes had incorrect dimensions causing visual artifacts After: consistent alignment across ASCII and Unicode content Closes #XXX
Improved Unicode width calculation for Korean and Japanese characters by adding dedicated checkAsianCharacter helper function. Changes: - Add checkAsianCharacter() with comprehensive Korean/Japanese ranges: * Korean Hangul (unicode.Hangul) * Korean Hangul Jamo (0x1100-0x11FF) * Korean Hangul Compatibility Jamo (0x3130-0x318F) * Enclosed CJK Letters (0x3200-0x32FF) * Japanese Hiragana (unicode.Hiragana) * Japanese Katakana (unicode.Katakana) - Add Miscellaneous Technical emoji range (0x2300-0x23FF) for clock symbols and similar emoji - Add comprehensive tests for Korean/Japanese character detection - Add TestCheckAsianCharacter for validating the helper function Credit: Implementation based on iblea's code review suggestion on PR charmbracelet#563 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Per iblea's suggestion in PR charmbracelet#563, added comprehensive Korean and Japanese character detection with checkAsianCharacter() helper function covering: - Korean Hangul (unicode.Hangul) - Korean Jamo ranges (0x1100-0x11FF, 0x3130-0x318F) - Japanese Hiragana and Katakana (unicode.Hiragana, unicode.Katakana) - Enclosed CJK Letters (0x3200-0x32FF) Key insight discovered during implementation: ansi.StringWidth already handles CJK characters correctly, so we only need the runewidth fallback for emoji and special symbols. This keeps table rendering consistent while improving emoji support. Changes: - Simplified stringWidth() to always use fallback for emoji - Removed CJK from containsComplexUnicode() detection - Updated tests to reflect that CJK is handled by ansi.StringWidth - All tests pass including table width constraints 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
3f2fc96 to
1731e9c
Compare
Thanks @iblea for the excellent suggestion! 👍 I've implemented the
Key finding during implementation: All tests pass ✅ including table width constraints. The PR is now rebased on latest master with the updated |
The runewidth package is now directly used in size.go for fallback width calculation, so it should be a direct dependency, not indirect.
Port of the Unicode width improvements to v2 branch, addressing Korean character rendering issues reported in opencode project (sst/opencode#2013). Changes: - Add comprehensive Korean/Japanese character detection via checkAsianCharacter() - Korean Hangul (unicode.Hangul) + Jamo ranges - Japanese Hiragana & Katakana - Enclosed CJK Letters (0x3200-0x32FF) - Implement emoji-specific width calculation fallback using go-runewidth - Detect emoji ranges (Emoticons, Symbols, Dingbats, etc.) - Use runewidth for accurate emoji width when detected - ansi.StringWidth already handles CJK correctly - Add comprehensive Unicode width tests - Test emoji width calculation - Test CJK character detection - Test Korean/Japanese character identification This should help resolve Korean character disappearing issues in terminal emulators like WezTerm and Ghostty. Related: charmbracelet#563, sst/opencode#2013 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
|
For my own curiosity, is simply using It also offers a |
|
@kolkov I don't see any problems with the current v2 implementation. Using this example below on Apple Terminal:
package main
import "github.com/charmbracelet/lipgloss/v2"
func main() {
box1 := lipgloss.NewStyle().Border(lipgloss.NormalBorder()).Width(15).Padding(0, 1)
box2 := lipgloss.NewStyle().Border(lipgloss.NormalBorder()).Width(25).Padding(0, 1)
txt1 := "[*] ASCII"
txt2 := "Test"
lin1 := "👨🏾🌾 Emoji"
lin2 := txt2
view := lipgloss.JoinHorizontal(lipgloss.Left,
box1.Render(
lipgloss.JoinVertical(lipgloss.Top,
txt1,
txt2,
),
),
box2.Render(
lipgloss.JoinVertical(lipgloss.Top,
lin1,
lin2,
),
),
)
lipgloss.Println(view)
} |


fix: improve Unicode width calculation for emoji alignment
Summary
Fixes emoji and Unicode width calculation issues that cause box alignment problems in TUI applications. This resolves layout misalignment when mixing ASCII and Unicode content in lipgloss-styled components.
Problem
The existing width calculation using
ansi.StringWidth()incorrectly handles:This causes boxes and layouts to appear misaligned when they contain Unicode content.
Changes
Core Implementation
stringWidth()function with smart Unicode detectionmattn/go-runewidthfor accurate width calculationKey Functions Added
Dependencies Added
Testing
size_emoji_test.go)Test Coverage
Performance Impact
Backward Compatibility
Visual Results
Before (Broken):
After (Fixed):
Use Cases Improved
Implementation Details
The fix uses a two-stage approach:
ansi.StringWidth()for ANSI sequencesgo-runewidthfor accuracySmart detection triggers fallback only when:
Migration Guide
No migration required - this is a drop-in improvement.
Existing code continues to work exactly as before, but now with correct Unicode width calculations.
Related Issues
Closes #562
Testing Instructions
Screenshots
[Include before/after screenshots of TUI applications showing the alignment fix]
Impact: Fixes critical layout issues affecting international users and modern TUI applications worldwide.
Risk: Very low - preserves all existing functionality with targeted Unicode improvements.
Review Focus: Unicode edge cases, performance with large strings, ANSI sequence preservation.