Skip to content

Ignoring case for unicode character leads to unexpected match #2200

Answered by BurntSushi
t-b asked this question in Q&A
Discussion options

You must be logged in to vote

There's two levels to "why" here. The first level is, "because that's how Unicode defines its case folding tables." Here:

$ curl -LO https://www.unicode.org/Public/zipped/14.0.0/UCD.zip
$ unzip UCD.zip
$ rg '03A9' CaseFolding.txt
332:03A9; C; 03C9; # GREEK CAPITAL LETTER OMEGA
$ rg '2126' CaseFolding.txt
959:2126; C; 03C9; # OHM SIGN
$ rg '03C9' CaseFolding.txt
332:03A9; C; 03C9; # GREEK CAPITAL LETTER OMEGA
949:1FF3; F; 03C9 03B9; # GREEK SMALL LETTER OMEGA WITH YPOGEGRAMMENI
951:1FF6; F; 03C9 0342; # GREEK SMALL LETTER OMEGA WITH PERISPOMENI
952:1FF7; F; 03C9 0342 03B9; # GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND YPOGEGRAMMENI
957:1FFC; F; 03C9 03B9; # GREEK CAPITAL LETTER OMEGA WIT…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@t-b
Comment options

Answer selected by t-b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants