Open
Description
When working with unicode (that is, almost any text), you need to remember many things. Here are a few of them:
-
Invisible characters
Invisible characters can behave differently on different devices, browsers, and fonts. They are usually invisible, but they still take up space."឴" != ""; "_឴_" != "__";
-
Combining character and cursed strings
The display of the combining character depends on many factors. They can often display strangely and break the interface and styles. -
Surrogate couples and normalization
https://en.wikipedia.org/wiki/Unicode_equivalence
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize"_._".normalize(); // "_._" "_._".normalize("NFC"); // "_._" "_._".normalize("NFD"); // "_._" "_._".normalize("NFKC"); // "_._" "_._".normalize("NFKD"); // "_._" const name1 = "\u0041\u006d\u00e9\u006c\u0069\u0065"; const name2 = "\u0041\u006d\u0065\u0301\u006c\u0069\u0065"; name1 != name2; // "Amélie" != "Amélie" name1.length != name2.length const name1NFC = name1.normalize("NFC"); const name2NFC = name2.normalize("NFC"); name1NFC == name2NFC; // "Amélie" == "Amélie" name1NFC.length == name2NFC.length
Everything seems to be fine with this in the editor now.
I suggest:
- highlight invisible characters
- automatically normalize and decode all strings when pasting or formatting