Skip to content

Unicode and invisible characters #409

Open
@AlexRMU

Description

@AlexRMU

Samples

When working with unicode (that is, almost any text), you need to remember many things. Here are a few of them:

  • Invisible characters
    Invisible characters can behave differently on different devices, browsers, and fonts. They are usually invisible, but they still take up space.

    "឴" != "";
    "_឴_" != "__";

    That's how they are highlighted in the VS Code:
    image

  • Combining character and cursed strings
    The display of the combining character depends on many factors. They can often display strangely and break the interface and styles.

    This is how they are currently displayed in the editor:
    image

    That's how they are displayed in the VS Code:
    image

  • Surrogate couples and normalization
    https://en.wikipedia.org/wiki/Unicode_equivalence
    https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize

    "_._".normalize(); // "_._"
    "_._".normalize("NFC"); // "_._"
    "_._".normalize("NFD"); // "_._"
    "_._".normalize("NFKC"); // "_._"
    "_._".normalize("NFKD"); // "_._"
    
    const name1 = "\u0041\u006d\u00e9\u006c\u0069\u0065";
    const name2 = "\u0041\u006d\u0065\u0301\u006c\u0069\u0065";
    name1 != name2; // "Amélie" != "Amélie"
    name1.length != name2.length
    
    const name1NFC = name1.normalize("NFC");
    const name2NFC = name2.normalize("NFC");
    name1NFC == name2NFC; // "Amélie" == "Amélie"
    name1NFC.length == name2NFC.length

    Before and after formatting:
    image


Everything seems to be fine with this in the editor now.
I suggest:

  • highlight invisible characters
  • automatically normalize and decode all strings when pasting or formatting

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions