Skip to content

Custom Unicode Keystrokes

ThioJoe edited this page Sep 9, 2024 · 9 revisions

Question: "Unicode? What does that have to do with keyboard keys? πŸ€”"

Answer:

  • Yes, the Windows API also allow you to specify Unicode "Code Points" (hex codes basically) to send Unicode Characters as if they were text input

Technical Answer:

  • Specifically, the KEYBDINPUT data type, used with the SendInput function, has a flag to pass in Unicode code points

Why Might This Be Useful?

  • I don't really know, I just saw it was supported by the Windows API so I added the functionality to the app
  • Possible ideas:
    • You want to enter Unicode characters into something that won't allow you to paste them, but would let you type them (like if you had a foreign keyboard that had character on it)
    • Testing input of foreign characters if you have an app that takes keyboard input
    • Testing out complex / combined emojis (read below for more info)

How Unicode Codepoints Work

Each Unicode character is assigned a unique number called a code point. These code points are typically represented in hexadecimal format, prefixed with "U+". For example:

  • U+0041 represents the Latin capital letter "A"
  • U+03A9 represents the Greek capital letter "Ξ©" (Omega)
  • U+1F600 represents the "Grinning Face" emoji πŸ˜€

Fun fact: In windows you can use Alt + Numpad keys to enter Unicode characters manually:

  • For example: Hold Alt and type 0160 with the numpad, and it will type out a "non-breaking space", which looks like a space but isn't!
    • Note: You have to use the decimal version of the codepoint. 0160 is the decimal version of the code point while U+00A0 is the typical hex representation
    • Also keep in mind that you have to actually type 0160 and not just 160, since apparently then Windows thinks it's a different set of characters and it prints something different. So for Unicode presumably you need to use a minimum of 4 characters, padding them with zeros as necessary.
  • So using this app could be easier since you can just copy and paste the U+ codes and use those directly

Complex Unicode Characters Using "ZWJ" And Modifiers

  • Some characters, particularly certain emojis, are composed of multiple code points joined together using a special character called a "Zero-Width Joiner" (ZWJ), represented by the hex code 0x200D.

    • Here is a larger list of examples, though still not exhaustive
  • In other cases, emojis may use a single "modifier" character, instead of a ZWJ code point plus additional code points.

  • ZWJ example: The "head shaking horizontally" emoji ( πŸ™‚β€β†”οΈ ) is actually a string of four code points of other Unicode symbols:

    • U+1F642 (Slightly Smiling Face πŸ™‚)
    • U+200D (Zero Width Joiner)
    • U+2194 (Left Right Arrow ↔)
    • U+FE0F (Variation Selector 16)
  • Modifier Example: The "light skin tone handshake" emoji (🀝🏻) consists of:

    • U+1F91D (Regular "handshake" emoji 🀝)
    • U+1F3FB (Light skin modifier - Technically called Emoji Modifier Fitzpatrick Type-1-2, referring to the Fitzpatrick skin tone scale

πŸ“ Using Unicode Codepoints in the App

  • For typical usage, enter the Unicode code point in one of the following formats:

    • With "U+" prefix: U+1F600 (It is OK if the U+ part is in the text box, it will be automatically be removed/ignored)
    • Without prefix: 1F600
  • For complex characters consisting of multiple emojis combined via Zero-Width-Joiner, ideally separate the multiple code points with one of these formats:

    • By Using Spaces between them:
      • 1F468 200D 1F33E
    • By keeping the U+ prefix for each. Either of these would work:
      • U+1F468 U+200D U+1F33E
      • U+1F468U+200DU+1F33E
  • Edge Cases: If you don't separate the codepoints by spaces or U+, it should still work if ALL codepoints are of the same length

    • Example: Notice with 1F468 200D 1F33E, the ZWJ (200D) is only 4 characters, while the others are 5
      • Combining them as-is would NOT work: 1F468200D1F33E (The app can't figure out where one code starts and another ends)
      • But adding a leading 0 to the ZWJ (so 200D) to make them all 5 characters, then combining, SHOULD work: 1F4680200D1F33E
        • If spaces or U+ aren't present the app will try to parse it by checking if the string length is divisible by 5, then 4
        • It is possible for this to lead to incorrect parsing, so it is still best to keep the codepoints separated