Skip to content

Use emoji presentation for emoji found in normal text #1107

Open
@gnprice

Description

@gnprice

This is a follow-up to:

That issue was narrowed to cover situations where we explicitly know we're working with an emoji. Still open are situations where a literal emoji character appears as part of some user-generated text:

  • TextNode in message content, corresponding to a text element in the HTML. As far as I know there's no longer a way to generate a message with emoji in a text node, though there used to be — compare this test message yesterday, where the only emoji is in an emoji span, to the message from 2020 it's quoting, where the same Markdown source produced some literal emoji following an emoji span.
    • … Oh, here's one way: write a literal emoji character inside a code span or code block. See these example messages.
  • Topics, channel names, users' names, organization names, and other places where more-or-less-arbitrary plain text appears.

Like #1104, this doesn't affect most emoji — newer emoji, including those added in the emoji boom, have only emoji presentations. Probably ❤ U+2764 HEAVY BLACK HEART, aka :heart: in Zulip, is the most conspicuous affected example: currently we show it as ❤︎ (the text presentation), but we should instead show ❤️ (the emoji presentation).

Implementation

Probably the way to control this is through the ordering of font choices: put an emoji font first, before plain-text fonts. The tricky part is that we'll need a reduced emoji font (at least compared to Noto Color Emoji): one that doesn't have glyphs for characters like U+0020 SPACE and U+0030..0039 DIGIT ZERO..DIGIT NINE, which are perfectly normal characters for ordinary text and should still be shown in text presentation.

For details, see #1104 (comment) and #1104 (comment) .

One further thought beyond those comments: for drawing the line in a principled way between characters like U+0020 SPACE that should stay in text presentation despite appearing in Noto Color Emoji, and characters like U+2764 HEAVY BLACK HEART that should get the emoji presentation, one candidate is to use the code point's Unicode General_Category value:

  • for code points in category So (Symbol, Other) that appear in the emoji font, assume they should get the emoji presentation;
  • for code points in any other category, assume they shouldn't, even if they're in the emoji font.

That rule gives the right answer for U+0020 SPACE, U+2764 HEAVY BLACK HEART, and all the other examples I looked at. More study would be needed to validate the rule before running with it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    a-contentParsing and rendering Zulip HTML content, notably message contents

    Type

    No type

    Projects

    • Status

      No status

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions