Document intended emoji shortcode parsing behavior for clients

There is a lot of nuance in how custom emoji shortcodes are parsed that I don’t think is documented anywhere. As a result, I don’t think most clients have consistent behavior here, and as a client dev I have no idea what the intended behavior is, if any. The API docs should specify how these are intended to be parsed.

For example, say we have a custom emoji with the shortcode `:blobcat:`…
- Should shortcodes inside various HTML tags be replaced? My gut would say shortcodes inside formatting tags like `<em>:blobcat:</em>` should be replaced, but what about `<a href="example.com">this is a link :blobcat:</a>`? Or `<code>:blobcat:</code>`? Standard HTML behavior would suggest all instances of the shortcode *should* be replaced unless they’re escaped somehow, but Mastodon doesn’t have a documented way of escaping these shortcodes, does it?
- If a shortcode spans across a formatting change, say the first half is italicized and the second half is not, should it be replaced with the custom emoji? (E.g. the HTML might look something like `<em>:blob</em>cat:`) The docs say the shortcodes are “plain text shortcodes”, but that doesn’t indicate whether clients should match against the plain text form before or after parsing HTML tags.
- As noted in mastodon/mastodon#7364, there are some arcane rules about replacing multiple consecutive shortcodes to prevent things like IPv6 addresses from being unintentionally replaced with custom emoji. Client developers need to know what the intended behavior is here!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Document intended emoji shortcode parsing behavior for clients #1850

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Document intended emoji shortcode parsing behavior for clients #1850

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions