Skip to content

fix(telegram): escape stray * and _ instead of stripping (URLs with _ get mangled)#2338

Open
chiptoe-svg wants to merge 1 commit into
nanocoai:channelsfrom
chiptoe-svg:fix/telegram-underscore-mangling
Open

fix(telegram): escape stray * and _ instead of stripping (URLs with _ get mangled)#2338
chiptoe-svg wants to merge 1 commit into
nanocoai:channelsfrom
chiptoe-svg:fix/telegram-underscore-mangling

Conversation

@chiptoe-svg

Copy link
Copy Markdown

Summary

When an outbound message has an odd count of * or _, the legacy-Markdown sanitizer in src/channels/telegram-markdown-sanitize.ts removes every occurrence of those characters to make Telegram's parser happy. This silently corrupts URLs whose path contains an underscore.

Concrete repro (caught in the wild):

  • Agent posts http://host/group_name/page/
  • Outgoing text has 1 underscore (odd) → sanitizer strips it
  • Telegram delivers http://host/groupname/page/
  • User clicks the link, gets a 404, blames the agent

This bites any deployment whose group folder convention uses underscores (student_01, team_alpha, etc.) — exactly the shape /add-classroom already encourages.

Fix

Backslash-escape stray * / _ instead of stripping them. Telegram's legacy Markdown renders \_ and \* as literal characters, so URLs survive verbatim. Even-balanced messages still pass through untouched, so legitimate _italic_ and *bold* rendering is preserved.

- if (starCount % 2 !== 0 || underCount % 2 !== 0) {
-   text = text.replace(/[*_]/g, '');
- }
+ if (starCount % 2 !== 0) text = text.replace(/\*/g, '\\*');
+ if (underCount % 2 !== 0) text = text.replace(/_/g, '\\_');

Three-line code change in the sanitizer plus tests:

  • Two existing tests (strips formatting chars on odd delimiter count …) updated to escapes formatting chars … with the new expected output.
  • One new regression test for the URL-with-underscore case.

Test plan

  • pnpm exec vitest run src/channels/telegram-markdown-sanitize.test.ts — 16/16 pass.
  • Even-balanced inputs ('a *b* c _d_ e', links like [docs](https://example.com)) still untouched.
  • Code-block handling (```...``` and `inline`) still untouched — those segments are placeholdered before delimiter counting and restored after.
  • Pre-existing tests for bracket balancing, list-bullet rewriting, horizontal-rule flattening, code preservation all pass unchanged.

Risk

Behaviour change is small and intentional: messages that previously had stray formatting chars silently removed will now show those chars literally (with a leading backslash). For human-readable text the result is more honest (you see the * or _ you wrote). For URLs the result is correct (they work). I don't think this breaks any existing user expectation, but flagging in case a maintainer disagrees.

🤖 Generated with Claude Code

Bug: when an outbound message had an odd count of `*` or `_`, the legacy-
Markdown sanitizer dropped EVERY occurrence of those chars to keep
Telegram's parser happy. That silently mangled URLs whose path contained
an underscore — e.g. `http://host/group_name/page/` became
`http://host/groupname/page/` after sanitize, and the user got a 404 on
a link they couldn't have typoed (they clicked it from the message).

This bites anyone whose group folder convention uses underscores
(`student_01`, `team_alpha`, etc.) the moment an agent posts a URL into
that group's filespace — exactly the convention the classroom skill
already encourages.

Fix: backslash-escape stray `\\*` / `\\_` instead of dropping them.
Telegram's legacy Markdown renders `\\_` as a literal underscore, so URLs
survive verbatim. Even-balanced messages still pass through untouched,
so legitimate `_italic_` and `*bold*` rendering is preserved.

Updates the two tests that asserted the strip behaviour to assert the
escape behaviour, and adds a regression test for the URL-with-underscore
case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant