fix(telegram): escape stray * and _ instead of stripping (URLs with _ get mangled)#2338
Open
chiptoe-svg wants to merge 1 commit into
Open
fix(telegram): escape stray * and _ instead of stripping (URLs with _ get mangled)#2338chiptoe-svg wants to merge 1 commit into
chiptoe-svg wants to merge 1 commit into
Conversation
Bug: when an outbound message had an odd count of `*` or `_`, the legacy- Markdown sanitizer dropped EVERY occurrence of those chars to keep Telegram's parser happy. That silently mangled URLs whose path contained an underscore — e.g. `http://host/group_name/page/` became `http://host/groupname/page/` after sanitize, and the user got a 404 on a link they couldn't have typoed (they clicked it from the message). This bites anyone whose group folder convention uses underscores (`student_01`, `team_alpha`, etc.) the moment an agent posts a URL into that group's filespace — exactly the convention the classroom skill already encourages. Fix: backslash-escape stray `\\*` / `\\_` instead of dropping them. Telegram's legacy Markdown renders `\\_` as a literal underscore, so URLs survive verbatim. Even-balanced messages still pass through untouched, so legitimate `_italic_` and `*bold*` rendering is preserved. Updates the two tests that asserted the strip behaviour to assert the escape behaviour, and adds a regression test for the URL-with-underscore case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When an outbound message has an odd count of
*or_, the legacy-Markdown sanitizer insrc/channels/telegram-markdown-sanitize.tsremoves every occurrence of those characters to make Telegram's parser happy. This silently corrupts URLs whose path contains an underscore.Concrete repro (caught in the wild):
http://host/group_name/page/http://host/groupname/page/This bites any deployment whose group folder convention uses underscores (
student_01,team_alpha, etc.) — exactly the shape/add-classroomalready encourages.Fix
Backslash-escape stray
*/_instead of stripping them. Telegram's legacy Markdown renders\_and\*as literal characters, so URLs survive verbatim. Even-balanced messages still pass through untouched, so legitimate_italic_and*bold*rendering is preserved.Three-line code change in the sanitizer plus tests:
strips formatting chars on odd delimiter count …) updated toescapes formatting chars …with the new expected output.Test plan
pnpm exec vitest run src/channels/telegram-markdown-sanitize.test.ts— 16/16 pass.'a *b* c _d_ e', links like[docs](https://example.com)) still untouched.```...```and`inline`) still untouched — those segments are placeholdered before delimiter counting and restored after.Risk
Behaviour change is small and intentional: messages that previously had stray formatting chars silently removed will now show those chars literally (with a leading backslash). For human-readable text the result is more honest (you see the
*or_you wrote). For URLs the result is correct (they work). I don't think this breaks any existing user expectation, but flagging in case a maintainer disagrees.🤖 Generated with Claude Code