Skip to content

fix: remove HTML entity unescaping from tool content processing#12265

Draft
roomote-v0[bot] wants to merge 1 commit intomainfrom
fix/remove-html-unescaping-from-tools
Draft

fix: remove HTML entity unescaping from tool content processing#12265
roomote-v0[bot] wants to merge 1 commit intomainfrom
fix/remove-html-unescaping-from-tools

Conversation

@roomote-v0
Copy link
Copy Markdown
Contributor

@roomote-v0 roomote-v0 Bot commented May 3, 2026

Related GitHub Issue

Closes: #12264

Description

This PR attempts to address Issue #12264. Feedback and guidance are welcome.

Root cause: The unescapeHtmlEntities() function was called unconditionally on tool content for non-Claude models in ApplyDiffTool.ts, WriteToFileTool.ts, and ExecuteCommandTool.ts. This was originally added to handle XML-based tool calls where HTML entities were encoding artifacts. However, since Roo Code now exclusively uses native JSON tool calls, HTML entities in parsed content are intentional literal strings (e.g., Go code doing HTML escaping like strings.ReplaceAll(htmlText, "&", "&")) rather than encoding artifacts.

Changes:

  • Removed unescapeHtmlEntities() call from ApplyDiffTool.ts
  • Removed unescapeHtmlEntities() call from WriteToFileTool.ts
  • Removed unescapeHtmlEntities() call from ExecuteCommandTool.ts
  • Updated tests: removed obsolete HTML unescaping tests, added a test verifying HTML entities are preserved in content
  • Kept unescapeHtmlEntities available in text-normalization.ts (no deletion) since it may still be useful for other contexts

Test Procedure

  • Ran npx vitest run for all three affected test files (executeCommandTool.spec.ts, writeToFileTool.spec.ts, text-normalization.spec.ts) -- all 49 tests pass
  • Full lint and type-check passes via pre-push hooks (turbo lint and turbo check-types)
  • To manually verify: use apply_diff or write_to_file with code containing HTML entities (e.g., &, <, >) and confirm they are preserved as-is

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes.
  • Documentation Impact: No documentation updates needed -- this is a bug fix removing incorrect behavior.
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Documentation Updates

No documentation updates needed. This is a bug fix that removes incorrect HTML entity unescaping from tool content processing.

Interactively review PR in Roo Code Cloud

Removes unescapeHtmlEntities() calls from ApplyDiffTool, WriteToFileTool,
and ExecuteCommandTool. Since all tool calls now use native JSON parsing,
HTML entities in parsed content are intentional (e.g. Go code doing HTML
escaping) rather than encoding artifacts. The lossy unescaping was causing
apply_diff failures and file content corruption when working with code
that legitimately contains HTML entities.

Fixes #12264
@rkfg
Copy link
Copy Markdown

rkfg commented May 3, 2026

Works very well now, apply_diff executed correctly on the first try extracting duplicated code with HTML entities!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] apply_diff uses too aggressive HTML unescaping preventing searching and changing some code

2 participants