Skip to content

[BUG] apply_diff uses too aggressive HTML unescaping preventing searching and changing some code #12264

@rkfg

Description

@rkfg

Problem (one or two sentences)

Sequences like &amp;, &lt;, &gt; are transparently replaced with &, >, <, both in search and replace. Because of that changing or writing HTML-escaping code becomes impossible.

Context (who is affected and when)

Simple HTML escaping in Go becomes impossible to write, strings.ReplaceAll(htmlText, "&", "&amp;") becomes strings.ReplaceAll(htmlText, "&", "&") which is useless.

Reproduction steps

Here's the tool use stanza:

{
        "type": "tool_use",
        "id": "E2nCKFzolIQ5bNGNWFmwc9b7aSZdQmgI",
        "name": "apply_diff",
        "input": {
          "path": "search/search.go",
          "diff": "<<<<<<< SEARCH\n:start_line:336\n-------\n\tif r.Text != \"\" {\n\t\ttruncated := truncate(r.Text, 200)\n\t\tfmt.Fprintf(&textLine, \"\\n%s\\n\", truncated)\n\t\t// Escape HTML special chars\n\t\thtmlText := truncated\n\t\thtmlText = strings.ReplaceAll(htmlText, \"&\", \"&amp;\")\n\t\thtmlText = strings.ReplaceAll(htmlText, \"<\", \"&lt;\")\n\t\thtmlText = strings.ReplaceAll(htmlText, \">\", \"&gt;\")\n\t\tfmt.Fprintf(&htmlLine, \"<br>%s<br>\", htmlText)\n\t}\n\tif r.ImageDesc != \"\" {\n\t\ttruncated := truncate(r.ImageDesc, 200)\n\t\tfmt.Fprintf(&textLine, \"\\n\\U0001f574\\ufe0f %s\", truncated)\n\t\tfmt.Fprintf(&htmlLine, \"<br>\\U0001f574\\ufe0f %s\", truncated)\n\t}\n=======\n\tif r.Text != \"\" {\n\t\ttruncated := truncate(r.Text, 200)\n\t\tfmt.Fprintf(&textLine, \"\\n%s\\n\", truncated)\n\t\t// Escape HTML special chars and convert newlines to <br>\n\t\thtmlText := truncated\n\t\thtmlText = strings.ReplaceAll(htmlText, \"&\", \"&amp;\")\n\t\thtmlText = strings.ReplaceAll(htmlText, \"<\", \"&lt;\")\n\t\thtmlText = strings.ReplaceAll(htmlText, \">\", \"&gt;\")\n\t\thtmlText = strings.ReplaceAll(htmlText, \"\\n\", \"<br>\")\n\t\tfmt.Fprintf(&htmlLine, \"<br>%s<br>\", htmlText)\n\t}\n\tif r.ImageDesc != \"\" {\n\t\ttruncated := truncate(r.ImageDesc, 200)\n\t\tfmt.Fprintf(&textLine, \"\\n\\U0001f574\\ufe0f %s\", truncated)\n\t\t// Escape HTML special chars and convert newlines to <br>\n\t\thtmlDesc := truncated\n\t\thtmlDesc = strings.ReplaceAll(htmlDesc, \"&\", \"&amp;\")\n\t\thtmlDesc = strings.ReplaceAll(htmlDesc, \"<\", \"&lt;\")\n\t\thtmlDesc = strings.ReplaceAll(htmlDesc, \">\", \"&gt;\")\n\t\thtmlDesc = strings.ReplaceAll(htmlDesc, \"\\n\", \"<br>\")\n\t\tfmt.Fprintf(&htmlLine, \"<br>\\U0001f574\\ufe0f %s\", htmlDesc)\n\t}\n>>>>>>> REPLACE"
        }
      }

The model uses correct code here as HTML entities should not be unescaped in JSON context. However, Roo errors out with:

{
        "type": "tool_result",
        "tool_use_id": "E2nCKFzolIQ5bNGNWFmwc9b7aSZdQmgI",
        "content": "<error_details>\nNo sufficiently similar match found at line: 336 (97% similar, needs 100%)\n\nDebug Info:\n- Similarity Score: 97%\n- Required Threshold: 100%\n- Search Range: starting at line 336\n- Tried both standard and aggressive line number stripping\n- Tip: Use the read_file tool to get the latest content of the file before attempting to use the apply_diff tool again, as the file content may have changed\n\nSearch Content:\n\tif r.Text != \"\" {\n\t\ttruncated := truncate(r.Text, 200)\n\t\tfmt.Fprintf(&textLine, \"\\n%s\\n\", truncated)\n\t\t// Escape HTML special chars\n\t\thtmlText := truncated\n\t\thtmlText = strings.ReplaceAll(htmlText, \"&\", \"&\")\n\t\thtmlText = strings.ReplaceAll(htmlText, \"<\", \"<\")\n\t\thtmlText = strings.ReplaceAll(htmlText, \">\", \">\")\n\t\tfmt.Fprintf(&htmlLine, \"<br>%s<br>\", htmlText)\n\t}\n\tif r.ImageDesc != \"\" {\n\t\ttruncated := truncate(r.ImageDesc, 200)\n\t\tfmt.Fprintf(&textLine, \"\\n\\U0001f574\\ufe0f %s\", truncated)\n\t\tfmt.Fprintf(&htmlLine, \"<br>\\U0001f574\\ufe0f %s\", truncated)\n\t}\n\nBest Match Found:\n336 | \tif r.Text != \"\" {\n337 | \t\ttruncated := truncate(r.Text, 200)\n338 | \t\tfmt.Fprintf(&textLine, \"\\n%s\\n\", truncated)\n339 | \t\t// Escape HTML special chars\n340 | \t\thtmlText := truncated\n341 | \t\thtmlText = strings.ReplaceAll(htmlText, \"&\", \"&amp;\")\n342 | \t\thtmlText = strings.ReplaceAll(htmlText, \"<\", \"&lt;\")\n343 | \t\thtmlText = strings.ReplaceAll(htmlText, \">\", \"&gt;\")\n344 | \t\tfmt.Fprintf(&htmlLine, \"<br>%s<br>\", htmlText)\n345 | \t}\n346 | \tif r.ImageDesc != \"\" {\n347 | \t\ttruncated := truncate(r.ImageDesc, 200)

Clearly it escaped the request and couldn't find a match. Same happens if the model patches or writes a file from scratch, in both cases HTML entities are escaped when they shouldn't.

Expected result

apply_diff works as it should

Actual result

No match could be found, invocation fails

Variations tried (optional)

After that the model tried rewriting the whole file and the bug manifested itself here as well, the entities came out unescaped:

Image Instead of adding just one line it mangled the other lines too and I now have to change them back manually.

App Version

Version: 3.53.0 (ad25634) (plus a couple of other apply_diff related PRs from here)

API Provider (optional)

OpenAI Compatible

Model Used (optional)

Qwen 3.6 35B APEX

Roo Code Task Links (optional)

No response

Relevant logs or errors (optional)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions