Skip to content

fix: strip images and badges from README before passing to Claude#13

Draft
LeoRoccoBreedt wants to merge 3 commits into
mainfrom
fix-remove-images-from-readme
Draft

fix: strip images and badges from README before passing to Claude#13
LeoRoccoBreedt wants to merge 3 commits into
mainfrom
fix-remove-images-from-readme

Conversation

@LeoRoccoBreedt
Copy link
Copy Markdown
Contributor

@LeoRoccoBreedt LeoRoccoBreedt commented May 22, 2026

Resolves #11

Summary

  • README content fetched from GitHub often contains Markdown image tags
    (![alt](url)) and HTML <img> tags for badges and screenshots.
    Claude receives these as plain text URLs it cannot fetch, adding noise
    with no value.
  • fetch_readme now strips all Markdown images, HTML img tags, empty
    link remnants left by badge stripping, and collapses excess blank
    lines before truncating to 3000 chars.
  • Removing the imagess/badges also removed logging these to Opik which are not necessary.

Test plan

  • Run Scout against a repo whose README contains badge images and
    confirm the Opik trace shows the repo context block with no
    ![...](...) or <img> content
  • Run Scout against a repo with no images and confirm output is
    unchanged
  • All 38 unit tests pass

Markdown image tags and HTML img elements in READMEs are passed as dead
URL strings that Claude cannot fetch or render. Strip them at fetch time
to reduce noise in the repo context sent to the model and Opik logging.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove image content from repo context being uploaded to Opik

1 participant