fix(tools): read_url third-party disclosure + cache opt-out by Teerapat-Vatpitak · Pull Request #122 · HKUDS/Vibe-Trading

Teerapat-Vatpitak · 2026-05-16T05:00:22Z

Summary

Disclose the read_url third-party dependency (r.jina.ai) in the skill doc, add a no_cache opt-out, and sanitize upstream errors so the vendor name and response body are not leaked to the caller (kept in server logs).

Why

read_url silently routed URLs through a third party and could return stale cached snapshots with no disclosure or opt-out.

Changes

src/skills/web-reader/SKILL.md, src/tools/web_reader_tool.py

Test Plan

pytest --ignore=agent/tests/e2e_backtest -q green; requests mocked
Asserts default path byte-identical + opt-out header behavior

Checklist

No protected-area changes
No hardcoded values
Follows CONTRIBUTING.md

Follow-up: anchor the cached-marker detection to the header region; verify the bypass header against current Jina docs.

read_url silently sent the full URL (incl. query string) to the third-party Jina Reader (r.jina.ai), could serve a stale cache with no opt-out, and HTTP/exception errors leaked the vendor name and response body (P11). Disclose the dependency, caching/staleness, and the bash-fallback caveat in the web-reader SKILL.md; add an opt-in no_cache flag (x-no-cache header — default path byte-identical); surface a stale snapshot via an additive `cached: true`; sanitize HTTP/exception errors to a generic message (detail kept in the server log only). A local direct-fetch fallback needs SSRF hardening first and is a deliberate out-of-scope follow-up. (cherry picked from commit eaf91128370dd9800195961a55baf5b123162012)

Restore the response body / exception text in read_url error returns so operators can diagnose a remote failure (HTTP 451, transport errors, etc.) without grepping logs. Keeps the new logger.warning lines from the parent commit intact — both paths now surface debug info. Adjusts the two affected regression tests to assert the debug info is included rather than scrubbed.

warren618 · 2026-05-16T06:57:18Z

Merged — thanks.

One tweak on top before squash: rolled back the HTTP/exc error sanitization in web_reader_tool.py (kept your logger.warning). Vibe-Trading runs as a local stdio MCP, so the caller is the user's own agent — scrubbing the body / exc text just removes debug signal. Tests updated to match.

) - SKILL.md now explicitly notes that read_url forwards the full URL to the third-party Jina Reader (r.jina.ai), and warns against passing credentials or internal URLs. - New optional no_cache=True parameter sends x-no-cache to bypass Jina's cache when freshness matters. - Responses that contain Jina's cached-snapshot marker now surface a cached: true flag. - Error paths keep the upstream HTTP body and exception text in the returned error string (and also log them via logger.warning) so operators can diagnose remote failures without grepping logs.

Teerapat-Vatpitak and others added 2 commits May 16, 2026 10:53

warren618 marked this pull request as ready for review May 16, 2026 06:50

warren618 merged commit 1b547d2 into HKUDS:main May 16, 2026
1 check passed

Teerapat-Vatpitak deleted the pr/readurl-privacy branch May 17, 2026 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tools): read_url third-party disclosure + cache opt-out#122

fix(tools): read_url third-party disclosure + cache opt-out#122
warren618 merged 2 commits into
HKUDS:mainfrom
Teerapat-Vatpitak:pr/readurl-privacy

Teerapat-Vatpitak commented May 16, 2026

Uh oh!

Uh oh!

warren618 commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Teerapat-Vatpitak commented May 16, 2026

Summary

Why

Changes

Test Plan

Checklist

Uh oh!

Uh oh!

warren618 commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants