fix(managed_agents): Fix console image, missing fixture, and summary blurb#526
Conversation
…mage - registry.yaml: rewrite SRE incident responder description with the on-call-flow framing. - sre_incident_responder.ipynb: replace the attachment: image ref with a self-closed <img> pointing at the raw.githubusercontent URL (matches data_analyst_agent / slack_data_bot). The platform's MDX renderer doesn't resolve attachment: URIs, so the screenshot was rendering as a broken image. Drops the 280KB base64 blob from the notebook.
…kbook The SRE incident responder notebook mounts logs/checkout-svc.log as a session resource, but the fixture didn't make it into the public repo during the initial managed_agents import (root .gitignore has *.log). Force-add the file so readers can run the notebook end-to-end.
Notebook ChangesThis PR modifies the following notebooks: 📓
|
There was a problem hiding this comment.
PR Review
Recommendation: APPROVE
Summary
Three well-scoped housekeeping fixes: restores a missing fixture file, replaces a broken attachment: image URI with an external URL that actually renders (consistent with existing patterns in data_analyst_agent and slack_data_bot), and sharpens the registry description with outcome-oriented language.
Actionable Feedback (1 item)
-
.gitignore— Consider adding a negation rule to make the forced fixture exception explicit and prevent re-regression:Without it, a future*.log !managed_agents/example_data/**/*.loggit add -Aor pre-commit cleanup tool could silently drop this fixture again. Non-blocking, but it self-documents the intent for future contributors.
Detailed Review
Log fixture (managed_agents/example_data/sre/logs/checkout-svc.log)
Realistic and pedagogically sound. The OOM crash loop arc is coherent: startup → pricing cache warm → gradual heap pressure → GC pause → allocation failure → OOMKill (exit 137) → restart loop → CrashLoopBackOff. Internal consistency checks out — heap percentages match raw MiB values (101/128 = 79%, 118/128 = 92%), restartCount increments correctly (1→2→3→4), and exit 137 is the canonical OOMKill code. Timestamps span ~6 minutes and are internally consistent.
The .gitignore has *.log, so this was correctly force-added. The fixture is now tracked. The only concern is the lack of a gitignore negation rule (see above).
Image reference (sre_incident_responder.ipynb)
Using raw.githubusercontent.com is the correct fix and matches the exact pattern used by data_analyst_agent.ipynb and slack_data_bot.ipynb (same URL structure, same width="700" attribute). The attachment: URI approach was always broken on the platform's MDX renderer. Dropping the 280KB base64 blob also improves notebook file size. console_session.png is tracked in git at the referenced path. The dependency on the path remaining stable is a real but accepted tradeoff in this codebase.
Registry description (registry.yaml)
Clearly superior to the old version. Switches from implementation framing ("A webhook-triggered responder... with a custom Skill") to outcome framing ("Wire Claude into your on-call flow..."). Concise, action-oriented, and accurate. Date bump to 2026-04-10 reflects the actual fix date.
Security
No concerns. No API keys, no new dependencies, no user-facing input handling.
Positive Notes
- The log fixture is genuinely realistic — timestamp consistency, correct exit codes, and matching heap percentages show careful construction.
- The image fix correctly identifies the root cause (MDX renderer limitation) and applies the established repo pattern rather than inventing a new approach.
- Removing the 280KB base64 blob is a meaningful file-size improvement with no downside.
…se-agent-fixes
…se-agent-fixes
…se-agent-fixes
Title is self explanatory =)