Skip to content

[Iris] Add babysitting MCP server#5042

Merged
dlwh merged 3 commits intomainfrom
agent/20260421-fix-5018
Apr 22, 2026
Merged

[Iris] Add babysitting MCP server#5042
dlwh merged 3 commits intomainfrom
agent/20260421-fix-5018

Conversation

@dlwh
Copy link
Copy Markdown
Member

@dlwh dlwh commented Apr 22, 2026

Add a marin-mcp-babysitter entry point with resident Iris controller/log clients, structured job/task/log/worker/process/profile tools, Zephyr progress parsing, and diagnosis signals. Also documents MCP-assisted babysitting in the job monitoring skill after live smoke validation.

Part of #5018

@dlwh dlwh added the agent-generated Created by automation/agent label Apr 22, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 22, 2026

Claude finished @dlwh's task in 8m 14s —— View job


Code review

  • Check PR status and whether review is needed
  • Gather CLAUDE.md/AGENTS.md files relevant to changed files
  • Read PR changes and summarize
  • Run 4 parallel review agents (2 AGENTS.md compliance, 2 bug-finding)
  • Validate each flagged issue via subagents
  • Filter high-signal findings
  • Post inline comments and summary

Found 2 high-signal issues, posted as inline comments:

  1. _jobs_with_prefix uses raw startswith (comment) — a tree rooted at /alice/train pulls in string-prefix siblings like /alice/train-v2/... into nodes, even though they're correctly excluded from children by the JobName.is_ancestor_of check later. Consumers iterating nodes directly see jobs that don't belong to the tree. Suggest filtering in _jobs_with_prefix (or before inserting into nodes) using the JobName hierarchy.

  2. --iris-token CLI flag leaks the bearer token via ps//proc and bypasses the Iris token-store pattern (comment) — every other Iris CLI goes through create_client_token_providerload_token(cluster_name) / load_any_token() (from iris login). Recommend using the same lookup path here instead of a --token flag.

Investigated and dismissed (to save reviewer time): tautological-test framing of test_classify_diagnosis_* (it does assert behavior per input pattern, just doesn't cover severity/evidence/escalation_hint); multi-line parsing in zephyr_stage_progress (coordinator emits one progress record per line); theoretical orphaning in job_tree (no job-deletion path in the controller DB today); _s/_ms suffix complaints (rule only bans the bare _s abbreviation); _profile_type "duplication" (the iris version is inline in a Click command, not an extractable helper — no parallel implementation to violate).
· Branch: agent/20260421-fix-5018

@dlwh dlwh mentioned this pull request Apr 22, 2026
4 tasks
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9824376a31

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread lib/marin/src/marin/mcp/babysitter.py Outdated
Comment thread lib/marin/src/marin/mcp/babysitter.py Outdated
Comment thread lib/marin/src/marin/mcp/babysitter.py
Comment thread lib/marin/src/marin/mcp/babysitter.py Outdated
@dlwh
Copy link
Copy Markdown
Member Author

dlwh commented Apr 22, 2026

🤖 Live MCP smoke test passed on marin-dev. I submitted /dlwh/mcp-babysitter-smoke-20260422-1017, ran marin-mcp-babysitter over streamable HTTP, and verified iris_job_summary plus iris_tail_logs through the MCP endpoint. The job printed mcp-babysitter-smoke tick=<n> roughly every 30s through tick=59, emitted mcp-babysitter-smoke done, and finished with state=succeeded, failure_count=0, preemption_count=0, and no error.

Copy link
Copy Markdown
Contributor

@yonromai yonromai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't done MCP in ages, but LGTM!

@dlwh dlwh merged commit c05d339 into main Apr 22, 2026
42 checks passed
@dlwh dlwh deleted the agent/20260421-fix-5018 branch April 22, 2026 18:51
def worker_status_to_json(worker: controller_pb2.Controller.WorkerHealthStatus) -> dict[str, Any]:
"""Serialize Iris worker health into stable JSON."""
metadata = worker.metadata
return {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlwh
Copy link
Copy Markdown
Member Author

dlwh commented Apr 22, 2026

🤖 Follow-up PR for the protobuf json_format review nit: #5071

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-generated Created by automation/agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants