You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**RHAIENG-4752** — We prototyped tool call tracing using `mlflow autolog claude`. Every tool Claude Code calls (Write, Read, Edit, Bash, AskUserQuestion, etc.) is captured as a span in MLflow with the tool name, input parameters, output/result, and latency. Tested across three backends with a real coding task — Vertex AI produced 15 spans, vLLM and OGX produced 8 each. MLflow integration works end-to-end. The stop-hook fires after the session so there is no latency impact.
95
+
**Tool call tracing** — We prototyped tool call tracing using `mlflow autolog claude`. Every tool Claude Code calls (Write, Read, Edit, Bash, AskUserQuestion, etc.) is captured as a span in MLflow with the tool name, input parameters, output/result, and latency. Tested across three backends with a real coding task — Vertex AI produced 15 spans, vLLM and OGX produced 8 each. MLflow integration works end-to-end. The stop-hook fires after the session so there is no latency impact.
96
96
97
-
**RHAIENG-4753** — On top of the tool call spans, each trace also captures higher-level session metrics: session ID, total duration, input/output token counts, and the full tool call sequence as a waterfall. This answers "what did the agent do and how much did it cost?" for any session. Validated with a complete multi-turn coding task ("build me a tetris game") across all three backends.
97
+
**Session-level metrics** — On top of the tool call spans, each trace also captures higher-level session metrics: session ID, total duration, input/output token counts, and the full tool call sequence as a waterfall. This answers "what did the agent do and how much did it cost?" for any session. Validated with a complete multi-turn coding task ("build me a tetris game") across all three backends.
98
98
99
99
As you can see in the results below.
100
100
@@ -170,7 +170,7 @@ Each span captures: tool name, input parameters, output/result, and per-span lat
@@ -180,7 +180,7 @@ MLflow integration works. This guide documents how to hook Claude Code, OGX, and
180
180
181
181
The following must already be running on the cluster:
182
182
183
-
- Claude Code container deployed (see [PR #92](https://github.com/red-hat-data-services/agentic-starter-kits/pull/92))
183
+
- Claude Code container deployed (see [agents/claude/claude_agent](https://github.com/red-hat-data-services/agentic-starter-kits/tree/main/agents/claude/claude_agent))
184
184
- OGX deployed and serving a model
185
185
- MLflow instance running via the ODH/RHOAI operator with a workspace matching your namespace
186
186
@@ -193,7 +193,7 @@ The ODH build of MLflow uses the Red Hat fork which includes the `kubernetes-nam
0 commit comments