feat: calculate API costs for agent steps#1697
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds per-message LLM cost tracking via new Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Executor Client
participant Executor as agentstep.Executor
participant Converter as Message Converter
participant Agent as Agent Loop
Client->>Executor: SetContext(contextMessages)
activate Executor
Note over Executor: Store contextMessages
deactivate Executor
Client->>Executor: Run()
activate Executor
Note over Executor: Initialize savedMessages from contextMessages
Executor->>Converter: contextToLLMHistory(contextMessages)
activate Converter
Converter-->>Executor: llm.Message[] (filtered history)
deactivate Converter
Executor->>Agent: Loop with History
activate Agent
Agent->>Agent: Execute agent step
Agent->>Executor: RecordMessage(agentMsg)
deactivate Agent
activate Executor
Executor->>Converter: convertMessage(agentMsg)
activate Converter
Converter-->>Executor: exec.LLMMessage[] (with metadata, Cost)
deactivate Converter
Note over Executor: Append to savedMessages
deactivate Executor
deactivate Executor
Client->>Executor: GetMessages()
activate Executor
Executor-->>Client: savedMessages (with Cost metadata)
deactivate Executor
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/runtime/runner.go (1)
511-538:⚠️ Potential issue | 🟠 Major
setupChatMessagesnow issues handler reads for every step with dependencies, not just LLM-capable ones.The removed
isSupportedByLLMguard was the right gate to prevent unnecessaryReadStepMessagescalls for shell/script/sub-DAG steps. Without it, every step that has dependencies triggers oneReadStepMessagescall per dependency — wasted I/O against whatever persistence backend backsChatMessagesHandler.Correctness is preserved (non-LLM executors don't implement
ChatMessageHandler, soSetChatMessages/GetChatMessagesare no-ops), but performance can degrade at scale. The intended change (agent step support) only requires extending the guard to cover both executor types.🛡️ Proposed fix — restore a two-type guard
func (r *Runner) setupChatMessages(ctx context.Context, node *Node) { if r.messagesHandler == nil { return } step := node.Step() + executorType := step.ExecutorConfig.Type + if executorType != core.ExecutorTypeChat && executorType != core.ExecutorTypeAgent { + return + } if len(step.Depends) == 0 { return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runtime/runner.go` around lines 511 - 538, The function setupChatMessages currently calls messagesHandler.ReadStepMessages for every dependency; restore the previous type guard so we only read messages for dependencies whose executor supports LLM/chat (i.e., use the isSupportedByLLM check or an equivalent that also returns true for agent-capable executors) before invoking messagesHandler.ReadStepMessages; specifically, inside setupChatMessages, for each dep from node.Step().Depends call the support-check (the same predicate used previously, e.g., isSupportedByLLM(dep) or its extended variant) and skip ReadStepMessages when false, leaving the subsequent DeduplicateSystemMessages and node.SetChatMessages behavior unchanged.
🧹 Nitpick comments (4)
internal/runtime/builtin/agentstep/executor_test.go (1)
21-25: Nit:ctxis misleading as a variable name for a[]exec.LLMMessageslice.
ctxconventionally refers tocontext.Context. Renaming tomsgswould avoid confusion.♻️ Proposed rename
- ctx := []exec.LLMMessage{ + msgs := []exec.LLMMessage{ {Role: exec.RoleSystem, Content: "be helpful"}, {Role: exec.RoleUser, Content: "hello"}, } - e.SetContext(ctx) - assert.Equal(t, ctx, e.contextMessages) + e.SetContext(msgs) + assert.Equal(t, msgs, e.contextMessages)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runtime/builtin/agentstep/executor_test.go` around lines 21 - 25, Rename the misleading local variable ctx (type []exec.LLMMessage) to msgs to avoid confusing it with context.Context; update its declaration and the call e.SetContext(ctx) to use msgs and e.SetContext(msgs), and ensure any other references in the test that use ctx are updated accordingly (symbols: ctx -> msgs, type exec.LLMMessage, method e.SetContext).internal/core/exec/messages_test.go (1)
95-96: Redundantassert.InDeltaafter full-structassert.Equal.
assert.Equal(t, meta, decoded)on line 95 already validatesdecoded.Costexactly (the struct is compared field-by-field). Theassert.InDeltaon line 96 is a no-op guard that adds noise.♻️ Proposed simplification
assert.Equal(t, meta, decoded) - assert.InDelta(t, 0.0042, decoded.Cost, 1e-9)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/exec/messages_test.go` around lines 95 - 96, The test is redundantly asserting both assert.Equal(t, meta, decoded) and assert.InDelta(t, 0.0042, decoded.Cost, 1e-9); remove the no-op assert.InDelta call (the struct equality already checks decoded.Cost) or, if you intended to allow a tolerance for Cost, replace the struct equality with field-by-field checks; update the test by deleting the assert.InDelta line referencing decoded.Cost and keep the assert.Equal(t, meta, decoded) (or alternatively remove assert.Equal and assert the float tolerance on decoded.Cost if variability is required).internal/runtime/builtin/agentstep/convert.go (1)
16-23: User messageContentis silently dropped whenToolResultsis non-empty.If an agent produces a
MessageTypeUserwith bothContentandToolResults, the content is discarded. This is likely intentional (tool-result messages don't carry prose), but adding a comment would prevent a future maintainer from treating it as a bug:💡 Suggested clarifying comment
case agent.MessageTypeUser: + // When ToolResults are present they are the message payload; Content is ignored. if len(msg.ToolResults) > 0 { return convertToolResultMessages(msg) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runtime/builtin/agentstep/convert.go` around lines 16 - 23, The branch handling agent.MessageTypeUser currently returns convertToolResultMessages(msg) when msg.ToolResults is non-empty, silently dropping msg.Content; add a short clarifying comment above that case (or inline) stating that when ToolResults are present the message is treated as a tool-result message and any human-readable Content is intentionally ignored/unused, referencing MessageTypeUser, msg.ToolResults, msg.Content, and convertToolResultMessages to make the intent explicit for future maintainers while leaving the behavior unchanged.internal/runtime/builtin/agentstep/testing.go (1)
1-69:testing.gocompiles into production binaries — consider a build constraint.Because this file doesn't end in
_test.go,MockExecutor,MockExecutorType, andRegisterMockExecutorsare included in every production build. Including test-only files in the production build can increase binary size and introduce unnecessary dependencies. A build tag keeps them out:♻️ Add a build constraint to exclude from non-test builds
+//go:build dagutest + package agentstepThen pass
-tags dagutestin CI test runs (go test -tags dagutest ./...). If the project already has a global test tag convention, use that instead.The current placement is intentional (cross-package test helper), so this is non-blocking if the project prefers keeping test helpers in-package without build tags.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runtime/builtin/agentstep/testing.go` around lines 1 - 69, The file defines test-only symbols (MockExecutor, MockExecutorType, RegisterMockExecutors) that are being compiled into production; add a build constraint to exclude this file from non-test builds (e.g., add a top-of-file build tag like //go:build dagutest and the legacy // +build dagutest) so it only compiles when tests opt-in, and update CI/test invocations to pass -tags dagutest (or your existing test tag) when running go test; ensure the tag appears before the package declaration and that you document the tag usage for test runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/runtime/builtin/agentstep/convert_test.go`:
- Around line 83-89: Add an assertion verifying the second tool call's arguments
are intact: check m.ToolCalls[1].Function.Arguments equals the expected JSON
string (e.g., {"path":"/tmp/test"}) to ensure no truncation; locate the
assertions around m.ToolCalls[1] in convert_test.go (use the existing assertions
on m.ToolCalls[1].ID and .Function.Name as anchors) and add the Arguments
assertion immediately after them.
- Around line 163-165: Add an assertion that the error message has no metadata:
in the test where you assert result length, role and content (the slice result
and result[0]), also require.Nil(t, result[0].Metadata) (or assert.Nil) to
ensure MessageTypeError produces nil Metadata and catch regressions that attach
cost/provider metadata to error messages.
In `@internal/runtime/builtin/agentstep/testing.go`:
- Line 1: The new source file testing.go in package agentstep is missing the
required GPL v3 license header; run the repository tooling (make addlicense) to
prepend the standard GPL v3 header to testing.go (and any other new files), or
manually add the exact GPL v3 header used across the project to the top of
testing.go so it matches other files in package agentstep.
In `@internal/runtime/runner_test.go`:
- Around line 3067-3078: The test "HandlerNotCalledForAgentStepWithNoMessages"
should explicitly assert that the mock messages handler had zero write calls to
make intent clear and prevent regressions: after using newMockMessagesHandler(),
setupRunner(withMessagesHandler(handler)) and calling plan.assertRun(t,
core.Failed), add an assertion that handler.writeCalls == 0 (or the test
framework's equivalent check) to confirm saveChatMessages was not invoked
(saveChatMessages is guarded by NodeSucceeded) — update the test function
containing agentStep("agent_fail") accordingly.
In `@rfcs/021-llm-cost-tracking.md`:
- Around line 138-147: The permission table and handler disagree: the handler
currently uses requireManagerOrAbove but the table lists
Operator/Viewer/Developer as "Own costs only"; either tighten the table or relax
the middleware. Pick one fix: (A) If non-managers should access their own costs,
replace requireManagerOrAbove with requireAuthenticated (or equivalent) and in
the handler enforce userId == req.user.id (or force req.params.userId =
req.user.id) so only own-costs are returned; or (B) if only managers/admins may
access, remove the Operator/Viewer/Developer rows from the permission table and
keep requireManagerOrAbove. Update the doc text to match the chosen behavior and
reference requireManagerOrAbove and the userId parameter enforcement in the
handler to ensure consistency.
- Around line 96-132: The AgentCostEntry OpenAPI schema lacks the "source"
property referenced in the RFC prose; add a "source" field to the AgentCostEntry
schema (type: string) with a description and restrict values to the three
allowed sources ("agent_chat", "chat_step", "agent_step") so API responses match
the documented breakdown, and ensure any code/path that emits AgentCostEntry
(e.g., the GetCostSummary implementation in internal/agent/api.go and the
handler in internal/service/frontend/api/v1/agent_cost.go) populates this field
accordingly; alternatively, if you intentionally removed source, remove the
prose that lists the three source values to keep spec and documentation
consistent.
- Around line 74-76: The fenced code block containing the HTTP endpoint
declaration "GET /api/v1/agent/cost-summary?month=YYYY-MM&userId=optional" is
missing a language specifier; update the markdown to use a fenced block with an
"http" language tag (i.e., replace the triple backticks around that GET line
with ```http ... ```) so lint rule MD040 is satisfied and the snippet is
properly highlighted.
In `@rfcs/022-agentstep-cost-persistence.md`:
- Around line 55-59: The RecordMessage snippet incorrectly appends the slice
returned by convertMessage directly to e.savedMessages; change the append call
to spread the slice returned by convertMessage(msg, modelCfg) (i.e., use the ...
operator) so that append(e.savedMessages, convertMessage(msg, modelCfg)...)
compiles; update the RFC example to match the real implementation referring to
RecordMessage, convertMessage, e.savedMessages, append and modelCfg.
- Around line 89-96: The RFC table incorrectly lists a change to
internal/runtime/builtin/chat/executor.go (populate Cost in
createResponseMetadata) but that file isn't modified in this PR; either add the
actual change to populate the Cost field in the createResponseMetadata
implementation in chat/executor.go, or update the RFC table to remove or mark
that row as deferred/out-of-scope (or move it to RFC 021) so the table matches
the PR contents and does not reference a non-existent change.
---
Outside diff comments:
In `@internal/runtime/runner.go`:
- Around line 511-538: The function setupChatMessages currently calls
messagesHandler.ReadStepMessages for every dependency; restore the previous type
guard so we only read messages for dependencies whose executor supports LLM/chat
(i.e., use the isSupportedByLLM check or an equivalent that also returns true
for agent-capable executors) before invoking messagesHandler.ReadStepMessages;
specifically, inside setupChatMessages, for each dep from node.Step().Depends
call the support-check (the same predicate used previously, e.g.,
isSupportedByLLM(dep) or its extended variant) and skip ReadStepMessages when
false, leaving the subsequent DeduplicateSystemMessages and node.SetChatMessages
behavior unchanged.
---
Nitpick comments:
In `@internal/core/exec/messages_test.go`:
- Around line 95-96: The test is redundantly asserting both assert.Equal(t,
meta, decoded) and assert.InDelta(t, 0.0042, decoded.Cost, 1e-9); remove the
no-op assert.InDelta call (the struct equality already checks decoded.Cost) or,
if you intended to allow a tolerance for Cost, replace the struct equality with
field-by-field checks; update the test by deleting the assert.InDelta line
referencing decoded.Cost and keep the assert.Equal(t, meta, decoded) (or
alternatively remove assert.Equal and assert the float tolerance on decoded.Cost
if variability is required).
In `@internal/runtime/builtin/agentstep/convert.go`:
- Around line 16-23: The branch handling agent.MessageTypeUser currently returns
convertToolResultMessages(msg) when msg.ToolResults is non-empty, silently
dropping msg.Content; add a short clarifying comment above that case (or inline)
stating that when ToolResults are present the message is treated as a
tool-result message and any human-readable Content is intentionally
ignored/unused, referencing MessageTypeUser, msg.ToolResults, msg.Content, and
convertToolResultMessages to make the intent explicit for future maintainers
while leaving the behavior unchanged.
In `@internal/runtime/builtin/agentstep/executor_test.go`:
- Around line 21-25: Rename the misleading local variable ctx (type
[]exec.LLMMessage) to msgs to avoid confusing it with context.Context; update
its declaration and the call e.SetContext(ctx) to use msgs and
e.SetContext(msgs), and ensure any other references in the test that use ctx are
updated accordingly (symbols: ctx -> msgs, type exec.LLMMessage, method
e.SetContext).
In `@internal/runtime/builtin/agentstep/testing.go`:
- Around line 1-69: The file defines test-only symbols (MockExecutor,
MockExecutorType, RegisterMockExecutors) that are being compiled into
production; add a build constraint to exclude this file from non-test builds
(e.g., add a top-of-file build tag like //go:build dagutest and the legacy //
+build dagutest) so it only compiles when tests opt-in, and update CI/test
invocations to pass -tags dagutest (or your existing test tag) when running go
test; ensure the tag appears before the package declaration and that you
document the tag usage for test runs.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1697 +/- ##
==========================================
- Coverage 69.95% 69.77% -0.19%
==========================================
Files 372 374 +2
Lines 41330 41445 +115
==========================================
+ Hits 28914 28919 +5
Misses 10089 10089
- Partials 2327 2437 +110
... and 9 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Summary by CodeRabbit
New Features
Tests