What instrumentation is missing
When the Anthropic Messages API is called with extended thinking enabled (thinking: {type: "enabled", budget_tokens: N}), the response usage object includes an output_tokens_details sub-object containing thinking_tokens — the number of tokens consumed by the model's internal reasoning. The Anthropic tracer's parseUsageTokens function never records this, so completion_reasoning_tokens (the standard Braintrust metric for reasoning/thinking token costs) is always absent from Anthropic extended-thinking spans.
Why it's dropped
In trace/contrib/anthropic/traceanthropic.go (lines 114–158), parseUsageTokens iterates over the usage map and only processes values that pass internal.ToInt64:
func parseUsageTokens(usage map[string]interface{}) map[string]int64 {
for k, v := range usage {
if ok, i := internal.ToInt64(v); ok { // only handles scalars
switch k {
case "input_tokens": ...
case "output_tokens": metrics["completion_tokens"] = i
// ...
default: metrics[k] = i
}
}
// nested objects silently skipped — output_tokens_details falls here
}
}
output_tokens_details is a nested object ({"thinking_tokens": N}), not a scalar, so ToInt64 returns false for it and the entire sub-object is ignored. No completion_reasoning_tokens (or thinking_tokens) metric is ever recorded.
Upstream source
The Anthropic Messages API documentation defines the usage response object as:
"usage": {
"input_tokens": 45,
"output_tokens": 170,
"output_tokens_details": {
"thinking_tokens": 120
}
}
output_tokens_details.thinking_tokens is the count of tokens used for the model's extended thinking reasoning, separate from the visible answer tokens. SDK type support was added in anthropic-sdk-go v1.46.0 (May 28, 2026).
Parity gap
Two other integrations in this repo already capture equivalent reasoning/thinking tokens:
| Integration |
API field |
Braintrust metric |
OpenAI (traceopenai.go:136–145) |
completion_tokens_details.reasoning_tokens |
completion_reasoning_tokens |
Google GenAI (generatecontent.go:360) |
thoughtsTokenCount |
completion_reasoning_tokens |
| Anthropic |
output_tokens_details.thinking_tokens |
❌ never recorded |
The OpenAI parseUsageTokens handles *_tokens_details sub-objects explicitly:
if strings.HasSuffix(k, "_tokens_details") {
prefix := translateMetricPrefix(strings.TrimSuffix(k, "_tokens_details"))
if details, ok := v.(map[string]interface{}); ok {
for kd, vd := range details {
if ok, i := internal.ToInt64(vd); ok {
metrics[prefix+"_"+kd] = i
}
}
}
}
The Anthropic version has no equivalent branch.
Braintrust docs status
supported — Braintrust's advanced tracing docs (https://www.braintrust.dev/docs/instrument/advanced-tracing) list standard LLM metrics including token counts. The completion_reasoning_tokens metric is already captured by the OpenAI and GenAI integrations, establishing a cross-provider convention that extended-thinking token costs should be separately metered.
Local repo files inspected
trace/contrib/anthropic/traceanthropic.go — parseUsageTokens() (lines 114–158): only processes int64 scalars; output_tokens_details silently dropped
trace/contrib/openai/traceopenai.go — parseUsageTokens() (lines 136–145): reference implementation handling *_tokens_details sub-objects → emits completion_reasoning_tokens
trace/contrib/genai/generatecontent.go — explicitly maps thoughtsTokenCount → completion_reasoning_tokens
trace/contrib/anthropic/testdata/cassettes/TestStreamingWithThinking.yaml — VCR cassette shows extended thinking streaming; was recorded before output_tokens_details field was added to the API
trace/contrib/anthropic/go.mod — anthropic-sdk-go v1.23.0 (SDK type for OutputTokensDetails not yet present, but middleware parses raw JSON — the API already returns this field for thinking-capable models)
What instrumentation is missing
When the Anthropic Messages API is called with extended thinking enabled (
thinking: {type: "enabled", budget_tokens: N}), the responseusageobject includes anoutput_tokens_detailssub-object containingthinking_tokens— the number of tokens consumed by the model's internal reasoning. The Anthropic tracer'sparseUsageTokensfunction never records this, socompletion_reasoning_tokens(the standard Braintrust metric for reasoning/thinking token costs) is always absent from Anthropic extended-thinking spans.Why it's dropped
In
trace/contrib/anthropic/traceanthropic.go(lines 114–158),parseUsageTokensiterates over the usage map and only processes values that passinternal.ToInt64:output_tokens_detailsis a nested object ({"thinking_tokens": N}), not a scalar, soToInt64returns false for it and the entire sub-object is ignored. Nocompletion_reasoning_tokens(orthinking_tokens) metric is ever recorded.Upstream source
The Anthropic Messages API documentation defines the
usageresponse object as:output_tokens_details.thinking_tokensis the count of tokens used for the model's extended thinking reasoning, separate from the visible answer tokens. SDK type support was added inanthropic-sdk-gov1.46.0 (May 28, 2026).anthropic-sdk-gov1.46.0 release notes (addsOutputTokensDetailstype)Parity gap
Two other integrations in this repo already capture equivalent reasoning/thinking tokens:
traceopenai.go:136–145)completion_tokens_details.reasoning_tokenscompletion_reasoning_tokensgeneratecontent.go:360)thoughtsTokenCountcompletion_reasoning_tokensoutput_tokens_details.thinking_tokensThe OpenAI
parseUsageTokenshandles*_tokens_detailssub-objects explicitly:The Anthropic version has no equivalent branch.
Braintrust docs status
supported — Braintrust's advanced tracing docs (
https://www.braintrust.dev/docs/instrument/advanced-tracing) list standard LLM metrics including token counts. Thecompletion_reasoning_tokensmetric is already captured by the OpenAI and GenAI integrations, establishing a cross-provider convention that extended-thinking token costs should be separately metered.Local repo files inspected
trace/contrib/anthropic/traceanthropic.go—parseUsageTokens()(lines 114–158): only processes int64 scalars;output_tokens_detailssilently droppedtrace/contrib/openai/traceopenai.go—parseUsageTokens()(lines 136–145): reference implementation handling*_tokens_detailssub-objects → emitscompletion_reasoning_tokenstrace/contrib/genai/generatecontent.go— explicitly mapsthoughtsTokenCount→completion_reasoning_tokenstrace/contrib/anthropic/testdata/cassettes/TestStreamingWithThinking.yaml— VCR cassette shows extended thinking streaming; was recorded beforeoutput_tokens_detailsfield was added to the APItrace/contrib/anthropic/go.mod—anthropic-sdk-gov1.23.0 (SDK type forOutputTokensDetailsnot yet present, but middleware parses raw JSON — the API already returns this field for thinking-capable models)