Skip to content

feat(metrics): per-call-site cache hit rate tracking#60

Merged
Siddhant-K-code merged 2 commits into
mainfrom
feat/47-callsite-hit-rate
May 2, 2026
Merged

feat(metrics): per-call-site cache hit rate tracking#60
Siddhant-K-code merged 2 commits into
mainfrom
feat/47-callsite-hit-rate

Conversation

@Siddhant-K-code
Copy link
Copy Markdown
Owner

Closes #47

What

Adds CallSiteTracker to pkg/metrics. Records Anthropic API usage per call site and surfaces the worst performers first, making it easy to identify which agents or code paths are getting poor cache hit rates.

Changes

pkg/metrics/callsite.go (new)

  • CallSiteTracker: thread-safe tracker backed by sync.RWMutex
  • Record(callSite, UsageRecord): accumulates CacheCreationTokens, CacheReadTokens, UncachedInputTokens, OutputTokens; increments CacheHitRequests when cache_read_input_tokens > 0
  • HitRate(): cache_read / (cache_read + cache_creation + uncached_input)
  • WriteEfficiency(): cache_read / cache_creation — values < 1.0 mean writes that expire before being read
  • RequestHitRate(): fraction of requests with at least one cache read token
  • AllStats(): all call sites sorted by hit rate ascending (worst first)
  • Summary(): human-readable table of call site, hit%, efficiency, request count
  • Reset / ResetAll for test isolation

pkg/metrics/callsite_test.go (new) — 7 tests

Usage

tracker := metrics.NewCallSiteTracker()

tracker.Record("agent/planner.go:84", metrics.UsageRecord{
    CacheCreationInputTokens: resp.Usage.CacheCreationInputTokens,
    CacheReadInputTokens:     resp.Usage.CacheReadInputTokens,
    InputTokens:              resp.Usage.InputTokens,
})

// Worst performers first
for _, s := range tracker.AllStats() {
    fmt.Printf("%-40s hit=%.0f%%  eff=%.1fx  reqs=%d\n",
        s.CallSite, s.HitRate()*100, s.WriteEfficiency(), s.TotalRequests)
}

Siddhant-K-code and others added 2 commits May 2, 2026 14:05
Add CallSiteTracker to pkg/metrics. Records Anthropic API usage per
call site and exposes HitRate, WriteEfficiency, RequestHitRate.

- Record(callSite, UsageRecord): accumulates token counters and request
  counts; marks CacheHitRequests when cache_read_input_tokens > 0
- Stats(callSite): snapshot for a single call site
- AllStats(): all call sites sorted by hit rate ascending (worst first)
- Reset / ResetAll for test isolation
- Summary(): human-readable table of call site, hit%, efficiency, reqs

Thread-safe via sync.RWMutex.

Co-authored-by: Ona <no-reply@ona.com>
@Siddhant-K-code Siddhant-K-code merged commit 7ca5e8a into main May 2, 2026
2 checks passed
@Siddhant-K-code Siddhant-K-code deleted the feat/47-callsite-hit-rate branch May 2, 2026 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Prompt cache hit rate tracking per call site

1 participant