Skip to content

ci: Add benchstat-based performance regression detection #489

@gyliu513

Description

@gyliu513

Problem

We have Go benchmarks (make bench) but they are not integrated into CI. There is currently no automated way to detect performance regressions introduced by a PR. Maintainers must run benchmarks manually or rely on real cluster testing to validate performance, which slows down the development cycle.

Current state

  • make bench only covers pkg/preprocessing/chat_completions/ and pkg/tokenization/
  • 12 Benchmark* functions exist across 4 files, but several are not included in make bench
  • No benchstat, no performance comparison tooling, no CI benchmark job

Proposal

Add a new GitHub Actions workflow (.github/workflows/ci-bench.yaml) that compares benchmark results between the PR branch and main using benchstat:

  1. Run go test -bench=. -benchmem -count=5 on the PR branch → new.txt
  2. Checkout main, run the same command → old.txt
  3. Run benchstat old.txt new.txt and post the comparison as a PR comment
  4. (Optional, future) Fail CI if any benchmark regresses beyond a configurable threshold

Benchmarks to track

Package Key benchmarks
pkg/preprocessing/chat_completions/ BenchmarkRenderChat, BenchmarkRender, BenchmarkGetOrCreateTokenizerKey
pkg/tokenization/ BenchmarkAsyncTokenizationStress, BenchmarkSyncTokenizationStress
pkg/kvevents/ BenchmarkZMQSubscriber_Throughput
pkg/kvcache/kvblock/ (see "New benchmarks" below)

All benchmarks should use -benchmem to track allocation counts and catch GC pressure regressions.

CI performance impact

  • Runs as a separate workflow (ci-bench.yaml), fully parallel with existing ci-test and ci-lint does not slow them down
  • Can be configured as a non-required check so it never blocks PR merges
  • Can be scoped to only trigger on changes to pkg/**/*.go, go.mod, go.sum

@vMaroon @sagearc comments for this new ci workflow?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions