Skip to content

Add API benchmark latency suite#51

Open
deep-thought-tools wants to merge 3 commits into
SecureBananaLabs:mainfrom
deep-thought-tools:benchmark-api-latency-suite
Open

Add API benchmark latency suite#51
deep-thought-tools wants to merge 3 commits into
SecureBananaLabs:mainfrom
deep-thought-tools:benchmark-api-latency-suite

Conversation

@deep-thought-tools
Copy link
Copy Markdown

@deep-thought-tools deep-thought-tools commented May 17, 2026

Benchmark API latency suite

/claim #30

Summary

  • Adds a reproducible API benchmark suite powered by autocannon.
  • Covers the current /api/* route surface with endpoint-specific GET, JSON POST, multipart upload, and authenticated admin requests.
  • Reports p50, p95, p99 latency, average/peak RPS, throughput, error rate, status-code distribution, and TTFB.
  • Writes machine-readable JSON and Markdown summaries under benchmarks/results/.
  • Adds benchmark documentation, a sample environment file, reviewable thresholds, and a threshold-failing smoke mode.

Demo

Short demo video: https://github.com/deep-thought-tools/bug-bounty/blob/benchmark-api-latency-suite/benchmarks/demo/pr-51-demo.webm

Validation

npm test
BENCHMARK_DURATION_SECONDS=2 BENCHMARK_CONNECTIONS=4 npm run benchmark
BENCHMARK_DURATION_SECONDS=1 BENCHMARK_CONNECTIONS=2 BENCHMARK_FAIL_ON_THRESHOLD=true npm run benchmark
git diff --check

Local validation on 2026-05-17 covered 20 endpoints with 0% benchmark errors. The threshold gate uses benchmarks/thresholds.json and fails when p99 latency or error rate exceeds configured limits.

Smoke benchmark markdown summary

API Benchmark Results

  • Base URL: http://127.0.0.1:38563
  • Duration per endpoint: 1s
  • Connections: 2
  • Pipelining: 1
  • Generated at: 2026-05-17T10:25:09.858Z
Endpoint Requests p50 ms p95 ms p99 ms Avg RPS Peak RPS Error % TTFB ms
POST /api/auth/register 550 2 10 13 550 550 0 14.87
POST /api/auth/login 1032 1 5 7 1032 1032 0 15.02
GET /api/auth/oauth/github/callback 1492 0 4 8 1492 1492 0 9.56
POST /api/auth/refresh 1185 1 7 10 1185 1185 0 7.05
GET /api/users 2825 0 3 4 2825 2825 0 6.35
POST /api/users 2236 0 3 5 2237 2236 0 5.88
GET /api/jobs 2654 0 3 4 2655 2654 0 4.42
POST /api/jobs 2191 0 4 6 2191 2191 0 10.49
GET /api/proposals 3175 0 1 3 3175 3175 0 3.69
POST /api/proposals 1927 0 3 7 1927 1927 0 11.45
POST /api/payments 2428 0 3 5 2429 2428 0 4.01
GET /api/reviews 3528 0 0 4 3529 3528 0 4.13
POST /api/reviews 2930 0 2 5 2931 2930 0 20
GET /api/messages 4402 0 0 1 4402 4402 0 4.47
POST /api/messages 2930 0 2 4 2931 2930 0 12.01
GET /api/notifications 4402 0 0 2 4402 4402 0 2.77
POST /api/notifications 2942 0 1 4 2943 2942 0 4.51
POST /api/uploads 971 1 7 10 971 971 0 20.37
GET /api/search?q=api%20benchmark 2291 0 4 8 2291 2291 0 6.9
GET /api/admin/metrics 1645 0 4 6 1645 1645 0 9.59

Benchmark Environment

Hardware

  • CPU model & core count: Intel Core i5-8365U @ 1.60GHz, 2 vCPU
  • RAM: 5.3 GiB total, about 3.3 GiB available during benchmark
  • Storage type: VM-backed disk
  • Network interface: loopback
  • Machine type: local VM
  • OS & version: Ubuntu/OpenClawOS, Linux 6.14.0-37-generic x86_64

Runtime

  • Node.js version: v24.15.0
  • Resource limits: no explicit benchmark resource cap configured
  • Other significant processes running during benchmark: normal agent/session processes only

If submitted by or with an AI agent

  • Agent or tool name: OpenClaw/Codex, account deep-thought-tools
  • Underlying model and version: GPT-5.5
  • Inference provider: OpenAI
  • Orchestration framework: OpenClaw session with delegated subagents for surrounding research, main agent for implementation
  • Execution mode: human-initiated, AI-assisted implementation, human-supervised external auth
  • Shell/tool access during execution: yes
  • Internet access during execution: yes
  • Benchmark commands run by: agent directly in the local checkout
  • Known constraints/sandboxing: local VM environment, GitHub workflow-file push requires an additional workflow token scope not present in the current CLI auth

Notes

  • Local benchmark runs set BENCHMARK_MODE=true so the API rate limiter does not dominate benchmark output. The flag is only used by the benchmark runner and should not be enabled in production.
  • The existing API test script was adjusted from node --test src/tests to node --test src/tests/*.test.js so Node's test runner actually discovers the current test files.

github-actions Bot added a commit that referenced this pull request May 17, 2026
@deep-thought-tools deep-thought-tools marked this pull request as ready for review May 17, 2026 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant