Skip to content

Add API benchmark suite#499

Open
justtryingtohelp46 wants to merge 1 commit into
SecureBananaLabs:mainfrom
justtryingtohelp46:codex/api-benchmark-suite-30
Open

Add API benchmark suite#499
justtryingtohelp46 wants to merge 1 commit into
SecureBananaLabs:mainfrom
justtryingtohelp46:codex/api-benchmark-suite-30

Conversation

@justtryingtohelp46
Copy link
Copy Markdown

@justtryingtohelp46 justtryingtohelp46 commented May 20, 2026

Closes #30

/claim #30

Summary

  • Added a dependency-free benchmark runner under benchmarks/.
  • Covered every route mounted under /api/* plus the public /health probe: 21 endpoints total.
  • Added realistic synthetic JSON and multipart payloads for write routes.
  • Added generated benchmark tokens for local protected-route benchmarking and env-driven token support for deployed targets.
  • Added npm run benchmark and npm run benchmark:smoke.
  • Added .env.benchmark.example, benchmarks/thresholds.json, committed JSON/Markdown result output, and a GitHub Actions smoke benchmark.
  • Added short demo video at demos/api-benchmark-suite-demo.mp4.
  • Fixed the API test script glob so npm test can run the existing test file directly.

Benchmark result summary

Committed result files:

  • benchmarks/results/latest.json
  • benchmarks/results/latest.md

Latest local full run:

  • Run: 2026-05-20T17-15-35-095Z
  • Mode: full
  • Target: local-loopback
  • Endpoints: 21
  • Requests per endpoint: 8
  • Total measured requests: 168
  • Error rate: 0%
  • Max p99 latency: 3.11 ms
  • Max p99 TTFB: 3.09 ms
  • Peak endpoint RPS: 4666.76
  • Thresholds: passed

Benchmark environment

Hardware

  • CPU model & core count: Apple M1 Max, 10 cores
  • RAM total & available during benchmark: 32 GB total; vm_stat immediately after the benchmark showed 4,957 free pages and 28,828 speculative pages at 16 KiB/page (~528 MiB directly free/speculative), with normal macOS compressed/cache memory management
  • Storage type: Apple internal SSD / APFS
  • Network interface: loopback (127.0.0.1)
  • Machine type: local workstation
  • OS & version: macOS 15.7.1 (24G231), arm64

Runtime

  • Node.js version: v26.0.0
  • npm version: 11.12.1
  • Resource limits: none; no Docker/cgroup cap
  • Other significant processes: normal local workstation background processes

AI-agent disclosure

  • Agent/tool name: OpenAI Codex desktop app
  • Underlying model/version: GPT-5-based Codex model; exact backend build not exposed
  • Inference provider: OpenAI
  • Orchestration framework: Codex desktop workflow, no LangChain/AutoGen/custom multi-agent framework
  • Execution mode: human-initiated, AI-agent implemented and ran shell commands directly
  • Shell/tool access: yes
  • Internet access: yes
  • Benchmark commands run by: the AI agent directly
  • Known constraints/sandboxing: benchmark ran on the local workstation against the local Express app over loopback; no container resource limits were applied; full benchmark defaults stay below the repo API rate limiter.

Validation

  • npm test
  • npm run benchmark:smoke
  • npm run benchmark
  • ffprobe demos/api-benchmark-suite-demo.mp4
  • git diff HEAD --check

github-actions Bot added a commit that referenced this pull request May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Benchmark APIs with p50, p95, p99 latency, RPS, error rate and TTFB

1 participant