Skip to content

Add API benchmark suite#55

Open
vornicx wants to merge 1 commit into
SecureBananaLabs:mainfrom
vornicx:codex/benchmark-api-suite
Open

Add API benchmark suite#55
vornicx wants to merge 1 commit into
SecureBananaLabs:mainfrom
vornicx:codex/benchmark-api-suite

Conversation

@vornicx
Copy link
Copy Markdown

@vornicx vornicx commented May 17, 2026

Closes #30.

Summary

  • Adds a dependency-free Node benchmark runner under /benchmarks.
  • Covers every /api route plus /health as a baseline, including auth-protected admin metrics and multipart uploads.
  • Reports p50, p95, p99 latency, sustained/peak RPS, error rate, status codes, and TTFB.
  • Adds configurable thresholds, .env.benchmark.example, persisted JSON/Markdown reports, and a CI smoke benchmark gate.

Validation

  • npm test
  • npm run benchmark:smoke
  • npm run benchmark

Latest full local run: 21 endpoints, 525 requests, 0% error rate, gate passed, worst p99 26.61ms (GET /health).

Notes

Local benchmark mode can bypass API rate limiting so endpoint behavior is measured rather than limiter behavior. Staging/production runs can set BENCHMARK_TARGET_HOST and BENCHMARK_AUTH_TOKEN instead.

Benchmark Environment

Hardware

  • CPU model & core count: AMD Ryzen 5 5600H with Radeon Graphics, 12 logical cores as reported by Node/os.cpus()
  • RAM (total & available during benchmark): 7,522 MB total / 1,013 MB free in the latest full benchmark report
  • Storage type (SSD / NVMe / HDD): not reported from the benchmark runner; local Windows workstation storage
  • Network interface (Ethernet / WiFi / loopback): loopback (127.0.0.1, local ephemeral API server)
  • Machine type (local workstation / cloud VM / CI runner - include instance type if cloud): local Windows workstation
  • OS & version: Windows_NT 10.0.26200 x64

Runtime

  • Node.js version (or relevant runtime): v22.22.3 captured by the benchmark report
  • Any resource limits applied (Docker memory cap, cgroup limits, etc.): none known; not running inside Docker
  • Other significant processes running during benchmark (yes / no - if yes, describe): yes; normal local desktop/Codex app processes

If submitted by or with an AI agent

  • Agent or tool name (e.g. Claude Code, Devin, Copilot Workspace, AutoGPT): OpenAI Codex
  • Underlying model and version (e.g. claude-sonnet-4-5, gpt-4o - if known): GPT-5-based Codex coding agent
  • Inference provider (e.g. Anthropic, OpenAI, Azure, self-hosted): OpenAI
  • Orchestration framework if any (e.g. LangChain, AutoGen, custom): Codex desktop app/tools
  • Execution mode (fully autonomous / human-supervised / human-initiated per step): human-initiated, agent-executed with approval prompts for network/GitHub actions
  • Did the agent have shell/tool access during execution (yes / no): yes
  • Did the agent have internet access during execution (yes / no): yes, through approved tool/network actions
  • Were benchmark commands run by the agent directly or handed off to the human to run: run directly by the agent
  • Any known agent constraints or sandboxing that may have affected execution: workspace-write sandbox, restricted network requiring approval; full npm install initially hit a Prisma postinstall network reset, then dependencies were repaired with npm install --ignore-scripts; the benchmark itself does not depend on Prisma engines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Benchmark APIs with p50, p95, p99 latency, RPS, error rate and TTFB

1 participant