Skip to content

feat: cross-domain performance baseline harness#144

Merged
nadilas merged 2 commits intomainfrom
feat/implement-cross-domain-performance-baseline-measurement-kux6rtdynfmfbuctf2eck184
Mar 15, 2026
Merged

feat: cross-domain performance baseline harness#144
nadilas merged 2 commits intomainfrom
feat/implement-cross-domain-performance-baseline-measurement-kux6rtdynfmfbuctf2eck184

Conversation

@tervezo-ai
Copy link
Copy Markdown
Contributor

@tervezo-ai tervezo-ai bot commented Mar 15, 2026

Summary

Implements US-710: Cross-domain performance baseline (#94).

  • Benchmark harness (test-infra/bench-harness/bench-harness.sh) — sends 100 sequential + 100 concurrent HTTP requests per test app (T1–T5), collects per-request metrics via curl -w (latency, DNS, DB proxy time, response size), computes p50/p95/p99 percentile breakdowns, and outputs structured JSON to test-results/performance-baseline.json
  • Percentile library (test-infra/bench-harness/lib/percentile.sh) — nearest-rank percentile computation and stats aggregation using pure bash + awk
  • Regression detection (scripts/compare-perf.sh) — diffs two baseline JSON files, outputs formatted comparison table, exits with warning on >20% p95 regression (configurable via --threshold)
  • Quality gate: shim overhead < 10% vs direct connection, enforced per-app

Test plan

  • 63 TDD tests pass (55 unit + 8 integration) in bench-harness.test.sh
  • Percentile correctness: single value, odd/even counts, all-same values, empty array
  • JSON schema validation: metadata, per-app sequential/concurrent sections, quality gate fields
  • Quality gate logic: pass at <10%, fail at >=10% overhead
  • Regression detection: no-regression exit 0, >20% regression exit 1, missing file exit 2, --threshold override
  • Argument parsing: --apps, --sequential-count, --concurrent-count, --output, --dry-run, --help
  • End-to-end: --dry-run --generate-sample produces valid JSON

Closes #94

🤖 Generated with Claude Code

nadilas and others added 2 commits March 15, 2026 09:18
Implements US-710: benchmark harness that measures per-request HTTP metrics
(latency, DNS time, DB proxy time, response size) across WarpGrid test apps
T1–T5, with p50/p95/p99 percentile breakdowns.

New files:
- test-infra/bench-harness/bench-harness.sh — main harness (sequential + concurrent requests)
- test-infra/bench-harness/bench-harness.test.sh — 63 TDD tests
- test-infra/bench-harness/lib/percentile.sh — percentile computation library
- scripts/compare-perf.sh — regression detection (>20% p95 triggers warning)

Quality gate: shim overhead < 10% vs direct connection.
All arithmetic uses awk (no bc dependency).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 3 tests covering critical paths not previously exercised:
- Quality gate boundary at 9.9% (just below 10% threshold)
- Unknown CLI option rejection (exit code 1)
- compare-perf.sh invalid JSON detection (exit code 2)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nadilas nadilas merged commit 4b08f94 into main Mar 15, 2026
2 of 11 checks passed
@nadilas nadilas deleted the feat/implement-cross-domain-performance-baseline-measurement-kux6rtdynfmfbuctf2eck184 branch March 15, 2026 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

US-710: Cross-domain performance baseline

1 participant