Skip to content

Commit 9ee584e

Browse files
Om-RohillaOm Rohilla
andauthored
community(benchmarks): add component benchmark results from Intel i3-6006U Linux (#2065)
* community(benchmarks): add component benchmark results from Intel i3-6006U Ran all four component benchmark scripts on my local Linux machine (Intel i3-6006U, 4 cores, 3.7 GB RAM, Python 3.14.5) and recorded the output. Added a Community-submitted benchmarks section to docs/benchmarks/BENCHMARKS.md with actual measured numbers. Contributes to #787. * docs(benchmarks): address review feedback on community submission - Add blank lines between metadata fields so they render as separate paragraphs (Sourcery) - Expand command column to full uv invocations, e.g. 'uv run python benchmarks/bench_orchestrator.py' (Sourcery/Copilot) - Clarify Python 3.14.5 is a pre-release build; add note that stable 3.12/3.13 should give lower startup latency (Sourcery/CodeRabbit) - Include full distro kernel string: 6.17.0-35-generic (CodeRabbit) --------- Co-authored-by: Om Rohilla <om@contributor>
1 parent d6fd9ba commit 9ee584e

1 file changed

Lines changed: 32 additions & 0 deletions

File tree

docs/benchmarks/BENCHMARKS.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,3 +129,35 @@ uv run python benchmarks/bench_startup.py
129129
Benchmarks measure scheduling efficiency, not code quality. A fast wrong answer is still wrong. Bernstein's janitor and quality gates ensure the output is correct before it lands - which adds overhead but saves you from debugging agent mistakes.
130130

131131
The real metric that matters: **how much of your day do you save?** If a single agent would take 4 hours on your backlog and Bernstein finishes it in 2.5 hours with verified output, you got back 1.5 hours. That compounds across every run.
132+
133+
---
134+
135+
## Community-submitted benchmarks
136+
137+
Real runs from the community. Submit yours via [issue #787](https://github.com/sipyourdrink-ltd/bernstein/issues/787) or open a PR adding a row here.
138+
139+
### Component benchmarks — Intel i3-6006U, Linux, Python 3.14 (pre-release)
140+
141+
**Hardware:** Intel Core i3-6006U @ 2.00GHz, 4 cores, 3.7 GB RAM, Ubuntu (kernel 6.17.0-35-generic), SSD
142+
143+
**Bernstein version:** v2.7.0
144+
145+
**Python:** 3.14.5 (pre-release dev build, inside project venv — `uv venv`)
146+
147+
**Submitted by:** [@Om-Rohilla](https://github.com/Om-Rohilla)
148+
149+
| Benchmark | Result | Command |
150+
|-----------|--------|---------|
151+
| Orchestrator tick latency (100-task backlog) — avg | 5.89 ms | `uv run python benchmarks/bench_orchestrator.py` |
152+
| Orchestrator tick latency (100-task backlog) — max | 7.35 ms | `uv run python benchmarks/bench_orchestrator.py` |
153+
| Task store: creations | 251.83 tasks/sec | `uv run python benchmarks/bench_task_store.py` |
154+
| Task store: claims | 253.38 tasks/sec | `uv run python benchmarks/bench_task_store.py` |
155+
| Task store: completions | 162.19 tasks/sec | `uv run python benchmarks/bench_task_store.py` |
156+
| Task store: flush latency (buffer=1) | 3.32 ms | `uv run python benchmarks/bench_task_store.py` |
157+
| Quality gate verify_task — 1 signal | 0.038 ms | `uv run python benchmarks/bench_quality_gates.py` |
158+
| Quality gate verify_task — 10 signals | 0.231 ms | `uv run python benchmarks/bench_quality_gates.py` |
159+
| Quality gate verify_task — 50 signals | 1.134 ms | `uv run python benchmarks/bench_quality_gates.py` |
160+
| Quality gate verify_task — 100 signals | 1.915 ms | `uv run python benchmarks/bench_quality_gates.py` |
161+
| Startup latency (avg, 5 runs) | 3048.61 ms | `uv run python benchmarks/bench_startup.py` |
162+
163+
**Notes:** Low-end consumer laptop (budget i3, 2016 generation, 3.7 GB RAM). Startup latency is higher than expected — likely cold import overhead from running a Python 3.14 pre-release build; expect lower on stable Python 3.12/3.13. Orchestrator tick and task store throughput look normal for this hardware class. Quality gate scaling is near-linear as the docs describe.

0 commit comments

Comments
 (0)