perf: Go benchmark CLI with Vegeta — LLM, MCP, and Code Mode by christianromeni · Pull Request #32 · voidmind-io/voidllm

christianromeni · 2026-03-29T18:48:59Z

Summary

Professional benchmark tool replacing the bash script. Uses Vegeta as a Go library with embedded mock servers, warmup phase, and structured reports.

Baseline results (1000 RPS, 60s):

LLM Proxy: 149µs P50
MCP Proxy: 428µs P50
Code Mode: 3.36ms pure JS, 33µs warm eval

Scenarios

Scenario	RPS	Duration	What it measures
`quick`	500	15s	Sanity check — all paths
`sustained`	5000	5 min	Memory leaks, GC pressure
`burst`	200→10k→200	90s	Spike handling and recovery
`large-payload`	100	60s	100KB bodies
`mixed`	500 total	60s	60% LLM + 30% MCP + 10% Code Mode
`endurance`	500	30 min	Long-running stability

Usage

go run ./scripts/bench quick
go run ./scripts/bench sustained --rps 2000 --duration 120s
go run ./scripts/bench quick --json > results.json

Test plan

go build ./scripts/bench/ compiles
quick scenario: 100% success, all 4 phases + Code Mode
JSON output includes all metrics
Warmup phase primes caches before measurement
Code review (2 rounds, all findings fixed)

… RPS)

…report

…N reports Replace bash script with a proper Go CLI using Vegeta as a library. Embedded mock servers (LLM + MCP), warmup phase, text + JSON reports. Scenarios: quick, sustained, burst, large-payload, mixed, endurance. Usage: go run ./scripts/bench [scenario] [--rps N] [--duration D] [--json]

codecov · 2026-03-29T18:54:26Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…ion (voidmind-io#30) * feat: Code Mode — WASM-sandboxed JS execution for MCP tool orchestration (voidmind-io#30) Add Code Mode: LLMs write JavaScript that orchestrates multiple MCP tool calls in a single execution, reducing token usage by 30-80%. Scripts run in a QuickJS/WASM sandbox (fastschema/qjs + Wazero) embedded in VoidLLM. Three new built-in MCP tools: - list_servers: discover available MCP servers - search_tools: find tools by keyword across servers - execute_code: run JS with MCP tools as async functions Runtime: pool of QJS runtimes (default 8), fresh runtime per execution, tool schema cache with lazy fetch, console capture in results. Configurable via voidllm.yaml (disabled by default): mcp.code_mode.enabled, pool_size, memory_limit_mb, timeout, max_tool_calls Also fixes session re-init double-check in MCP proxy. * feat: Code Mode Phase 3 — blocklist, refresh, toggle, MCP server split (voidmind-io#31) Split built-in MCP into two servers: - /api/v1/mcp — Code Mode (list_servers, search_tools, execute_code) - /api/v1/mcp/voidllm — Management (list_models, get_usage, etc.) - /api/v1/mcp/:alias — External MCP server proxy Per-tool blocklist for Code Mode: - Migration 0005: mcp_tool_blocklist table - CRUD API: GET/POST/DELETE /mcp-servers/:id/blocklist - Defense in depth: filtered before sandbox injection + checked in ToolCaller - Blocklist also applied to search_tools and list_servers tool counts Tool refresh endpoint: - POST /mcp-servers/:id/refresh-tools with 60s cooldown Admin controls: - code_mode_enabled toggle in API response and PATCH - UI: Code Mode toggle column, expanded row with blocklist management Shared MCP handler helper eliminates POST/SSE handler duplication. * feat: Code Mode Phase 4 — Proxy pattern, SSE upstream, execution history, TypeScript types (voidmind-io#32) JS Proxy pattern replaces static preamble generator: - Single __callTool dispatch via ES6 Proxy interception - Any tool name characters supported, preamble is O(1) in tool count SSE upstream transport support: - Auto-detect Streamable HTTP vs deprecated SSE protocol - Lazy detection with sync.Once, origin validation on endpoints Execution history: - Migration 0006: code_mode_execution_id on mcp_tool_calls - UUIDv7 per execute_code call groups all tool calls Dynamic TypeScript types in execute_code description: - GenerateToolTypeDefs converts cached tool schemas to TS declarations - OnToolsListHook injects types at tools/list time Bug fixes: - MCP access control enforced in all Code Mode closures for global servers - ToolCache fetcher resolves servers across all scopes - Frontend blocklist DELETE matches backend query parameter API * feat: persistent tool cache, SSE detection, tools list UI with block buttons (voidmind-io#33) Persistent tool cache: - Migration 0007: mcp_server_tools table for DB-backed tool schemas - Startup loads from DB (zero HTTP calls, TypeScript types immediately available) - 24h background refresh keeps schemas current - Write-through on every fetch (RefreshServer, GetTools) - DB entries marked stale on load so they refresh within maxAge SSE transport detection: - Servers using deprecated SSE protocol auto-deactivated at startup - Clear error message: "server uses deprecated SSE transport" - Test connection also detects and deactivates SSE servers Tools list UI: - GET /mcp-servers/:id/tools endpoint returns cached tools with blocked status - Expanded row shows all tools with Block/Unblock buttons - Block buttons work for YAML-sourced servers (blocklist is independent of source) - Plug icon centered in sidebar and MCP servers page Also fixes: - ToolStore.Delete uses server ID (not alias) to avoid soft-delete lookup failure - Corrupt JSON schemas skipped on DB load instead of serving empty schemas * ci: lower patch coverage target to 50% * refactor: extract Code Mode service from app.go + add 46 tests Extract 3 closures (ExecuteCode, ListAccessibleMCPServers, SearchMCPTools) from app.go into codeModeService in code_mode.go. Shared accessibleServers helper eliminates duplicated server-listing + access-check logic. New tests: - code_mode_test.go: 21 tests (mock DB, real WASM executor) - mcp_tool_blocklist_test.go: 11 tests (CRUD, conflicts, isolation) - mcp_server_tools_test.go: 14 tests (upsert, replace, active filter) app.go reduced by ~400 lines. * test: add 13 handler tests for blocklist, refresh, and tools list endpoints * docs: add Code Mode section to README with config, limitations, and IDE setup * test: add 17 tests for CallMCPTool and dbToolStore

christianromeni added 4 commits March 29, 2026 17:10

perf: add Vegeta proxy overhead benchmark (baseline: ~400µs P50)

8f1336d

perf: add MCP proxy benchmark (LLM: 149µs P50, MCP: 428µs P50 at 1000…

39473cf

… RPS)

perf: unified benchmark — LLM proxy, MCP proxy, and Code Mode in one …

aea1320

…report

christianromeni merged commit 8d2943b into main Mar 29, 2026
6 checks passed

christianromeni deleted the perf/benchmark-and-optimize branch March 29, 2026 18:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Go benchmark CLI with Vegeta — LLM, MCP, and Code Mode#32

perf: Go benchmark CLI with Vegeta — LLM, MCP, and Code Mode#32
christianromeni merged 4 commits intomainfrom
perf/benchmark-and-optimize

christianromeni commented Mar 29, 2026

Uh oh!

codecov bot commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christianromeni commented Mar 29, 2026

Summary

Scenarios

Usage

Test plan

Uh oh!

codecov bot commented Mar 29, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant