perf: sonic JSON, MCP caches, transport pool, unified parse by christianromeni · Pull Request #33 · voidmind-io/voidllm

christianromeni · 2026-03-29T22:51:35Z

Summary

Performance optimizations for the MCP proxy hot path, reducing overhead from ~670µs to ~420µs P50 at 1000 RPS.

sonic JSON via internal/jsonx — drop-in wrapper with ConfigStd for encoding/json compatibility. All 17 production files migrated.
MCP server + access in-memory cache — eliminates DB queries from every proxy request. Scope-aware lookup (team > org > global) with 30s periodic refresh.
Scoped ToolCache by server ID — tool cache keyed by server ID instead of alias. BREAKING: MCP access is now closed-by-default at org level (orgs must explicitly grant access to global servers).
Persistent HTTP transport cache — reuses TCP connections and pre-decrypts auth tokens once at cache load. Stale transport list for safe eviction of in-flight requests.
Unified JSON-RPC parse — single parseMCPRequestMeta replaces 3 separate JSON parses per request.
Bench script fix — registers MCP server as org-scoped via Admin API for closed-by-default compatibility.

Breaking Changes

MCP access at org level is now closed-by-default. Organizations must explicitly grant access to global MCP servers via org_mcp_access. Org-scoped and team-scoped servers are unaffected.
ToolCache is keyed by server ID instead of alias.

Benchmark Results (1000 RPS, 30s)

Metric	Before	After
LLM Proxy P50	371µs	442µs
MCP Proxy P50	670µs	427µs
Code Mode (pure JS)	3.36ms	3.35ms
Code Mode (warm eval)	33µs	32µs

Test plan

go test ./... -race -count=1 — all pass, no race conditions
go vet ./... — clean
go run ./scripts/bench quick --rps 1000 --duration 30s — 100% success all paths
Code review (3 rounds)
Security audit (3 rounds)

…igStd)

… hot path

BREAKING CHANGES: 1. ToolCache keyed by server ID instead of alias. Prevents tool schema conflicts when multiple scopes use the same alias. ToolFetcher resolves servers by ID. Persistent tool store keys by server ID. 2. MCP access for global servers is now closed-by-default at the org level. Orgs must explicitly configure org_mcp_access to grant access to global MCP servers. Team and key levels remain open (inherit from org). Built-in VoidLLM server is unaffected (always accessible). 3. Removed GetMCPServerByAliasAny (no longer used). Tests: 30 new tests for CheckMCPAccess, MCPAccessCache, and accessibleServers.

…d JSON-RPC parse

…refresh - Deferred transport closure via stale list for safe eviction - Consolidated refreshMCPCaches into single DB query for both caches - Removed unused field from resolved server struct - Added error logging for decrypt failures in cache load - Removed dead code guard in metrics path

Register bench MCP server as org-scoped via Admin API instead of relying on global YAML server. Updates MCP proxy target alias to match the org-scoped registration.

codecov · 2026-03-29T22:56:36Z

Codecov Report

❌ Patch coverage is 41.07366% with 472 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/api/admin/mcp_access.go	48.38%	72 Missing and 24 partials ⚠️
internal/app/app.go	0.00%	91 Missing ⚠️
internal/proxy/mcp_transport_cache.go	0.00%	80 Missing ⚠️
internal/api/admin/mcp_proxy.go	43.88%	61 Missing and 17 partials ⚠️
internal/proxy/mcp_server_cache.go	0.00%	45 Missing ⚠️
internal/db/mcp_servers.go	0.00%	16 Missing ⚠️
internal/api/admin/handler.go	31.57%	13 Missing ⚠️
internal/api/admin/mcp_servers.go	35.00%	12 Missing and 1 partial ⚠️
internal/mcp/tool_cache.go	72.41%	8 Missing ⚠️
internal/app/code_mode.go	77.77%	3 Missing and 1 partial ⚠️
... and 13 more

📢 Thoughts on this report? Let us know!

Closed-by-default MCP access requires a way for org admins to grant access to global MCP servers. Adds: - 6 API endpoints for org/team/key MCP access (GET/PUT) - Lightweight available-servers endpoint for org admins - Org MCP Access tab in Organization settings - Team MCP Access tab in Team settings - System admin bypass for MCP access checks - 14 integration tests

Test was asserting 0 servers for system_admin, but the bypass now grants unrestricted access. Split into member (denied) and system_admin (allowed) assertions.

…ion (voidmind-io#30) * feat: Code Mode — WASM-sandboxed JS execution for MCP tool orchestration (voidmind-io#30) Add Code Mode: LLMs write JavaScript that orchestrates multiple MCP tool calls in a single execution, reducing token usage by 30-80%. Scripts run in a QuickJS/WASM sandbox (fastschema/qjs + Wazero) embedded in VoidLLM. Three new built-in MCP tools: - list_servers: discover available MCP servers - search_tools: find tools by keyword across servers - execute_code: run JS with MCP tools as async functions Runtime: pool of QJS runtimes (default 8), fresh runtime per execution, tool schema cache with lazy fetch, console capture in results. Configurable via voidllm.yaml (disabled by default): mcp.code_mode.enabled, pool_size, memory_limit_mb, timeout, max_tool_calls Also fixes session re-init double-check in MCP proxy. * feat: Code Mode Phase 3 — blocklist, refresh, toggle, MCP server split (voidmind-io#31) Split built-in MCP into two servers: - /api/v1/mcp — Code Mode (list_servers, search_tools, execute_code) - /api/v1/mcp/voidllm — Management (list_models, get_usage, etc.) - /api/v1/mcp/:alias — External MCP server proxy Per-tool blocklist for Code Mode: - Migration 0005: mcp_tool_blocklist table - CRUD API: GET/POST/DELETE /mcp-servers/:id/blocklist - Defense in depth: filtered before sandbox injection + checked in ToolCaller - Blocklist also applied to search_tools and list_servers tool counts Tool refresh endpoint: - POST /mcp-servers/:id/refresh-tools with 60s cooldown Admin controls: - code_mode_enabled toggle in API response and PATCH - UI: Code Mode toggle column, expanded row with blocklist management Shared MCP handler helper eliminates POST/SSE handler duplication. * feat: Code Mode Phase 4 — Proxy pattern, SSE upstream, execution history, TypeScript types (voidmind-io#32) JS Proxy pattern replaces static preamble generator: - Single __callTool dispatch via ES6 Proxy interception - Any tool name characters supported, preamble is O(1) in tool count SSE upstream transport support: - Auto-detect Streamable HTTP vs deprecated SSE protocol - Lazy detection with sync.Once, origin validation on endpoints Execution history: - Migration 0006: code_mode_execution_id on mcp_tool_calls - UUIDv7 per execute_code call groups all tool calls Dynamic TypeScript types in execute_code description: - GenerateToolTypeDefs converts cached tool schemas to TS declarations - OnToolsListHook injects types at tools/list time Bug fixes: - MCP access control enforced in all Code Mode closures for global servers - ToolCache fetcher resolves servers across all scopes - Frontend blocklist DELETE matches backend query parameter API * feat: persistent tool cache, SSE detection, tools list UI with block buttons (voidmind-io#33) Persistent tool cache: - Migration 0007: mcp_server_tools table for DB-backed tool schemas - Startup loads from DB (zero HTTP calls, TypeScript types immediately available) - 24h background refresh keeps schemas current - Write-through on every fetch (RefreshServer, GetTools) - DB entries marked stale on load so they refresh within maxAge SSE transport detection: - Servers using deprecated SSE protocol auto-deactivated at startup - Clear error message: "server uses deprecated SSE transport" - Test connection also detects and deactivates SSE servers Tools list UI: - GET /mcp-servers/:id/tools endpoint returns cached tools with blocked status - Expanded row shows all tools with Block/Unblock buttons - Block buttons work for YAML-sourced servers (blocklist is independent of source) - Plug icon centered in sidebar and MCP servers page Also fixes: - ToolStore.Delete uses server ID (not alias) to avoid soft-delete lookup failure - Corrupt JSON schemas skipped on DB load instead of serving empty schemas * ci: lower patch coverage target to 50% * refactor: extract Code Mode service from app.go + add 46 tests Extract 3 closures (ExecuteCode, ListAccessibleMCPServers, SearchMCPTools) from app.go into codeModeService in code_mode.go. Shared accessibleServers helper eliminates duplicated server-listing + access-check logic. New tests: - code_mode_test.go: 21 tests (mock DB, real WASM executor) - mcp_tool_blocklist_test.go: 11 tests (CRUD, conflicts, isolation) - mcp_server_tools_test.go: 14 tests (upsert, replace, active filter) app.go reduced by ~400 lines. * test: add 13 handler tests for blocklist, refresh, and tools list endpoints * docs: add Code Mode section to README with config, limitations, and IDE setup * test: add 17 tests for CallMCPTool and dbToolStore

christianromeni added 6 commits March 29, 2026 21:45

perf: migrate encoding/json to sonic via internal/jsonx wrapper (Conf…

893567f

…igStd)

perf: MCP server + access in-memory cache — eliminate DB queries from…

b983ac0

… hot path

perf: persistent MCP transport cache + decrypted token cache + unifie…

2d714b1

…d JSON-RPC parse

fix: bench script MCP access for closed-by-default orgs

3955dae

Register bench MCP server as org-scoped via Admin API instead of relying on global YAML server. Updates MCP proxy target alias to match the org-scoped registration.

christianromeni added 2 commits March 30, 2026 02:39

fix: update accessibleServers test for system admin bypass

69469bc

Test was asserting 0 servers for system_admin, but the bypass now grants unrestricted access. Split into member (denied) and system_admin (allowed) assertions.

christianromeni merged commit 3889570 into main Mar 30, 2026
5 of 6 checks passed

christianromeni deleted the perf/sonic-json-optimization branch March 30, 2026 00:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: sonic JSON, MCP caches, transport pool, unified parse#33

perf: sonic JSON, MCP caches, transport pool, unified parse#33
christianromeni merged 8 commits intomainfrom
perf/sonic-json-optimization

christianromeni commented Mar 29, 2026

Uh oh!

codecov bot commented Mar 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christianromeni commented Mar 29, 2026

Summary

Breaking Changes

Benchmark Results (1000 RPS, 30s)

Test plan

Uh oh!

codecov bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Mar 29, 2026 •

edited

Loading