Skip to content

refactor(module)/codemode improvements combined#212

Draft
Mat4m0 wants to merge 15 commits intonuxt-modules:mainfrom
Mat4m0:refactor(module)/codemode-improvements-combined
Draft

refactor(module)/codemode improvements combined#212
Mat4m0 wants to merge 15 commits intonuxt-modules:mainfrom
Mat4m0:refactor(module)/codemode-improvements-combined

Conversation

@Mat4m0
Copy link
Copy Markdown
Contributor

@Mat4m0 Mat4m0 commented Apr 7, 2026

Code Mode Integration — Combined Changeset

This PR combines the earlier PRs (#210 #206 #207 #208 )into one branch, with a few additional improvements. The goal is to make review easier and give us a single discussion baseline; it is not intended to be merge-ready yet.

This document describes every change in this branch, why it was made, and how it was validated.


Table of Contents

  1. Per-Session RPC Architecture
  2. AsyncLocalStorage Context Preservation
  3. Resource Limits & Abuse Prevention
  4. structuredContent Dispatch
  5. Tool Error Surfacing in Sandbox
  6. Typed Output Schemas
  7. Type Generation Improvements
  8. Description Templates & Progressive Mode
  9. Structured Envelope Responses
  10. Handler Reuse Between Registration and Code Mode
  11. Hardening Pass (PR Review Fixes)
  12. Per-Tool Code Mode Overrides & Smart Progressive
  13. Test Coverage
  14. Documentation Updates

1. Per-Session RPC Architecture

Problem: The original executor used a singleton RPC server and a singleton V8 runtime. All concurrent execute() calls shared the same fns map, onReturn callback, and RPC token. This meant:

  • Concurrent executions could overwrite each other's dispatch functions
  • A sandbox from execution A could call tools meant for execution B
  • The onReturn callback was a single slot — racing executions would lose return values

Solution: Each execute() call now creates its own isolated RPC session:

  • Fresh createServer() on port 0 (OS-assigned ephemeral port)
  • Unique execId (8 random bytes) and token (32 random bytes) per execution
  • ExecutionContext struct holds per-execution state: fns, onReturn, deadlineMs, rpcCallCount
  • The fns map is frozen with Object.freeze() so sandbox code can't mutate it
  • The sandbox sends execId in every RPC call; the server validates it matches the session
  • ActiveSession tracking with cleanup on completion, error, or timeout

Files: executor.ts (complete rewrite of RPC layer)


2. AsyncLocalStorage Context Preservation

Problem: Nitro uses AsyncLocalStorage to carry per-request context (H3 event, auth, etc.). When Code Mode dispatches a tool call via HTTP RPC, the async context is lost — the RPC handler runs in the HTTP server's context, not the original request's.

Solution: On entry to execute(), we capture the current async context via AsyncLocalStorage.snapshot(). Every tool dispatch is wrapped through restoreContext(), which re-enters the original request's async context before calling the tool handler. This means tools called from sandbox code have access to the same useEvent(), auth state, etc. as if called directly.

When snapshot() is unavailable (Node.js < 18.16.0), execution degrades gracefully: a passthrough function is used instead, and a one-time warning is logged via console.warn. Tools still work but lose access to request context (useEvent(), auth, etc.) — matching the behavior of main before this branch. This preserves backward compatibility with Node.js 18.0+.

Files: executor.ts


3. Resource Limits & Abuse Prevention

Problem: The original executor had no protection against:

  • Oversized RPC request bodies (memory exhaustion)
  • Infinite tool call loops
  • Runaway wall-clock time (sandbox waiting forever on slow tools)
  • Oversized tool responses

Solution: Added configurable limits with sensible defaults:

Limit Default Config Key Enforcement
RPC request body 1 MB maxRequestBodyBytes HTTP 413, streaming byte count
Tool calls per execution 200 maxToolCalls HTTP 429, counter on ExecutionContext
Wall-clock timeout 60s wallTimeLimitMs setTimeoutruntime.dispose()
Tool response size 1 MB maxToolResponseSize Truncation (same strategy as maxResultSize)

The wall-clock timeout is separate from the V8 CPU time limit. CPU time only counts isolate execution; wall time caps total elapsed time including host-side tool calls. On timeout, the runtime is forcefully disposed and the sandbox gets a clear error.

Error messages returned to sandbox/client are sanitized: file paths are replaced with [path], stack traces are stripped, and messages are truncated to 500 chars. Full errors are logged server-side with console.error('[nuxt-mcp-toolkit] ...').

Files: executor.ts, types.ts (new options), 8.code-mode.md (docs)


4. structuredContent Dispatch

Problem: When a tool handler returned structuredContent (the MCP spec's typed data channel), Code Mode ignored it and fell through to extracting text from content[].text. This silently lost IDs, booleans, nested objects — breaking operation chaining where a returned ID is needed for follow-up calls.

Solution: normalizeDispatchResult() now checks structuredContent first:

  1. If rawResult.structuredContent != null → return it directly (preserves typed data)
  2. If rawResult.isError → convert to CodeModeToolError sentinel (see feat: add defineMcpHandler utils #5)
  3. If rawResult.content has text items → return joined text (no JSON.parse — intentional, avoids ambiguity)
  4. Plain objects/primitives pass through unchanged

The function also uses isCallToolResult() to distinguish MCP results from plain objects returned by handlers. This duck-type check was tightened: isError alone no longer matches unless it's a boolean (prevents false positives from objects that happen to have an isError property).

Files: index.ts (normalizeDispatchResult, extractTextContent), results.ts (isCallToolResult)


5. Tool Error Surfacing in Sandbox

Problem: Tool errors (isError: true results or thrown exceptions) were returned as plain strings to sandbox code, making them indistinguishable from successful results. try/catch never fired, and structured error details from structuredContent were lost.

Solution: Tool errors are now wrapped as a sentinel object with a namespaced key:

{
  __mcp_toolkit_error__: true,
  message: "Permission denied",
  tool: "delete_item",
  details: { /* structuredContent if available */ }
}

The sandbox proxy code detects this sentinel and throws a structured Error with .tool, .isToolError, and .details properties. This lets sandbox code use try/catch to handle tool errors with full context.

The sentinel key is namespaced (__mcp_toolkit_error__) rather than generic (__toolError) to avoid collisions with legitimate tool return values. Similarly, the stderr error prefix is __MCP_EXEC_ERR__ rather than __ERROR__.

Files: index.ts (toToolError, CodeModeToolError), executor.ts (proxy boilerplate)


6. Typed Output Schemas

Problem: All Code Mode tools returned Promise<unknown> regardless of whether an outputSchema was defined. The LLM had no type information about what to expect back.

Solution: When a tool definition includes outputSchema, the type generator now emits typed return values:

  • Small schemas (<=3 primitive fields) are inlined: Promise<{ id: string; ok: boolean }>
  • Larger schemas get named interfaces: Promise<GetReportOutput>

This reuses the same generateSchemaTypeInfo() helper used for input schemas, extracted from the duplicated inline logic.

Files: types.ts (generateSchemaTypeInfo, generateToolTypeInfo output handling)


7. Type Generation Improvements

Several improvements to the TypeScript type generation for sandbox code:

  • Property key escaping: Keys with special characters or reserved words are now properly quoted using JSON.stringify() via formatTsPropertyKey(). Before: my-key?: string (invalid TS). After: "my-key"?: string.
  • Enum value escaping: Enum strings containing quotes or backslashes are now escaped with JSON.stringify(v) instead of template literals. Before: "he said "hello"" (broken). After: "he said \"hello\"".
  • Name collision detection: buildToolNameMap() now warns via console.warn if two tools sanitize to the same name (e.g., get-user and get_user both become get_user), keeping last-wins behavior for backward compatibility. The warning message includes a notice that this will become an error in a future version.

Files: types.ts


8. Description Templates & Progressive Mode

Problem: The code tool descriptions were overly verbose with multi-paragraph instructions about combining sequential/parallel/conditional logic. This wastes context window for every tool call.

Solution:

  • Simplified templates to essential information: what the tool does, how to write code, available types
  • Added {{example}} placeholder support alongside existing {{types}} and {{count}}
  • Example blocks are automatically omitted when there are >10 tools (they become noise at that scale)
  • Collapsed excessive newlines in template output
  • Progressive mode gets its own concise example showing the search→code workflow

Files: index.ts (templates, applyDescriptionTemplate), types.ts (CodeModeOptions.description docs)


9. Structured Envelope Responses

Problem: The code tool returned only text content — no structured data for programmatic consumption by MCP clients.

Solution: The code tool now returns both structuredContent and content:

{
  isError: false,
  structuredContent: {
    ok: true,
    result: { id: "abc123" },
    durationMs: 142,
    logs: ["[stdout] Processing..."]
  },
  content: [{ type: "text", text: "..." }]  // human-readable fallback
}
  • CodeToolEnvelope is a discriminated union (ok: true | false) preventing impossible states (simultaneous result + error)
  • ExecuteResult is also a discriminated union — result and error are mutually exclusive at the type level
  • durationMs tracks wall-clock execution time
  • outputSchema is declared on the code tool definition so MCP clients can validate responses

Files: index.ts (CodeToolEnvelope, createCodeToolEnvelope, formatSuccessContent), types.ts (ExecuteResult)


10. Handler Reuse Between Registration and Code Mode

Problem: Code Mode reimplemented tool handler invocation: it manually resolved tool names, manually checked for inputSchema, and called handlers differently than registerToolFromDefinition(). This meant cache wrappers, error normalization, and future middleware would only apply to registered tools, not code mode dispatches.

Solution: Extracted shared utilities from registerToolFromDefinition():

  • resolveToolDefinitionName(tool) — applies enrichNameTitle to get the canonical name
  • createWrappedToolHandler(tool) — applies cache wrappers and returns the handler
  • invokeWrappedToolHandler(tool, handler, input, extra) — calls the handler with the correct argument shape. Uses tool.inputSchema !== undefined (not Object.keys(...).length > 0) to determine the call shape, correctly handling inputSchema: {} which is a valid empty schema where the handler still receives (args, extra)

Code Mode now uses these same functions, so cache, auth, and any future middleware apply consistently whether a tool is called via MCP protocol or via sandbox code.

The buildDispatchFunctions() helper is now exported (used by tests) and accepts McpRequestExtra, which gets passed through to tool handlers so they have access to requestId, signal, sendNotification, etc.

Files: tools.ts (resolveToolDefinitionName, createWrappedToolHandler, invokeWrappedToolHandler), index.ts (buildDispatchEntries, buildDispatchFunctionsFromEntries)


11. Hardening Pass (PR Review Fixes)

After the feature work, a comprehensive PR review identified issues that were fixed:

Error sentinel spoofability

Changed sentinel key from __toolError to __mcp_toolkit_error__ and stderr prefix from __ERROR__ to __MCP_EXEC_ERR__ to avoid collisions with legitimate tool return values or user code logging.

Empty catch blocks

Four truly empty catch {} blocks in cleanup paths (server.close, runtime.dispose, wall-timer dispose) now log warnings with console.warn('[nuxt-mcp-toolkit] ...') and context about which operation failed.

Unhandled promise rejection

void handleRpcRequest(...) had no catch handler. If the inner try/catch failed AND sendJson also threw, the rejection escaped. Added .catch(() => { if (!res.headersSent) res.destroy() }) as a safety net.

Missing server-side error logging

The outer catch in execute() sanitized errors before returning them but never logged the raw error. Added console.error('[nuxt-mcp-toolkit] Execution error:', error) before sanitization so infrastructure errors are visible in server logs.

isCallToolResult false positives

isError alone was enough to match isCallToolResult(). Now requires typeof isError === 'boolean' — objects with incidental string/number isError properties aren't misrouted.

Dead proxy cache

cachedProxyKey/cachedProxyCode module-level variables never hit because each execution creates a new port+token. Removed entirely.

Impossible type states

ExecuteResult was { result: unknown, error?: string } — both could be set. Now a discriminated union where result and error are mutually exclusive. Same for CodeToolEnvelope with ok as discriminant.

disposeCodeMode error swallowing

.catch(() => {}) replaced with .catch(error => console.warn(...)).

Empty-schema dispatch fix

invokeWrappedToolHandler was checking Object.keys(tool.inputSchema).length > 0 which treated inputSchema: {} as "no schema", calling handler(extra) instead of handler({}, extra). Fixed to check tool.inputSchema !== undefined. This was a pre-existing bug on main that was surfaced during the handler extraction refactor.

Files: executor.ts, index.ts, types.ts, results.ts, tools.ts


12. Annotation Surfacing & Grouped Progressive Search

Problem: MCP annotations (readOnlyHint, destructiveHint, idempotentHint) were defined on tool definitions but invisible in code mode. Progressive mode search results were a flat list with no grouping.

Solution: Derive more from existing metadata, configure nothing new.

12a. Annotation surfacing

buildAnnotationTags() converts existing MCP annotations into compact tags prepended to the // comment in type signatures:

  • readOnlyHint: true[read-only]
  • destructiveHint: true[destructive]
  • idempotentHint: true[idempotent]
declare const codemode: {
  list_items: () => Promise<Runbook[]>; // [read-only] [idempotent] List all runbooks
  delete_runbook: (input: { id: string }) => Promise<unknown>; // [destructive] Permanently removes
};

Tags use text (not emoji) for token efficiency. Both annotation tags and description comments are always preserved in the type block — the // description comment is the primary way the LLM understands tool semantics in standard mode. When annotations are present, tags are prepended to the description (e.g., // [read-only] List all items). Tools without annotations retain their plain // description comment.

12b. Grouped progressive search

formatSearchResults() groups results by the existing group field (auto-inferred from directory structure, e.g., server/mcp/tools/workspace/'workspace'):

Found 3/6 tools matching "runbook":

## workspace

codemode.create_runbook: (input: CreateRunbookInput) => Promise<{ ok: boolean; data: { id: string } }>;
codemode.delete_runbook: (input: { id: string }) => Promise<unknown>; // [destructive] Permanent

## public

codemode.search_public: (input: { q: string }) => Promise<Runbook[]>; // [read-only] Full-text search
  • Groups come from tool.group or tool._meta.group
  • When no tools have groups, output is flat — fully backward-compatible

13. Test Coverage

codemode.test.ts (~530 lines)

  • Type generation: declarations with descriptions, output types, name collision warning
  • Tool catalog: signatures, search formatting
  • createCodemodeTools: standard mode, progressive mode, example block omission, outputSchema
  • buildDispatchFunctions: structuredContent preference, plain text preservation, native object passthrough, thrown error → sentinel conversion, MCP extra propagation, empty-schema extra passing, undefined-schema extra passing
  • Code tool envelope: structured success/error responses
  • normalizeCode: markdown fence stripping, arrow function unwrapping, export default removal
  • isCallToolResult: content arrays, structuredContent, boolean isError, non-boolean isError rejection, plain object rejection
  • Enum escaping: quotes, backslashes in enum values
  • isError CallToolResult → tool error sentinel flow (including structuredContent details)
  • Annotation surfacing: [read-only], [destructive], [idempotent] tags; combined tags
  • Grouped search: group headers, mixed grouped/ungrouped, flat fallback
  • Backward compat: tools without annotations produce identical output
  • Annotation vs description: all tools preserve descriptions, annotated tools get tags prepended

codemode-executor.test.ts (new, ~450 lines)

  • Concurrency: parallel executions with no cross-talk
  • AsyncLocalStorage context preservation
  • AsyncLocalStorage fallback: graceful degradation when snapshot unavailable, one-time warning
  • RPC token validation: wrong token → 403
  • Execution ID validation: wrong execId → 400
  • Request body limits: oversized payload → 413
  • Tool call quota: exceeding maxToolCalls → 429
  • Tool response truncation: arrays, objects, primitives
  • Wall-clock timeout behavior
  • Cleanup verification after success/error/timeout

14. Documentation Updates

  • 8.code-mode.md: Added new config options (maxRequestBodyBytes, maxToolResponseSize, wallTimeLimitMs, maxToolCalls) to the configuration reference and resource limits table. Added "Error Sanitization" section. Added Node.js >=18.16.0 callout. Added "Annotation Surfacing" section. Updated progressive mode section with grouped search output example.
  • handlers.ts: Expanded experimental_codeMode JSDoc with config keys and Node.js requirement.
  • 2.installation.md: Minor addition (context for secure-exec requirement).

Files Changed Summary

File What changed
executor.ts Rewritten: per-session RPC, AsyncLocalStorage, resource limits, error handling
index.ts Rewritten: structuredContent dispatch, tool error sentinels, envelope responses, handler reuse
types.ts Extended: output schemas, property key escaping, enum escaping, discriminated unions, buildAnnotationTags, grouped formatSearchResults
tools.ts Extracted resolveToolDefinitionName, createWrappedToolHandler, invokeWrappedToolHandler
results.ts Exported isCallToolResult, tightened isError check
handlers.ts JSDoc update for experimental_codeMode
executor.cloudflare.ts Re-export alignment
codemode.test.ts Expanded from ~180 to ~450 lines
codemode-executor.test.ts New: ~400 lines of executor integration tests
8.code-mode.md New config options, resource limits table, error sanitization, annotation surfacing, grouped progressive search
2.installation.md Minor addition
5.handlers.md Minor addition

How Everything Connects

Tool Definition (defineMcpTool)
  ├── name, description, inputSchema, outputSchema, annotations, group
  │
  ▼
registerToolFromDefinition (tools.ts)
  ├── enrichNameTitle → canonical name
  ├── _meta: { inputExamples, group, tags }
  │
  ▼
createCodemodeTools (index.ts)
  ├── Standard mode: generateTypesFromTools → type block
  │   ├── buildAnnotationTags reads annotations → [read-only] etc. prepended to description
  │   ├── All tool descriptions preserved in type block (LLM's primary tool semantics source)
  │   └── hardcoded example block (omitted when >10 tools)
  │
  ├── Progressive mode: generateToolCatalog → searchable entries
  │   ├── ToolCatalogEntry: { group, annotations }
  │   └── formatSearchResults → grouped by group field with ## headers
  │
  ▼
buildDispatchEntries → buildDispatchFunctionsFromEntries (index.ts)
  ├── createWrappedToolHandler (shared with registerToolFromDefinition)
  ├── invokeWrappedToolHandler (shared with registerToolFromDefinition)
  ├── normalizeDispatchResult → structuredContent preference, error sentinels
  │
  ▼
execute (executor.ts)
  ├── Per-session RPC server (port 0, unique token + execId)
  ├── AsyncLocalStorage.snapshot → context preservation
  ├── Resource limits (CPU, wall-clock, memory, request body, tool calls)
  ├── Error sanitization
  │
  ▼
Code tool response (CodeToolEnvelope)
  ├── structuredContent: { ok, result?, error?, logs?, durationMs }
  └── content: [{ type: "text", text: "..." }]  (human-readable fallback)

Key design principle: Derive from existing MCP metadata (annotations, group, outputSchema, description), don't add new per-tool config surface. Every feature is additive and optional — tools without annotations or groups produce identical output to before.

Mat4m0 and others added 14 commits April 7, 2026 06:51
Tools with outputSchema now emit typed Promise return values in code
mode type definitions instead of Promise<unknown>. Small schemas (<=3 primitive fields) are inlined; larger schemas get named interfaces.
…pe generation

Added a new function to format TypeScript property keys, ensuring non-safe identifiers are quoted. Introduced a utility to generate schema type information, streamlining the handling of input and output schemas in tools. This improves type safety and clarity in generated type definitions.
…nd concurrency safety

- Fix AsyncLocalStorage context loss through singleton RPC server by
  using per-execution snapshots via AsyncLocalStorage.snapshot()
- Fix concurrent execute() calls overwriting shared fns/onReturn by
  introducing per-execution ExecutionContext keyed by execId
- Add bounded RPC body reader (1MB default, HTTP 413)
- Add wall-clock execution timeout (60s default, HTTP 408)
- Add per-execution RPC call quota (200 default, HTTP 429)
- Add per-tool response size limit (1MB default, truncation)
- Sanitize error messages to strip file paths and stack traces
- Add server-side logging to all catch blocks
- Fix exec.returned set before callback completes
- Replace void handleRpcRequest with .catch() safety net
When a tool handler returns structuredContent (the MCP spec's typed data channel), buildDispatchFunctions() ignored it and fell through to extracting text from content[].text — silently losing IDs, booleans, and nested objects. This broke operation chaining where a returned ID is needed for follow-up calls.
…edContent handling

Updated the codemode tests to utilize a mockMcpExtra function, ensuring that structuredContent is correctly processed in tool handlers. Changed tool definitions from McpToolDefinition to McpToolDefinitionListItem for better type alignment.
Tool errors (isError: true) were returned as plain strings, making them
indistinguishable from successful results in sandbox code. try/catch
never fires, and structured error details from structuredContent are
lost.

Return a __toolError sentinel from dispatch so the sandbox proxy can
throw a structured Error with .tool, .isToolError, and .details fields.
…example handling

- Updated description templates for code execution to be more concise.
- Added conditional logic to include or omit example blocks based on the number of available tools.
- Improved the handling of description formatting to collapse excessive newlines.
- Adjusted tests to verify the new description formats and example inclusion logic.
- Introduced a durationMs field in the execution result to track execution time.
- Updated ExecuteResult type to include durationMs for both success and error cases.
- Improved error handling in tool dispatch to ensure structured error messages are returned.
- Refactored code to maintain clarity and consistency in execution context management.
- Changed error prefix from '__ERROR__' to '__MCP_EXEC_ERR__' for clarity.
- Enhanced error handling in RPC session creation and cleanup processes with console warnings for better debugging.
- Updated tool error structure to use '__mcp_toolkit_error__' instead of '__toolError__'.
- Refactored ExecuteResult and CodeToolEnvelope types for consistency in error representation.
- Improved tests to validate new error handling and ensure proper functionality.
- Added support for surfacing MCP tool annotations (`readOnlyHint`, `destructiveHint`, `idempotentHint`) as tags in code mode type signatures and search results.
- Updated the description template to include an `{{example}}` placeholder for usage examples.
- Implemented grouping of search results by tool `group` field, improving organization in output.
- Refactored related functions to accommodate new features and ensure backward compatibility.
- Expanded tests to validate annotation surfacing and grouped search results functionality.
…l name collisions

- Added a fallback mechanism for AsyncLocalStorage.snapshot to handle Node.js versions <18.16.0, logging a warning when the snapshot is unavailable.
- Updated tool name collision handling to issue a warning instead of throwing an error, preserving the last tool in case of a name conflict.
- Adjusted tests to validate the new fallback behavior and ensure proper logging for name collisions.
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Apr 7, 2026

@Mat4m0 is attempting to deploy a commit to the Nuxt Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Hey there and thank you for opening this pull request! 👋🏼

We require pull request titles to follow the Conventional Commits specification and it looks like your proposed title needs to be adjusted.

Details:

No release type found in pull request title "refactor(module)/codemode improvements combined". Add a prefix to indicate what kind of release this pull request corresponds to. For reference, see https://www.conventionalcommits.org/

Available types:
 - breaking
 - feat
 - fix
 - build
 - ci
 - docs
 - enhancement
 - chore
 - perf
 - style
 - test
 - refactor
 - revert

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 7, 2026

npm i https://pkg.pr.new/@nuxtjs/mcp-toolkit@212

commit: af95280

@HugoRCD HugoRCD marked this pull request as draft April 7, 2026 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant