feat: add experimental Rust MCP runtime and session core#3617
feat: add experimental Rust MCP runtime and session core#3617crivetimihai merged 76 commits intomainfrom
Conversation
7f4cbe7 to
65f9bb2
Compare
There was a problem hiding this comment.
Interesting PR, i love the general idea of rust components being fully independent. However I have been working on the a2a_service in #3250, and one of the main challenges in fully replacing the Python service in this case is how the database layer is handled. The current Python implementation already contains assumptions around schema management, migrations, connection handling, and ORM/query behavior. Re-implementing or partially replacing that logic can easily introduce inconsistencies (e.g., differences in transaction handling, migration state, connection pooling, or query semantics), which makes the database layer particularly tricky to replicate safely.
Using a Python fallback may not provide the same feature set, safety guarantees, or performance characteristics. It could also behave differently under load, which makes this approach quite risky for production use.
if this PR were merged and then integrated with the changes introduced in #3161, it would benefit from an existing and more robust CI setup. For that reason, I would recommend working toward merging #3161 first and placing this implementation within the crates folder so it can leverage that infrastructure.
lucarlig
left a comment
There was a problem hiding this comment.
I’m a bit concerned about the long-term architecture here. The Rust runtime is no longer just a transport edge; it now depends on an implicit contract with the Python backend across internal headers, internal endpoints, duplicated auth/session semantics, and direct DB knowledge. That works for a staged migration, but it also means changes to Python routing, token-scope behavior, or payload/header shapes can silently break Rust unless both sides evolve in lockstep.
The DB path has a similar issue. Right now migrations and schema ownership still live in the Python ORM/Alembic layer, while Rust is starting to read the same schema directly and reimplement some of the filtering/visibility logic itself. That creates two sources of truth for data access semantics. Every schema change or business-rule change now has to be maintained in both languages, which is going to get harder over time.
I think it would be healthier to converge on one shared boundary instead of growing both layers independently: either keep Python as the single DB/business-logic layer and expose a narrower internal API to Rust, or move the DB/data layer into Rust and have Python consume that same layer as well. The second option is more work, but it avoids permanent duplication and gives us a cleaner path to Rust-side performance gains.
760055d to
b6c0c03
Compare
|
I tried to use cloudflare pingora crate as mcp proxy to fast-time-server: |
ed7441b to
f647217
Compare
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
|
Tested the functionalities with both the rust mode disabled and with full rust mode, both worked well for all basic functions. For the load tests, I've faced a slightily higher response time and lower rps against the python version for 800 users. Really odd. |
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Thanks, I went through this list and split it into "addressed in this PR" vs "tracked follow-up". Addressed in this PR
No longer a merge blocker after the above fix
So the main immediate trust-boundary issues from this list were #5 / #7, and those are now fixed. The rest are either follow-up hardening or no longer blockers after the |
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
|
I really like the idea of running rust on the core and slowly implement new modules in rust as we need it. I am still testing and reading the docs. But I ran docker with full rust and manually tested from the UI. It is all running as expected, including adding an external MCP provider (Claudflare) and running its tools. Suggestions.
AI catches.
|
gcgoncalves
left a comment
There was a problem hiding this comment.
Can confirm the app is still functional.
I tested it directly, and clippy is passing for tools_rust/mcp_runtime. Ran:
Thanks. I checked both points.
So I'm not treating either as a blocker for this PR, but the wrapper secret-logging issue is worth tracking/fixing separately. @dima-zakharov please investigate. |
|
I reproduced the code state, but not the warning output on this machine. What I verified:
What happened locally:
I did not get the warning locally in this run, likely due to toolchain/clippy-version differences or lint configuration differences. Either case, it is a minor warning-cleanup item, not a failing clippy error on my current environment. |
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Test Setup
Steps
Results — 125 users / 60s
On failuresAll 24 failures were on The At 125 concurrent users the Python path was already saturated — requests were arriving faster than workers could drain them. Rust processes the same requests ~60x faster so the queue never builds up. Python is showing stress at 125 users; Rust has not reached its saturation point. |
araujof
left a comment
There was a problem hiding this comment.
Nice work @crivetimihai ! This review focuses on any potential regressions or interactions with the Plugin Framework.
Summary
No plugin framework regressions identified. The Rust runtime integration is cleanly separated from plugin execution via the has_hooks_for guard. The only semantic change affecting plugins is the prompt_id UUID→name switch in post-hook payloads, which aligns with the pre-hook behavior and is unlikely to affect any plugin in practice.
Test Results
| Suite | Result |
|---|---|
| Plugin framework unit tests | All passed |
| Service unit tests (tool/prompt/resource) | All passed |
| New sentinel plugin tests | All passed |
| Plugin performance profiling (31 plugins) | All profiled, 0 errors |
Performance numbers are consistent with baseline. No degradation detected.
Findings
1. Plugins bypass in Rust path
tool_service.py — prepare_rust_mcp_tool_execution correctly falls back to Python when any tool plugin hooks are configured:
if self._plugin_manager and (self._plugin_manager.has_hooks_for(ToolHookType.TOOL_PRE_INVOKE)
or self._plugin_manager.has_hooks_for(ToolHookType.TOOL_POST_INVOKE)):
return {"eligible": False, "fallbackReason": "plugin-hooks-configured"}The Rust hot path is only taken when no tool hooks are registered.
2. prompt_id UUID→name switch
In prompt_service.py, PromptPosthookPayload.prompt_id changed from prompt.id (UUID) to prompt.name (string)
Before: PromptPosthookPayload(prompt_id=str(prompt.id), result=result)
After: PromptPosthookPayload(prompt_id=prompt.name, result=result)
All plugins use prompt_id for logging, passthrough in modified payloads, or as a Cedar/OPA policy resource identifier. No plugin parses it as a UUID. The pre-hook already uses the caller-supplied prompt_id (name or ID string), so this change actually improves consistency between pre-hook and post-hook — both now carry the human-readable prompt name.
3. New internal MCP handlers correctly propagate plugin context
The new handle_internal_mcp_* functions in main.py all extract plugin_context_table and plugin_global_context from request.state and pass them through to service methods. This matches the existing pattern in handle_rpc.
4. resource_service.py server-scoping and MultipleResultsFound handling
The resource service changes add server-scoped queries and handle MultipleResultsFound gracefully. These changes are upstream of plugin hooks (the plugin hooks fire after resource resolution). No interaction with plugin execution.
5. _load_invocable_tools refactoring
The extraction of tool lookup logic into _load_invocable_tools is purely structural. The existing invoke_tool now calls this helper, and the new prepare_rust_mcp_tool_execution also calls it. Plugin invocation in invoke_tool is untouched.
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
|
Thanks, this matches my read. I don’t see any plugin-framework regression or new blocker here. The only semantic change worth tracking is the PromptPosthookPayload.prompt_id UUID→name behavior, and that is already captured in tools_rust/mcp_runtime/FOLLOWUPS.md as a contract/schema follow-up. |
e2e68d9
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
* feat: add Rust MCP runtime prototype Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: integrate experimental Rust MCP runtime Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: complete Rust MCP compose parity Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: improve Rust MCP runtime observability Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: add Rust MCP runtime status report Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: harden Rust MCP parity coverage Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: streamline Rust MCP proxy path Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: streamline server-scoped Rust MCP proxying Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: narrow Rust MCP sidecar dispatch Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: trim trusted Rust MCP dispatch overhead Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: specialize Rust MCP tools list dispatch Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: optimize Rust MCP tools call hot path Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: validate Rust MCP benchmark curve Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: add Rust MCP tuning knobs Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: record Rust MCP load-testing updates Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: expand Rust MCP transport parity Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: expose active MCP runtime mode Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: narrow more MCP methods through Rust runtime Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: narrow more MCP methods through Rust runtime Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: wire Rust MCP rmcp runtime into compose Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: refresh Rust MCP runtime README Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: add Rust MCP session core slice Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: add Rust MCP event store session core Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: move MCP replay resume path into Rust Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: move MCP live streaming into Rust Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: simplify Rust MCP build and runtime UX Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: add Rust MCP affinity core slice Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: reuse auth cache on MCP transport path Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: move MCP read paths and auth cache hot path Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: restore RPC permission and server scope semantics Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: resolve flake8 and bandit findings Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: satisfy pylint checks Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: align RPC permission tests and docstrings Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: raise diff coverage for Rust MCP paths Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: route public MCP ingress directly to Rust Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: add Rust MCP benchmark targets and safe fallback Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: clarify Rust MCP mode workflow Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: add MCP session isolation coverage Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: add Rust MCP follow-up tracker and quick reference Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: raise Rust MCP diff coverage to 100 percent Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: stabilize logger capture assertions Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: improve Rust MCP runtime coverage and tooling Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: expand Rust MCP runtime unit coverage Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * refactor: clean up Rust MCP runtime lint issues Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * perf: reduce Rust MCP RMCP and header overhead Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: document Rust MCP runtime architecture Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: improve Rust MCP runtime code documentation Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: refresh Rust MCP runtime guides Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: add Rust MCP follow-up checklist Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: isolate Rust-only MCP E2E coverage Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: show MCP runtime mode in admin UI Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: harden auth service and test stack startup Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: harden Rust MCP public ingress Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: redact Rust MCP transport errors Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * feat: add optional Postgres TLS for Rust MCP runtime Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: extend Rust MCP isolation validation Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: handle ambiguous MCP resource reads cleanly Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: add Rust MCP access matrix coverage Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: tighten Rust MCP response shaping Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: expand Rust MCP runtime unit coverage Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: clean up Rust MCP helper plumbing Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: normalize Rust MCP resource fallback payloads Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: add MCP plugin parity coverage Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: gate MCP prompt and plugin parity Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: expand Rust MCP release checklist Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: finalize Rust MCP release validation Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: add modular runtime specification Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * docs: document implemented MCP module Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: harden Rust runtime coverage Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: harden Rust runtime fail-closed handling Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: harden internal MCP trust boundaries Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: tolerate string auth secrets in MCP trust helpers Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * test: skip parity E2Es without parity config Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: stabilize minikube release validation Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: record rust tools call metrics Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * fix: satisfy metrics buffer lint Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> --------- Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> Signed-off-by: KRISHNAN, SANTHANA <sk8069@exo.att.com> Signed-off-by: calculus-ask <a.santhana.k@gmail.com>
🔗 Related Issue
N/A
📝 Summary
This PR adds an experimental Rust MCP runtime and incrementally moves the MCP transport, high-volume JSON-RPC routing, and the first session/event-store slices into Rust while keeping the Python implementation available behind feature flags.
Key changes:
tools_rust/mcp_runtimeand integrate it intoContainerfile.lite,docker-compose.yml, and the managed entrypoint flow/mcptraffic through Rust with explicit runtime visibility in/health,/ready, startup logs, and MCP response headersinitialize,notifications/initialized,tools/list,tools/call,resources/*,prompts/*, and several other MCP methods off the generic Python dispatchermodelcontextprotocol/rust-sdkbehind a feature flag for the upstreamtools/callclient pathDELETEtools_rust/mcp_runtime/README.mdandtools_rust/mcp_runtime/STATUS.mdThis remains intentionally reversible:
EXPERIMENTAL_RUST_MCP_RUNTIME_ENABLED=falseEXPERIMENTAL_RUST_MCP_SESSION_CORE_ENABLEDandEXPERIMENTAL_RUST_MCP_EVENT_STORE_ENABLED🏷️ Type of Change
🧪 Verification
cargo test --release --manifest-path tools_rust/mcp_runtime/Cargo.tomlmake test-mcp-climake test-mcp-rbacuv run pytest -q --with-integration tests/integration/test_streamable_http_redis.pydocker compose build gatewaywith Rust enabledPerformance highlights on the compose-built Rust stack:
1007.35 RPSat120users,1033.01 RPSat150users,0%failures1126.96 RPSoverall and1068.5 RPSonMCP tools/call [rapid]at125users,0%failures✅ Checklist
make black isort pre-commit)📓 Notes
Current boundary:
Follow-on work after this PR is the remaining transport/session ownership: replay/resume behavior, session existence/owner checks on the public path, and multi-worker session-affinity forwarding.