-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Summary
An agent that analyzes a software project, discovers which MCP servers would be useful, reads their documentation, generates correct configuration, collects credentials with human guidance, installs the servers with explicit approval, validates every connection end-to-end, and self-heals when something breaks — all through LLM reasoning that adapts to every server's unique setup requirements.
This is the first template to demonstrate a feedback loop with error diagnosis (diagnose_fix → validate → retry), alongside function nodes, HITL approval, conditional credential routing, on_failure recovery edges, and client_facing conversational interaction — patterns that exist in the framework but have zero template coverage today.
The core insight: Between discovering that an MCP server exists and having a working
mcp_servers.jsonentry with credentials configured, there is an interpretation gap that only an LLM can bridge. A CLI tool can runnpm install. It cannot read an arbitrary README, understand that "set theGITHUB_TOKENenvironment variable withreposcope" means it needs to create a credential entry, infer the transport type from installation instructions, and generate a correct configuration for your specific project context. That gap is the entire product.
The Problem
The MCP ecosystem has reached critical mass. The official MCP Registry is growing, Smithery lists 7,300+ servers, and new ones appear weekly. But adoption is gated by a manual, error-prone process:
- Discovery is overwhelming — Thousands of servers exist, but no way to know which ones are relevant to YOUR project and YOUR agent's purpose
- Configuration is the real pain — Every server has different transport requirements (stdio vs HTTP), different environment variables, different auth flows, different dependencies. A GitHub MCP server needs a personal access token with specific scopes. A Postgres server needs a connection string. A Stripe server needs webhook secrets. There is no standard configuration format across servers
- Validation is absent — You install a server, add it to config, and then discover at runtime that it doesn't work. No pre-flight check. No "did this actually connect?" feedback
- Credential management is disconnected — Even when Hive's
CredentialStoresupports encrypted storage and template resolution ({{cred.key}}), the user still needs to know which env vars each server expects, what scopes are required, and where to create the tokens - Debugging is blind — When a server fails to connect, the error message is often cryptic (
ENOENT,connection refused,timeout). The user is left to Google the error and manually fix the config
Who needs this? Every Hive agent developer. But more broadly: anyone who has ever spent 30 minutes debugging why their MCP server config isn't loading — and that's essentially everyone building with MCP today.
Why An Agent, Not A CLI Tool
This is the crux. Tools like install-mcp and the Smithery CLI can install packages. But a CLI tool cannot:
- Analyze your project — look at your
package.json,requirements.txt,docker-compose.yml, and.envto understand what services you actually use and infer what tools would be useful - Reason about intent — determine whether you need the "filesystem" MCP server (local file operations) or the "git" MCP server (repository operations) based on what your agent is trying to accomplish
- Comprehend documentation — read the README of an MCP server it has never seen before and figure out the correct configuration, extracting transport type, command path, args, and required env vars from unstructured text
- Detect existing credentials — notice that your project already has a
SLACK_BOT_TOKENconfigured in.envand reuse it for the Slack MCP server instead of asking you to create a new one - Diagnose and self-heal — try a server, see it fail, read the error message, reason about the cause, fix the configuration, and retry — something a CLI tool fundamentally cannot do
- Compare alternatives — when two candidate servers provide overlapping functionality, evaluate maturity (GitHub stars, last update, open issues) and recommend the one that better fits your architecture
The LLM IS the differentiator. The fact that every MCP server has unique, varied setup requirements is not a limitation — it is the exact justification for using an agent instead of a CLI tool.
Architecture
9 nodes, 13 edges — showcasing function nodes, HITL approval, conditional credential routing, feedback loops with error diagnosis, on_failure recovery, and client_facing conversational interaction.
+-------------------+
| project_scanner | function (deterministic)
| Scan files, |
| detect stack, |
| output profile |
+--------+----------+
|
on_success
|
+--------v----------+
| discover_servers | event_loop
| Query MCP |
| Registry API + |
| web search for |
| candidates |
+--------+----------+
|
on_success
|
+--------v----------+
| evaluate_ | event_loop
| candidates |
| Read READMEs, |
| compare options, |
| generate config |
| templates |
+--------+----------+
|
on_success
|
+--------v----------+
| approval_gate | event_loop, client_facing
| Present ranked |
| recommendations |
| with tradeoffs. |
| HITL: user |
| approves plan |
+---+--------+------+
| |
on_success on_failure
(approved) (rejected)
| |
| +----v-----------+
| | report_results | (early exit)
| +----------------+
|
+---------+---------+
| |
conditional conditional
credentials_needed > 0 credentials_needed == 0
| |
+--------v---------+ |
| collect_ | |
| credentials | |
| event_loop, | |
| client_facing | |
+--------+----------+ |
| |
on_success |
| |
+--------+-----------+
|
+--------v----------+
| install_ | event_loop
| configure |
| Run installs, |
| write |
| mcp_servers.json |
+--------+----------+
|
on_success
|
+--------v----------+
| validate_ | function (deterministic)
| connections |
| MCPClient test |
| per server |
+---+----------+----+
| |
conditional conditional
all_passed has_failures
| |
| +------v----------+
| | diagnose_fix | event_loop
| | Read error, | max_node_visits: 3
| | reason about |
| | cause, fix |
| | config, retry |
| +---+--------+----+
| | |
| on_success on_failure
| (retry) (give up)
| | |
| v |
| validate_ |
| connections |
| (loop) |
| |
+---v--------+-------v
| report_results | function
| Final summary: |
| installed, |
| tools available, |
| manual steps |
+--------------------+
Node Specifications
1. project_scanner — function (deterministic)
Input: project_path
Output: project_profile
No LLM call. Pure Python file scanning. This is deterministic work — zero cost, zero hallucination risk.
Files scanned (priority order):
package.json— dependencies, scripts (npm-based projects)requirements.txt/pyproject.toml/Pipfile— Python dependenciesdocker-compose.yml— services (postgres, redis, elasticsearch, rabbitmq).env/.env.example— referenced API keys hint at integrationsmcp_servers.json— existing MCP config (preserve, don't duplicate)README.md— project description for semantic context.github/workflows/*.yml— CI/CD platform detectionMakefile/justfile— common tool invocationstsconfig.json/deno.json— TypeScript/Deno detection.tool-versions/.nvmrc/.python-version— runtime versions
@dataclass(frozen=True)
class ProjectProfile:
languages: list[str] # ["python", "typescript"]
package_managers: list[str] # ["pip", "npm"]
frameworks: list[str] # ["fastapi", "react"]
databases: list[str] # ["postgresql", "redis"]
cloud_providers: list[str] # ["aws"]
ci_cd: list[str] # ["github-actions"]
apis_detected: list[str] # ["github", "slack", "stripe"]
existing_mcp_servers: list[dict] # from mcp_servers.json if present
env_vars_available: list[str] # from .env (names only, never values)
project_description: str # from README first paragraph
file_manifest: dict[str, bool] # {"package.json": True, "Cargo.toml": False, ...}Why function, not event_loop: Scanning for package.json and parsing JSON is deterministic. The LLM adds no value here and would only add cost and latency. Every token spent on file scanning is a token not available for the high-value reasoning in later nodes.
2. discover_servers — event_loop
Input: project_profile
Output: candidate_servers
Tools: web_search, fetch_url
Max iterations: 15
Queries the official MCP Registry API for each detected need, then performs broader web search for servers not yet in the registry.
Discovery strategy (ordered by reliability):
- MCP Registry API (primary source — OpenAPI spec):
GET https://registry.modelcontextprotocol.io/v0.1/servers
?search={query} # case-insensitive substring match on server name
&limit={n} # items per page (default 30, max 100)
&cursor={token} # pagination cursor from previous response
&version=latest # only latest versions (filters out old releases)
&updated_since={rfc3339} # only servers updated after this timestamp
Each server entry follows the ServerJSON schema: name (reverse-DNS format), description, version, packages (array with registryType [npm/pypi/oci], identifier, transport [stdio/streamable-http/sse], environmentVariables, packageArguments with isRequired/isSecret flags), repository (source + URL), and remotes (direct transport endpoints).
- npm search for Node.js servers: query
@modelcontextprotocol/*and popular community packages - PyPI search for Python servers: query
mcp-server-* - GitHub search for emerging servers:
"mcp server" + {technology keyword} - General web search for niche cases not in any registry
System prompt strategy:
You are an MCP server discovery specialist. Given a project profile,
identify MCP servers that match ACTUAL project needs.
For each candidate, determine:
- Package name and source (npm/PyPI/GitHub)
- Transport type (stdio or http)
- What credentials/env vars it requires
- What tools it provides
- Confidence score (0.0-1.0) based on match quality
RULES:
- Only recommend servers that match detected project signals
- Prefer servers from the official @modelcontextprotocol org
- Prefer servers with recent activity (updated within 6 months)
- Do NOT recommend servers the project doesn't need
- Do NOT recommend servers already in existing mcp_servers.json
Why event_loop: Discovery requires multi-step reasoning. The LLM might search the registry, find a promising result, search again with refined terms, compare two alternatives. A single LLM call cannot navigate the search→evaluate→refine cycle reliably.
3. evaluate_candidates — event_loop
Input: candidate_servers, project_profile
Output: ranked_recommendations, config_templates, credentials_needed
Tools: fetch_url, web_search
Max iterations: 20
This is the highest-value node — where the agent earns its cost. For each candidate server, the LLM:
- Reads the documentation — fetches the README/docs page via
fetch_urland extracts configuration patterns from unstructured text - Assesses maturity — checks GitHub stars, last commit date, open issues count, maintenance signals
- Generates a config template — produces the exact
mcp_servers.jsonentry this server needs:
{
"name": "postgres-mcp",
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres"],
"env": {
"POSTGRES_CONNECTION_STRING": "{{cred.postgres_url}}"
},
"description": "PostgreSQL database tools — query, schema inspection"
}- Maps credentials — identifies what secrets are needed, where to get them, and whether the project ALREADY has them:
{
"credential_id": "postgres_url",
"env_var": "POSTGRES_CONNECTION_STRING",
"description": "PostgreSQL connection URI",
"help_url": "https://www.postgresql.org/docs/current/libpq-connect.html",
"already_available": True, # detected POSTGRES_URL in .env
"source": ".env (as POSTGRES_URL)"
}- Compares alternatives — when multiple servers serve the same need, explains tradeoffs and recommends one
System prompt includes the exact mcp_servers.json schema (from tool_registry.py's load_mcp_config()) and 2-3 real examples from existing Hive templates to ground the LLM in the correct format.
Context window management: Each server's docs are processed individually. The LLM reads one README (~2K-8K tokens), extracts config, moves to the next. This prevents context bloat from processing 5+ servers simultaneously.
4. approval_gate — event_loop, client_facing
Input: ranked_recommendations, config_templates, credentials_needed
Output: approved_servers, approved_credentials
Presents the full recommendation to the user in a clear, structured format:
I analyzed your project and recommend 3 MCP servers:
1. postgres-mcp (HIGH priority)
Why: Detected PostgreSQL in docker-compose.yml
Package: @modelcontextprotocol/server-postgres
Credentials: POSTGRES_CONNECTION_STRING (already in your .env)
Tools: query, describe_table, list_tables
2. slack-mcp (HIGH priority)
Why: Detected SLACK_BOT_TOKEN in .env
Package: @modelcontextprotocol/server-slack
Credentials: SLACK_BOT_TOKEN (already in your .env)
Tools: send_message, search_messages, list_channels
3. github-mcp (MEDIUM priority)
Why: Detected .github/ directory and GitHub Actions workflows
Package: @github/github-mcp-server
Credentials: GITHUB_TOKEN (needs to be created)
Tools: create_issue, list_prs, search_code, get_file_contents
I also evaluated @alternative/pg-server but it hasn't been
updated in 8 months and has 12 unresolved issues.
Install all 3? Or select specific ones?
The user can:
- Approve all — proceed with everything
- Approve selectively — "Install postgres and slack, skip github for now"
- Request alternatives — "Show me other options for Postgres"
- Reject all — graceful exit with report of what was found
HITL checkpoint #1 — Nothing is installed, no commands are executed, and no credentials are collected until the user explicitly approves. This is the equivalent of Claude Code's "Accept" button, applied at the plan level.
5. collect_credentials — event_loop, client_facing
Input: approved_credentials (only credentials not already available)
Output: collected_credentials
Tools: store_credential
Max iterations: 20
Only activated when credentials_needed > 0 — if all required credentials are already available in .env or the Hive credential store, this node is skipped entirely via conditional routing.
Conversational credential collection:
I need 1 credential to complete the setup:
GITHUB_TOKEN (GitHub Personal Access Token)
Required scopes: repo, read:org
Create one here: https://github.com/settings/tokens/new
Please paste your token:
The user might say "I don't have one yet" — the agent responds with step-by-step instructions for creating the token. This conversational flexibility is why this is an event_loop with client_facing, not a rigid human_input node.
Security rules:
- Credential values are stored immediately via
store_credentialtool (wrapsCredentialStore.save_credential()) - The agent NEVER echoes credential values back in response text
- Credentials are stored in Hive's encrypted store (
~/.hive/credentials/), never in plain text mcp_servers.jsonuses template references ({{cred.key}}), never raw values
6. install_configure — event_loop
Input: approved_servers, config_templates, collected_credentials
Output: installation_results, mcp_config_path
Tools: execute_command, write_file, read_file
Max iterations: 30
For each approved server:
- Check if already installed — run the command with
--versionor--help - Install if needed — execute the install command (
npm install -g,pip install,npx,uvx) - Verify installation — confirm the command is available after install
- Handle failures — if global install fails, try
npx/uvxwrapper (no global install needed)
After all installations, generate the final mcp_servers.json by merging:
- New server configs from templates
- Existing
mcp_servers.jsonentries (preserved, never overwritten) - Credential template references (
{{cred.key}}format)
# Generated mcp_servers.json example
{
"postgres-mcp": {
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres"],
"env": {
"POSTGRES_CONNECTION_STRING": "{{cred.postgres_url}}"
}
},
"slack-mcp": {
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-slack"],
"env": {
"SLACK_BOT_TOKEN": "{{cred.slack_token}}"
}
},
"github-mcp": {
"transport": "stdio",
"command": "npx",
"args": ["-y", "@github/github-mcp-server"],
"env": {
"GITHUB_TOKEN": "{{cred.github_token}}"
}
}
}The agent installs with the user's explicit permission — the approval_gate already granted approval for these specific servers and commands. This mirrors how Claude Code executes commands after the user clicks "Accept." Hive agents run on the host machine; there is no sandbox limitation.
Why event_loop: Installation is inherently multi-step with conditional logic. The LLM needs to read error messages and adapt (global install failed → try npx; permission denied → suggest --prefix ~/.local; command not found after install → check PATH). A function node cannot handle this variety of failure modes.
7. validate_connections — function (deterministic)
Input: installation_results, mcp_config_path
Output: validation_results, has_failures
No LLM call. Programmatic validation using the existing MCPClient class from framework/runner/mcp_client.py:
async def validate_connections(ctx: NodeContext) -> NodeResult:
"""For each installed server, attempt connect() + list_tools()."""
config_path = Path(ctx.input_data["mcp_config_path"])
with open(config_path) as f:
servers = json.load(f)
validation = []
for name, server_config in servers.items():
try:
config = MCPServerConfig(name=name, **server_config)
client = MCPClient(config)
await client.connect() # 10s timeout (from mcp_client.py)
tools = await client.list_tools() # Discover available tools
await client.disconnect()
validation.append({
"name": name,
"status": "connected",
"tools_count": len(tools),
"tool_names": [t.name for t in tools],
})
except Exception as e:
validation.append({
"name": name,
"status": "failed",
"error": str(e),
})
has_failures = any(v["status"] == "failed" for v in validation)
return NodeResult(
success=True,
output={
"validation_results": validation,
"has_failures": has_failures,
},
)Why function: Validation is connect → list_tools → report. The existing MCPClient handles all transport logic. The 10-second connection timeout provides a natural health check boundary. No LLM reasoning needed — this is pure I/O verification.
8. diagnose_fix — event_loop (feedback loop, max_node_visits: 3)
Input: validation_results (only failures)
Output: fix_applied
Tools: read_file, write_file, execute_command, web_search
Max iterations: 15
Max node visits: 3
This is the most innovative pattern in the agent — a self-healing loop that no existing template demonstrates. When validation fails, instead of just reporting the error, the agent:
- Reads the error — parses the failure message with full context
- Diagnoses the cause — reasons about what went wrong:
ENOENT→ command not found, check PATH or try full pathConnection refused→ wrong port for HTTP server, or server not runningETIMEDOUT→ server takes too long to start, increase timeoutAuthentication failed→ credential is wrong or expiredPermission denied→ file permissions or missing scopes
- Applies a fix — modifies
mcp_servers.jsonor re-runs an install command - Routes back to
validate_connectionsfor retry
After 3 failed attempts, the node gives up gracefully and routes to report_results with detailed error context for the user.
System prompt:
A configured MCP server failed validation. Diagnose and fix the issue.
PREVIOUS CONFIG:
{config_json}
ERROR:
{error_message}
Common fixes:
- "ENOENT" on command → command not found, try full path to binary
or alternative runner (npx instead of global, uvx instead of pip)
- "Connection refused" on HTTP → wrong port or server not running
- "ETIMEDOUT" → server startup too slow, check if dependencies are met
- "Authentication failed" → credential wrong or insufficient scopes
- STDIO with no output → command hangs, check if it needs args
Read the current mcp_servers.json, apply your fix, and write it back.
If the error is unfixable (missing system dependency, incompatible OS),
report that clearly instead of retrying.
Why max_node_visits: 3: First pass fixes obvious issues (wrong path). Second pass catches issues revealed by the first fix (e.g., fixing the path reveals a missing dependency). Third pass is the last chance. After that, diminishing returns — hand off to the user with full diagnostic context.
9. report_results — function (deterministic)
Input: validation_results, installation_results, approved_servers
Output: final_report
No LLM call. Compiles the final summary:
def build_report(ctx: NodeContext) -> NodeResult:
validation = ctx.memory.read("validation_results") or []
connected = [v for v in validation if v["status"] == "connected"]
failed = [v for v in validation if v["status"] == "failed"]
total_tools = sum(v.get("tools_count", 0) for v in connected)
report = {
"summary": {
"servers_installed": len(connected) + len(failed),
"servers_connected": len(connected),
"servers_failed": len(failed),
"total_tools_available": total_tools,
},
"connected_servers": [
{
"name": v["name"],
"tools": v["tool_names"],
"tools_count": v["tools_count"],
}
for v in connected
],
"failed_servers": [
{
"name": v["name"],
"error": v["error"],
"action": f"Manual setup needed: {v['error']}",
}
for v in failed
],
"config_path": str(ctx.memory.read("mcp_config_path")),
"next_steps": [],
}
for server in failed:
report["next_steps"].append(
f"Manually configure '{server['name']}': {server['error']}"
)
if not failed:
report["next_steps"].append(
"All servers connected. Your agent now has "
f"{total_tools} new tools available."
)
return NodeResult(success=True, output={"final_report": report})Edge Specifications
| # | Source | Target | Condition | Notes |
|---|---|---|---|---|
| 1 | project_scanner | discover_servers | on_success | |
| 2 | discover_servers | evaluate_candidates | on_success | |
| 3 | evaluate_candidates | approval_gate | on_success | |
| 4 | approval_gate | collect_credentials | conditional: len(credentials_needed) > 0 |
Credentials required |
| 5 | approval_gate | install_configure | conditional: len(credentials_needed) == 0 |
No credentials needed |
| 6 | approval_gate | report_results | on_failure | User rejected recommendations |
| 7 | collect_credentials | install_configure | on_success | |
| 8 | install_configure | validate_connections | on_success | |
| 9 | validate_connections | report_results | conditional: has_failures == false |
All servers healthy |
| 10 | validate_connections | diagnose_fix | conditional: has_failures == true |
Some servers failed |
| 11 | diagnose_fix | validate_connections | on_success | Feedback loop — retry after fix |
| 12 | diagnose_fix | report_results | on_failure | Give up after 3 attempts |
| 13 | install_configure | report_results | on_failure | Installation failed catastrophically |
Advanced Framework Features Demonstrated
| Feature | Where | Existing Template Coverage |
|---|---|---|
| Function nodes (3x) | project_scanner, validate_connections, report_results |
0/4 templates |
Feedback loop (max_node_visits > 1) |
diagnose_fix → validate → retry (max 3 cycles) |
0/4 templates — first ever |
| Conditional routing (4x) | Credential branching, validation branching, failure exits | Only deep_research (basic) |
| on_failure edges (2x) | User rejection, catastrophic install failure | 0/4 templates |
| client_facing (2x) | approval_gate, collect_credentials |
Templates use 1 max |
| HITL approval | approval_gate — nothing executes without permission |
0/4 templates |
| Credential integration | Detects existing creds, stores new ones via CredentialStore |
0/4 templates |
| Mixed node types | 3 function + 4 event_loop + 2 client_facing | All templates use event_loop only |
What Makes This Different From install-mcp / Smithery CLI
| Capability | install-mcp | Smithery CLI | Toolsmith Agent |
|---|---|---|---|
| Install a known server by name | Yes | Yes | Yes |
| Search a registry | No | Yes (curated list) | Yes (MCP Registry API + web search) |
| Analyze your project to determine needs | No | No | Yes — reads codebase, detects stack |
| Read a server's README and generate config | No | No | Yes — LLM reads docs, extracts config |
| Detect existing credentials and reuse them | No | No | Yes — checks .env, docker-compose, credential store |
| Validate installation end-to-end | No | Partial (basic check) | Yes — starts server, calls list_tools() |
| Diagnose failures and auto-fix | No | No | Yes — feedback loop with LLM reasoning |
| Compare alternatives and explain tradeoffs | No | No | Yes — evaluates maturity, maintenance, API quality |
| Human approval before any action | N/A | N/A | Yes — HITL gate before install |
The CLI tools operate at "you tell me what to install, I install it." The agent operates at "let me understand what you need, find the best options, explain my reasoning, and set everything up correctly."
Tool Usage
| Tool | Node | Purpose | Bundled? |
|---|---|---|---|
web_search |
discover_servers, diagnose_fix | Search MCP Registry, npm, PyPI, GitHub | tools.py |
fetch_url |
discover_servers, evaluate_candidates | Read MCP server READMEs and documentation | tools.py |
read_file |
project_scanner*, evaluate_candidates, diagnose_fix | Read project files, existing configs | tools.py |
write_file |
install_configure, diagnose_fix | Write mcp_servers.json | tools.py |
list_directory |
project_scanner* | Scan project structure | tools.py |
execute_command |
install_configure, diagnose_fix | Run npm/pip install commands, verify binaries | tools.py |
store_credential |
collect_credentials | Save secrets to encrypted Hive credential store | tools.py |
*project_scanner is a function node — it uses Python's pathlib directly, not LLM tool calls.
The bootstrapping problem: This agent installs MCP servers — but it can't USE MCP servers to do so (what if none are installed yet?). Therefore, all tools are bundled in a tools.py file shipped with the agent, using Python's standard library (subprocess.run, pathlib, httpx). This makes the agent self-contained — it works even when zero MCP servers are configured.
Technical Implementation Notes
MCP Registry API Integration
The discover_servers node queries the official MCP Registry:
import httpx
async def search_mcp_registry(query: str, limit: int = 30) -> list[dict]:
"""Query the official MCP Registry API (v0.1)."""
async with httpx.AsyncClient() as client:
response = await client.get(
"https://registry.modelcontextprotocol.io/v0.1/servers",
params={
"search": query,
"limit": limit, # max 100
"version": "latest", # skip old releases
},
)
data = response.json()
# Response: { servers: [...], metadata: { count, nextCursor } }
return data.get("servers", [])Each server entry follows the ServerJSON schema: name (reverse-DNS), description, version, packages (with registryType, identifier, transport, environmentVariables, packageArguments), repository (source + URL), and remotes (direct transport config).
Configuration Generation
The evaluate_candidates node generates entries compatible with both list and dict formats supported by load_mcp_config() in tool_registry.py:
{
"postgres-mcp": {
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres"],
"env": {
"POSTGRES_CONNECTION_STRING": "{{cred.postgres_url}}"
},
"description": "PostgreSQL database tools (auto-configured by Toolsmith)"
}
}Credential Detection Chain
The evaluate_candidates node checks multiple sources before asking the user:
- Hive's encrypted credential store (
~/.hive/credentials/) — viaCredentialStore - Environment variables (
os.environ) .envfiles in the project directory (names only, never reads values into LLM context)docker-compose.ymlenvironment sections- Existing
mcp_servers.jsonenv mappings
This leverages the existing CredentialStore, CredentialSpec, and CredentialStoreAdapter infrastructure already in the Hive codebase. Only credentials NOT found in any source are routed to the collect_credentials node.
Validation Using MCPClient
The validate_connections node uses the existing MCPClient class from framework/runner/mcp_client.py:
from framework.runner.mcp_client import MCPClient, MCPServerConfig
config = MCPServerConfig(
name="postgres-mcp",
transport="stdio",
command="npx",
args=["-y", "@modelcontextprotocol/server-postgres"],
env={"POSTGRES_CONNECTION_STRING": "postgresql://..."},
)
client = MCPClient(config)
await client.connect() # 10-second timeout (built-in)
tools = await client.list_tools() # Discover available tools
await client.disconnect() # Clean shutdownThe 10-second connection timeout in MCPClient._connect_stdio() provides a natural health check boundary.
Preference Memory
.toolsmith-history.json in the project directory tracks:
- Previously installed servers and their versions
- Servers the user declined (with reason and date)
- Configuration overrides the user made manually
- Last scan date
This prevents the agent from re-suggesting servers the user has explicitly rejected.
The Killer Demo
User: "I have a Hive agent that helps with customer support.
Analyze my project and set up any useful MCP servers."
Toolsmith: "Scanning your project..."
[project_scanner reads package.json, docker-compose.yml, .env, README]
Toolsmith: "I found the following in your project:
- Python 3.11 with FastAPI backend
- PostgreSQL database (detected from docker-compose.yml)
- Slack integration (SLACK_BOT_TOKEN found in .env)
- GitHub repository for issue tracking
- No existing MCP servers configured
Based on your customer support use case, I recommend:
1. postgres-mcp (priority: HIGH)
Query your database directly for customer records.
Package: @modelcontextprotocol/server-postgres
I detected your POSTGRES_URL in docker-compose.yml — I'll reuse it.
2. slack-mcp (priority: HIGH)
Send messages, search history, manage channels.
Package: @modelcontextprotocol/server-slack
I can reuse your existing SLACK_BOT_TOKEN from .env.
3. github-mcp (priority: MEDIUM)
Manage issues, read code, track PRs.
Package: @github/github-mcp-server
You'll need to create a GitHub Personal Access Token.
I evaluated 2 other Postgres servers but @modelcontextprotocol/server-postgres
has the most recent updates (3 days ago) and the cleanest API.
Shall I proceed with all 3?"
User: "Go ahead."
[collect_credentials asks only for GITHUB_TOKEN — the other 2 are already available]
Toolsmith: "I need 1 credential:
GITHUB_TOKEN (GitHub Personal Access Token)
Required scopes: repo, read:org
Create one here: https://github.com/settings/tokens/new"
User: [pastes token]
[install_configure runs npm installs, generates mcp_servers.json]
[validate_connections tests each server]
Toolsmith: "All 3 servers installed and verified:
✓ postgres-mcp: 8 tools (query, describe_table, list_tables, ...)
✓ slack-mcp: 12 tools (send_message, search_messages, ...)
✓ github-mcp: 15 tools (create_issue, list_prs, search_code, ...)
Your mcp_servers.json has been written to ./mcp_servers.json
All credentials stored securely in ~/.hive/credentials/
Your agent now has 35 new tools available."
From zero to 35 working tools in one conversation, with the agent doing all the thinking.
Built On Existing Hive Infrastructure
This agent is not built from scratch — it directly leverages infrastructure that already exists in the Hive codebase. Every integration point listed below has been verified against the current source.
MCP Client & Transport (core/framework/runner/mcp_client.py)
The validate_connections node reuses MCPClient directly — the same class that powers MCP tool execution throughout Hive:
MCPClient.connect()— starts STDIO servers or connects to HTTP endpoints (10-second timeout built-in)MCPClient.list_tools()— discovers available tools from a running serverMCPClient.disconnect()— clean shutdownMCPServerConfigdataclass — defines the exact config shape Toolsmith generates
No new transport code is needed. The validation node is a thin wrapper around existing infrastructure.
Tool Registry (core/framework/runner/tool_registry.py)
Toolsmith generates mcp_servers.json files that are consumed directly by:
load_mcp_config()— loads server configs at agent startup (supports both list and dict formats)register_mcp_server()— registers servers with context injection for tool execution- The closure-based tool executor pattern — wraps each MCP tool call with runtime context
The agent's output IS the registry's input. Zero format translation needed.
Credential Store (tools/src/aden_tools/credentials/)
The collect_credentials node integrates with Hive's existing multi-backend credential system:
CredentialStore— encrypted file storage at~/.hive/credentials/(Fernet AES-128-CBC + HMAC)CredentialSpecdataclass — tracksenv_var,tools,help_url,health_check_endpointper credentialCredentialStoreAdapter.validate_for_tools()— pre-flight credential validation (reused byvalidate_connections)- Template resolution —
{{cred.key}}references inmcp_servers.jsonare resolved at runtime byTemplateResolver
The credential flow is already built. Toolsmith fills the gap between "credentials exist" and "user knows what credentials are needed."
Agent Runner (core/framework/runner/runner.py)
The AgentRunner.__init__() already auto-loads tools.py and mcp_servers.json at startup (lines 300-329). Toolsmith's output integrates seamlessly:
- Agent generates
mcp_servers.json→ Runner auto-loads it on next run - Agent stores credentials via
CredentialStore→ Runner's_validate_credentials()(lines 331-409) validates them at startup - No manual wiring needed between Toolsmith's output and any Hive agent's startup
Agent Builder MCP Server (core/framework/mcp/agent_builder_server.py)
The builder already has manual MCP management tools that Toolsmith automates:
add_mcp_server()(line 2024) — validates and registers a server manually. Toolsmith does this automatically after reading docslist_mcp_servers()— lists registered servers. Toolsmith'sreport_resultsprovides richer outputlist_mcp_tools()— lists tools from servers. Toolsmith'svalidate_connectionsdoes this programmatically- Export writes
mcp_servers.jsonif servers are registered — same format Toolsmith generates
Toolsmith is the intelligent layer on top of what the builder already does manually.
EventLoopNode (core/framework/graph/event_loop_node.py)
The diagnose_fix feedback loop relies on framework features already implemented:
max_node_visits(fromNodeSpec, line 210 ofnode.py) — controls retry limit, already supportedclient_facingmode (line 168 ofevent_loop_node.py) — enablesask_user()for interactive credential collectionLoopConfig.max_tool_calls_per_turn(default 10) — controls tool call budget per LLM response
Relationship to Existing Issues
This agent is the orchestrator that unifies capabilities proposed across 8 existing issues. It does not replace them — it connects them through LLM reasoning.
| Existing Issue | What It Proposes | How Toolsmith Relates |
|---|---|---|
| #4470 Hive Skills Watchtower | Central registry that auto-downloads skills | Toolsmith queries this as a source for Hive-native MCP servers when available |
| #4052 Agent Template Marketplace | Browse, search, one-click install of templates | Toolsmith could recommend templates that leverage discovered tools |
| #4374 hive doctor | Health check, config validation | Toolsmith's validate_connections overlaps — should reuse same MCPClient logic |
| #4201 / #3971 OpenAPI → MCP Generator | Generate MCP servers from OpenAPI specs | Future: when no existing server matches, Toolsmith could invoke this to generate one |
| #4381 Generic API Connector | Connect any REST API without custom integration | Fallback path for APIs without dedicated MCP servers |
| #3846 hive new CLI | Scaffolding for agents and tools | Toolsmith generates configs compatible with scaffolded projects |
| #3944 Intent-to-Template Matcher | Semantic matching of user needs to templates | Toolsmith's discover_servers uses similar intent→capability matching |
| #3009 Declarative Tool Definitions | Tools via YAML/JSON without Python | Toolsmith's output (mcp_servers.json) IS a declarative tool definition |
Key positioning: Those issues build individual infrastructure pieces (registry, marketplace, CLI, health check). Toolsmith is the agent that orchestrates those pieces together, end-to-end, through LLM reasoning. It is the user-facing experience that makes those infrastructure pieces accessible.
Security & Safety Policy
- No action without approval — The
approval_gateis a hard HITL checkpoint. Nothing is installed, no commands are executed, and no credentials are collected until the user explicitly approves the plan. The agent presents what it wants to do and waits. - Credentials never in LLM context — Credential values are stored immediately via
store_credentialand never echoed in response text. The.envscanner reads variable NAMES only.mcp_servers.jsonuses template references ({{cred.key}}), never raw values. - Preserve existing config — Existing
mcp_servers.jsonentries are never overwritten or removed. New servers are merged alongside existing ones. - Command allowlisting — The
execute_commandtool should restrict execution to package manager commands (npm,npx,pip,uv,uvx,pipx) and MCP server binaries. No arbitrary shell execution. - No credential store writes without user action — The agent only writes to
CredentialStoreafter the user explicitly provides a credential in thecollect_credentialsnode. It never generates, guesses, or fabricates credentials. - Graceful degradation — If the agent cannot install or configure a server after 3 retry cycles, it reports the failure with full diagnostic context and suggests manual steps. It never silently drops a server.
Non-Overlap Verification
| Existing Proposal | Scope | Overlap |
|---|---|---|
| #4205 Account Intelligence Agent | CRM account research using LLM | None — different domain |
| #4224 Vulnerability Auditor Agent | Security scanning of dependencies | None — different domain (but could USE Toolsmith to set up its MCP tools) |
| #4301 Lead Intelligence Agent | GitHub signals to CRM pipeline | None — different domain |
| #4286 Agent QA Pipeline | Meta-circular agent testing | None — different domain |
| #4470 Skills Watchtower | Infrastructure: registry/download system | Complementary — Toolsmith would consume this as a data source |
| #4374 hive doctor | Infrastructure: health check CLI | Complementary — shares validation logic, different purpose |
| #4052 Marketplace | Infrastructure: template browsing UI | Complementary — different interface (agent vs. UI) |
| #3846 hive new CLI | Infrastructure: scaffolding commands | Complementary — Toolsmith generates what scaffolding consumes |
Implementation Complexity
Estimated effort: Medium (9 nodes, but each has clear bounded scope)
Can be built incrementally:
Phase 1 — Core Discovery (MVP)
project_scanner— file scanning logicdiscover_servers— MCP Registry API integrationevaluate_candidates— README reading + config generationapproval_gate— HITL presentationinstall_configure— install + write mcp_servers.jsonvalidate_connections— MCPClient health checkreport_results— summary- Bundled
tools.pywithexecute_command,read_file,write_file,web_search,fetch_url
Demonstrates: function nodes, client_facing, HITL, conditional routing, on_failure edges
Phase 2 — Credential Intelligence
collect_credentials— conversational credential collection- Credential detection from
.env,docker-compose.yml,CredentialStore - Conditional routing (skip when no creds needed)
store_credentialtool integration
Demonstrates: conditional branching, credential store integration, multi-node HITL
Phase 3 — Self-Healing Loop
diagnose_fix— error diagnosis + feedback loopmax_node_visits: 3on diagnose_fix- Retry edges from diagnose_fix → validate_connections
- Preference memory (
.toolsmith-history.json)
Demonstrates: feedback loops (max_node_visits > 1), on_failure recovery, adaptive LLM reasoning
Phase 4 — Ecosystem Integration
- Integration with Skills Watchtower ([Feature]: Auto-download whitelisted skills from Hive Skills Watchtower #4470) as server source
- Integration with
hive doctor([Feature]:hive doctor— Environment Health Check CLI #4374) for ongoing health monitoring - Integration with OpenAPI → MCP generator ([Integration]: OpenAPI → MCP Tool Generator (MVP: GET-only + allowlist + auth injection) #4201) as fallback for missing servers
- Expose as
hive mcp setupCLI command ([Feature]: Addhive newCLI for agent & tool scaffolding (no Claude dependency) #3846)
Goal Definition
goal:
id: mcp-toolsmith
name: MCP Toolsmith — Intelligent Server Setup
description: >
Analyze a software project, discover relevant MCP servers, install and
configure them with correct credentials, validate every connection, and
produce a working mcp_servers.json integrated with Hive's infrastructure.
success_criteria:
- id: project-analyzed
description: Generate a structured project profile with detected languages, frameworks, databases, and existing integrations
weight: 0.10
- id: servers-discovered
description: Identify at least 1 relevant MCP server from the MCP Registry or web search
weight: 0.15
- id: docs-comprehended
description: Read server documentation and generate correct mcp_servers.json config entries with proper transport, command, args, and env
weight: 0.20
- id: human-approved
description: Present recommendations to user with tradeoffs and receive explicit approval before any action
weight: 0.15
- id: credentials-handled
description: Detect existing credentials, collect missing ones securely, store via CredentialStore
weight: 0.10
- id: servers-installed
description: Successfully install approved MCP server packages
weight: 0.10
- id: connections-validated
description: Every installed server passes connect() + list_tools() validation
weight: 0.15
- id: self-healed
description: When validation fails, diagnose the error, fix the config, and retry (up to 3 attempts)
weight: 0.05
constraints:
- id: no-action-without-approval
description: Nothing is installed or executed until the user explicitly approves via approval_gate
type: safety
- id: no-credential-exposure
description: Credential values are never included in LLM output text or logged
type: security
- id: preserve-existing-config
description: Existing mcp_servers.json entries are never overwritten or removed
type: safety
- id: command-allowlist
description: execute_command restricted to package managers (npm, pip, npx, uvx) and MCP binaries
type: security
- id: graceful-degradation
description: Failed servers are reported with diagnostics, never silently dropped
type: operationalOpen Questions
-
Should Toolsmith update agent node definitions? When a new MCP server is installed, the agent's nodes need their
toolsfield updated to reference the new tools. Should Toolsmith modifyagent.jsonautomatically, or just generate themcp_servers.jsonand let the developer wire it up? -
Scope of package managers at launch: The install node needs to handle npm, pip/uv, and potentially Docker and Go binaries. Should Phase 1 support all package managers, or start with npm+pip and expand?
-
Offline/air-gapped environments: Enterprise users may have internal MCP servers not in the public registry. Should Toolsmith support a local catalog file or private registry URL as a discovery source?
-
Ongoing monitoring: After initial setup, MCP servers can break (new versions, revoked credentials, changed APIs). Should a future version of Toolsmith support periodic re-validation, or should that be delegated to
hive doctor([Feature]:hive doctor— Environment Health Check CLI #4374)?