awslabs
diff --git a/‎INSTALL.md‎
Lines changed: 49 additions & 36 deletions b/‎INSTALL.md‎
Lines changed: 49 additions & 36 deletions
diff --git a/‎README.md‎
Lines changed: 22 additions & 14 deletions b/‎README.md‎
Lines changed: 22 additions & 14 deletions
diff --git a/‎eval_mcp/INSTALL.md‎
Lines changed: 49 additions & 36 deletions b/‎eval_mcp/INSTALL.md‎
Lines changed: 49 additions & 36 deletions
@@ -2,47 +2,35 @@
 
 Follow this when a user asks you to install eval-mcp into their IDE.
 
-## 1. Prerequisite check
+## Happy path (one command)
 
-User needs `uv`. Check: `command -v uv`. If missing, tell them to run `curl -LsSf https://astral.sh/uv/install.sh | sh` (or `brew install uv` on macOS).
-
-## 2. Pick the IDE
-
-Detect from what's present in the user's home:
-
-| IDE | Check |
-|---|---|
-| Claude Code | `claude` on PATH or `~/.claude/` exists |
-| Cursor | `~/.cursor/` exists |
-| VS Code | `code` on PATH |
-| Kiro | `~/.kiro/` exists |
-| Codex CLI | `~/.codex/` exists |
-
-If multiple, ask the user which.
-
-## 3. Install
-
-Run the one-command install for that IDE:
+```bash
+uvx --from llm-evaluation-system eval-mcp install --yes
+```
 
-- **Claude Code**: `claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp`
-- **Cursor**: open `cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==` (or edit `~/.cursor/mcp.json`)
-- **VS Code**: `code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'`
-- **Kiro**: merge into `~/.kiro/settings/mcp.json` under `mcpServers.eval` — same JSON shape as Cursor. Back up first, atomic write.
-- **Codex**: append to `~/.codex/config.toml`: `[mcp_servers.eval]\ncommand = "uvx"\nargs = ["--from", "llm-evaluation-system", "eval-mcp"]`
+The subcommand detects which IDEs are present (Claude Code, Kiro, VS Code,
+Cursor, Codex), registers `eval` in each, and warms the uvx cache so the
+first IDE launch isn't 60s of "disconnected". Then tell the user to
+restart whichever IDE(s) it touched.
 
-For any manual JSON/TOML edit: back up first, merge — never clobber, atomic write.
+If the user wants a single IDE only:
+```bash
+uvx --from llm-evaluation-system eval-mcp install --ide claude-code --yes
+```
 
-Do **not** use `llm-evaluation-system@latest` in the IDE config — it forces a PyPI resolution on every MCP start (~20s cold start) and causes the MCP to show "disconnected" on first connect after any new release. Plain `llm-evaluation-system` is what's documented across the MCP ecosystem (reference servers, `mcp-atlassian`, etc.). Upgrade separately via `uv cache clean llm-evaluation-system`.
+Valid `--ide` values: `claude-code`, `kiro`, `vscode`, `cursor`, `codex`.
+Comma-separate for multiple.
 
-## 4. Warm the cache
+If `eval` is already registered, the subcommand skips with a message.
+Pass `--force` to overwrite.
 
-Run this so the user's first IDE launch is instant instead of waiting for uvx to fetch the package:
+## Prerequisite
 
-```bash
-uvx --from llm-evaluation-system eval-mcp --help >/dev/null
-```
+User needs `uv`. Check: `command -v uv`. If missing, tell them to run
+`curl -LsSf https://astral.sh/uv/install.sh | sh` (or `brew install uv`
+on macOS).
 
-## 5. Ask about S3 team sharing
+## After install: S3 team sharing prompt
 
 > Want to share eval results, datasets, judges, and reports with teammates via a shared S3 bucket?
 
@@ -64,12 +52,37 @@ Suggested bucket name if they don't have a preference: `eval-mcp-$(aws sts get-c
 
 If **no**: skip. They can enable it later with `eval-mcp init <bucket>`.
 
-## 6. Tell the user to restart
+## Fallback: manual per-IDE install
+
+If `eval-mcp install` won't run for some reason (e.g. corporate network blocks PyPI
+mid-install and the cache warm-up fails), here are the same operations done by hand:
+
+- **Claude Code**: `claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp`
+- **VS Code**: `code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'`
+- **Kiro**: merge into `~/.kiro/settings/mcp.json` under `mcpServers.eval`:
+  ```json
+  {"mcpServers": {"eval": {"command": "uvx", "args": ["--from", "llm-evaluation-system", "eval-mcp"]}}}
+  ```
+- **Cursor**: same JSON shape, in `~/.cursor/mcp.json`. Or use the one-click deeplink: `cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==`
+- **Codex**: append to `~/.codex/config.toml`: `[mcp_servers.eval]\ncommand = "uvx"\nargs = ["--from", "llm-evaluation-system", "eval-mcp"]`
+
+For any manual JSON/TOML edit: back up first, merge — never clobber, atomic write.
+
+Do **not** use `llm-evaluation-system@latest` in the IDE config — it forces a PyPI
+resolution on every MCP start (~20s cold start) and causes the MCP to show
+"disconnected" on first connect after any new release. Plain `llm-evaluation-system`
+is what's documented across the MCP ecosystem (reference servers, `mcp-atlassian`,
+etc.). Upgrade separately via `uv cache clean llm-evaluation-system`.
+
+## Tell the user to restart
+
+The `install` subcommand prints per-IDE restart hints. Quick recap:
 
 - Claude Code / Cursor / VS Code: reload/restart the window or app
-- Kiro: save mcp.json — picks up automatically
+- Kiro: picks up `mcp.json` automatically — no restart needed
 - Codex: must restart the CLI
 
 ## Uninstall
 
-Reverse step 3 (remove the `eval` entry), restart.
+Per IDE: remove the `eval` entry from the relevant config (Claude Code:
+`claude mcp remove eval -s user`; others: edit the JSON/TOML). Then restart.
@@ -22,40 +22,48 @@ An LLM evaluation system where you describe what you want to evaluate in natural
 
 ### Install
 
-Pick your IDE and paste / click.
+One command — auto-detects which IDEs are on your machine (Claude Code,
+Kiro, VS Code, Cursor, Codex), asks which to configure, registers the
+MCP server in each, and warms the uvx cache:
 
-**Claude Code** — one CLI command:
+```bash
+uvx --from llm-evaluation-system eval-mcp install
+```
+
+Add `--yes` to skip prompts, `--ide claude-code` (or comma-separated)
+to target a specific subset, `--force` to overwrite an existing entry.
+Then restart your IDE.
+
+<details>
+<summary>Manual install (if you'd rather not use <code>eval-mcp install</code>)</summary>
+
+**Claude Code:**
 ```bash
 claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp
 ```
 
 **Cursor** — one-click deeplink: [Install eval-mcp in Cursor](cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==)
 
-**Kiro** — add to `~/.kiro/settings/mcp.json`:
+**Kiro** — `~/.kiro/settings/mcp.json`:
 ```json
-{
-  "mcpServers": {
-    "eval": {
-      "command": "uvx",
-      "args": ["--from", "llm-evaluation-system", "eval-mcp"]
-    }
-  }
-}
+{"mcpServers": {"eval": {"command": "uvx", "args": ["--from", "llm-evaluation-system", "eval-mcp"]}}}
 ```
 
-**Codex CLI** — add to `~/.codex/config.toml`, then restart Codex:
+**Codex** — `~/.codex/config.toml`:
 ```toml
 [mcp_servers.eval]
 command = "uvx"
 args = ["--from", "llm-evaluation-system", "eval-mcp"]
 ```
 
-**VS Code** (with GitHub Copilot MCP) — one CLI command:
+**VS Code:**
 ```bash
 code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'
 ```
 
-Using a coding agent to install? Point it at [INSTALL.md](INSTALL.md) — it handles the config edit and asks about optional S3 team sharing.
+</details>
+
+Using a coding agent to install? Point it at [INSTALL.md](INSTALL.md) — it just runs `eval-mcp install --yes` and asks about optional S3 team sharing.
 
 ### Upgrading
 
 
@@ -2,47 +2,35 @@
 
 Follow this when a user asks you to install eval-mcp into their IDE.
 
-## 1. Prerequisite check
+## Happy path (one command)
 
-User needs `uv`. Check: `command -v uv`. If missing, tell them to run `curl -LsSf https://astral.sh/uv/install.sh | sh` (or `brew install uv` on macOS).
-
-## 2. Pick the IDE
-
-Detect from what's present in the user's home:
-
-| IDE | Check |
-|---|---|
-| Claude Code | `claude` on PATH or `~/.claude/` exists |
-| Cursor | `~/.cursor/` exists |
-| VS Code | `code` on PATH |
-| Kiro | `~/.kiro/` exists |
-| Codex CLI | `~/.codex/` exists |
-
-If multiple, ask the user which.
-
-## 3. Install
-
-Run the one-command install for that IDE:
+```bash
+uvx --from llm-evaluation-system eval-mcp install --yes
+```
 
-- **Claude Code**: `claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp`
-- **Cursor**: open `cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==` (or edit `~/.cursor/mcp.json`)
-- **VS Code**: `code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'`
-- **Kiro**: merge into `~/.kiro/settings/mcp.json` under `mcpServers.eval` — same JSON shape as Cursor. Back up first, atomic write.
-- **Codex**: append to `~/.codex/config.toml`: `[mcp_servers.eval]\ncommand = "uvx"\nargs = ["--from", "llm-evaluation-system", "eval-mcp"]`
+The subcommand detects which IDEs are present (Claude Code, Kiro, VS Code,
+Cursor, Codex), registers `eval` in each, and warms the uvx cache so the
+first IDE launch isn't 60s of "disconnected". Then tell the user to
+restart whichever IDE(s) it touched.
 
-For any manual JSON/TOML edit: back up first, merge — never clobber, atomic write.
+If the user wants a single IDE only:
+```bash
+uvx --from llm-evaluation-system eval-mcp install --ide claude-code --yes
+```
 
-Do **not** use `llm-evaluation-system@latest` in the IDE config — it forces a PyPI resolution on every MCP start (~20s cold start) and causes the MCP to show "disconnected" on first connect after any new release. Plain `llm-evaluation-system` is what's documented across the MCP ecosystem (reference servers, `mcp-atlassian`, etc.). Upgrade separately via `uv cache clean llm-evaluation-system`.
+Valid `--ide` values: `claude-code`, `kiro`, `vscode`, `cursor`, `codex`.
+Comma-separate for multiple.
 
-## 4. Warm the cache
+If `eval` is already registered, the subcommand skips with a message.
+Pass `--force` to overwrite.
 
-Run this so the user's first IDE launch is instant instead of waiting for uvx to fetch the package:
+## Prerequisite
 
-```bash
-uvx --from llm-evaluation-system eval-mcp --help >/dev/null
-```
+User needs `uv`. Check: `command -v uv`. If missing, tell them to run
+`curl -LsSf https://astral.sh/uv/install.sh | sh` (or `brew install uv`
+on macOS).
 
-## 5. Ask about S3 team sharing
+## After install: S3 team sharing prompt
 
 > Want to share eval results, datasets, judges, and reports with teammates via a shared S3 bucket?
 
@@ -64,12 +52,37 @@ Suggested bucket name if they don't have a preference: `eval-mcp-$(aws sts get-c
 
 If **no**: skip. They can enable it later with `eval-mcp init <bucket>`.
 
-## 6. Tell the user to restart
+## Fallback: manual per-IDE install
+
+If `eval-mcp install` won't run for some reason (e.g. corporate network blocks PyPI
+mid-install and the cache warm-up fails), here are the same operations done by hand:
+
+- **Claude Code**: `claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp`
+- **VS Code**: `code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'`
+- **Kiro**: merge into `~/.kiro/settings/mcp.json` under `mcpServers.eval`:
+  ```json
+  {"mcpServers": {"eval": {"command": "uvx", "args": ["--from", "llm-evaluation-system", "eval-mcp"]}}}
+  ```
+- **Cursor**: same JSON shape, in `~/.cursor/mcp.json`. Or use the one-click deeplink: `cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==`
+- **Codex**: append to `~/.codex/config.toml`: `[mcp_servers.eval]\ncommand = "uvx"\nargs = ["--from", "llm-evaluation-system", "eval-mcp"]`
+
+For any manual JSON/TOML edit: back up first, merge — never clobber, atomic write.
+
+Do **not** use `llm-evaluation-system@latest` in the IDE config — it forces a PyPI
+resolution on every MCP start (~20s cold start) and causes the MCP to show
+"disconnected" on first connect after any new release. Plain `llm-evaluation-system`
+is what's documented across the MCP ecosystem (reference servers, `mcp-atlassian`,
+etc.). Upgrade separately via `uv cache clean llm-evaluation-system`.
+
+## Tell the user to restart
+
+The `install` subcommand prints per-IDE restart hints. Quick recap:
 
 - Claude Code / Cursor / VS Code: reload/restart the window or app
-- Kiro: save mcp.json — picks up automatically
+- Kiro: picks up `mcp.json` automatically — no restart needed
 - Codex: must restart the CLI
 
 ## Uninstall
 
-Reverse step 3 (remove the `eval` entry), restart.
+Per IDE: remove the `eval` entry from the relevant config (Claude Code:
+`claude mcp remove eval -s user`; others: edit the JSON/TOML). Then restart.