Skip to content

Commit 3a211b6

Browse files
authored
Merge pull request #69 from awslabs/feat/install-subcommand
feat(install): one-command auto-detecting IDE installer
2 parents 0cb4577 + 7cbec8a commit 3a211b6

24 files changed

Lines changed: 1672 additions & 114 deletions

INSTALL.md

Lines changed: 49 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -2,47 +2,35 @@
22

33
Follow this when a user asks you to install eval-mcp into their IDE.
44

5-
## 1. Prerequisite check
5+
## Happy path (one command)
66

7-
User needs `uv`. Check: `command -v uv`. If missing, tell them to run `curl -LsSf https://astral.sh/uv/install.sh | sh` (or `brew install uv` on macOS).
8-
9-
## 2. Pick the IDE
10-
11-
Detect from what's present in the user's home:
12-
13-
| IDE | Check |
14-
|---|---|
15-
| Claude Code | `claude` on PATH or `~/.claude/` exists |
16-
| Cursor | `~/.cursor/` exists |
17-
| VS Code | `code` on PATH |
18-
| Kiro | `~/.kiro/` exists |
19-
| Codex CLI | `~/.codex/` exists |
20-
21-
If multiple, ask the user which.
22-
23-
## 3. Install
24-
25-
Run the one-command install for that IDE:
7+
```bash
8+
uvx --from llm-evaluation-system eval-mcp install --yes
9+
```
2610

27-
- **Claude Code**: `claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp`
28-
- **Cursor**: open `cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==` (or edit `~/.cursor/mcp.json`)
29-
- **VS Code**: `code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'`
30-
- **Kiro**: merge into `~/.kiro/settings/mcp.json` under `mcpServers.eval` — same JSON shape as Cursor. Back up first, atomic write.
31-
- **Codex**: append to `~/.codex/config.toml`: `[mcp_servers.eval]\ncommand = "uvx"\nargs = ["--from", "llm-evaluation-system", "eval-mcp"]`
11+
The subcommand detects which IDEs are present (Claude Code, Kiro, VS Code,
12+
Cursor, Codex), registers `eval` in each, and warms the uvx cache so the
13+
first IDE launch isn't 60s of "disconnected". Then tell the user to
14+
restart whichever IDE(s) it touched.
3215

33-
For any manual JSON/TOML edit: back up first, merge — never clobber, atomic write.
16+
If the user wants a single IDE only:
17+
```bash
18+
uvx --from llm-evaluation-system eval-mcp install --ide claude-code --yes
19+
```
3420

35-
Do **not** use `llm-evaluation-system@latest` in the IDE config — it forces a PyPI resolution on every MCP start (~20s cold start) and causes the MCP to show "disconnected" on first connect after any new release. Plain `llm-evaluation-system` is what's documented across the MCP ecosystem (reference servers, `mcp-atlassian`, etc.). Upgrade separately via `uv cache clean llm-evaluation-system`.
21+
Valid `--ide` values: `claude-code`, `kiro`, `vscode`, `cursor`, `codex`.
22+
Comma-separate for multiple.
3623

37-
## 4. Warm the cache
24+
If `eval` is already registered, the subcommand skips with a message.
25+
Pass `--force` to overwrite.
3826

39-
Run this so the user's first IDE launch is instant instead of waiting for uvx to fetch the package:
27+
## Prerequisite
4028

41-
```bash
42-
uvx --from llm-evaluation-system eval-mcp --help >/dev/null
43-
```
29+
User needs `uv`. Check: `command -v uv`. If missing, tell them to run
30+
`curl -LsSf https://astral.sh/uv/install.sh | sh` (or `brew install uv`
31+
on macOS).
4432

45-
## 5. Ask about S3 team sharing
33+
## After install: S3 team sharing prompt
4634

4735
> Want to share eval results, datasets, judges, and reports with teammates via a shared S3 bucket?
4836
@@ -64,12 +52,37 @@ Suggested bucket name if they don't have a preference: `eval-mcp-$(aws sts get-c
6452

6553
If **no**: skip. They can enable it later with `eval-mcp init <bucket>`.
6654

67-
## 6. Tell the user to restart
55+
## Fallback: manual per-IDE install
56+
57+
If `eval-mcp install` won't run for some reason (e.g. corporate network blocks PyPI
58+
mid-install and the cache warm-up fails), here are the same operations done by hand:
59+
60+
- **Claude Code**: `claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp`
61+
- **VS Code**: `code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'`
62+
- **Kiro**: merge into `~/.kiro/settings/mcp.json` under `mcpServers.eval`:
63+
```json
64+
{"mcpServers": {"eval": {"command": "uvx", "args": ["--from", "llm-evaluation-system", "eval-mcp"]}}}
65+
```
66+
- **Cursor**: same JSON shape, in `~/.cursor/mcp.json`. Or use the one-click deeplink: `cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==`
67+
- **Codex**: append to `~/.codex/config.toml`: `[mcp_servers.eval]\ncommand = "uvx"\nargs = ["--from", "llm-evaluation-system", "eval-mcp"]`
68+
69+
For any manual JSON/TOML edit: back up first, merge — never clobber, atomic write.
70+
71+
Do **not** use `llm-evaluation-system@latest` in the IDE config — it forces a PyPI
72+
resolution on every MCP start (~20s cold start) and causes the MCP to show
73+
"disconnected" on first connect after any new release. Plain `llm-evaluation-system`
74+
is what's documented across the MCP ecosystem (reference servers, `mcp-atlassian`,
75+
etc.). Upgrade separately via `uv cache clean llm-evaluation-system`.
76+
77+
## Tell the user to restart
78+
79+
The `install` subcommand prints per-IDE restart hints. Quick recap:
6880

6981
- Claude Code / Cursor / VS Code: reload/restart the window or app
70-
- Kiro: save mcp.json — picks up automatically
82+
- Kiro: picks up `mcp.json` automatically — no restart needed
7183
- Codex: must restart the CLI
7284

7385
## Uninstall
7486

75-
Reverse step 3 (remove the `eval` entry), restart.
87+
Per IDE: remove the `eval` entry from the relevant config (Claude Code:
88+
`claude mcp remove eval -s user`; others: edit the JSON/TOML). Then restart.

README.md

Lines changed: 22 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -22,40 +22,48 @@ An LLM evaluation system where you describe what you want to evaluate in natural
2222

2323
### Install
2424

25-
Pick your IDE and paste / click.
25+
One command — auto-detects which IDEs are on your machine (Claude Code,
26+
Kiro, VS Code, Cursor, Codex), asks which to configure, registers the
27+
MCP server in each, and warms the uvx cache:
2628

27-
**Claude Code** — one CLI command:
29+
```bash
30+
uvx --from llm-evaluation-system eval-mcp install
31+
```
32+
33+
Add `--yes` to skip prompts, `--ide claude-code` (or comma-separated)
34+
to target a specific subset, `--force` to overwrite an existing entry.
35+
Then restart your IDE.
36+
37+
<details>
38+
<summary>Manual install (if you'd rather not use <code>eval-mcp install</code>)</summary>
39+
40+
**Claude Code:**
2841
```bash
2942
claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp
3043
```
3144

3245
**Cursor** — one-click deeplink: [Install eval-mcp in Cursor](cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==)
3346

34-
**Kiro**add to `~/.kiro/settings/mcp.json`:
47+
**Kiro**`~/.kiro/settings/mcp.json`:
3548
```json
36-
{
37-
"mcpServers": {
38-
"eval": {
39-
"command": "uvx",
40-
"args": ["--from", "llm-evaluation-system", "eval-mcp"]
41-
}
42-
}
43-
}
49+
{"mcpServers": {"eval": {"command": "uvx", "args": ["--from", "llm-evaluation-system", "eval-mcp"]}}}
4450
```
4551

46-
**Codex CLI**add to `~/.codex/config.toml`, then restart Codex:
52+
**Codex**`~/.codex/config.toml`:
4753
```toml
4854
[mcp_servers.eval]
4955
command = "uvx"
5056
args = ["--from", "llm-evaluation-system", "eval-mcp"]
5157
```
5258

53-
**VS Code** (with GitHub Copilot MCP) — one CLI command:
59+
**VS Code:**
5460
```bash
5561
code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'
5662
```
5763

58-
Using a coding agent to install? Point it at [INSTALL.md](INSTALL.md) — it handles the config edit and asks about optional S3 team sharing.
64+
</details>
65+
66+
Using a coding agent to install? Point it at [INSTALL.md](INSTALL.md) — it just runs `eval-mcp install --yes` and asks about optional S3 team sharing.
5967

6068
### Upgrading
6169

eval_mcp/INSTALL.md

Lines changed: 49 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -2,47 +2,35 @@
22

33
Follow this when a user asks you to install eval-mcp into their IDE.
44

5-
## 1. Prerequisite check
5+
## Happy path (one command)
66

7-
User needs `uv`. Check: `command -v uv`. If missing, tell them to run `curl -LsSf https://astral.sh/uv/install.sh | sh` (or `brew install uv` on macOS).
8-
9-
## 2. Pick the IDE
10-
11-
Detect from what's present in the user's home:
12-
13-
| IDE | Check |
14-
|---|---|
15-
| Claude Code | `claude` on PATH or `~/.claude/` exists |
16-
| Cursor | `~/.cursor/` exists |
17-
| VS Code | `code` on PATH |
18-
| Kiro | `~/.kiro/` exists |
19-
| Codex CLI | `~/.codex/` exists |
20-
21-
If multiple, ask the user which.
22-
23-
## 3. Install
24-
25-
Run the one-command install for that IDE:
7+
```bash
8+
uvx --from llm-evaluation-system eval-mcp install --yes
9+
```
2610

27-
- **Claude Code**: `claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp`
28-
- **Cursor**: open `cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==` (or edit `~/.cursor/mcp.json`)
29-
- **VS Code**: `code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'`
30-
- **Kiro**: merge into `~/.kiro/settings/mcp.json` under `mcpServers.eval` — same JSON shape as Cursor. Back up first, atomic write.
31-
- **Codex**: append to `~/.codex/config.toml`: `[mcp_servers.eval]\ncommand = "uvx"\nargs = ["--from", "llm-evaluation-system", "eval-mcp"]`
11+
The subcommand detects which IDEs are present (Claude Code, Kiro, VS Code,
12+
Cursor, Codex), registers `eval` in each, and warms the uvx cache so the
13+
first IDE launch isn't 60s of "disconnected". Then tell the user to
14+
restart whichever IDE(s) it touched.
3215

33-
For any manual JSON/TOML edit: back up first, merge — never clobber, atomic write.
16+
If the user wants a single IDE only:
17+
```bash
18+
uvx --from llm-evaluation-system eval-mcp install --ide claude-code --yes
19+
```
3420

35-
Do **not** use `llm-evaluation-system@latest` in the IDE config — it forces a PyPI resolution on every MCP start (~20s cold start) and causes the MCP to show "disconnected" on first connect after any new release. Plain `llm-evaluation-system` is what's documented across the MCP ecosystem (reference servers, `mcp-atlassian`, etc.). Upgrade separately via `uv cache clean llm-evaluation-system`.
21+
Valid `--ide` values: `claude-code`, `kiro`, `vscode`, `cursor`, `codex`.
22+
Comma-separate for multiple.
3623

37-
## 4. Warm the cache
24+
If `eval` is already registered, the subcommand skips with a message.
25+
Pass `--force` to overwrite.
3826

39-
Run this so the user's first IDE launch is instant instead of waiting for uvx to fetch the package:
27+
## Prerequisite
4028

41-
```bash
42-
uvx --from llm-evaluation-system eval-mcp --help >/dev/null
43-
```
29+
User needs `uv`. Check: `command -v uv`. If missing, tell them to run
30+
`curl -LsSf https://astral.sh/uv/install.sh | sh` (or `brew install uv`
31+
on macOS).
4432

45-
## 5. Ask about S3 team sharing
33+
## After install: S3 team sharing prompt
4634

4735
> Want to share eval results, datasets, judges, and reports with teammates via a shared S3 bucket?
4836
@@ -64,12 +52,37 @@ Suggested bucket name if they don't have a preference: `eval-mcp-$(aws sts get-c
6452

6553
If **no**: skip. They can enable it later with `eval-mcp init <bucket>`.
6654

67-
## 6. Tell the user to restart
55+
## Fallback: manual per-IDE install
56+
57+
If `eval-mcp install` won't run for some reason (e.g. corporate network blocks PyPI
58+
mid-install and the cache warm-up fails), here are the same operations done by hand:
59+
60+
- **Claude Code**: `claude mcp add eval -s user -- uvx --from llm-evaluation-system eval-mcp`
61+
- **VS Code**: `code --add-mcp '{"name":"eval","command":"uvx","args":["--from","llm-evaluation-system","eval-mcp"]}'`
62+
- **Kiro**: merge into `~/.kiro/settings/mcp.json` under `mcpServers.eval`:
63+
```json
64+
{"mcpServers": {"eval": {"command": "uvx", "args": ["--from", "llm-evaluation-system", "eval-mcp"]}}}
65+
```
66+
- **Cursor**: same JSON shape, in `~/.cursor/mcp.json`. Or use the one-click deeplink: `cursor://anysphere.cursor-deeplink/mcp/install?name=eval&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyItLWZyb20iLCJsbG0tZXZhbHVhdGlvbi1zeXN0ZW0iLCJldmFsLW1jcCJdfQ==`
67+
- **Codex**: append to `~/.codex/config.toml`: `[mcp_servers.eval]\ncommand = "uvx"\nargs = ["--from", "llm-evaluation-system", "eval-mcp"]`
68+
69+
For any manual JSON/TOML edit: back up first, merge — never clobber, atomic write.
70+
71+
Do **not** use `llm-evaluation-system@latest` in the IDE config — it forces a PyPI
72+
resolution on every MCP start (~20s cold start) and causes the MCP to show
73+
"disconnected" on first connect after any new release. Plain `llm-evaluation-system`
74+
is what's documented across the MCP ecosystem (reference servers, `mcp-atlassian`,
75+
etc.). Upgrade separately via `uv cache clean llm-evaluation-system`.
76+
77+
## Tell the user to restart
78+
79+
The `install` subcommand prints per-IDE restart hints. Quick recap:
6880

6981
- Claude Code / Cursor / VS Code: reload/restart the window or app
70-
- Kiro: save mcp.json — picks up automatically
82+
- Kiro: picks up `mcp.json` automatically — no restart needed
7183
- Codex: must restart the CLI
7284

7385
## Uninstall
7486

75-
Reverse step 3 (remove the `eval` entry), restart.
87+
Per IDE: remove the `eval` entry from the relevant config (Claude Code:
88+
`claude mcp remove eval -s user`; others: edit the JSON/TOML). Then restart.

0 commit comments

Comments
 (0)