Skip to content

Commit 0e7a3df

Browse files
authored
WebMCP Evals UI Sidecar (#24)
* Adding ui for running evals locally and on the website * Organizing css styles per component
1 parent 3291005 commit 0e7a3df

38 files changed

+5777
-127
lines changed

evals-cli/.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@ node_modules/
22
dist/
33
.DS_Store
44
.env
5-
report.html
5+
report*.html

evals-cli/README.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,12 @@ The project is structured as follows:
1515
- `src/`: Source code.
1616
- `bin/runevals.ts`: Entry point that loads tool schemas from a JSON file and runs the evaluation loop.
1717
- `bin/webmcpevals.ts`: Entry point that loads tool schemas live from a browser page via the WebMCP API.
18+
- `bin/serve.ts`: Entry point that starts the WebMCP Evals Web UI sidecar.
1819
- `backend/`: Implementation of LLM backends (e.g., `googleai.ts`, `ollama.ts`).
1920
- `browser/`: Browser automation for WebMCP tool discovery (`webmcp.ts`).
2021
- `types/`: TypeScript definitions for tools, messages, and evaluations.
2122
- `examples/`: Detailed examples and test data.
22-
- `travel/`: A travel agent example containing `schema.json` and `evals.json`.
23+
- `travel/`: A travel agent example containing `schema.json` and `evals.json`.
2324

2425
## Prerequisites
2526

@@ -59,7 +60,7 @@ The project is structured as follows:
5960
Loads tool schemas from a local JSON file.
6061

6162
```bash
62-
node dist/bin/runevals.js --model=gemini-2.5-flash --tools=examples/travel/schema.json --evals=examples/travel/evals.json
63+
node dist/bin/runevals.js --tools=examples/travel/schema.json --evals=examples/travel/evals.json
6364
```
6465

6566
With Ollama:
@@ -90,6 +91,18 @@ node dist/bin/webmcpevals.js --url=https://example.com/my-webmcp-app --evals=exa
9091
| `--backend` | No | `gemini` | Backend to use (`gemini` or `ollama`) |
9192
| `--model` | No | `gemini-2.5-flash` | Model name |
9293

94+
### `serve` — WebMCP Evals UI sidecar
95+
96+
Starts a local web server to provide a visual interface for configuring and running evaluations.
97+
98+
```bash
99+
node dist/bin/serve.js --port=8080
100+
```
101+
102+
| Argument | Required | Default | Description |
103+
| -------- | -------- | ------- | ------------------------- |
104+
| `--port` | No | `8080` | Port to run the server on |
105+
93106
## Argument Constraints
94107

95108
You can use constraint operators to match argument values flexibly. A constraint object is identified when **all** its keys start with `$`.

0 commit comments

Comments
 (0)