Skip to content

Commit bd69c39

Browse files
committed
docs
1 parent eabc6d6 commit bd69c39

1 file changed

Lines changed: 20 additions & 5 deletions

File tree

evals/README.md

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Language Model Evaluations
22

3-
Works with `v1.0+`
3+
Deprecated in `v2.0+`. Works with `v1.0+`
44

55
Spice can be used to both run language models but also to evaluate their performance on specific tasks.
66

@@ -24,12 +24,27 @@ spice run
2424
2. Run an evaluation against `my_model`. This will take a moment to complete.
2525

2626
```shell
27-
spice eval tetris --model my_model
27+
curl -XPOST "http://localhost:8090/v1/evals/tetris" \
28+
-H "Content-Type: application/json" \
29+
-d '{
30+
"model": "my_model"
31+
}'
2832
```
2933

30-
```text
31-
ID CREATEDAT DATASET MODEL STATUS SCORERS METRICS
32-
15b8c5351cff98d96db28b8c76ad19dc 2024-12-30T06:14:54 small_tetris my_model Completed [match] map[match/mean:0.375]
34+
```json
35+
[
36+
{
37+
"id": "15b8c5351cff98d96db28b8c76ad19dc",
38+
"created_at": "2024-12-30T06:14:54",
39+
"dataset": "small_tetris",
40+
"model": "my_model",
41+
"status": "Completed",
42+
"scorers": ["match"],
43+
"metrics": {
44+
"match/mean": 0.375
45+
}
46+
}
47+
]
3348
```
3449

3550
3. Inspect the results

0 commit comments

Comments
 (0)