Skip to content

Commit 39430fb

Browse files
authored
Merge pull request #33 from agentevals-dev/peterj/customevals
add support for custom graders (eval metrics)
2 parents b40c5ca + a94a5e9 commit 39430fb

60 files changed

Lines changed: 6073 additions & 3272 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
name: Publish evaluator SDK
2+
3+
on:
4+
push:
5+
tags:
6+
- 'evaluator-sdk-v*'
7+
workflow_dispatch:
8+
inputs:
9+
tag:
10+
description: 'Release tag (e.g. evaluator-sdk-v0.1.0)'
11+
required: true
12+
13+
permissions:
14+
contents: read
15+
16+
jobs:
17+
build:
18+
runs-on: ubuntu-latest
19+
steps:
20+
- uses: actions/checkout@v6
21+
22+
- uses: actions/setup-python@v5
23+
with:
24+
python-version: '3.12'
25+
26+
- name: Verify tag matches pyproject.toml version
27+
run: |
28+
REF=${{ github.event.inputs.tag || github.ref_name }}
29+
TOML_VERSION=$(grep '^version' packages/evaluator-sdk-py/pyproject.toml | cut -d'"' -f2)
30+
TAG_VERSION=${REF#evaluator-sdk-v}
31+
if [ "$TOML_VERSION" != "$TAG_VERSION" ]; then
32+
echo "Error: $REF does not match packages/evaluator-sdk-py/pyproject.toml version $TOML_VERSION"
33+
exit 1
34+
fi
35+
36+
- name: Install build tool
37+
run: pip install build
38+
39+
- name: Build wheel and sdist
40+
run: python -m build packages/evaluator-sdk-py/ --outdir dist/
41+
42+
- uses: actions/upload-artifact@v7
43+
with:
44+
name: evaluator-sdk-dist
45+
path: dist/
46+
47+
publish:
48+
needs: build
49+
runs-on: ubuntu-latest
50+
environment: pypi
51+
permissions:
52+
id-token: write
53+
54+
steps:
55+
- uses: actions/download-artifact@v8
56+
with:
57+
name: evaluator-sdk-dist
58+
path: dist/
59+
60+
- uses: pypa/gh-action-pypi-publish@release/v1

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ Point any OTel-instrumented agent at the receiver. No SDK, no code changes:
5858

5959
```bash
6060
# Terminal 1
61-
agentevals serve --dev
61+
uv run agentevals serve --dev
6262

6363
# Terminal 2
6464
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
@@ -116,7 +116,7 @@ agentevals serve
116116

117117
```bash
118118
uv run agentevals serve --dev # Terminal 1
119-
cd ui && npm run dev # Terminal 2 → http://localhost:5173
119+
cd ui && npm install && npm run dev # Terminal 2 → http://localhost:5173
120120
```
121121

122122
Upload traces and eval sets, select metrics, and view results with interactive span trees. Live-streamed traces appear in the "Local Dev" tab, grouped by session ID.

0 commit comments

Comments
 (0)