Learn a lightweight eval loop for AI outputs.
This example grades candidate responses against simple criteria and prints a pass/fail summary.
- Define eval cases as data.
- Apply deterministic scoring rules.
- Track pass rate over time.
python3 run.py --cases sample_input/eval_cases.jsonpython3 -m unittest discover -s tests -p "test_*.py"- Input:
sample_input/eval_cases.json - Output:
sample_output/report.json