Releases: hud-evals/hud-python
Releases · hud-evals/hud-python
v0.5.30 - Chat, citation, scenario and agent updates
What's Changed
- gpt 5.4 computer use + tool search by @jdchawla29 in #361
- Improve telemetry output for silent tool calls by @nancyjlau in #363
- Anthropic tooling doesn't return screenshot each turn by @rsalaks2 in #362
- Feat/a2a chat orchestrator clean #HUD-935 by @ryantzr1 in #360
- add claude citations HUD-985 by @ryantzr1 in #365
- L/api docs by @lorenss-m in #358
New Contributors
- @nancyjlau made their first contribution in #363
- @rsalaks2 made their first contribution in #362
Full Changelog: v0.5.29...v0.5.30
v0.5.29
What's Changed
- fix connection by @jdchawla29 in #353
- Propagate ctx.reward to Trace returned by run_single_task by @shfunc in #352
- Surface per-scenario tool configs in lock file by @shfunc in #354
- Add allowed_tools rescue list to @env.scenario decorator by @shfunc in #355
- Fix job linking and taskset lookup by @jdchawla29 in #356
Full Changelog: v0.5.28...v0.5.29
v0.5.28 - CLI changes, hud dev proxy fix, tool filtering enhancements
What's Changed
- Maintenance by @jdchawla29 in #343
- Add metadata field and score alias to SubScore + tests by @shfunc in #345
- Dylanbowman314/fix tool prefixing by @dylanbowman314 in #347
- Dylanbowman314/exclude tool sources by @dylanbowman314 in #348
- patch hud dev proxy to call _hud_submit by @jdchawla29 in #349
- hotfix by @dylanbowman314 in #350
- More Maintenance by @jdchawla29 in #351
Full Changelog: v0.5.27...v0.5.28
v0.5.27
v0.5.26
What's Changed
- scenario.task by @jdchawla29 in #336
- claude enhancements by @jdchawla29 in #337
- Add annotation field to MCPToolCall by @shfunc in #339
- Task Slugs etc. by @jdchawla29 in #341
- Surface scenario evaluate errors by @farrelmahaztra in #338
- run jobs on existing tasksets only by @jdchawla29 in #342
Full Changelog: v0.5.25...v0.5.26
v0.5.25 - update bash tool, integration agent, ckpt configs
What's Changed
- Dylanbowman314/heredoc fix for other bash by @dylanbowman314 in #332
- stuff by @jdchawla29 in #333
- L/checkpoint configs by @lorenss-m in #334
Full Changelog: v0.5.24...v0.5.25
v0.5.24 - Reliability improvements for native tools
version bump
v0.5.23 - Updates to tools and context
v0.5.22
bump version
v0.5.21 - Environment conversion and robustness
Merge pull request #320 from hud-evals/l/small-fixes-3 L/small fixes 3