You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: integrate six traceable benchmarks with unified smoke test (#1202)
* refactor: remove dead read_video_pyav_pil and deduplicate _resize_image in load_video
* refactor: rename read_video_pyav -> read_video, remove dead code
- Rename read_video_pyav to read_video in load_video.py with backward-compat alias
- Delete _resize_image and read_video_pyav_base64 dead functions
- Update all 12 caller files to use read_video directly
- Inline base64 encoding logic in qwen2_5_omni.py (was read_video_pyav_base64)
- Fix missing import in vila.py (latent bug)
- Remove use_custom_video_loader dead code from 5 models that declared but never checked it (qwen2_5_vl, qwen3_vl, qwen3_omni, llava_onevision1_5, huggingface)
* docs: rewrite Section 7.1 to document read_video backends, remove dead Section 7.2
* feat: unified CLI with subcommand dispatch and interactive wizard
Add lmms_eval/cli/ package with subcommand-based architecture:
eval - run evaluation (wizard mode when no args)
tasks - list/groups/subtasks/tags browser
models - list backends with optional --aliases
ui - launch Web UI
serve - start HTTP eval server
power - statistical power analysis
version - version and environment info
tui - terminal UI (textual)
Full backward compat: lmms-eval --model X --tasks Y still works.
Entrypoint rewired through cli.dispatch:main in pyproject.toml.
* docs: add external usage guide for CLI and library access
Add docs/external_usage.md covering CLI subcommands (tasks, models,
eval wizard, ui, serve, power, version) and Python library usage
(TaskManager, datasets, evaluator, metrics). Update docs index link.
Polish v0.7 release notes for consistency.
* feat(tasks): add six benchmark tasks and unified smoke report
* fix(smoke): enable audio payloads for openrouter omni runs
* fix(smoke): use 1fps video sampling for api smoke runs
* fix(multimodal): correct audio routing and video fps sampling
* test(cli): add dispatch and task pipeline coverage
0 commit comments