Skip to content

[Benchmark Backfill] Integrate OfficeQA into lmms-eval#1150

Merged
Luodian merged 1 commit into
dev-v0d7from
feat/lmm-281-officeqa
Feb 22, 2026
Merged

[Benchmark Backfill] Integrate OfficeQA into lmms-eval#1150
Luodian merged 1 commit into
dev-v0d7from
feat/lmm-281-officeqa

Conversation

@Luodian
Copy link
Copy Markdown
Contributor

@Luodian Luodian commented Feb 22, 2026

Summary

  • Add officeqa benchmark task integration with YAML config and utility functions.
  • Update document-understanding benchmark list in docs/current_tasks.md.
  • Align dataset loading to publicly accessible OfficeQA CSV source for smoke validation.

Validation

  • uv run python -m lmms_eval --tasks list (includes officeqa)
  • uv run python -m lmms_eval --model openai --model_args model_version=bytedance-seed/seed-1.6-flash,api_key=$OPENROUTER_API_KEY,base_url=https://openrouter.ai/api/v1 --tasks officeqa --batch_size 1 --limit 1

@Luodian Luodian changed the base branch from main to dev-v0d7 February 22, 2026 11:10
@Luodian Luodian force-pushed the feat/lmm-281-officeqa branch from 17d222c to 5cdcc8b Compare February 22, 2026 11:25
@Luodian Luodian merged commit 79ddef8 into dev-v0d7 Feb 22, 2026
2 checks passed
@Luodian Luodian deleted the feat/lmm-281-officeqa branch February 23, 2026 08:24
Luodian added a commit that referenced this pull request Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant