Skip to content

feat: integrate Neptune long-video benchmark tasks#1187

Merged
Luodian merged 3 commits into
dev-v0d7from
feat/neptune-long-video-v0d7
Feb 23, 2026
Merged

feat: integrate Neptune long-video benchmark tasks#1187
Luodian merged 3 commits into
dev-v0d7from
feat/neptune-long-video-v0d7

Conversation

@Luodian
Copy link
Copy Markdown
Contributor

@Luodian Luodian commented Feb 22, 2026

Summary

  • Add Neptune long-video benchmark task family (neptune_full_*, neptune_mma_*, neptune_mmh_*) with YAML configs, task utilities, and unit coverage.
  • Update task catalog (docs/current_tasks.md) and add a Neptune task README documenting hydration notes and the two currently unavailable full-split videos.
  • Cap OpenAI-compatible chat video frame ingestion by honoring max_frames_num in message conversion to avoid oversized video payloads.

Validation

  • uv run python -m unittest test/eval/test_neptune_task.py (pass)
  • uv run pre-commit run --files <Neptune-related files> (black/isort pass)
  • Smoke command:
    uv run python -m lmms_eval --model openai_compatible --model_args "model_version=google/gemini-3-flash-preview,max_frames_num=2,num_concurrent=1,adaptive_concurrency=false,max_retries=1,retry_backoff_s=0.1" --tasks neptune_mma_v --batch_size 1 --limit 8 --log_samples --verbosity INFO

Output Table

Tasks Version Filter n-shot Metric Value Stderr
neptune_mma_v Yaml none 0 neptune_acc 0.75 ± N/A

Throughput Summary

Metric Value Unit
total_gen_tokens 8.0000 tokens
total_elapsed_time 19.8799 seconds
avg_speed 0.4024 tokens/s

@Luodian Luodian merged commit ecbed1c into dev-v0d7 Feb 23, 2026
@Luodian Luodian deleted the feat/neptune-long-video-v0d7 branch February 23, 2026 08:25
Luodian added a commit that referenced this pull request Feb 28, 2026
* LMM-271: [P0][Benchmark] Neptune long-video benchmark integration...

* fix(neptune): cap chat video frame loading and document missing full videos

* style: auto-fix lint (black + isort)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant