Skip to content

Feat: add model support for penguinvl#1257

Merged
Luodian merged 2 commits into
EvolvingLMMs-Lab:mainfrom
taintaintainu:feat/add-model-penguinvl
Mar 26, 2026
Merged

Feat: add model support for penguinvl#1257
Luodian merged 2 commits into
EvolvingLMMs-Lab:mainfrom
taintaintainu:feat/add-model-penguinvl

Conversation

@taintaintainu
Copy link
Copy Markdown
Contributor

Summary

  • Add a new simple-model integration for Penguin-VL exposed as --model penguinvl in lmms-eval.
  • Register penguinvl in the model registry and add an example launch script for multi-benchmark evaluation.

In scope

  • Add lmms_eval/models/simple/penguinvl.py, register the model ID, provide examples/models/penguin_vl.sh, and add penguinvl prompt overrides for mmmu_pro_standard and mmmu_pro_vision.

Out of scope

  • No new benchmark/task is introduced, and no metric/aggregation logic or dataset definitions are changed outside the Penguin-VL-specific prompt configuration.

Validation

  • accelerate launch --num_processes=8 --main_process_port=12346 -m lmms_eval --model penguinvl --model_args=pretrained=tencent/Penguin-VL-8B,attn_implementation=flash_attention_2,dtype=bfloat16 --tasks "ai2d,mmmu_pro_standard,ocrbench" --batch_size 1 --log_samples --log_samples_suffix penguinvl --verbosity DEBUG --output_path ./logs/ | sample size: N=3088+1730+1000 | key metrics: ai2d exact_match=0.8491, mmmu_pro_standard mmmu_acc=0.32139, ocrbench_accuracy=0.8430 | result: pass
  • accelerate launch --num_processes=8 --main_process_port=12346 -m lmms_eval --model penguinvl --model_args=pretrained=tencent/Penguin-VL-8B,attn_implementation=flash_attention_2,dtype=bfloat16 --tasks "videomme,longvideobench_val_v" --batch_size 1 --log_samples --log_samples_suffix penguinvl --verbosity DEBUG --output_path ./logs/ | sample size: N=2700+1337 | key metrics: videomme_perception_score=66.30, longvideobench_val_v lvb_acc=0.64996 | result: pass

Risk / Compatibility

  • Runtime compatibility depends on the upstream Penguin-VL Hugging Face implementation; this integration was evaluated with transformers==4.51.3 and attn_implementation=flash_attention_2.

Type of Change

  • Bug fix (non-breaking change)
  • New feature
  • New benchmark/task
  • New model integration
  • Breaking change
  • Documentation update
  • Refactoring (no functional changes)

Copy link
Copy Markdown
Contributor

@Luodian Luodian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean model integration for Penguin-VL. Well-validated with benchmark results across image and video tasks.

@Luodian Luodian merged commit 27f09b6 into EvolvingLMMs-Lab:main Mar 26, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants