Skip to content

LMMs-Eval v0.7 - Audio Update#1124

Merged
Luodian merged 3 commits into
EvolvingLMMs-Lab:dev-v0d7from
YichenG170:main
Feb 22, 2026
Merged

LMMs-Eval v0.7 - Audio Update#1124
Luodian merged 3 commits into
EvolvingLMMs-Lab:dev-v0d7from
YichenG170:main

Conversation

@YichenG170
Copy link
Copy Markdown
Collaborator

@YichenG170 YichenG170 commented Feb 21, 2026

Description

This update is the main update in the Audio section for LMMs-Eval v0.6.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • New benchmark/task
  • New model integration
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)

Changes Made

Models

  • Fixed dtype errors in Whisper
  • Fixed typos in Qwen2-Audio
  • Applied better debug methods in Audio Flamingo 3

Datasets

  • Added AMI (WER evaluation for speech recognition)
  • Added CN College MCQ (accuracy for knowledge-intensive audio MCQ)
  • Added Dream-TTS MCQ (accuracy for TTS-generated MCQ)
  • Added Europal-ASR (WER for long speech recognition)
  • Added Song Describer (LLM-as-judge for music description quality)
  • Added VoxPopuli (WER for multilingual speech recognition)

(Europal-ASR and Song Describer are uploaded to https://huggingface.co/lmms-lab-audio rather than the primary organization due to storage limits.)

Testing

  • Tested all datasets with the three modified audio llms (only en and en_accented splits tested in VoxPopuli).

New audio results table (from local testing):

Model AMI (WER) CN College MCQ (Acc) Dream-TTS MCQ (Acc) Europal ASR (WER) Song Describer (LLM-as-Judge /5) VoxPopuli EN (WER) VoxPopuli EN acc. (WER)
Qwen2-Audio-7B-Instruct 0.8793 0.6596 0.7047 0.7047 2.2681 0.2254 0.4549
OpenAI Whisper-Large-v3 0.4019
NVIDIA Audio-Flamingo-3 0.2733 0.8661 0.8677 0.3492 2.4169 0.1581 0.4826

@YichenG170 YichenG170 changed the title LMMs-Eval v0.6 - Audio Update LMMs-Eval v0.7 - Audio Update Feb 21, 2026
@Luodian Luodian changed the base branch from main to dev-v0d7 February 21, 2026 12:51
@Luodian Luodian merged commit bb846e1 into EvolvingLMMs-Lab:dev-v0d7 Feb 22, 2026
2 checks passed
Luodian added a commit that referenced this pull request Feb 28, 2026
* [Model] Fixed the parameter name error in qwen2_audio

* LMMs-Eval v0.6 - Audio Updates

* fix: align whisper world_size init for single-process runtime

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants