Skip to content

Feat: MVP ASR Benchmarking using CV22 #5

@Yip-Jia-Qi

Description

@Yip-Jia-Qi

Approach

For ASR, simple WER and CER will be the main metric

Things to consider

  • Jan User Profiles (languages)
Rank Language Count Percentage
1 English 92,537 61.55%
2 Russian 11,014 7.33%
3 Spanish 8,522 5.67%
4 French 5,959 3.96%
5 German 5,886 3.92%
6 Chinese 4,582 3.05%
7 Portuguese 3,353 2.23%
8 Polish 2,587 1.72%
9 Italian 2,081 1.38%
10 Japanese 1,911 1.27%

Out of scope

  • We will not benchmark model efficiency, just model size and ASR performance
  • Running the benchmark. Once we have selected the benchmark then we will figure out how to download the data and run it

Stage plans

1. Lit Review
Select a panel of relevant benchmarks
2. Experiments
Download and run the benchmarks on a few ASR models
3. Evaluate
Complie the results and share with the team, to make a decision on the benchmark panel
4. Translation
Incorporate the benchmark into a replicable script so we can always quickly evaluate new ASR models that come out

Metadata

Metadata

Labels

No labels
No labels

Projects

Status

Needs Review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions