Thank you very much for sharing this useful benchmark. Currently, I am using this benchmark to evaluate our foundation model. However, I’ve noticed that both GPU and CPU utilization remain very low throughout the process. As a result, the overall evaluation takes very long time to complete. Is there anyway to speed up the evaluation?