Sharing our benchmarks and metrics!

I am Zhiqiu Lin, a final-year PhD student at Carnegie Mellon University working with Prof. Deva Ramanan. I came across your paper and like it a lot! Congratulations!

I also wanted to share some of our recent work on benchmarking generative VLMs:

1. [VQAScore](https://linzhiqiu.github.io/papers/vqascore/) (ECCV'24): A simple but effective alignment score for text-to-image/video/3D generation, strongly agreeing with human judgments. VQAScore can be run using one-line Python code [here](https://github.com/linzhiqiu/t2v_metrics)! Google's [Imagen3](https://arxiv.org/abs/2408.07009) used VQAScore as the strongest replacement for CLIPScore.
2. [GenAI-Bench](https://linzhiqiu.github.io/papers/genai_bench/) (CVPR'24 SynData Workshop): A benchmark with 1,600 prompts from professional designers for compositional visual generation. We also show VQAScore can serve as a strong reward metric to re-rank the DALLE-3 generated images. GenAI-Bench was awarded the Best Short Paper at the [SynData@CVPR24 workshop](https://syndata4cv.github.io/) and adopted in [Imagen3's report](https://arxiv.org/abs/2408.07009).

Hope you find them useful!

Best,
Zhiqiu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sharing our benchmarks and metrics! #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sharing our benchmarks and metrics! #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions