VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models

This repository provides the official implementation of VCRBench.

Installation

Clone the repository and navigate to the VCRBench directory:

git clone https://github.com/pritamqu/VCRBench
cd VCRBench

This repository supports several LVLMs for direct evaluation on VCRBench.

Download VCRBench

Our data can be accessed via this link: [VCRBench]

Please download the videos and questions from the link and save them in your local directory.

mkdir HF_DATA # create a dir where you want to download the data
cd HF_DATA # go to that dir
git lfs install
git clone https://huggingface.co/datasets/pritamqu/VCRBench

Please make sure to update the video-folder and question-file in inference scripts as per your path.

Leaderboard

See our leaderborad here. If you want to add your model to our leaderboard, please send model responses to [email protected], in the same the format as provided in sample response.

Download model weights

You can download the open-source weights using:

git lfs install
git clone [email protected]:Qwen/Qwen2.5-VL-72B-Instruct

OR you can also evaluate models using API as done for Gemini and GPT4o.

Setting up the environment

conda create -n vcr python=3.10 -y
conda activate vcr
pip install -r requirements.txt

Evaluating on VCRBench

We provide scripts to directly evaluate several open-source (e.g., Qwen2.5-VL-Instruct, InternVL2_5, VideoLLaMA3, VideoLLaVA) and closed-source (e.g., Gemini, GPT-4o) models on VCRBench.
Evaluation scripts are located here. For example, to evaluate Qwen2.5-VL-72B-Instruct:

bash scripts/qwen_2_5_vl/inference72.sh

You can use the given evaluation scripts as a reference to evaluate on other models.

Evaluating on VCRBench equipped with RRD

We also provide scripts to test open-source models equipped with RRD.
For example, to evaluate Qwen2.5-VL-72B-Instruct with RRD:

bash scripts/qwen_2_5_vl/rrd72.sh

Citation

If you find this work useful, please consider citing our paper:

@misc{sarkar2025vcrbench,
      title={VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models}, 
      author={Pritam Sarkar and Ali Etemad},
      year={2025},
      eprint={2505.08455},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
}

Usage and License Notices

This project incorporates datasets and model checkpoints that are subject to their respective original licenses.
Users must adhere to the terms and conditions specified by these licenses.

Assets used in this work include, but are not limited to, CrossTask.
This project does not impose any additional constraints beyond those stipulated in the original licenses. Users must ensure their usage complies with all applicable laws and regulations.

This repository is released under the MIT License. See LICENSE for details.

For any issues or questions, please open an issue or contact Pritam Sarkar at [email protected]!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
docs		docs
models		models
output/random		output/random
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
metrics.py		metrics.py
process.py		process.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models

Installation

Download VCRBench

Leaderboard

Download model weights

Setting up the environment

Evaluating on VCRBench

Evaluating on VCRBench equipped with RRD

Citation

Usage and License Notices

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

pritamqu/VCRBench

Folders and files

Latest commit

History

Repository files navigation

VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models

Installation

Download VCRBench

Leaderboard

Download model weights

Setting up the environment

Evaluating on VCRBench

Evaluating on VCRBench equipped with RRD

Citation

Usage and License Notices

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages