This repository provides a Hierarchical Coresets Selection (HCS) module that can be plugged into any open-source VLM baselines to select informative regions in a coarse-to-fine manner for scene understanding tasks.
🥇[ACMMM2025] Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection
👉 Click here to view the project online
conda create -n vlm-hcs python=3.10 -y
conda activate vlm-hcs
pip install -r requirements.txtbash scripts/demo_clip_hcs.sh- Backbones are frozen by default; HCS acts as a pre-selection module.
- You can optionally train a tiny MLP scorer (
train/train_hcs_scorer.py) to stabilize scores.
If you find our work and codes useful, please consider citing our paper and star our repository (🥰🎉Thanks!!!):
@misc{wang2025advancing,
title={Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection},
author={Jingyao Wang and Yiming Chen and Lingyu Si and Changwen Zheng},
year={2025},
eprint={2507.13061},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.13061},
}