This is the official repository for our paper:
"Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?" 📄 Paper Link
Author: Hanxue Gu*, Yaqian Chen*, Nick Konz, Qihang Li and Maciej A. Mazurowski
This repository implements a training-free (zero-shot) medical image registration pipeline using vision foundation models as feature encoders. We evaluate five different models:
Each model is used to extract image features that are then aligned using a training-free registration optimization pipeline—no fine-tuning required. Though our paper is heavily focused on breast image registration, we excited to see how it can be extended into other tasks!
# Create and activate a clean conda environment
conda create -n dinov2 python=3.10 -y
conda activate dinov2
# Install PyTorch with CUDA 11.7
pip install torch==2.0.0+cu117 torchvision==0.15.0+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
# Install dependencies
pip install torchmetrics==0.10.3 timm opencv-python
# Install DINOv2
pip install git+https://github.com/facebookresearch/dinov2.git
# Install other required libraries
pip install -r requirements.txt- Organize your image volumes as
.nii.gzfiles under a directory, for example:sample_dataset_dir/ - Create a
task.csvfile that defines each image pair for registration, with columns:mov_volume: moving imagefix_volume: fixed image
✅ You can refer to the provided sample_dataset_dir/ for a template and example setup.
- You can download our prepared examples with two pre-contrast breast MRIs from the same patient taken at different dates from Google drive.
- For all models, you can find the model weights to be downloaded in their original repo.
- For SAM, please choose to download the Vit-b version.
- For SSLSAM, you can find the model weights under SSLSAM.
Edit the configuration file for your experiment. You can choose different models and adjust paths:
Key config fields:
exp_note: A name for your experimentmodel_ver: One of:'sam''dino-v2''sslsam''medsam''biomedclip-sam''MIND'(classic handcrafted feature)
data_dir: The path to your datasetsave_feature: Set to'True'to save extracted features for reuse
📌 Example config files are provided in the repo to help you get started quickly.
python inference_reg.py --cfg config-dinov2-task1.pyYou can validate and visualize registration performance using the following tools:
- 📈 Visualize registration result:
- Open and run
vis_result.ipynb
- Open and run
- 🩻 Overlay masks on registered volumes:
- See Part 1 in
eval_dsc.ipynb
- See Part 1 in
- 📉 Calculate Dice Similarity Coefficient (DSC) across registered cases:
- See Part 2 in
eval_dsc.ipynb
- See Part 2 in
This work is heavily developed based on the excellent DINO-reg repository. Big thanks to the original authors for their contributions to cross-modality medical image registration.
We are currently releasing the zero-shot, image-only registration pipeline.
- Zero-shot registration with image pairs only
- Mask-guided registration support
Feel free to ⭐️ this repo, and cite our work if you find this repo helpful!
@misc{gu2025visionfoundationmodelsready,
title={Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?},
author={Hanxue Gu and Yaqian Chen and Nicholas Konz and Qihang Li and Maciej A. Mazurowski},
year={2025},
eprint={2507.11569},
archivePrefix={arXiv},
primaryClass={eess.IV},
url={https://arxiv.org/abs/2507.11569},
}