🧠 Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?

This is the official repository for our paper:

"Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?" 📄 Paper Link

Author: Hanxue Gu*, Yaqian Chen*, Nick Konz, Qihang Li and Maciej A. Mazurowski

🔍 Overview

This repository implements a training-free (zero-shot) medical image registration pipeline using vision foundation models as feature encoders. We evaluate five different models:

Each model is used to extract image features that are then aligned using a training-free registration optimization pipeline—no fine-tuning required. Though our paper is heavily focused on breast image registration, we excited to see how it can be extended into other tasks!

⚙️ Step-by-Step Tutorial

✅ Step 1: Environment Setup

# Create and activate a clean conda environment
conda create -n dinov2 python=3.10 -y
conda activate dinov2

# Install PyTorch with CUDA 11.7
pip install torch==2.0.0+cu117 torchvision==0.15.0+cu117 --extra-index-url https://download.pytorch.org/whl/cu117

# Install dependencies
pip install torchmetrics==0.10.3 timm opencv-python

# Install DINOv2
pip install git+https://github.com/facebookresearch/dinov2.git

# Install other required libraries
pip install -r requirements.txt

📁 Step 2: Dataset Preparation

Organize your image volumes as .nii.gz files under a directory, for example: sample_dataset_dir/
Create a task.csv file that defines each image pair for registration, with columns:
- mov_volume: moving image
- fix_volume: fixed image

✅ You can refer to the provided sample_dataset_dir/ for a template and example setup.

You can download our prepared examples with two pre-contrast breast MRIs from the same patient taken at different dates from Google drive.

Step 3: Encoders preparation

For all models, you can find the model weights to be downloaded in their original repo.
For SAM, please choose to download the Vit-b version.
For SSLSAM, you can find the model weights under SSLSAM.

⚙️ Step 3: Configure Your Experiment

Edit the configuration file for your experiment. You can choose different models and adjust paths:

Key config fields:

exp_note: A name for your experiment
model_ver: One of:
- 'sam'
- 'dino-v2'
- 'sslsam'
- 'medsam'
- 'biomedclip-sam'
- 'MIND' (classic handcrafted feature)
data_dir: The path to your dataset
save_feature: Set to 'True' to save extracted features for reuse

📌 Example config files are provided in the repo to help you get started quickly.

😃 Step 4: Run Registration Code

python inference_reg.py --cfg config-dinov2-task1.py

📊 Step 5: Validation and Visualization

You can validate and visualize registration performance using the following tools:

📈 Visualize registration result:
- Open and run vis_result.ipynb
🩻 Overlay masks on registered volumes:
- See Part 1 in eval_dsc.ipynb
📉 Calculate Dice Similarity Coefficient (DSC) across registered cases:
- See Part 2 in eval_dsc.ipynb

🙏 Acknowledgments

This work is heavily developed based on the excellent DINO-reg repository. Big thanks to the original authors for their contributions to cross-modality medical image registration.

🧩 To-Do List

We are currently releasing the zero-shot, image-only registration pipeline.

✅ Current:

Zero-shot registration with image pairs only

🔜 Coming Soon:

Mask-guided registration support

Feel free to ⭐️ this repo, and cite our work if you find this repo helpful!

@misc{gu2025visionfoundationmodelsready,
      title={Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?}, 
      author={Hanxue Gu and Yaqian Chen and Nicholas Konz and Qihang Li and Maciej A. Mazurowski},
      year={2025},
      eprint={2507.11569},
      archivePrefix={arXiv},
      primaryClass={eess.IV},
      url={https://arxiv.org/abs/2507.11569}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
MedCLIP_SAMv2		MedCLIP_SAMv2
dinov2		dinov2
sam		sam
sample_dataset_dir		sample_dataset_dir
utils		utils
LICENSE		LICENSE
cfg.py		cfg.py
config-MIND-task1.py		config-MIND-task1.py
config-biomedclip-task1.py		config-biomedclip-task1.py
config-dinov2-task1.py		config-dinov2-task1.py
config-dinov3-task3.py		config-dinov3-task3.py
config-medsam-task1.py		config-medsam-task1.py
config-sam-task1.py		config-sam-task1.py
config-sslsam-task1.py		config-sslsam-task1.py
convex_adam_MIND.py		convex_adam_MIND.py
convex_adam_utils.py		convex_adam_utils.py
eval_dsc.ipynb		eval_dsc.ipynb
inference_reg.py		inference_reg.py
readme.md		readme.md
vis_result.ipynb		vis_result.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?

🔍 Overview

⚙️ Step-by-Step Tutorial

✅ Step 1: Environment Setup

📁 Step 2: Dataset Preparation

Step 3: Encoders preparation

⚙️ Step 3: Configure Your Experiment

😃 Step 4: Run Registration Code

📊 Step 5: Validation and Visualization

🙏 Acknowledgments

🧩 To-Do List

✅ Current:

🔜 Coming Soon:

About

Uh oh!

Releases

Packages

Languages

License

mazurowski-lab/Foundation-based-reg

Folders and files

Latest commit

History

Repository files navigation

🧠 Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?

🔍 Overview

⚙️ Step-by-Step Tutorial

✅ Step 1: Environment Setup

📁 Step 2: Dataset Preparation

Step 3: Encoders preparation

⚙️ Step 3: Configure Your Experiment

😃 Step 4: Run Registration Code

📊 Step 5: Validation and Visualization

🙏 Acknowledgments

🧩 To-Do List

✅ Current:

🔜 Coming Soon:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages