We release implementations of 50+ downstream datasets across various medical tasks, including segmentation, classification, registration, and vision-language. We will consistently update this repo to build a comprehensive validation benchmark for medical pre-training.
| Dataset | Modality | Task |
|---|---|---|
| BTCV | CT | Abdomen Seg. |
| AMOS22 | CT | Abdomen Seg. |
| WORD | CT | Abdomen Seg. |
| FLARE22 | CT | Abdomen Seg. |
| FLARE23 | CT | Abdomen Seg. |
| Abdomenct1k | CT | Abdomen Seg. |
| AbdomenAtlas | CT | Abdomen Seg. |
| TotalSegmentator | CT | 104 Structures Seg. |
| MM-WHS | CT | Heart Seg. |
| ASOCA | CT | Coronary Seg. |
| AVT | CT | Aorta Seg. |
| CHAOS | CT | Liver Seg. |
| Sliver07 | CT | Liver Seg. |
| IRCADb | CT | Liver Tumor Seg. |
| KiTS | CT | Kidney Tumor Seg. |
| KiPA22 | CT | Kidney Tumor Seg. |
| TCIA-Panc. | CT | Panc. Seg. |
| PANORAMA | CT | Panc. Tumor Seg. |
| SegThor | CT | Thoracic Risk Seg. |
| BHSD | CT | Brain Bleed Seg. |
| StructSeg19 | CT | Nasopharynx Cancer Seg. |
| Verse20 | CT | Vertebrae Seg. |
| PENGWIN | CT | Vertebrae Seg. |
| Covid-19-20 | CT | Covid Seg. |
| FUMPE | CT | Pulmonary Embolism Seg. |
| Parse22 | CT | Pulmonary Artery Seg. |
| AIIB23 | CT | Fibrotic Lung Seg. |
| CC-CCII | CT | Covid Classi. |
| AutoPET-II23 | PET-CT | HeadNeck Lesion Seg. |
| AMOS-MRI | MRI | Abdomen Seg. |
| MM-WHS-MRI | MRI | Heart Seg. |
| ACDC | MRI | Heart Seg. |
| ATLAS-MRI | MRI | Liver Tumor Seg. |
| BraTs21 | MRI | Brain Tumor Seg. |
| IXI | MRI | Brain MRI Registration |
| OASIS | MRI | Brain MRI Registration |
| CTRG-Chest | VLP | Report Generation |
| CT-RATE | VLP | Vocabulary Classi. |
| CT-RATE | VLP | Report-Volume Retrieval |
| MSD Challenge | ||
| Task01 Brain | MRI | Brain Tumor Seg. |
| Task02 Heart | MRI | Heart Seg. |
| Task03 Liver | CT | Liver Tumor Seg. |
| Task04 Hip. | MRI | Hip. Seg. |
| Task05 Pros. | MRI | Prostate Seg. |
| Task06 Lung | CT | Lung Cancer Seg. |
| Task07 Panc. | CT | Pancreas Tumor Seg. |
| Task08 Vessel | CT | Vessel Tumor Seg. |
| Task09 Spleen | CT | Spleen Seg. |
| Task10 Colon | CT | Colon Cancer Seg. |
NOTE THAT we are not the authors of these datasets. Although all these datasets are publicly available for academic research, you need to cite the original works as shown in our paper. For certain datasets (e.g., WORD) that necessitate approval from the authors, you need to download it from the original link.
You can choose to download our pre-processed datasets from our Hugging face. Most of these datasets are organized like nnUNet.
├── YOUR/DIRECTORY/OF/DOWNSTREAM/DATA
├── 3Dircadb1_convert
├── imagesTr
├── labelsTr
├── AIIB23
├── ATLAS-MRI
├── AVT
├── ...
└── Segthor
We provide both monai and nnUNet implementations. For nnUNet, you need to follow the instructions from their official repo.
We provide various models for fine-tuning downstream tasks. For nnUNet, please refer to nnunet trainer.
- SSL_head represents trained by Self-supervised pre-training.
- Omni represents trained by Omni-supervised pre-training.
| Model | Params | Checkpoint |
|---|---|---|
| VoComni_nnunet | 31M | Download |
| VoCo_B_SSL_head | 53M | Download |
| VoCo_L_SSL_head | 206M | Download |
| VoCo_H_SSL_head | 818M | Download |
| VoComni_B | 72M | Download |
| VoComni_L | 290M | Download |
| VoComni_H | 1.2B | Download |
We download checkpoints of other methods from SuPreM for comparison (Thanks for their great efforts!).
The path of pre-trained models should be organized as:
├── YOUR/DIRECTORY/OF/PRETRAINED/MODELS
├── VoComni_nnunet.pt
├── VoCo_B_SSL_head.pt
├── VoCo_L_SSL_head.pt
├── VoCo_H_SSL_head.pt
├── VoComni_B.pt
├── VoComni_L.pt
├── VoComni_H.pt
├── ...
└── supervised_suprem_swinunetr_2100.pth
Here, we take 3D-IRCADb as an example:
cd 3D-IRCADb
source activate YOUR-CONDA-ENVIRONMENT
sh train.sh
A template for train.sh like:
now=$(date +"%Y%m%d_%H%M%S")
name=VoCo
pretrained_root=/pretrained
logdir=runs/logs_swin_base_VoComni
feature_size=48
data_dir=/data/3Dircadb1_convert/
cache_dir=/data/cache/3D-IRCADb
mkdir -p $logdir
torchrun --master_port=21503 main.py \
--name $name \
--pretrained_root $pretrained_root \
--feature_size $feature_size \
--data_dir $data_dir \
--cache_dir $cache_dir \
--use_ssl_pretrained
--use_persistent_dataset
--logdir $logdir | tee $logdir/$now.txt
Parameters you need to modify !!!!! :
- name: The name of pre-trained models. Support [VoCo, suprem, swin, clip_driven, mg, unimiss, dodnet] for now. If None, without pre-training !!!
- pretrained_root: The path you store the pretrained models
- master_port: specify different master_port for different processes
- logdir: The path you want to save your results
- feature_size: 48 Base (B), 96 Large (L), 192 Huge (H)
- data_dir: The path you store your dataset
- cache_dir: The path you want to cache your dataset (activated by 'use_persistent_dataset')
- use_ssl_pretrained: If True, use 'VoCo_SSL_head'. Else, 'VoComni'
- use_persistent_dataset: If True, it would cache data in 'cache_dir' for fast training. WARNING: it requires extra storage space !!!!!
We meticulously default settings for different downstream tasks, including "a_min, a_max, roi, spacing". We learn a lot from nnUNet and after consuming over 10,000 GPU hours in evaluation, we assume the current settings in 'main.py' are relatively better.
The settings may not be consistent with pre-training, e.g., 'roi=64' in pre-training while 'roi=96' in some downstream tasks. You can re-define these parameters yourself, but for fair comparisons, we recommend you to adopt our settings.
We provide template for validation and testing.
# Organized as
├── Task/DIRECTORY
├── utils
├── utils.py
├── val.py
├── test.py
└── ...
# val
python val.py
# test
python test.py
You should modify the parameters in these two files according to your own settings !!! We provide the descriptions of these parameters in the files:
- test_data_path: The data path.
- test_label_path: The label path. Only in 'val.py'
- trained_pth: The model path.
- input & output channels: model settings
- processing params: consistent with training
- ......
Please refer to CC-CCII and LUNA16.
Please refer to Registration. Please cite TransMorph and follow their instructions.
Please refer to M2KT and CT_CLIP. Please cite M2KT and CT_CLIP and follow their instructions.
We are uploading our fine-tuning checkpoints to BaiduYun.
If you find this repo useful for your research, please consider citing the paper as follows:
@article{wu2024large,
title={Large-Scale 3D Medical Image Pre-training with Geometric Context Priors},
author={Wu, Linshan and Zhuang, Jiaxin and Chen, Hao},
journal={arXiv preprint arXiv:2410.09890},
year={2024}
}
@InProceedings{voco-v1,
author = {Wu, Linshan and Zhuang, Jiaxin and Chen, Hao},
title = {VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis},
booktitle = {CVPR},
month = {June},
year = {2024},
pages = {22873-22882}
}
@article{monai,
title={monai: An open-source framework for deep learning in healthcare},
author={Cardoso, M Jorge and others},
journal={arXiv preprint arXiv:2211.02701},
year={2022}
}
@article{nnunet,
title={nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation},
author={Isensee, Fabian and others},
journal={Nature Methods},
volume={18},
number={2},
pages={203--211},
year={2021},
}