Object-Aware Diffusion Model for Controllable Surgical Video Generation (SurgSora)

Object-Aware Diffusion Model for Controllable Surgical Video Generation (SurgSora) Tong Chen*, Shuya Yang*, Junyi Wang*, Long Bai†, Hongliang Ren, and Luping Zhou† Medical Image Computing and Computer Assisted Intervention (MICCAI) 2025

[`arXiv`]	[`Project Page`]

Update

· Oct/2025: 📢📢📢 Training Code Released!

· Jul/2025: 🎉🎉🎉 Our Work has been accepted by MICCAI 2025!

· Apr/2025: 🔥🔥🔥 SurgSora Gradio is online!

Environment Setup

pip install -r requirements.txt

Install SAM2 follow this:

git clone https://github.com/facebookresearch/sam2.git && cd sam2

pip install -e .

Training

stage 1

bash train_stage1.sh

stage 2

bash train_stage2.sh

Download checkpoints

Download the pretrained checkpoint of DAV2 from huggingface to ./mdoels/dav2.
Download the pretrained checkpoint of CMP from here from huggingface to ./mdoels/cmp.

The final structure of checkpoints should be:

./models/
|-- DAV2
|-- CMP
|-- controlnet
|   |-- config.json
|   `-- diffusion_pytorch_model.safetensors
|-- stable-video-diffusion-img2vid-xt-1-1
|   |-- ...
|   `-- model_index.json

Run Gradio Demo

python gradio_demo_run.py

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
Training		Training
assets		assets
demo		demo
models		models
pipeline		pipeline
utils		utils
README.md		README.md
gradio_demo_run.py		gradio_demo_run.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Object-Aware Diffusion Model for Controllable Surgical Video Generation (SurgSora)

Tong Chen, Shuya Yang, Junyi Wang*, Long Bai†, Hongliang Ren, and Luping Zhou†

Medical Image Computing and Computer Assisted Intervention (MICCAI) 2025

Update

Environment Setup

Training

Download checkpoints

Run Gradio Demo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Object-Aware Diffusion Model for Controllable Surgical Video Generation (SurgSora)

Tong Chen*, Shuya Yang*, Junyi Wang*, Long Bai†, Hongliang Ren, and Luping Zhou†

Medical Image Computing and Computer Assisted Intervention (MICCAI) 2025

Update

Environment Setup

Training

Download checkpoints

Run Gradio Demo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Tong Chen, Shuya Yang, Junyi Wang*, Long Bai†, Hongliang Ren, and Luping Zhou†

Packages