SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting

MICCAI 2025

Yiming Huang*, Long Bai*, Beilei Cui*, Kun Yuan,
Guankun Wang, Mobarak I. Hoque, Nicolas Padoy, Nassir Navab, Hongliang Ren

|| Paper || Project Page ||

Environment

Install the CUDA toolkit on ubuntu from Download link, and then:

export PATH=/usr/local/cuda-11.7/bin:${PATH}
export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda-11.7

Install the Python environment

git clone https://github.com/lastbasket/SurgTPGS
cd SurgTPGS
conda create -n SurgTPGS python=3.7 
conda activate SurgTPGS

pip install -r requirements.txt
pip install -e submodules/depth-diff-gaussian-rasterization
pip install -e submodules/simple-knn

Datasets and Pre-trained Checkpoints

We have the processed version of CholeSeg and EndoVis 2018 datasets with disparity maps. Download the datasets from the Download Link, unzip to the following structure:

├── data
│   ├── cholecseg_sub
│   |   ├── video01_00080
│   |   ├── video01_00240
│   |   ├── ...
│   ├── endovis_2018
│   |   ├── seq_5_sub
│   |   ├── seq_9_sub

Download the SAM checkpoint, VLM(CLIP finetuned with CAT-Seg): CholecSeg checkpoints, and EndoVis 2018. Placing the checkpoints as:

├── ckpts
│   ├── model_final_cholecseg.pth
│   ├── model_final_endovis.pth
│   ├── sam_vit_h_4b8939.pth

Training

# 1. data processing for VLM and SAM features
bash pre_data.sh
# 2. use the autoencoder for the semantic features
bash pre_VL_features.sh
# 3. train the SurgTPGS
bash train.sh

Rendering and Evaluation

# 1. render the RGB, Depth, and semantic features
bash render.sh
# 2. eval the semantic segmentation on novel view with text prompt
bash eval_fine.sh

Related Works

Welcome to follow our related works:

Endo-4DGX: Robust Endoscopic Gaussian Splatting with Illumination Correction
Endo2DTAM: Gaussian Splatting SLAM for Endoscopic Scene
Endo-4DGS: Monocular Endoscopic Scene Reconstruction with Gaussian Splatting

Citation

@misc{huang2025surgtpgssemantic3dsurgical,
      title={SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting}, 
      author={Yiming Huang and Long Bai and Beilei Cui and Kun Yuan and Guankun Wang and Mobarakol Islam and Nicolas Padoy and Nassir Navab and Hongliang Ren},
      year={2025},
      eprint={2506.23309},
      archivePrefix={arXiv},
      primaryClass={eess.IV},
      url={https://arxiv.org/abs/2506.23309}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
arguments		arguments
autoencoder		autoencoder
figs		figs
gaussian_renderer		gaussian_renderer
lpipsPyTorch		lpipsPyTorch
scene		scene
submodules		submodules
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_fine.py		eval_fine.py
eval_fine.sh		eval_fine.sh
pre_VL_features.sh		pre_VL_features.sh
pre_data.sh		pre_data.sh
preprocess_fine.py		preprocess_fine.py
render.py		render.py
render.sh		render.sh
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting

MICCAI 2025

|| Paper || Project Page ||

Environment

Datasets and Pre-trained Checkpoints

Training

Rendering and Evaluation

Related Works

Citation

About

Uh oh!

Releases

Packages

Languages

License

lastbasket/SurgTPGS

Folders and files

Latest commit

History

Repository files navigation

SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting

MICCAI 2025

|| Paper || Project Page ||

Environment

Datasets and Pre-trained Checkpoints

Training

Rendering and Evaluation

Related Works

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages