Mona Sheikh Zeinoddin · Mobarak I. Hoque · Zafer Tandogdu · Greg L. Shaw · Matthew J. Clarkson · Evangelos B. Mazomenos · Danail Stoyanov
This work was early-accpeted and selected for an oral presentation at MICCAI 2025.
We ran our experiments with PyTorch 2.1.0, CUDA 12.0, Python 3.10 and Ubuntu 22.04.
You can predict scaled disparity for a single image or a folder of images with:
CUDA_VISIBLE_DEVICES=0 python test_simple.py --model_path <your_model_path> --image_path <your_image_or_folder_path>You can download AF-Sfm Learners weights that we use in initialization with:
gdown 1kf7LjQ6a2ACKr6nX5Uyee3of3bXn1xWB
unzip -q Model_trained_end_to_end.zip
mv Model_trained_end_to_end af_sfmlearner_weightsYou can download the Endovis or SCARED dataset by signing the challenge rules and emailing them to max.allan@intusurg.com
Endovis split
The train/test/validation split for Endovis dataset used in our works is defined in the splits/endovis folder.
Endovis data preprocessing
We use the ffmpeg to convert the RGB.mp4 into images.png:
find . -name "*.mp4" -print0 | xargs -0 -I {} sh -c 'output_dir=$(dirname "$1"); ffmpeg -i "$1" "$output_dir/%10d.png"' _ {}We only use the left frames in our experiments and please refer to extract_left_frames.py. For dataset 8 and 9, we rephrase keyframes 0-4 as keyframes 1-5.
Data structure
The directory of dataset structure is shown as follows:
/path/to/endovis_data/
dataset1/
keyframe1/
image_02/
data/
0000000001.png
CUDA_VISIBLE_DEVICES=0 python train_end_to_end.py --data_path <your_data_path> --log_dir <path_to_save_model (depth, pose, appearance flow, optical flow)>To prepare the ground truth depth maps run:
CUDA_VISIBLE_DEVICES=0 python export_gt_depth.py --data_path <your_data_path> --split endovisDepth Evaluation:
python evaluate_depth.py --data_path <your_data_path> --load_weights_folder <path_to_weights_i_folder> --eval_monoPose evaluation:
python evaluate_pose.py --data_path <your_data_path> --load_weights_folder <path_to_weights_i_folder> --scared_pose_seq <trajectory_1_or_2>Want to see our project in action? ✨ Dive into our interactive Colab demo: Launch in Colab
The StereoMIS sequence we used to evaluate our model is available here.
| Model | Abs Rel | Sq Rel | RMSE | ATE-Trajectory 1 | ATE-Trajectory 2 | Link |
|---|---|---|---|---|---|---|
| End-to-end best model weights | 0.051 | 0.354 | 4.480 | 0.0702 | 0.0438 |
If you found this code/work to be useful in your own research, please considering citing the following:
@article{zeinoddin2025endo,
title={Endo-FASt3r: Endoscopic Foundation model Adaptation for Structure from motion},
author={Zeinoddin, Mona Sheikh and Islam, Mobarakol and Tandogdu, Zafer and Shaw, Greg and Clarkson, Mathew J and Mazomenos, Evangelos and Stoyanov, Danail},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
year={2025},
organization={Springer}
} If you have any questions, please feel free to contact mona.zeinoddin.22@ucl.ac.uk



