GitHub - CAMMA-public/OR_GazeFollowing

Where are they looking in the Operating Room?

Keqi Chen*, Séraphin Baributsa*, Lilien Schewski*, Vinkle Srivastav, Didier Mutter, Guido Beldi, Sandra Keller, Nicolas Padoy, IPCAI 2026

*equal contribution

Introduction

Purpose: Gaze-following, the task of inferring where individuals are looking, has been widely studied in computer vision, advancing research in visual attention modeling, social scene understanding, and human-robot interaction. However, gaze-following has never been explored in the operating room (OR), a complex, high-stakes environment where visual attention plays an important role in surgical workflow analysis. In this work, we introduce the concept of gaze-following to the surgical domain, and demonstrate its great potential for understanding clinical roles, surgical phases, and team communications in the OR.

Methods: We extend the 4D-OR dataset with gaze-following annotations, and extend the Team-OR dataset with gaze-following and a new team communication activity annotations. Then, we propose novel approaches to address clinical role prediction, surgical phase recognition, and team communication detection using a gaze-following model. For role and phase recognition, we propose a gaze heatmap-based approach that uses gaze predictions solely; for team communication detection, we train a spatial-temporal model in a self-supervised way that encodes gaze-based clip features, and then feed the features into a temporal activity detection model.

Results: Experimental results on the 4D-OR and Team-OR datasets demonstrate that our approach achieves state-of-the-art performance on all downstream tasks. Quantitatively, our approach obtains F1 scores of 0.92 for clinical role prediction and 0.95 for surgical phase recognition. Furthermore, it significantly outperforms existing baselines in team communication detection, improving previous best performances by over 30%.

Conclusion: We introduce gaze-following in the OR as a novel research direction in surgical data science, highlighting its great potential to advance surgical workflow analysis in computer-assisted interventions. Although limited to monocular 2D gaze prediction relying on manual annotations, our research clearly demonstrates the clinical value of gaze analysis from ceiling-mounted cameras. Future work will explore semantic understanding, multi-view learning, and few-shot approaches to further improve scalability and robustness.

Gaze following in the operating room

Gaze target estimation on the 4D-OR / Team-OR datasets.

Overall framework

In this repo we provide:

Training and inference code for gaze following in the OR.
Trained models on the public datasets.

Installation

Please follow Gaze-LLE for installation.

Data preparation

We are currently unable to make Team-OR publicly available due to strict privacy and ethical concerns. The gaze-following annotations of the 4D-OR dataset are released under ./gaze_following/annotations/.

4D-OR dataset

Download the 4D-OR dataset and place it in ./data/ as:

${ROOT_DIR}
|-- data
    |-- 4D-OR
        |-- export_holistic_take1_processed
            |-- colorimage

Training

4D-OR dataset

cd gaze_following
python main.py

Checkpoints

Two main checkpoints are provided for the gaze-following demo.

Global checkpoint

epoch_14_global_inout.pt

This checkpoint was trained on a combination of general and operating-room gaze-following datasets:

Surgical checkpoint

epoch_14_Surg_inout.pt

This checkpoint corresponds to a Gaze-LLE model first trained on GazeFollow, then fine-tuned on operating-room datasets:

4D-OR
MM-OR

Demo

A simple notebook is provided to run gaze following on a single image:

demo_gazefollowing.ipynb

The demo requires:

an image,
one or several head bounding boxes,

Citation

If you use our code or models in your research, please cite with:

@article{chen2026they,
  title={Where are they looking in the operating room?},
  author={Chen, Keqi and Baributsa, S{\'e}raphin and Schewski, Lilien and Srivastav, Vinkle and Mutter, Didier and Beldi, Guido and Keller, Sandra and Padoy, Nicolas},
  journal={International Journal of Computer Assisted Radiology and Surgery},
  pages={1--10},
  year={2026},
  publisher={Springer}
}

References

The project uses Gaze-LLE. We thank the authors for releasing their codes.

License

This code and models are available for non-commercial scientific research purposes as defined in the CC BY-NC-SA 4.0. By downloading and using this code you agree to the terms in the LICENSE. Third-party codes are subject to their respective licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
gaze_following		gaze_following
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo_ORgazefollowing.ipynb		demo_ORgazefollowing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Where are they looking in the Operating Room?

Introduction

Gaze following in the operating room

Overall framework

In this repo we provide:

Installation

Data preparation

4D-OR dataset

Training

4D-OR dataset

Checkpoints

Global checkpoint

Surgical checkpoint

Demo

Citation

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Where are they looking in the Operating Room?

Introduction

Gaze following in the operating room

Overall framework

In this repo we provide:

Installation

Data preparation

4D-OR dataset

Training

4D-OR dataset

Checkpoints

Global checkpoint

Surgical checkpoint

Demo

Citation

References

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages