Do Echocardiography Foundation Models Generalize Beyond the Lab?
CardioBench is a standardized benchmark that unifies 8 public echocardiography datasets spanning 4 regression and 5 classification tasks, evaluating cardiac-specific, biomedical, and general-purpose foundation models under zero-shot, probing, and alignment protocols.
Preprint · Datasets · Getting Started · Evaluation · Prediction Formats · Citation
- EchoNet-Dynamic
- EchoNet-Pediatric
- EchoNet-LVH
- CAMUS
- TMED-2
- HMC-QU
- SegRWMA (Regional Wall Motion Abnormality)
- CardiacNet (Abnormal Cardiac Echo Videos)
data/– split CSVs generated by the workflow.docs/downloads.md– where and how to obtain each dataset.evaluation/– per-dataset evaluation scripts, config, sample predictions, and helpers.src/– training / probing utilities and baseline model code.
Follow the per-dataset instructions in docs/downloads.md.
Edit evaluation/config.py to point to:
- Your prediction folders.
- Desired output directories.
The config file also holds shared constants such as bootstrap count (B), seed, view labels, and evaluation split (SPLIT="test" by default).
Each dataset/task has a standalone script:
python evaluation/<script>.py
# Example:
python evaluation/camus.pyOutputs (metrics CSVs, per-class accuracy, prediction histograms, plots, etc.) are written to the directories configured in evaluation/config.py.
Helpful tips:
- Ensure prediction IDs (
patient_id,HashedFileName,FileName, etc.) match the split CSVs exactly. - View classifiers must include
prob_<VIEW>columns for every entry inVIEW_CLASS_NAMESplusview_pred.
Use the CSVs inside evaluation/example_predictions/ as templates.
- View models can omit
prob_Other; the evaluator derives it as1 - sum(prob_<known view>). - Multi-target datasets (e.g., EchoNet-LVH) expect one CSV per measurement, named
<TARGET>_pred. - When in doubt, mirror the filenames inside
evaluation/example_predictions/<DATASET>/.
src/contains code for zero-shot similarity, language-aligned prompting, and linear probes on top of frozen encoders.
Contributions are welcome! Please open an issue or pull request for:
- Evaluation tasks.
- Bug fixes within the existing scripts.
- Documentation or visualization improvements.
When submitting predictions or scripts, ensure you do not upload raw patient data—only derived metrics.
If CardioBench is useful for your work, please cite:
@article{taratynova2025cardiobench,
title={CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?},
author={Taratynova, Darya and Aly, Ahmed and Saeed, Numan and Yaqub, Mohammad},
journal={arXiv preprint arXiv:2510.00520},
year={2025}
}Preprint: arXiv:2510.00520
