We need to add more segmentation backbones to transform this into a real benchmark; this is important for reviewers. A few options: - scaling nnUNet ResEnc (M/L/XL) - MedNeXt - transformer-based (e.g. SwinUNetR) - mamba-based (e.g. U-Mamba)