Skip to content

VisionXLab/CrossEarth-SAR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CrossEarth-SAR: A SAR-Centric and Billion-Scale Geospatial Foundation Model for Domain Generalizable Semantic Segmentation

Ziqi Ye1, 2 ∗, Ziyang Gong3 ∗ † , Ning Liao3 ∗, Xiaoxing Hu4, Di Wang5, 6, Hongruixuan Chen7, Chen Huang8, Yiguo He3,

Yuru Jia9, 10, Xiaoxing Wang3, Yuan Cheng3, Haipeng Wang1 ‡, Xue Yang3 ‡, Junchi Yan3, 2 ‡

1 Fudan University, 2 Shanghai Innovation Institute, 3 Shanghai Jiao Tong University,

4 Beijing Institute of Technology, 5 Wuhan University, 6 Zhongguancun Academy,

7 The University of Tokyo, 8 Sun Yat-sen University, 9 KU Leuven, 10 KTH

Equal Contribution, Project Lead, ‡  Corresponding Author

This repository contains the official implementation of [CrossEarth-SAR: A SAR-Centric and Billion-Scale Geospatial Foundation Model for Domain Generalizable Semantic Segmentation].



🔥🔥🔥 News

  • [2025/12/07] We are releasing the benchmark collection. Click here to get the datasets of benchmarks!
  • [2025/12/07] The checkpoints have been uploaded and you can access them in the huggingface badges.
  • [2025/12/07] The training and inference code of 22 benchmarks is released at followings.

Abstract

Synthetic Aperture Radar (SAR) enables global, all-weather earth observation. However, owing to diverse imaging mechanisms, domain shifts across sensors and regions severely hinder its semantic generalization. To address this, we present CrossEarth-SAR, the first billion-scale SAR vision foundation model built upon a novel physics-guided sparse mixture-of-experts (MoE) architecture incorporating physical descriptors, explicitly designed for cross-domain semantic segmentation. To facilitate large-scale pre-training, we develop CrossEarth-SAR-200K, a weakly and fully supervised dataset that unifies public and private SAR imagery. We also introduce a benchmark suite comprising 22 sub-benchmarks across 8 distinct domain gaps, establishing the first unified standard for domain generalization semantic segmentation on SAR imagery. Extensive experiments demonstrate that CrossEarth-SAR achieves state-of-the-art results on 20 benchmarks, surpassing previous methods by over 10% mIoU on some benchmarks under multi-gap transfer. All code will be publicly available.

CrossEarth-SAR-200K and Benchmarks

CrossEarth-SAR-200K:

  • CrossEarth-SAR-200K consists of three components: 37K private and 126K public SAR–optical pairs, together with 40K public SAR segmentation samples. Among them, 163K SAR images are assigned pseudo labels generated by applying CrossEarth to their paired optical images. All data are unified under the 7-class LoveDA semantic scheme.
  • To the best of our knowledge, CrossEarthSAR-200K is the first large-scale SAR semantic segmentation dataset, and its size surpasses that of the widely used COCO-Stuff benchmark (164K images) for general-purpose semantic segmentation. The scale and diversity of CrossEarthSAR-200K effectively emulate real-world deployment scenarios in which SAR semantic segmentation models are applied across multiple data sources, with imagery collected from 109 regions worldwide. CrossEarthSAR-200K provides a robust foundation for training and evaluation, thereby advancing research in SAR semantic segmentation and image understanding.
dual resampler cond gen

Benchmarks:

  • For fair evaluation of generalizability of existing models in SAR modality, we curate benchmarks based on widely-used SAR RS semantic segmentation datasets including AIR-PolSAR-Seg-2.0, DDHR-SK, FUSAR-Map, OpenEarthMap, SARBuD, and WHU-OPT-SAR, and extend them to DG settings.
  • Our benchmark comprises different tasks across eight compositional domain gaps: (1) Unseen Region; (2) Unseen Polarization; (3) Unseen Complex Number; (4) Unseen Region and Polarization; (5) Unseen Region and Platform; (6) Unseen Region and Microwave Band; (7) Unseen Region, Polarization and Microwave Band; (8) Unseen Region, Platform and Microwave Band.

Results and Visualization

  • Visualizations of predicted segmentation maps on several benchmarks. The experimental results demonstrate that CrossEarth-SAR possesses a strong capacity for SAR remote-sensing domain generalization (SAR RSDG), yielding semantically accurate and visually coherent segmentation predictions across diverse unseen scenarios.
  • CrossEarth-SAR achieves SOTA performances on 20 evaluation benchmarks across various segmentation scenes, demonstrating strong generalizability.
dual resampler cond gen

Environment Requirements:

conda create -n CrossEarthSAR python = 3.9 -y
conda activate CrossEarthSAR
conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=12.1 -c pytorch -c nvidia -y
pip install fsspec
pip install -U openmim
mim install mmengine
mim install "mmcv==2.1.0"
pip install "mmsegmentation>=1.0.0"
pip install "mmdet>=3.0.0"
conda install xformers -c xformers
pip install numpy==1.24.4
pip install ftfy scipy prettytable matplotlib regex timm einops
pip install future tensorboard

Training Steps:

First, download the CPT model weights crossearthsar_vitl.pth from the huggingface in the above badges, then place the weights in the ./checkpoint/ directory.

Second, replace the dataset path in /configs/_base_/datasets/×××.py (i.e., the line path/to/datasets...) with your own directory.

Third, run the training code. (Take VV2F benchmark as an example):

python tools/train.py configs/CrossEarthSAR_dinov2/CrossEarthSAR_vv2RGB_VV2F.py --work-dir ./work_dirs/...  --cfg-options load_from=./checkpoints/crossearthsar_vitl.pth

Inference Steps:

First, use your own training weight or download the model weights from the huggingface in the above badges.

Second, run the test code. (Take VV2F benchmark as an example):

python tools/test.py configs/CrossEarthSAR_dinov2/CrossEarthSAR_vv2RGB_VV2F.py ./checkpoints/xxx.pth

CrossEarth-SAR Model Weights

Model params (all/activated) CrossEarth-SAR-200K val download
ViT-S 90M / 20M 59.06% crossearthsar_vits.pth
ViT-B 300M / 80M 60.79% crossearthsar_vitb.pth
ViT-L 1.3B / 300M 62.42% crossearthsar_vitl.pth

Benchmarks Model Weights with Configs

Domain Gap Benchmark Model Config
Unseen Region N2S Unseen_Region_N2S.pth CrossEarthSAR_bj2shgz_N2S.py
- S2N Unseen_Region_S2N.pth CrossEarthSAR_shgz2bj_S2N.py
- K2C Unseen_Region_K2C.pth CrossEarthSAR_kr2sd_K2C.py
- C2K Unseen_Region_C2K.pth CrossEarthSAR_sd2kr_C2K.py
Unseen Polarization VV2F Unseen_Region_VV2F.pth CrossEarthSAR_vv2RGB_VV2F.py
- F2VV Unseen_Region_F2VV.pth CrossEarthSAR_RGB2vv_F2VV.py
- HH2F Unseen_Region_HH2F.pth CrossEarthSAR_hh2RGB_HH2F.py
- F2HH Unseen_Region_F2HH.pth CrossEarthSAR_RGB2hh_F2HH.py
Unseen Complex Values C(r)2R Unseen_Region_Cr2R.pth CrossEarthSAR_real2RGB_Cr2R.py
- R2C(r) Unseen_Region_R2Cr.pth CrossEarthSAR_RGB2real_R2Cr.py
- C(i)2R Unseen_Region_Ci2R.pth CrossEarthSAR_imgy2RGB_Ci2R.py
- R2C(i) Unseen_Region_R2Ci.pth CrossEarthSAR_RGB2imgy_R2Ci.py
Unseen Region and Polarization F2A Unseen_Region_F2A.pth CrossEarthSAR_fusar_sar2airpolsar_F2A.py
- A2F Unseen_Region_A2F.pth CrossEarthSAR_fusar_airpolsar2sar_A2F.py
Unseen Region and Platform O2D Unseen_Region_O2D.pth CrossEarthSAR_dfc252ddhr_O2D.py
- D2O Unseen_Region_D2O.pth CrossEarthSAR_ddhr2dfc25_D2O.py
Unseen Region and Microwave Band S2A Unseen_Region_S2A.pth CrossEarthSAR_sarbud_sar2airpolsar_S2A.py
- A2S Unseen_Region_A2S.pth CrossEarthSAR_sarbud_airpolsar2sar_A2S.py
Unseen Region, Polarization and Microwave Band D2F Unseen_Region_D2F.pth CrossEarthSAR_fusar_ddhr2sar_D2F.py
- F2D Unseen_Region_F2D.pth CrossEarthSAR_fusar_sar2ddhr_F2D.py
Unseen Region, Platform and Microwave Band W2D Unseen_Region_W2D.pth CrossEarthSAR_whu2ddhr_W2D.py
- D2W Unseen_Region_D2W.pth CrossEarthSAR_ddhr2whu_D2W.py

Todo List

  • Release the paper on arXiv.
  • Release the 22 SAR RSDG benchmarks.
  • Release the CrossEarth-SAR-200K dataset and Continous Pre-Training code.
  • Release the benchmarks fine-tuning training and inference code.
  • Release the model weights with configs.

Citation

If you find CrossEarth-SAR helpful, please consider giving this repo a ⭐ and citing:

@article{gong2025crossearth,
  title={Crossearth: Geospatial vision foundation model for domain generalizable remote sensing semantic segmentation},
  author={Gong, Ziyang and Wei, Zhixiang and Wang, Di and Hu, Xiaoxing and Ma, Xianzheng and Chen, Hongruixuan and Jia, Yuru and Deng, Yupeng and Ji, Zhenming and Zhu, Xiangwei and others},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2025},
  publisher={IEEE}
}
@article{hu2025earth,
  title={Earth-adapter: Bridge the geospatial domain gaps with mixture of frequency adaptation},
  author={Hu, Xiaoxing and Gong, Ziyang and Wang, Yupei and Jia, Yuru and Lin, Fei and Gao, Dexiang and An, Ke and Han, Jianhong and Sun, Zhuoran and Luo, Gen and others},
  journal={arXiv preprint arXiv:2504.06220},
  year={2025}
}
@article{cao2025crossearth,
  title={CrossEarth-Gate: Fisher-Guided Adaptive Tuning Engine for Efficient Adaptation of Cross-Domain Remote Sensing Semantic Segmentation},
  author={Cao, Shilei and Gong, Ziyang and Lin, Hehai and Liu, Yang and Cheng, Jiashun and Hu, Xiaoxing and Liang, Haoyuan and Li, Guowen and Qin, Chengwei and Cheng, Hong and others},
  journal={arXiv preprint arXiv:2511.20302},
  year={2025}
}

About

The official repo of CrossEarth-SAR, a sar-centric and billion-scale geospatial foundation model for cross-domain semantic segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors