Description
Summary
I just stumbled across FLAIR, a high res (.2 meter) semantic segmentation dataset of 19 land cover categories. Full description here https://ignf.github.io/FLAIR/#FLAIR1 That link also references a U-Net model baseline trained on FLAIR
It is described in an OGC Report on ML Engineering as "a comprehensive and high-quality collection of labeled satellite imagery aimed at advancing land cover classification and geospatial analysis tasks". It's maintained by the French National Institute of Geographic and Forest Information (IGN).
Rationale
I'm interested in composing a list of high quality, challenging datasets for benchmarking semantic segmentation and object detection models and at a glance FLAIR seems like one of them. It seems like the rigor of maintenance and description of the FLAIR dataset is high compared to other datasets and so I would like to see this available in torchgeo.
Also, I think I recall seeing that torchgeo would like to offer models that are inference-ready, not just pretrained with self-superivison but fine-tuned to address popular tasks in remote sensing. This U-Net model seems like a good start, but I can raise a separate issue for adding the model.
Implementation
I haven't contributed to torchgeo before but I would check out other PRs that added datamodules and pretrained models and follow that example.
Alternatives
No response
Additional information
This comment is on my mind: Clay-foundation/model#269 (comment)
I'd love for more challenging datasets to be front and center when benchmarking models rather than Eurosat. I'd also like for us to use datasets that we have a solid understanding of the geographic and class distribution and I like that FLAIR lays this out on their site.
I might be a bit slow to implement this but would like to submit a PR when I have time if this sounds like a good idea.