Skip to content

Overview of Satellite-Derived Surface Solar Radiation Datasets useful for Nowcasting 🌞 #44

@irenelivia

Description

@irenelivia

This issue aims to consolidate and clarify the current set of satellite-derived datasets relevant for solar radiation nowcasting within the MLCast community.

The goal is to:

  • Provide a clear overview of available and relevant satellite-derived surface solar radiation datasets
  • Clarify differences in data source, processing level, and intended use
  • Identify preferred datasets (or complementary roles)
  • Serve as a hub linking to dataset-specific sub-issues

This follows recent discussions in the MLCast community about avoiding duplication of preprocessing efforts and aligning on a shared AI-ready data foundation.


🧭 Motivation

There is currently some ambiguity across the community regarding:

  • Which satellite-derived solar radiation datasets exist
  • How they differ in terms of:
    • satellite source (MSG vs MTG)
    • processing method
    • spatial/temporal resolution
    • intended use (climate product vs AI-ready training data)

This issue is intended to resolve that by collecting and structuring the landscape.


Summary tables

🟦 1. Operational input datasets

These datasets represent the direct satellite observation space, closest to the operational data that flows into met services through EUMETCAST.

They are intended to be used as:

  • inputs (X) to ML models
Dataset Satellite Role
MSG SEVIRI L1C MSG (operational) Raw multispectral radiances
MTG FCI L1C (proposed) MTG (operational) Operational successor to MSG

Related issue on obtaining an ML-ready zarr dataset from MTG FCI level 1c data: #43

🟨 2. Derived geophysical products (target variables) for surface solar radiation (SSR)

These datasets represent physically interpreted variables, derived from satellite observations via retrieval algorithms or radiative transfer models.

They are typically used as:

  • training targets
  • evaluation / benchmarking datasets
  • physical reference products
Dataset Satellite/Instrument Variable Type
MSGCPP MSG SEVIRI GHI Operational surface solar radiation + clearsky radiation product
SARAH-3 MSG SEVIRI GHI Climate data record of surface solar radiation, 5 day latency
HANNA MSG SEVIRI GHI High-resolution surface solar radiation demonstrator
DWD SSR MSG SEVIRI / MTG FCI (emerging) GHI Operational surface solar radiation product
LSA-SAF MDSSFTD MSG SEVIRI GHI? Surface solar radiation (land only)

🛰️ Candidate Surface Solar Radiation datasets

1. MSGCPP (MSG Cloud Physical Properties) (KNMI)


2. SARAH-3 (CM SAF)

  • Source: MSG-based climate data record
  • Type: Surface solar radiation (climate product)
  • Resolution: ~0.05° (~5 km), 30 min
  • Notes:
    • Long-term climate consistency focus
    • Not optimized for ML nowcasting workflows
    • Useful for climatology / baseline comparisons / emulating surface solar radiation from raw satellite channels
    • More information and archive: https://user.eumetsat.int/catalogue/EO:EUM:DAT:0863

3. HANNA (CM SAF demonstrator)


4. DWD SSR product (MSG + MTG)

  • Source: DWD satellite processing chain
  • Type: Surface solar radiation
  • Notes:

5 LSA-SAF MDSSFTD (EUMETSAT SAF)

  • Source: MSG + Land Surface Analysis SAF
  • Type: Surface solar radiation product
  • Coverage: Land-only
  • Resolution: ~3–5 km, 15 min
  • Notes:
    • Used in solar applications (e.g. retraining IrradianceNet @olah-soma, operationally used @geosphere with IrradPhyDNet and DE_330 on euro-cordex domain, used in post-processing for PV production nowcasts)
    • Limited by land-only coverage

❓ Open questions for the community

  1. Which derived SSR product(s) should be used as:
    • training targets?
    • evaluation benchmarks?
  2. Do we want a community benchmark comparison paper/study across SSR products?

👥 Contributors / interested parties

Tagging MLCasters possibly interested in this discussion:
@pdebuyl @ladc @leifdenby @franchg

A quick look at some of the datasets by @irenelivia:
Image

Image

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions