Challenge #21 - Machine Learning to improve the CAMS global air quality forecasts

# Challenge 21 - Machine Learning to improve the CAMS global air quality forecasts

> **Stream 2** - Machine Learning for weather, climate and atmosphere applications


### Goal
Develop an ML algorithm to predict (and correct) the time-varying bias of the global CAMS forecast for surface PM2.5, O3 and NO2 at the locations of air quality monitoring stations.

### Mentors and skills
* **Mentors:** @JohannesFlemming , @miha-at-ecmwf  @jerome-barre-ecmwf
* **Skills required:**
  * Machine Learning
  * Time-series analysis
  * Handling geospatial data
  * Basic understanding of air quality observations and models 
  * Python

<br>

> <b> *Note: Challenge is funded by Copernicus. Only nationals from the European Union and ECMWF Member States are eligible to apply (see [Terms and Conditions](https://esowc.ecmwf.int/terms-and-conditions)).* </b>
<hr>


### Challenge description
CAMS/ECMWF runs a computer model to predict global air pollution at a spatial resolution of about 40x40km (grid boxes size). While the CAMS model predicts the observed air quality mostly reasonably well errors in the prediction can occur because of the necessary simplification of the CAMS model and the uncertainties in the input data such as the emissions. 
The main task is to develop an ML approach to predict the forecast errors at the location of air quality stations in order to correct them as a post-processing step. The observations to be used are hourly observations of surface ozone, NO2 and PM2.5 from about 2000 stations worldwide as provided in the openAQ data repository.

We suggest the following steps towards the solution:

1) Build ML model
Train the  ML model to predict the difference between forecast and station observations using data from a recent previous year (2019 or 2018).
The input to the correction algorithm can be the CAMS model-forecast of the air quality value (O3, NO2, PM2.5) and forecast meteorological parameters (temperature, wind speed, etc. ).

2) Test performance of ML bias predication with independent data
Test the performance of the ML model for recent forecasts of the 2020-2021 period. Use basic error statistics such as bias, RMSE and correlation to compare the forecast accuracy of the ML-corrected forecast against the uncorrected forecast.

3) Model error analysis (optional)
Investigate the importance of the individual predictors and do a spatial analysis to identify patterns that could be used to better understand or improve the forecast model.

<br>

<img src='https://user-images.githubusercontent.com/8168920/112647836-35205400-8e49-11eb-96ff-c220bd0969ab.jpg' alt='ESoWC' align='center' width='80%'></img>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Challenge #21 - Machine Learning to improve the CAMS global air quality forecasts #6

Challenge 21 - Machine Learning to improve the CAMS global air quality forecasts

Goal

Mentors and skills

Challenge description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Challenge #21 - Machine Learning to improve the CAMS global air quality forecasts #6

Description

Challenge 21 - Machine Learning to improve the CAMS global air quality forecasts

Goal

Mentors and skills

Challenge description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions