GitHub - UW-MLGEO/2026MLGeo-CloudsTeam

Data information

Accessing our data

We have downloaded the data from the tornet github which had a site from which you could download data. We have already downloaded all of it onto Sofia's hardrive. In total it is around 100GB so we are currently uploading it to an amazon server under Sofia's name. The data should finish uploading by the end of the week (02/13). We have tested that everyone in our group can access it through our python notebook.

Our project questions as a reminder

How do we use supervised machine learning to identify tornado hook echoes in radar imagery? Inputs: PPIs of radar reflectivity and radial velocity categorized as containing a hook or not Outputs: binary classification of tornado hook echoes, of whether a storm event has a hook echo or not
Can we use unsupervised machine learning on storms with identified hook echoes to determine pre-existing conditions of tornadoes and predict whether a tornado will occur? Inputs: PPIs of radar reflectivity and radial velocity from the times before a tornado develops a hook echo (determined from the output of part 1) Outputs: pre-tornadic conditions, and also binary classification of whether a tornado will occur based on a storms structure.

How we are training our data

Our data already comes split into training and testing, and with a tornado category, so we will be able to select the images we need fairly easily. Out of those tornado-labeled images we will take a random subsample out of the 13,000 tornadic events in our data. We will also make sure our classes are balanced to prevent bias, i.e 50 samples with tornadoes with hooks, 50 samples of tornadoes with no hooks, and 50 non-tornadic cells. We will choose about 50 events from each category, and then label each frame within the event into the categories mentioned above. For our classifying model we will randomly subset the frames within our subsample to get the 50 frames we need for each category. This will be the trained dataset for the model. The output of this model will be our input for our unsupervised predictive model.

Resources

None needed for processing the training/input dataset. Further discussion on potentially needing resources from Akshay at choosing ML/running ML stage.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.gitignore		.gitignore
MDFormatting.md		MDFormatting.md
NBFormatting.ipynb		NBFormatting.ipynb
README.md		README.md
modeling_tornet.ipynb		modeling_tornet.ipynb
new_tornet_load.ipynb		new_tornet_load.ipynb
radar_labeler.ipynb		radar_labeler.ipynb
radar_labeler_whiteplots.ipynb		radar_labeler_whiteplots.ipynb
tornet_data_load.ipynb		tornet_data_load.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data information

Accessing our data

Our project questions as a reminder

How we are training our data

Resources

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

UW-MLGEO/2026MLGeo-CloudsTeam

Folders and files

Latest commit

History

Repository files navigation

Data information

Accessing our data

Our project questions as a reminder

How we are training our data

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages