Skip to content

A brief tutorial on random forest applied to classification and regression problems, using a sample of central galaxies in the Sloan Digital Sky Survey

Notifications You must be signed in to change notification settings

joanna-pk/random-forest-in-SDSS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Random Forest Tutorial with astronomical data

This tutorial contains two jupyter notebooks in the notebooks directory:

Both of them demonstrate a simple application of the random forest algorithm to relevant problems in astronomy, using the scikit-learn machine learning package for python. The classification notebook describes all steps in more detail and contains links to appropriate scikit-learn manuals.


The DATA directory contains a single text file with 6 entries for each of the ~200 000 central galaxies observed by the Sloan Digital Sky Survey (SDSS). The structure of the file is explained in detail in the classification notebook.

The compiled data uses the following publically available catalogues:


Finally, the scripts directory contains a single script support_tools, where my favourite configure_plots function is defined.


I hope you enjoy the content of these two notebooks and, hopefully, find them useful. If you have any questions, comments or requests, please feel free to reach out to me directly!


I would like to thank my supervisor Asa Bluck for introducing me to the joy of machine learning and inspiring this tutorial.

About

A brief tutorial on random forest applied to classification and regression problems, using a sample of central galaxies in the Sloan Digital Sky Survey

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published