This project is ongoing and subject to continuous advancements and modifications.
This project focuses on estimating photometric redshifts, which are crucial for studying the large-scale structure of the universe and the distribution of galaxies. It employs Gaussian processes as a flexible non-parametric approach to effectively model uncertainties in photometric data. The project also integrates various data analysis techniques to enhance accuracy and performance, offering a comprehensive framework for photometric redshift estimation and other ML and AI methods, benchmarking between them to observe each method's performance in terms of accuracy and computational time.
The project aims to test existing methodologies, such as Gaussian processes, to calculate photometric redshifts and mass estimates on a dataset with known redshifts, like Stripe 82X, to validate and benchmark the approach. The results will then be reproduced using the older dataset to ensure consistency and accuracy, demonstrating alignment with published data.
Once validated, the methodology will be adapted and applied to other wide X-ray fields with incomplete redshift data, such as XMM XXL, while addressing challenges posed by inhomogeneous data coverage. The performance of the approach will be evaluated across datasets with varying depths and completeness to optimize its reliability for diverse datasets.
Finally, the methodology will be scaled for fields with no redshifts, enabling broader application in X-ray AGN studies and mass estimation while leveraging advanced machine learning techniques.
- Gaussian Process Regression: Leverage Gaussian processes to estimate redshifts, allowing for a quantifiable measure of uncertainties
- Data Handling and Preprocessing: Tools for cleaning and preparing synthetic datasets based on the Sloan Digital Sky Survey (SDSS)
- Advanced Data Analysis: Combines Gaussian processes with other statistical and machine learning techniques to enhance predictive power
- Visualization Tools: Includes tools for visualizing redshift distributions, error margins, and overall model performance
- Thorough Documentation: Detailed explanations and example notebooks for easy understanding and reproducibility
Python Version: Python 3.8+
Key Packages:
| Category | Packages |
|---|---|
| Core Scientific Libraries | NumPy, Pandas, SciPy |
| Machine Learning & Statistical Modeling | Scikit-Learn, GPflow |
| Visualization Tools | Matplotlib, Seaborn |
| Astronomy-Specific Tools | Astroquery, Astropy |
| Deep Learning (Optional) | TensorFlow |
| Utilities | tqdm, h5py |
git clone https://github.com/Adrita-Khan/AstroPhotoZ.gitpip install -r requirements.txtTo help you get started with the project, you can run the following example notebooks:
| Notebook | Description |
|---|---|
Photometric_Redshift_Dataset_Exploration.ipynb |
An exploratory analysis of the photometric redshift dataset to understand underlying patterns and features |
Synthetic_Photometric_Redshift_Predictor.ipynb |
A step-by-step guide to predicting photometric redshifts using synthetic data |
Sample_Galaxy_Redshift_Prediction_py.ipynb |
An example notebook for predicting galaxy redshifts using real data |
Follow the notebooks to apply Gaussian processes and other data analysis techniques to photometric data. Hyperparameters and methods can be adjusted to suit specific research requirements. Notebooks and scripts will be updated and shared as the work progresses.
Contributions are welcome! Please feel free to open issues, suggest improvements, or submit pull requests.
This project is licensed under the MIT License.
| Resource | Description |
|---|---|
| Scikit-Learn Astronomy Regression Tutorial | Regression tutorial for astronomy applications |
| Photo-z Regression Demo - Mofokeng Chaka | Classification and photo-z regression demonstration |
| Multi-Wavelength Classification and Regression | Multi-wavelength approach to classification and regression |
| PhotoZ_SDSS by Tasos Theodoropoulos | SDSS photo-z estimation implementation |
| TITAN Project - PhotoZ SDSS ML | Machine learning approaches for SDSS photo-z |
| Photometric Redshifts - Martian Side of the Moon | Photometric redshift estimation project |
| Photometric Redshift Estimation - Amber | Machine learning for photometric redshifts |
| Photometric Redshift Estimation by Qbeer | Comprehensive photo-z estimation guide |
| MLZ: Machine Learning Redshifts | Machine learning framework for redshift estimation |
| Resource | Description |
|---|---|
| AstroML - Forest Photometric Redshift Estimation | Random forest photo-z estimation examples |
| Photo-z using k-Nearest Neighbors | KNN-based photo-z estimation |
| Compute SDSS PCA | Principal component analysis on SDSS data |
| Resource | Description |
|---|---|
| Photometric Redshift Using Deep Learning - Shreever Shith | Deep learning approaches for photo-z estimation |
| Resource | Description |
|---|---|
| SDSS4 DR16Q Tutorial by Qiaoya Wu | Tutorial for SDSS Data Release 16 quasar catalog |
| SDSS DR8 Data Access | Data access for SDSS DR8 |
| SDSS DR9 Photo-z Algorithms | Photo-z algorithms documentation for DR9 |
| SDSS DR17 Photo-z Algorithms | Photo-z algorithms documentation for DR17 |
| sdss Python Package | Python package for SDSS data access |
| Astroquery SDSS Documentation | Astroquery module for SDSS queries |
| Astroquery SDSS API | API documentation for SDSS queries |
| SDSS DR14 SkyServer SQL Search | SQL search interface for DR14 |
| SDSS DR18 SkyServer SQL Search | SQL search interface for DR18 |
| Resource | Description |
|---|---|
| The Dark Energy Survey Data Management System | Data management insights from the Dark Energy Survey |
| Understanding Redshift - Sky at Night Magazine | Comprehensive explanation of redshift in astronomy |
| Hubble Law Introduction | Introduction to Hubble's Law and cosmology |
| Gaussian Process Regression Tutorial | Tutorial on Gaussian process regression |
| ArXiv Paper: Photo-z Methods | Research paper on photometric redshift methods |


