This repository contains code used for my project: Accelerating Search for Superconductors using Machine Learning.
-
Python 3.10+ is required. You can check your Python version using:
python3 --version
-
Clone this repository to your local machine:
git clone https://github.com/adigasuhas/Accelerating-Search-for-Superconductors-using-Machine-Learning.git
-
Download the trained models from Google Drive: Click Here
-
Ensure to rename the unzipped folder as
Machine_Learning_Models. -
Move the
Machine_Learning_Modelsfolder to the cloned repository:mv Machine_Learning_Models Accelerating-Search-for-Superconductors-using-Machine-Learning/
(OR)
To reproduce the results reported in the manuscript, use the trainer function provided in the Training folder.
A convenient bash wrapper script Trainer.sh is also included to execute training with minimal commands.
If the script does not have execute permission, grant it and run it as follows:
cd Accelerating-Search-for-Superconductors-using-Machine-Learning
chmod +700 Trainer.sh
./Trainer.sh-
Command Line Options:
--save- Saves the trained classification and regression models as joblib files in theMachine_Learning_Modelsdirectory--model30- Trains a random forest regression model using 30 QSD-inspired descriptors used in the manuscript.--model5- Trains a random forest regression model using 5 key features identified through SHAP analysis.
-
The full training source code can be found in
Training/Trainer.py
Note: If you retrain the models yourself, a minor numerical variations in predictions may occur. This happens because the dataset is randomly split into training and test sets each time, slightly changing the fitted parameters, even though all hyperparameters remain same.
-
Navigate to the code directory where the
$T_c$ prediction scripts are located:cd Accelerating-Search-for-Superconductors-using-Machine-Learning/Temp_Predictor/ -
Open the file named
Material_prediction.csv. -
The following columns can be handled as described:
-
Material-ID(Optional): This is for user reference and identification. It is not auto-filled by the code, so you may enter any identifier (e.g., Unique Identification number, Sample name). -
Chemical_Formula: Add the materials (chemical compositions) for which you wish to predict the critical temperature ($T_c$ ). -
Temp_critical(Optional): If known, you may enter the experimental$T_c$ here for comparison. Otherwise, you may leave it blank.
-
-
Follow the formatting rules illustrated in the reference image provided in the repository to ensure your chemical composition input is valid.
- After entering the
Chemical_FormulaintoMaterial_prediction.csv, you can check if the compound is present in theSuperCon-MTGdataset by running the following command. A warning will be displayed, and if the compound is found, its corresponding material ID and critical temperature will be shown.
cd Temp_Predictor
python3 Compound_Matcher.py-
Once you have validated that the compound is not present in the
SuperCon-MTGdataset, you can proceed to generate descriptors for$T_c$ prediction. To do this, simply run thePredict.shscript, which will generate the descriptors and predict$T_c$ using both the 30-feature and 5-feature models.If the script does not have execute permissions, you can grant them using the following command (skip if already granted):
chmod +700 Predict.sh ./Predict.sh
-
Upon successful execution, a
Material_prediction_results.csvfile will be generated inside theTemp_Predictorfolder. This file will contain the following columns:Material-IDChemical_FormulaTemp_criticalPredicted_classPredicted_Temp_critical_30_FeaturesPredicted_Temp_critical_5_Features
This work has been made possible due to the insights and datasets provided by the following sources, which have served as key inspirations:
-
Original Dataset:
- Center for Basic Research on Materials. MDR SuperCon Datasheet Ver.240322. National Institute for Materials Science, 2024. DOI: 10.48505/NIMS.4487
-
Descriptors:
- Rabe, K. M., Phillips, J. C., Villars, P., & Brown, I. D. (1992). Global multinary structural chemistry of stable quasicrystals, high-$(T_c)$ ferroelectrics, and high-$(T_c)$ superconductors. Phys. Rev. B, 45(14), 7650–7676. DOI: 10.1103/PhysRevB.45.7650
-
Composition Generation:
- Davies, D., Butler, K., Jackson, A., Skelton, J., Morita, K., & Walsh, A. (2019). SMACT: Semiconducting Materials by Analogy and Chemical Theory. Journal of Open Source Software, 4(38), 1361. DOI: 10.21105/joss.01361
-
Machine Learning-Based Work:
- Stanev, V., Oses, C., Kusne, A. G., Rodriguez, E., Paglione, J., Curtarolo, S., & Takeuchi, I. (2018). Machine learning modeling of superconducting critical temperature. npj Computational Materials, 4(1). DOI: 10.1038/s41524-018-0085-8
If you'd like to cite our work.
@misc{Adiga2025,
title={Accelerating the Search for Superconductors Using Machine Learning},
author={Suhas Adiga and Umesh V. Waghmare},
year={2025},
eprint={2505.11964},
archivePrefix={arXiv},
primaryClass={cond-mat.supr-con},
url={https://arxiv.org/abs/2505.11964},
}


