Skip to content

DENSE SENSE : A novel approach utilizing electron density augmented machine learning paradigm to understand complex odour landscape

Notifications You must be signed in to change notification settings

CSIO-FPIL/Dense-Sense-ODOR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DENSE SENSE: A Novel Approach Utilizing an Electron Density Augmented Machine Learning Paradigm to Understand a Complex Odour Landscape

pheno_final1

Till date there is only one crystal structure of human odour receptor deposited in PDB (8F76), which has been obtained via cryo- EM. The dearth of crystal structure limits structure based analysis of odour receptors, this forces us to look into direction of ligand based ML approaches which can be used for predicting odorant properties for molecules thus providing us an insight into the world of olfaction. At present the current state of art model given by Lee et al which is graph neural networks ensemble has score (5 fold CV) 0.89. In this work we synergistically combine Quantum Mechanics(QM) with graph neural networks to get an improved model. Our findings underscore the possibility of this methodology in predicting odour perception directly from QM data, offering a novel approach in the Machine learning space to understand olfaction.

Getting started

Creating environment:
conda create --name my_env --file requirements.txt

Dataset for Training

The data files that are used for training are as follows:
DMPNNpruned_without hydrogen_curated_GS_LF_merged_4812_QM_cleaned.csv
curated_GS_LF_merged_4983.csv
DMPNNpruned_graph_data_cleaned.npz
graph_data_cleaned.npz

The above mentioned PyTorch data is created using: Pyg_data_creator_for_cleaned.ipynb.

Models

We utilized various architecture of Graph neural networks: message passing neural networks (MPNN), directed message passing neural net (DMPNN) and graph convolutional neural networks (GCN).

The results of the QNN model are impressive on this challenging odour label classification task given the fact that only electron localization and delocalization data was provided to the QNN model. DMPNN +LDM model gives us best results amongst all the GNN by achieving a validation score of 0.871 and is competitive with openPOM .

We then employed an ensemble approach to combine graph neural networks for improved performance. We first explored combining DMPNN models, as they demonstrated the best results. We tested ensemble of 10 and 30 DMPNNs and aggregated their result by averaging out their predictions. AUROC metric was used to evaluate the model performance. We tested two cases where we varied random seeds and one without random seed variation.

The notebooks for the 5 Fold Cross Validation for the above models are as follows:

The notebooks for the ensemble of models are as follows:

Explainability

pheno_final1

We attempt to elucidate the structure-odour relationship using Intergrated Gradients on the molecular graphs. We have explained for the model that is using DMPNN graph featurizer DMPNNFeaturizer() and LDM data, for a given SMILE and odour label.The best way to validate explainability was by utilizing explainability analysis for compounds with functional group-based odour labels, i.e., ketonic, phenolic, etc. This is because explainability for such odour labels is straightforward: the functional group is the part of the molecule responsible for its odourous property. The explainability analysis must then highlight the functional group as the odourgenic region of the molecule. Our explainability analysis indeed found this to be true. We took the functional group odour labels that our model was able to successfully predict (AUROC per label score > 0.8).

The code is available in this Dense_Sense_Explainability_g1.ipynb notebook.

Contributors

Pinaki Saha, University of Hertfordshire, UH Biocomputation Group, United Kingdom
Mrityunjay Sharma, CSIR-CSIO, Chandigarh, India
Sarabeshwar Balaji, Indian Institute of Science Education and Research Bhopal(IISERB), India
Aryan Amit Barsainyan, National Institute of Technology Karnataka Surathkal, Karnataka, India
Ritesh Kumar, CSIR-CSIO, Chandigarh, India
Volker Steuber, University of Hertfordshire, UH Biocomputation Group, United Kingdom
Michael Schmuker, Helmholtz-Gemienschaft, Berlin, Germany

Citing This Work

To cite this work, please use this bibtex entry:

@article{saha2025dense,
  title={DENSE SENSE: A novel approach utilizing an electron density augmented machine learning paradigm to understand a complex odour landscape},
  author={Saha, Pinaki and Sharma, Mrityunjay and Balaji, Sarabeshwar and Barsainyan, Aryan Amit and Kumar, Ritesh and Steuber, Volker and Schmuker, Michael},
  year={2025}
}

About

DENSE SENSE : A novel approach utilizing electron density augmented machine learning paradigm to understand complex odour landscape

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •