Skip to content

VipinMI2024/MolecularExplorer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

3D Molecule-Text Interpretation Project

The 3D Molecule-Text Interpretation Project is an interactive web application built with Streamlit that interprets natural language queries or PubChem CIDs to visualize 3D molecular structures and retrieve chemical properties. By leveraging PubChem's API, RDKit, and py3Dmol, the app allows users to explore molecules using text inputs like "Acetylsalicylic acid" or numeric CIDs like "2244". The project combines natural language processing (spaCy) with cheminformatics for a seamless user experience.


✨ Features

  • Text Query Interpretation: Parse compound names from natural language inputs using spaCy.
  • CID Support: Directly query molecules using PubChem Compound IDs.
  • 3D Visualization: Render interactive 3D molecular structures with atom labels using py3Dmol.
  • Chemical Properties: Retrieve properties like molecular weight, XLogP, complexity, and hydrogen bond counts from PubChem.
  • Dark Theme UI: Sleek, user-friendly interface for better visibility.
  • Robust Error Handling: Graceful handling of invalid inputs, API errors, and visualization issues.

🚀 Installation

# Create a virtual environment (recommended)
python -m venv venv
# Activate the environment
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Download spaCy language model
python -m spacy download en_core_web_sm

🌐 Usage

# Run the Streamlit app
streamlit run molecular_explorer.py
  • Open the app in your browser (usually http://localhost:8501)

  • Select input type: "Compound Name" or "CID"

  • Enter a query (e.g., "Acetylsalicylic acid" or "2244")

  • Click Visualize to display:

    • 3D molecular structure with atom labels
    • IUPAC name and SMILES string
    • Chemical properties (MW, XLogP, etc.)

🔍 Example

Input:

  • Compound Name: "Acetylsalicylic acid"
  • CID: "2244"

Output:

  • Interactive 3D molecular structure
  • IUPAC Name, SMILES string
  • Molecular Weight, Exact Mass, XLogP, Complexity
  • Rotatable Bond Count, H-Bond Donors/Acceptors

🙌 Contributing

We welcome contributions!

  1. Fork the repository

  2. Create a feature branch:

    git checkout -b feature/your-feature-name
  3. Commit your changes:

    git commit -m "Add your feature"
  4. Push to your branch:

    git push origin feature/your-feature-name
  5. Open a Pull Request with a clear description


📄 License

This project is licensed under the MIT License. See the LICENSE file for details.


📢 Contact


📊 Acknowledgments

  • PubChem for providing chemical data APIs
  • RDKit for cheminformatics tools
  • py3Dmol for molecular visualization
  • spaCy for NLP
  • Streamlit for web deployment

Built with passion by Vipin Mishra

About

A Streamlit-based web application for exploring 3D molecular structures and chemical properties of compounds using PubChem data. Search by compound name or CID to visualize molecules and retrieve properties like molecular weight, XLogP, and more.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors