Skip to content

Streamlit application for multi-gene TRAPI query integration with NetworkX clustering and LLM-assisted exploration using Cytoscape.js visualization.

License

Notifications You must be signed in to change notification settings

gladstone-institutes/GeneSet_Translator

Repository files navigation

GeneSet Translator

Explore biomedical knowledge graphs for gene sets via NCATS Translator.

GeneSet Translator

Features

  • Query NCATS Translator APIs to explore gene neighborhoods and disease connections
  • Interactive network visualization with Cytoscape.js
  • Human Protein Atlas integration for gene/protein cell type and tissue expression filtering
  • LLM-assisted summaries with citations (optional, requires API key)
  • Support for custom gene lists or built-in example datasets

How It Works

User Workflow

Installation

Prerequisites

Setup

  1. Clone and install:

    git clone https://github.com/gladstone-institutes/GeneSet_Translator.git
    cd GeneSet_Translator
    poetry install
  2. (Optional) Enable LLM summaries:

    cp .env.example .env
    # Add your Anthropic API key to .env

If you have trouble installing the app dependencies, consider using Docker (instructions below).

Usage

Run the app:

streamlit run app.py

Docker

If you have Docker installed, you can run the app in a container without installing Python or Poetry:

./docker_run.sh

Video Tutorial Coming soon!

This pulls a pre-built image and runs the app. The script mounts your local .env (if present, for LLM summaries) and data/ folder (for query caching).

Quick Start

  1. Select an example dataset
  2. Choose a query pattern and intermediate node types
  3. Click "Run Query" (takes 3-5 minutes)
  4. Explore results in the Network, Overview, and Summary tabs

Custom Genes

Upload a CSV with a gene_symbol column or enter genes manually in the sidebar.

Troubleshooting

  • No results: Some APIs may fail (5-6 successes is normal). Try different genes, or less specific predicate filter.
  • Empty graph: Check disease CURIE format (e.g., MONDO:0100096 for COVID-19)
  • Slow visualization: Reduce the max_intermediates slider or use simpler layouts

AI Disclosure

Generative AI tools (Claude Code, Anthropic) were used as coding assistants during development. The author maintains full responsibility for accuracy, reproducibility, and scientific validity. AI outputs were reviewed and validated before integration. Research questions, analytical approaches, and scientific interpretations were determined independently by the author.

License

MIT License

About

Streamlit application for multi-gene TRAPI query integration with NetworkX clustering and LLM-assisted exploration using Cytoscape.js visualization.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published