Skip to content

kiharalab/queryome

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Queryome Banner

Queryome: Orchestrating Retrieval, Reasoning, and Synthesis across Biomedical Literature

Queryome is a multi-agent deep research system for biomedical literature. It combines semantic vector search, lexical keyword search, and LLM-powered reasoning to provide comprehensive answers to complex biomedical research questions. This repository contains the command-line version of Queryome; a desktop app is available at https://www.queryome.app/.

Features

  • Hybrid Search: Combines FAISS vector search with BM25 keyword search for optimal retrieval
  • Multi-Agent Architecture: Uses specialized agents (PI Agent, SubAgent Team, Synthesizer) for planning, execution, and synthesis
  • Multiple Search Indices: Searches across title/abstract, author keywords, and MeSH terms
  • Comprehensive Reports: Generates well-structured research reports with proper citations

Installation

1. Set up Environment Variables

Export your OpenAI API key:

export OPENAI_API_KEY="your-openai-api-key-here"

You can add this to your ~/.bashrc or ~/.zshrc for persistence.

2. Download Indices

Download the pre-built search indices:

wget https://kiharalab.org/queryome/indices.tar.gz
tar -xzf indices.tar.gz

This will create an indices/ directory containing:

  • vector_db/ - FAISS vector index and SQLite database
    • faiss.index - FAISS vector index
    • articles.db - SQLite database with article metadata
  • bm25_title_abstract/ - BM25 index for title and abstract search
  • bm25_author_keywords/ - BM25 index for author keywords search
  • bm25_mesh_terms/ - BM25 index for MeSH terms search

3. Create Conda Environment

Create and activate a new conda environment:

conda create -n queryome python=3.10
conda activate queryome

Install the required dependencies:

pip install openai numpy torch faiss-cpu bm25s PyStemmer sentence-transformers numba

Usage

Command Line Interface

Interactive Mode:

python queryome_cli.py

Single Query:

python queryome_cli.py --query "What are the latest treatments for Type 2 diabetes?"

With Custom Log Directory:

python queryome_cli.py --query "Efficacy of metformin in elderly patients" --log-dir ./my_logs

Example Run (Expected Output + Runtime)

The directory example/ in this repo is a captured run log for the query "tell me about metformin" (see example/run_log.txt and example/pi_final_answer.txt).

python queryome_cli.py --query "tell me about metformin" 

Truncated expected output:

Here's an overview of metformin based on recent research findings:

### Mechanism of Action
Metformin primarily works by reducing hepatic glucose production ... [1,2]

### Therapeutic Uses
Metformin is widely used as a first-line treatment for T2DM ... [3,4]

### Side Effects
Common side effects include gastrointestinal disturbances ... [5,6]

### Current Research Trends
Research is expanding metformin's application beyond diabetes ... [7,8]

## References
[1] Viollet & Foretz (2013). PMID: 23582849
[2] Foretz et al. (2019). PMID: 31439934
...

Runtime notes (from the captured run):

  • End-to-end (PI start → final answer): ~4m 03s
  • Subagent phase: 4 subquestions, 4 parallel workers, ~3m 49s, 73 total articles collected

Runtime scaling (rough estimates; dominated by LLM latency + search depth):

  • Without subagents (single-agent / sequential subquestions): ~1-4 min
  • With subagents (up to 10 workers): 4–15 min for 5-10 subquestions (often bounded by the slowest subquestion)

Python API

Single Query:

from queryome import Queryome

queryome = Queryome()
result = queryome.research("What are the latest treatments for Type 2 diabetes?")
print(result)

Multiple Queries (Batch Processing):

from queryome import Queryome, batch_research

# Using the Queryome class
queryome = Queryome()
queries = [
    "Efficacy of metformin in elderly patients",
    "Side effects of insulin therapy",
    "Latest diabetes management guidelines"
]
results = queryome.research_multiple(queries)

for r in results:
    print(f"Query: {r['query']}")
    print(f"Result: {r['result']}")
    print("---")

# Or use the convenience function
results = batch_research(queries)

Configuration Options

When initializing Queryome, you can customize:

queryome = Queryome(
    log_dir="./custom_logs",           # Custom log directory
    enable_search_engine=True,          # Enable/disable search engine
    openai_api_key="your-key",          # Provide API key programmatically
    embedding_device="cuda:0"           # GPU device for embeddings
)

Benchmark Results

All Queryome's benchmark data as seen at the paper can be downloaded from https://kiharalab.org/queryome/benchmark_data.tar.gz

Authors

Pranav Punuru1, Nabil Ibtehaz2, Swagarika Giri2, Harsha Srirangam2, Emilia Tugolukova1, and Daisuke Kihara1,2,*

1 Department of Biological Sciences, Purdue University, West Lafayette, IN 47906, USA 2 Department of Computer Science, Purdue University, West Lafayette, IN 47906, USA

* Corresponding Author Email: [email protected]

Citation

If you use Queryome in your research, please cite:

@article{punuru2025queryome,
  title={Queryome: Orchestrating Retrieval, Reasoning, and Synthesis across Biomedical Literature},
  author={Punuru, Pranav and Ibtehaz, Nabil and Giri, Swagarika and Srirangam, Harsha and Tugolukova, Emilia and Kihara, Daisuke},
  year={2025}
}

License

GPL v3. (If you are interested in a different license, for example, for commercial use, please contact us ([email protected]).)

Contact

For questions or support, please contact:

For technical problems or questions, please reach to Pranav Punuru ([email protected]).

About

An Agentic AI system for Biomedical Deep Research

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages