Skip to content

feat(examples): add arXiv papers knowledge base demo#2810

Closed
moshierming wants to merge 1 commit intoHKUDS:mainfrom
moshierming:feat/arxiv-papers-demo
Closed

feat(examples): add arXiv papers knowledge base demo#2810
moshierming wants to merge 1 commit intoHKUDS:mainfrom
moshierming:feat/arxiv-papers-demo

Conversation

@moshierming
Copy link
Copy Markdown

Summary

This PR adds a new unofficial sample demonstrating how to build a searchable knowledge base from arXiv papers using LightRAG's graph-enhanced retrieval.

What this example does

  1. Fetches paper metadata via the free arXiv API (no API key required)
  2. Inserts papers into LightRAG for graph-indexed storage
  3. Demonstrates Local / Global / Hybrid query modes on academic content

Usage

# Using OpenAI
python examples/unofficial-sample/lightrag_arxiv_papers_demo.py \
    --ids 2410.05779 1706.03762 2005.11401

# Using local Ollama (free, no API key)
python examples/unofficial-sample/lightrag_arxiv_papers_demo.py \
    --ids 2410.05779 1706.03762 \
    --llm-model qwen2.5:7b \
    --embed-model nomic-embed-text \
    --ollama

Key features

  • No extra dependencies beyond lightrag-hku — arXiv fetching uses stdlib urllib
  • Retry logic with exponential backoff for environments with unstable arXiv API access
  • Supports both OpenAI and local Ollama backends
  • Cleanly separates paper ingestion from query demo, easy to extend

Testing

Syntax-checked with python -m py_compile. Full integration test requires LLM credentials.

Demonstrates building a multi-paper knowledge base from arXiv using
LightRAG's graph-enhanced retrieval.

Features:
- Fetches paper metadata via free arXiv API (no API key needed)
- Supports both OpenAI and local Ollama (qwen2.5:7b + nomic-embed-text)
- Shows Local / Global / Hybrid query modes on academic content
- Includes retry logic for arXiv API in network-restricted environments

Usage:
  # OpenAI
  python examples/unofficial-sample/lightrag_arxiv_papers_demo.py \
      --ids 2410.05779 1706.03762

  # Ollama (free, no API key)
  python examples/unofficial-sample/lightrag_arxiv_papers_demo.py \
      --ids 2410.05779 1706.03762 --ollama
@moshierming moshierming closed this by deleting the head repository Mar 20, 2026
@danielaskdd
Copy link
Copy Markdown
Collaborator

Thanks for the interest! The demo looks great. Please go ahead with the submission—I'd be happy to consider merging it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants