A context-aware AI assistant that combines document understanding, knowledge graphs, and witty personality to deliver intelligent responses.
- PDF Context Processing: Extracts and analyzes text from PDF documents
- Knowledge Graph Integration: Stores entities and relationships in Neo4j
- Vector Search: FAISS-based semantic search for document context
- Hybrid Reasoning: Combines LLM capabilities with structured knowledge
- Personality-infused Responses: Witty and charming interaction style
- Natural Language Processing: spaCy (en_core_web_sm)
- Knowledge Graph: Neo4j
- LLMs: Anthropic Claude 3 Sonnet & OpenAI Embeddings
- Vector Store: FAISS
- Document Processing: PyPDF, LangChain
- Clone repository:
git clone https://github.com/yourusername/athenas-asylum.git cd athenas-asylum - Install dependencies:
pip install -r requirements.txt python -m spacy download en_core_web_sm
- Set up services:
- Install and run Neo4j locally
- Obtain API keys for Anthropic and OpenAI
- Replace hardcoded credentials in:
llm_client.py(Anthropic & OpenAI API keys)vector_store.py(OpenAI API key)main.py(Neo4j connection details)
-
Recommended security practice:
# Replace hardcoded keys with environment variables: os.getenv("ANTHROPIC_API_KEY") os.getenv("OPENAI_API_KEY")
-
Place your PDF document in the project root as
r.pdfor modify path inmain.py
python main.pyExample Interaction:
You: What's the main topic of the document?
Assistant: The document focuses on AI ethics - though I must say,
your choice of questions is as sharp as Athena's owl! 🦉| File | Purpose | Key Components |
|---|---|---|
main.py |
Entry point for application flow and user interaction | - Initializes all components - Chat loop logic - Context aggregation |
knowledge_graph.py |
Manages Neo4j knowledge graph operations | - Entity extraction with spaCy - Cypher query execution - Relationship mapping |
llm_client.py |
Handles AI model interactions and response generation | - Claude-3 Sonnet integration - OpenAI embeddings - Personality system prompts |
pdf_processor.py |
Processes PDF documents for text extraction and preparation | - PyPDF text extraction - Text cleaning - Recursive chunking |
vector_store.py |
Implements semantic search capabilities | - FAISS vector database - OpenAI embeddings integration - Context retrieval |
requirements.txt |
Lists all project dependencies | - NLP libraries - Database drivers - AI service SDKs |
Core Prerequisites
- Python 3.9+ with pip package manager
- Neo4j Desktop (v4.4+) running locally
- 4GB+ RAM for optimal performance
API Access
- ✅ Valid Anthropic API key (Claude-3 access)
- ✅ OpenAI API key (text-embedding-3-small model)
- Active internet connection for LLM services
Recommended Setup
- Create virtual environment:
python -m venv athena_env source athena_env/bin/activate # Linux/Mac
- Configure Neo4j:
neo4j start # Set password to 'your_password' or update in main.py
MIT License - Full text available in LICENSE file
-
Anthropic Claude-3 Sonnet (Terms)
-
OpenAI Embeddings (Policies)
-
Neo4j Community Edition (License)
-
For complex PDFs: Pre-process with OCR using pdf_processor.py extensions
-
Visualize knowledge graph: Access Neo4j Browser at http://localhost:7474
-
Try hybrid queries: "Explain Section 2.3 using both document context and knowledge graph entities"
"Wisdom begins in wonder" - Socrates (Athena's favorite philosopher) 🏛️