A Retrieval-Augmented Generation (RAG) system designed for psychology research documentation, developed for the CASML Generative AI Hackathon. This solution combines advanced document retrieval with generative AI to provide context-aware answers from academic literature (OpenStax Psychology (2e), CC BY 4.0).
Key Features:
- PDF document parsing with page-aware chunking
- Hybrid vector database integration (ChromaDB)
- Metadata-enriched text embeddings
- LLM-powered response generation with source attribution
- Embeddings: TogetherAI/m2-bert-80M-2k-retrieval
- Vector DB: ChromaDB
- LLM: meta-llama/Llama-3.3-70B-Instruct-Turbo-Free
- Text Processing: pdfplumber, custom chunking pipeline
- Python 3.9+
- Together API key
- Clone repository
git clone https://github.com/yash4agr/Psych-LLM.git
cd psych-llm
- Create a virtual environment
python -m venv .venv
- Activate the virtual environment
# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activate
- Install dependencies
pip install -r requirements.txt
-
Register and obtain your API key
- Visit Together API Key Settings
- Generate an API key
-
Export your Together API Key
export TOGETHER_API_KEY=<your_api_key> # On macOS/Linux
$env:TOGETHER_API_KEY="<your_api_key>" # On Windows (Powershell)
- Run the application
python main.py