A fine-tuned legal assistant powered by Qwen 2.5, Unsloth, and Agentic RAG.
This project implements a legal AI assistant capable of retrieving case law and answering legal questions with citation-backed accuracy. It uses "Agentic RAG," meaning the model is fine-tuned to explicitly call for a search mechanism (<search>query</search>) when it needs external information.
- Agentic RAG: Model autonomously generates search queries during generation.
- RAG Pipeline: Ingestion and semantic search using ChromaDB and
all-MiniLM-L6-v2. - Flexible Fine-tuning: Uses Unsloth for efficient LoRA fine-tuning of Qwen models.
- Synthetic Data: Includes tools to generate high-quality, multi-turn training data from raw case text using OpenAI.
src/rag: Data ingestion and retrieval logic.src/data_gen: Data generation scripts for fine-tuning.src/finetune: Unsloth training scripts.src/inference: End-to-end inference script with tool-use loop.cases/: Raw case law data (JSON/HTML).
This project uses uv for dependency management.
# Clone the repository
git clone <repo_url>
cd law-assistant
# Install dependencies
uv syncRequirements:
- Linux environment (recommended)
- GPU with at least 12GB VRAM (for Qwen-7B 4-bit fine-tuning)
OPENAI_API_KEY(for synthetic data generation)
Parse raw case files and build the vector database.
uv run src/rag/ingest.pyCreate a fine-tuning dataset formatted for Agentic RAG (Question -> Search -> Context -> Answer).
# Ensure OPENAI_API_KEY is set in .env
uv run src/data_gen/generate.py --num_samples 50Train the Qwen model using Unsloth on the generated data.
uv run src/finetune/train.pyChat with the fine-tuned assistant. It will search for cases if needed.
uv run src/inference/chat.py --query "What is the holding in the Araujo case?"- Model:
unsloth/Qwen2.5-7B-Instruct-bnb-4bit - Vector DB: ChromaDB (persisted in
chroma_db/) - Embedding:
sentence-transformers/all-MiniLM-L6-v2