An automated system for identifying root causes of IT incidents using Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) with real OpenStack production data.
What it does:
- Downloads real OpenStack production logs
- Parses incidents from logs
- Builds a vector database of incidents
- Uses LLM to predict root causes of new incidents based on similar past incidents
- Achieves 70-75% accuracy while reducing analysis time by 80%+
Technologies:
- LLM: Llama 3.2 3B (via Ollama)
- Vector DB: ChromaDB
- Embeddings: SentenceTransformers
- Framework: LangChain
- Data: Real OpenStack Production Logs (~200 incidents)
Minimum:
- 16GB RAM
- GPU with 8GB VRAM (RTX 3060 or similar)
- 10GB free disk space
Tested on: HP Pavilion 15 (RTX 3060 12GB, 16GB RAM)
- Python 3.11+
- Ollama installed and running
Clone/download project
cd incident-rca-project
Create virtual environment
python -m venv venv
Activate venv
Install dependencies
pip install -r requirements.txt
Start Ollama (in separate terminal)
ollama serve
Pull LLM model (if not done)
ollama pull llama3.2:3b
python run_project.py
This executes all 4 steps:
- Download OpenStack logs
- Parse logs into structured format
- Build RAG system
- Evaluate performance