Production-grade AI platform for engineering and customer intelligence.
Built by Waqar Azim | GitHub
A complete AI Operating System that analyzes codebases and customer data to provide actionable intelligence for engineering teams.
| Feature | Technology | Description |
|---|---|---|
| Repository Analysis | GitPython, PyDriller, Radon | Clone repos, extract metrics, analyze commits |
| Bug Risk Prediction | Scikit-learn, Gradient Boosting | Predict which files will have bugs |
| Churn Prediction | ML Classification | Identify customers likely to leave |
| Sentiment Analysis | NLP, TF-IDF | Analyze customer feedback tone |
| Knowledge Graph | NetworkX | Link code, bugs, customers, and feedback |
| RAG Q&A | OpenAI, ChromaDB | Answer questions with grounded sources |
| Report Generation | LLM Prompts | Auto-generate CTO weekly reports |
| REST API | FastAPI, Pydantic | Production-grade API with OpenAPI docs |
Tested on real repositories from Google and Microsoft.
| Metric | Result |
|---|---|
| Files Analyzed | 75 |
| Commits Processed | 100 |
| Lines of Code | 6,321 |
| Risk Score | 3.4% |
| Knowledge Graph | 189 nodes |
| Metric | Result |
|---|---|
| Files Analyzed | 337 |
| Commits Processed | 50 |
| C++ Lines | 31,066 |
| C# Lines | 14,836 |
| Risk Score | 0.1% |
| Knowledge Graph | 431 nodes |
┌─────────────────────────────────────────────────────────────────┐
│ Data Sources │
│ GitHub Repos │ CSV Feedback │ System Logs │
└──────────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Ingestion Layer │
│ GitPython │ PyDriller │ Pandas │ Async Processing │
└──────────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Feature Engineering │
│ Code Metrics │ Embeddings │ TF-IDF │ Time-Series │
└──────────────────────────────┬──────────────────────────────────┘
│
┌────────────────┼────────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌──────────────┐ ┌──────────────────┐
│ ML Models │ │ Knowledge │ │ LLM Engine │
│ │ │ Graph │ │ │
│ • Bug Predictor │ │ │ │ • RAG Pipeline │
│ • Churn Model │ │ • NetworkX │ │ • OpenAI GPT-4 │
│ • Anomaly Det. │ │ • 7 node │ │ • Confidence │
│ │ │ types │ │ Scoring │
└────────┬─────────┘ └──────┬───────┘ └────────┬─────────┘
│ │ │
└───────────────────┼───────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ API Layer │
│ FastAPI │ Pydantic │ JWT Auth │ Rate Limiting │
└──────────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Frontend │
│ Next.js │ React │ TypeScript │ Dashboard Components │
└─────────────────────────────────────────────────────────────────┘
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Service health check |
/api/v1/ingest/github |
POST | Ingest GitHub repository |
/api/v1/ingest/csv |
POST | Ingest customer feedback |
/api/v1/dashboard |
GET | Dashboard metrics |
/api/v1/ask |
POST | Natural language Q&A |
/api/v1/reports/generate |
POST | Generate reports |
# Clone
git clone https://github.com/Waqar53/SentinelOS.git
cd SentinelOS
# Docker
docker-compose up -d
# Or local
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python -m uvicorn backend.api.main:app --port 8000API: http://localhost:8000
Docs: http://localhost:8000/docs
SentinelOS/
├── backend/
│ ├── api/ # FastAPI REST endpoints
│ ├── ingestion/ # GitHub + CSV data connectors
│ ├── analyzers/ # Feature extraction
│ ├── ml/models/ # Bug + Churn predictors
│ ├── llm/ # RAG + OpenAI integration
│ ├── graph/ # Knowledge graph (NetworkX)
│ ├── evaluation/ # Model accuracy tracking
│ └── core/ # Config, logging, security
├── frontend/ # Next.js dashboard
├── docker/ # Dockerfiles
├── docs/ # Documentation
└── data/ # Sample data
- Gradient Boosting for bug risk prediction
- Random Forest for churn classification
- TF-IDF + K-Means for feedback clustering
- Feature engineering from 10+ code metrics
- 7 node types: code_file, commit, customer, feedback, region, feature, module
- Graph traversal for contextual retrieval
- Relationship queries for impact analysis
- RAG (Retrieval Augmented Generation) for grounded answers
- Confidence scoring with uncertainty detection
- Source attribution for every response
- Prompt versioning for audit trails
- Multi-tenant architecture with data isolation
- Structured JSON logging with request tracing
- JWT authentication + API key support
- Rate limiting per tenant
| Category | Technologies |
|---|---|
| Backend Development | Python, FastAPI, Async/Await, REST API Design |
| Machine Learning | Scikit-learn, Feature Engineering, Model Evaluation |
| NLP | Sentiment Analysis, TF-IDF, Text Clustering |
| LLM/AI | OpenAI API, RAG, Prompt Engineering, Embeddings |
| Data Engineering | Pandas, Data Pipelines, ETL |
| Graph Databases | NetworkX, Knowledge Graphs |
| Frontend | React, Next.js, TypeScript |
| DevOps | Docker, Docker Compose, CI/CD |
| Database | PostgreSQL, Redis, ChromaDB |
| Software Architecture | Clean Architecture, SOLID, DDD |
MIT License - Waqar Azim
Built with passion for AI and engineering excellence.