LLM Systems Engineer building production-grade AI infrastructure in Cuda & Python
Currently: M.Sc. AI Student | Specializing in scalable inference engines & advanced RAG systems
- 📚 Graph-based RAG with advanced reranking strategies
- 🤖 Multi-agent orchestration with LangGraph & CrewAI
- ⚡ LLM optimization via quantization, distillation & efficient fine-tuning
- 🔬 RLHF pipelines for specialized domain models
Active on HuggingFace sharing:
- Fine-tuned LLM configurations
- RAG system implementations & benchmarks
- Optimized inference setups for production use
Explore my repositories for practical implementations of cutting-edge AI research.
Open to collaborating on:
- Open-source AI infrastructure projects
- Research in efficient LLM systems
- Production-grade RAG implementations
Reach out: [email protected]