AI Engineer focused on Large Language Model (LLM) training, inference optimization, and agentic AI systems. Currently developing expertise in distributed fine-tuning (LoRA/DPO), GPU kernel optimization (CUDA/Triton), and Agentic AI frameworks (LangChain, Coze, AutoGen). Combines solid ML theory, PyTorch engineering, and system-level optimization to build scalable, high-efficiency AI solutions.
"Building a complete stack understanding from model math to GPU execution pipeline, and from fine-tuning to intelligent agent orchestration."
| Domain | Skills / Tools |
|---|---|
| LLM Training | PyTorch, HuggingFace Transformers, PEFT (LoRA), DeepSpeed, Megatron-LM |
| Alignment & RLHF | DPO, PPO, Reward Modeling, OpenRLHF, TRL |
| Inference Optimization | vLLM, TensorRT-LLM, AutoGPTQ, CUDA, Triton, Nsight Systems |
| Agentic AI | LangChain Agents, FAISS, Coze Studio, CrewAI, AutoGen |
| GPU & Systems | CUDA C++, PTX profiling, memory hierarchy tuning, quantization |
| Data Engineering | Python, Pandas, SentencePiece, FastAPI, Docker |
| Math Foundations | Linear Algebra, Probability, Information Theory (KL, CE), Optimization |



