π― DevOps / MLOps / AIOps Engineer with over 7 years of total technical experience (2+ years dedicated DevOps & Cloud). I transitioned from a Mechanical R&D and HPC background to DevOps and AI Infrastructure engineering, blending deep analytical skills with modern platform engineering practices.
- π€ Hands-on MLOps and LLM infrastructure engineer β fine-tuned open-source LLMs (Qwen, Gemma), performed NVFP4 and Gemma4 model quantization, and deployed production AI workloads on NVIDIA DGX Blackwell (GB10) GPU servers.
- βΈοΈ Expert in Kubernetes platform engineering β self-managed on-premises clusters (kubeadm), EKS, GKE, DMZ cluster architecture, RBAC, network policies, and production incident resolution.
- π§ Passionate about automating everything β from provisioning infrastructure to setting up monitoring, CI/CD pipelines, and ML model lifecycle management.
- π§ Expert in Linux systems (RHEL & Debian) with experience automating complex batch processes, HPC server management, and GPU compute resource scheduling.
- βοΈ Skilled across cloud platforms: AWS, GCP, and Azure β with hands-on Terraform IaC across all three.
- π Designed robust CI/CD pipelines using Jenkins, GitLab CI, and GitHub Actions, reducing deployment time by 60%+ and enabling 100% automated rollouts.
- π¦ Built and deployed applications using Docker, Kubernetes (self-managed and EKS), Helm, and ArgoCD GitOps.
- π Advocate of DevSecOps β integrating Vault, Trivy, SonarQube, and Istio mTLS into delivery pipelines; authored security architecture documents for enterprise clients.
- π Built production observability stacks (Prometheus, Grafana, ELK) with SLO/SLI dashboards β reducing MTTD by 40% and MTTR by 35%.
- π Forward deployed engineer β visited client sites to deliver infrastructure upgrades, lead technical discussions with client leadership, and architect on-premises AI platforms.
- π οΈ Tools I love: Terraform, Ansible, Python, Go, Bash, ArgoCD, Helm, MLflow, DVC, vLLM, Ollama
- Languages: Python, Go, Bash, Shell, JavaScript, C
- AI / MLOps: LLM Fine-tuning (Qwen, Gemma), NVFP4 Quantization, Gemma4, vLLM, Ollama, MLflow, DVC, ClickHouse, NVIDIA GPU Scheduling, DGX Blackwell (GB10)
- Infra as Code: Terraform, Ansible, Bicep (learning), GitOps
- Containers: Docker, Kubernetes (On-Prem kubeadm, EKS, GKE), Helm, containerd
- CI/CD: Jenkins, GitHub Actions, GitLab CI, ArgoCD, Cloud Build, Spinnaker
- Monitoring / SRE: Prometheus, Grafana, ELK Stack, New Relic, AWS CloudWatch, GCP Monitoring, SLO/SLI design
- Security: HashiCorp Vault, Istio (mTLS), Trivy, SonarQube, AWS/GCP IAM, Kubernetes RBAC, firewalld
- Networking: HAProxy, Nginx, Calico CNI, VPC design, DNS (Route 53), Load Balancing
- Cloud Platforms: AWS, GCP, Azure, IBM Cloud, OCI, DigitalOcean
- Data / Streaming: Apache Kafka, MongoDB, PostgreSQL, Redis, BigQuery, ClickHouse
- Build Tools: Maven, NPM, Uvicorn
- Version Control: Git, GitHub, GitLab
- LLM Serving: vLLM, Ollama
- LLM Fine-tuning: Qwen series, Gemma series (supervised fine-tuning for domain adaptation)
- Quantization: NVFP4, Gemma4 quantization (VRAM optimization)
- GPU Infra: NVIDIA DGX Blackwell (GB10), GPU Kubernetes node scheduling
- ML Lifecycle: MLflow (experiment tracking), DVC (data versioning)
- Metadata Store: ClickHouse
- Orchestration: Kubernetes GPU workloads, resource limits and requests, node affinity
- Frameworks: LangChain, LangGraph (learning)
- Vision AI: NVIDIA DeepStream (hands-on learning), CCTV-based analytics platforms
"Automate what you can. Monitor what you can't. Improve what matters."
π Portfolio Website
Feel free to connect with me on LinkedIn.
I enjoy exploring new technologies and building cool projects in my free time.
Web development, Robotics, Aquarist.
Thanks for visiting my GitHub profile! π

