Platform engineer specializing in building and operating resilient cloud infrastructure on AWS and GCP. Expertise in Kubernetes orchestration, infrastructure as code with Terraform, and comprehensive observability engineering using Prometheus, Grafana, and Elastic Stack.
Proficient in Python and Bash for automation. Focused on reliability engineering, scalable architectures, and data-driven incident response.
- Architected and deployed production-grade infrastructure on AWS and GCP using Terraform and Kubernetes
- Implemented end-to-end CI/CD pipelines with GitHub Actions, ArgoCD, Flux, and Ansible for automated deployments
- Designed and operated comprehensive observability platforms (Prometheus, Grafana, Jaeger, Elastic Stack) with distributed tracing and log aggregation using Fluentd
- Deployed and managed service mesh architectures (Istio, Linkerd) for traffic management, security, and observability
- Optimized Kubernetes workloads with Helm charts, resource management, health checks, HPA, and cluster autoscaling strategies
- Led containerization initiatives for legacy applications, improving deployment velocity and system reliability
- Established GitOps practices with ArgoCD and Flux for declarative, version-controlled infrastructure management
- Implemented security best practices with HashiCorp Vault for secrets management, Falco for runtime security, and OPA for policy enforcement
- Delivered incident response, root cause analysis documentation, and blameless postmortem processes
- Built custom monitoring solutions with alerting rules, SLO tracking, and automated remediation workflows

