Skip to content

vishalgunjalSWE/vishalgunjalswe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

GitHub Banner

Site Reliability Engineer | Platform Engineering | Distributed Systems

Early-career engineer with senior-level systems thinking. Building production-grade cloud platforms demonstrating reliability, observability, and automation principles used by Google, Netflix, and Uber.

Current: Production-grade Kubernetes platforms, SRE observability, GitOps at scale
Approach: Build systems that teach industry patterns, not tutorials
Philosophy: Infrastructure should be boring (reliable), not exciting (breaking)

💼 Open to opportunities: DevOps Engineer | SRE | Platform Engineer | Cloud Engineer
📍 Location: Pune, India (Open to Remote & Relocation)

Gmail    LinkedIn    Phone


🚀 What I Do

I specialize in cloud-native infrastructure and platform engineering, with hands-on experience building:

  • Microservices platforms on AWS EKS with event-driven architecture (RabbitMQ, Kafka)
  • Infrastructure as Code using Terraform with modular, reusable patterns
  • GitOps workflows with ArgoCD for declarative, drift-free deployments
  • Full observability stacks (Prometheus, Grafana, ELK, Jaeger)
  • DevSecOps pipelines with automated security scanning (SonarQube, Trivy)

💼 Current Focus

  • 🔨 Building production-grade DevOps projects demonstrating enterprise patterns
  • 📚 Deep-diving into Kubernetes (RBAC, Network Policies, Security, Operators)
  • 🔐 Implementing DevSecOps practices (shift-left security, policy-as-code)
  • 📊 Designing SRE observability systems (SLIs, SLOs, error budgets)
  • 🤖 Exploring MLOps and AI-driven infrastructure automation

🧠 Engineering Philosophy (Borrowed from Google SRE)

Principle What It Means How I Apply It
Everything Fails Design for failure, not success Multi-AZ, circuit breakers, graceful degradation
Toil is the Enemy Automate repetitive work GitOps, drift detection, self-healing
Observability ≠ Monitoring Understand unknowns Distributed tracing, correlation IDs, SLOs
Security by Default Zero trust RBAC, Network Policies, no hardcoded secrets
Error Budgets Balance velocity and reliability SLI/SLO tracking, controlled risk

I don't believe in:

  • Manual deployments ("works on my machine" syndrome)
  • Infrastructure without monitoring
  • Code without tests or automation without guardrails

🛠️ Tech Stack


AWS

Azure

GCP

Kubernetes

Docker

Terraform

Ansible

Helm

Linux

Jenkins

GitHub Actions

ArgoCD

Grafana

Prometheus

ELK

Go

Python

Git

🌟 What Sets Me Apart (Early Career with Senior Thinking)

1. I Think in Systems, Not Tools

Most engineers: "I know Docker, Kubernetes, Terraform"

Me: "I understand distributed systems failure modes and design infrastructure that degrades gracefully. I use Kubernetes for declarative state reconciliation and self-healing, not because it's trendy."

2. I Design for Failure

Most engineers: "My app works in testing"

Me: "I've tested:

  • What happens when RabbitMQ goes down? (DLQ prevents message loss)
  • What if Redis crashes? (Cache-aside handles misses)
  • What if AWS loses an AZ? (Multi-AZ with auto-failover)"

3. I Document Decisions

Most engineers: "I built it"

Me: "I documented:

  • WHY I chose RabbitMQ over Kafka (trade-off analysis)
  • Architecture diagrams (system design)
  • Runbooks (production operations)
  • What I learned from failures"

🧠 Systems Thinking & Engineering

I explore the trade-offs in distributed systems, documenting my journey from "how it works" to "why it breaks."

"I'm fascinated by systems that scale, self-heal, and never go down."

I document the "why" behind my code — deep dives into Engineering Systems, FinOps, Scalability, and SRE practices.

Medium


🌐 Community Involvement

Active Participation:

  • Google Developer Group (GDG) Pune - Cloud-native discussions, hands-on labs
  • CNCF Community - Kubernetes, service mesh, observability
  • AWS User Group Pune - Best practices, architecture patterns
  • Atlassian Community Pune - CI/CD, DevOps automation

🎯 2026 Goals

Technical:

  • ✅ Build 4 production-grade cloud platforms (End-to-End)
  • 🔄 Contribute to CNCF projects (Kubernetes, Prometheus, ArgoCD)
  • 📚 Deep-dive into Kubernetes operators and CRDs
  • 🔐 Master service mesh (Istio/Linkerd) and zero-trust networking
  • 🤖 Explore MLOps and infrastructure for ML workloads

Professional:

  • 📝 Publish 25+ in-depth technical articles
  • 🎤 Present at CNCF Pune and AWS User Group
  • 💼 Land first DevOps/SRE role as a early-career engineer
  • 🌟 Contribute to open-source (Kubernetes, Terraform providers, Helm charts)

Learning:

  • 📖 Complete AWS DevOps Professional certification
  • 📖 Complete CKA and CKS certifications before 2027
  • 📖 Study distributed systems papers (Raft, Paxos, CAP theorem)

💡 Questions That Keep Me Up at Night

→ How does Kubernetes handle split-brain in etcd?
→ What's the optimal error budget for a new service?
→ How do you design alerts that don't cause fatigue?
→ What's the CAP theorem trade-off in my architecture?
→ How would Netflix design this system?
→ What's the failure mode I haven't considered?

I don't just want to use tools. I want to understand the engineering decisions behind them.

Currently Reading:

  • 📖 Site Reliability Engineering (Google SRE Book)
  • 📖 Designing Data-Intensive Applications (Martin Kleppmann)
  • 📖 Kubernetes Patterns (Bilgin Ibryam)
  • 📖 Raft Consensus Paper (understanding distributed systems)

📈 Contribution Graph

Vishal's github activity graph


Profile Views

⚡ "Automation is not about replacing humans, it's about freeing them to do what they do best."


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published