Skip to content
View vishalgunjalSWE's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report vishalgunjalSWE

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
vishalgunjalswe/README.md

GitHub Banner

Site Reliability Engineer | Platform Engineering | Distributed Systems

Early-career engineer with senior-level systems thinking. Building production-grade cloud platforms demonstrating reliability, observability, and automation principles used by Google, Netflix, and Uber.

Current: Production-grade Kubernetes platforms, SRE observability, GitOps at scale
Approach: Build systems that teach industry patterns, not tutorials
Philosophy: Infrastructure should be boring (reliable), not exciting (breaking)

πŸ’Ό Open to opportunities: DevOps Engineer | SRE | Platform Engineer | Cloud Engineer
πŸ“ Location: Pune, India (Open to Remote & Relocation)

Gmail Β Β  LinkedIn Β Β  Phone


πŸš€ What I Do

I specialize in cloud-native infrastructure and platform engineering, with hands-on experience building:

  • Microservices platforms on AWS EKS with event-driven architecture (RabbitMQ, Kafka)
  • Infrastructure as Code using Terraform with modular, reusable patterns
  • GitOps workflows with ArgoCD for declarative, drift-free deployments
  • Full observability stacks (Prometheus, Grafana, ELK, Jaeger)
  • DevSecOps pipelines with automated security scanning (SonarQube, Trivy)

πŸ’Ό Current Focus

  • πŸ”¨ Building production-grade DevOps projects demonstrating enterprise patterns
  • πŸ“š Deep-diving into Kubernetes (RBAC, Network Policies, Security, Operators)
  • πŸ” Implementing DevSecOps practices (shift-left security, policy-as-code)
  • πŸ“Š Designing SRE observability systems (SLIs, SLOs, error budgets)
  • πŸ€– Exploring MLOps and AI-driven infrastructure automation

🧠 Engineering Philosophy (Borrowed from Google SRE)

Principle What It Means How I Apply It
Everything Fails Design for failure, not success Multi-AZ, circuit breakers, graceful degradation
Toil is the Enemy Automate repetitive work GitOps, drift detection, self-healing
Observability β‰  Monitoring Understand unknowns Distributed tracing, correlation IDs, SLOs
Security by Default Zero trust RBAC, Network Policies, no hardcoded secrets
Error Budgets Balance velocity and reliability SLI/SLO tracking, controlled risk

I don't believe in:

  • Manual deployments ("works on my machine" syndrome)
  • Infrastructure without monitoring
  • Code without tests or automation without guardrails

πŸ› οΈ Tech Stack


AWS

Azure

GCP

Kubernetes

Docker

Terraform

Ansible

Helm

Linux

Jenkins

GitHub Actions

ArgoCD

Grafana

Prometheus

ELK

Go

Python

Git

🌟 What Sets Me Apart (Early Career with Senior Thinking)

1. I Think in Systems, Not Tools

Most engineers: "I know Docker, Kubernetes, Terraform"

Me: "I understand distributed systems failure modes and design infrastructure that degrades gracefully. I use Kubernetes for declarative state reconciliation and self-healing, not because it's trendy."

2. I Design for Failure

Most engineers: "My app works in testing"

Me: "I've tested:

  • What happens when RabbitMQ goes down? (DLQ prevents message loss)
  • What if Redis crashes? (Cache-aside handles misses)
  • What if AWS loses an AZ? (Multi-AZ with auto-failover)"

3. I Document Decisions

Most engineers: "I built it"

Me: "I documented:

  • WHY I chose RabbitMQ over Kafka (trade-off analysis)
  • Architecture diagrams (system design)
  • Runbooks (production operations)
  • What I learned from failures"

🧠 Systems Thinking & Engineering

I explore the trade-offs in distributed systems, documenting my journey from "how it works" to "why it breaks."

"I'm fascinated by systems that scale, self-heal, and never go down."

I document the "why" behind my code β€” deep dives into Engineering Systems, FinOps, Scalability, and SRE practices.

Medium


🌐 Community Involvement

Active Participation:

  • Google Developer Group (GDG) Pune - Cloud-native discussions, hands-on labs
  • CNCF Community - Kubernetes, service mesh, observability
  • AWS User Group Pune - Best practices, architecture patterns
  • Atlassian Community Pune - CI/CD, DevOps automation

🎯 2026 Goals

Technical:

  • βœ… Build 4 production-grade cloud platforms (End-to-End)
  • πŸ”„ Contribute to CNCF projects (Kubernetes, Prometheus, ArgoCD)
  • πŸ“š Deep-dive into Kubernetes operators and CRDs
  • πŸ” Master service mesh (Istio/Linkerd) and zero-trust networking
  • πŸ€– Explore MLOps and infrastructure for ML workloads

Professional:

  • πŸ“ Publish 25+ in-depth technical articles
  • 🎀 Present at CNCF Pune and AWS User Group
  • πŸ’Ό Land first DevOps/SRE role as a early-career engineer
  • 🌟 Contribute to open-source (Kubernetes, Terraform providers, Helm charts)

Learning:

  • πŸ“– Complete AWS DevOps Professional certification
  • πŸ“– Complete CKA and CKS certifications before 2027
  • πŸ“– Study distributed systems papers (Raft, Paxos, CAP theorem)

πŸ’‘ Questions That Keep Me Up at Night

β†’ How does Kubernetes handle split-brain in etcd?
β†’ What's the optimal error budget for a new service?
β†’ How do you design alerts that don't cause fatigue?
β†’ What's the CAP theorem trade-off in my architecture?
β†’ How would Netflix design this system?
β†’ What's the failure mode I haven't considered?

I don't just want to use tools. I want to understand the engineering decisions behind them.

Currently Reading:

  • πŸ“– Site Reliability Engineering (Google SRE Book)
  • πŸ“– Designing Data-Intensive Applications (Martin Kleppmann)
  • πŸ“– Kubernetes Patterns (Bilgin Ibryam)
  • πŸ“– Raft Consensus Paper (understanding distributed systems)

πŸ“ˆ Contribution Graph

Vishal's github activity graph


Profile Views

⚑ "Automation is not about replacing humans, it's about freeing them to do what they do best."


Pinned Loading

  1. Kubernetes-Directive-A-DevOps-Journey Kubernetes-Directive-A-DevOps-Journey Public

    JavaScript

  2. Docker-Directive-A-DevOps-Journey Docker-Directive-A-DevOps-Journey Public

    Roff

  3. Terraform-Directive-An-DevOps-Automation-Journey Terraform-Directive-An-DevOps-Automation-Journey Public

    HCL

  4. LLD-Low-Level-Design LLD-Low-Level-Design Public

    Java

  5. Linux-For-DevOps-SRE-Cloud Linux-For-DevOps-SRE-Cloud Public

    1