|
| 1 | +--- |
| 2 | +title: Kafka on Kubernetes - The Hard Way |
| 3 | +shorttitle: E2E Tutorial |
| 4 | +weight: 990 |
| 5 | +--- |
| 6 | + |
| 7 | +Inspired by Kelsey Hightower's [kubernetes-the-hard-way](https://github.com/kelseyhightower/kubernetes-the-hard-way), this comprehensive tutorial walks you through setting up a complete Kafka environment on Kubernetes using the Koperator from scratch. |
| 8 | + |
| 9 | +## What You'll Learn |
| 10 | + |
| 11 | +This tutorial will teach you how to: |
| 12 | + |
| 13 | +- Set up a multi-node Kubernetes cluster using kind |
| 14 | +- Install and configure all required dependencies manually |
| 15 | +- Deploy a production-ready Kafka cluster with monitoring |
| 16 | +- Test and validate your Kafka deployment |
| 17 | +- Handle disaster recovery scenarios |
| 18 | +- Troubleshoot common issues |
| 19 | + |
| 20 | +## Why "The Hard Way"? |
| 21 | + |
| 22 | +This tutorial is called "the hard way" because it walks through each step manually rather than using automated scripts or simplified configurations. This approach helps you understand: |
| 23 | + |
| 24 | +- How each component works and interacts with others |
| 25 | +- The dependencies and relationships between services |
| 26 | +- How to troubleshoot when things go wrong |
| 27 | +- The complete architecture of a Kafka deployment on Kubernetes |
| 28 | + |
| 29 | +## Prerequisites |
| 30 | + |
| 31 | +Before starting this tutorial, you should have: |
| 32 | + |
| 33 | +- Basic knowledge of Kubernetes concepts (pods, services, deployments) |
| 34 | +- Familiarity with Apache Kafka fundamentals |
| 35 | +- A local development machine with Docker installed |
| 36 | +- At least 8GB of RAM and 4 CPU cores available for the kind cluster |
| 37 | + |
| 38 | +## Tutorial Structure |
| 39 | + |
| 40 | +This tutorial is organized into the following sections: |
| 41 | + |
| 42 | +1. **[Prerequisites and Setup]({{< relref "prerequisites.md" >}})** - Install required tools and prepare your environment |
| 43 | +2. **[Kubernetes Cluster Setup]({{< relref "cluster-setup.md" >}})** - Create a multi-node kind cluster with proper labeling |
| 44 | +3. **[Dependencies Installation]({{< relref "dependencies.md" >}})** - Install cert-manager, ZooKeeper operator, and Prometheus operator |
| 45 | +4. **[Koperator Installation]({{< relref "koperator-install.md" >}})** - Install the Kafka operator and its CRDs |
| 46 | +5. **[Kafka Cluster Deployment]({{< relref "kafka-deployment.md" >}})** - Deploy and configure a Kafka cluster with monitoring |
| 47 | +6. **[Testing and Validation]({{< relref "testing.md" >}})** - Create topics, run producers/consumers, and performance tests |
| 48 | +7. **[Disaster Recovery Scenarios]({{< relref "disaster-recovery.md" >}})** - Test failure scenarios and recovery procedures |
| 49 | +8. **[Troubleshooting]({{< relref "troubleshooting.md" >}})** - Common issues and debugging techniques |
| 50 | + |
| 51 | +## Architecture Overview |
| 52 | + |
| 53 | +By the end of this tutorial, you'll have deployed the following architecture: |
| 54 | + |
| 55 | +``` |
| 56 | +┌─────────────────────────────────────────────────────────────────┐ |
| 57 | +│ Kubernetes Cluster (kind) │ |
| 58 | +├─────────────────────────────────────────────────────────────────┤ |
| 59 | +│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ |
| 60 | +│ │ Control Plane │ │ Worker AZ1 │ │ Worker AZ2 │ │ |
| 61 | +│ │ │ │ │ │ │ │ |
| 62 | +│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ |
| 63 | +│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ |
| 64 | +│ │ Worker AZ3 │ │ Worker AZ1 │ │ Worker AZ2 │ │ |
| 65 | +│ │ │ │ │ │ │ │ |
| 66 | +│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ |
| 67 | +├─────────────────────────────────────────────────────────────────┤ |
| 68 | +│ Applications │ |
| 69 | +├─────────────────────────────────────────────────────────────────┤ |
| 70 | +│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ |
| 71 | +│ │ Kafka Cluster │ │ ZooKeeper │ │ Monitoring │ │ |
| 72 | +│ │ (3 brokers) │ │ (3 nodes) │ │ Stack │ │ |
| 73 | +│ │ │ │ │ │ │ │ |
| 74 | +│ │ ┌─────────────┐│ │ ┌─────────────┐│ │ ┌─────────────┐│ │ |
| 75 | +│ │ │ Broker 101 ││ │ │ ZK-0 ││ │ │ Prometheus ││ │ |
| 76 | +│ │ │ Broker 102 ││ │ │ ZK-1 ││ │ │ Grafana ││ │ |
| 77 | +│ │ │ Broker 201 ││ │ │ ZK-2 ││ │ │ AlertMgr ││ │ |
| 78 | +│ │ │ Broker 202 ││ │ └─────────────┘│ │ └─────────────┘│ │ |
| 79 | +│ │ │ Broker 301 ││ └─────────────────┘ └─────────────────┘ │ |
| 80 | +│ │ │ Broker 302 ││ │ |
| 81 | +│ │ └─────────────┘│ │ |
| 82 | +│ └─────────────────┘ │ |
| 83 | +├─────────────────────────────────────────────────────────────────┤ |
| 84 | +│ Infrastructure │ |
| 85 | +├─────────────────────────────────────────────────────────────────┤ |
| 86 | +│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ |
| 87 | +│ │ cert-manager │ │ Koperator │ │ Cruise │ │ |
| 88 | +│ │ │ │ │ │ Control │ │ |
| 89 | +│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ |
| 90 | +└─────────────────────────────────────────────────────────────────┘ |
| 91 | +``` |
| 92 | + |
| 93 | +## Key Features Demonstrated |
| 94 | + |
| 95 | +This tutorial demonstrates: |
| 96 | + |
| 97 | +- **Multi-AZ deployment** with rack awareness |
| 98 | +- **SSL/TLS encryption** for secure communication |
| 99 | +- **Monitoring and alerting** with Prometheus and Grafana |
| 100 | +- **Automatic scaling** with Cruise Control |
| 101 | +- **Persistent storage** with proper volume management |
| 102 | +- **External access** configuration |
| 103 | +- **Disaster recovery** and failure handling |
| 104 | + |
| 105 | +## Time Commitment |
| 106 | + |
| 107 | +Plan to spend approximately 2-3 hours completing this tutorial, depending on your familiarity with the tools and concepts involved. |
| 108 | + |
| 109 | +## Getting Started |
| 110 | + |
| 111 | +Ready to begin? Start with the [Prerequisites and Setup]({{< relref "prerequisites.md" >}}) section. |
| 112 | + |
| 113 | +--- |
| 114 | + |
| 115 | +> **Note**: This tutorial is designed for learning and development purposes. For production deployments, consider using automated deployment tools and following your organization's security and operational guidelines. |
0 commit comments