Skip to content

IaC for going from empty disks to running HA homelab cluster managed using GitOps within 2(-ish) clicks

License

Notifications You must be signed in to change notification settings

anokfireball/homelab-as-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Homelab-as-Code logo

Cluster Health Cluster Uptime

Bootstrap and GitOps sources to get my baremetal homelab set up consistently.

🏑 Homelab-as-Code (HaCβ„’)

This repository was born out of the need to better manage an ever-growing homelab environment. After starting with a simple single-node Docker Compose setup, the increasing number of services began to make maintenance and updates more challenging.

As the complexity grew, it became clear that a more structured, Infrastructure-as-Code approach was needed to:

  • keep configurations versioned and thus better documented
  • make deployments more consistently repeatable and reliable
  • simplify the process of adding new services without losing track of the overall state
  • enable easier backup and disaster recovery that is centrally managed
  • provide better scalability and resilience beyond a single node

I decided to take this opportunity to properly learn Kubernetes hands-on, embracing the complexity and "feeling the pain" that comes with it rather than just having the theoretical knowledge. This repo serves as both documentation of my setup as well as a real-world learning experience in managing infrastructure that I rely upon as code.

PS: This setup is mature enough to be girlfriend-approved. πŸ˜‰

πŸ”° Overview

At the highest possible level, this repo and HaC workflow consists of three parts:

  • cloud-init contains the stage 1 bootstrapping for the cluster nodes. This includes only the very basic OS-level configuration required for the other stages of this workflow. The contained shell script creates all files required to install the OS via network boot and without user interaction. Triggering the network-boot installation is out-of-scope for the moment. After completion of the cloud-init autoinstall, all nodes reboot and are ready to accept SSH connections.
  • ansible/cluster contains the stage 2 system configuration for the cluster nodes. This includes a range of tasks including power management, networking setup, and most importantly bootstrapping the kubernetes cluster using kubeadm. The included Ansible playbook performs the required tasks on the nodes via SSH and a dedicated Ansible user created in the previous step. After completion of this stage, the Kubernetes cluster is set up with HA control planes, joined worker nodes, dual-stack CNI, almost working OIDC authn, and last but not least a bootstrapped GitOps setup that is ready to start reconciling.
  • flux contains the final stage 3 GitOps cluster configuration. This includes everything running inside kubernetes in the cluster and ranges from basic system infrastructure like load balancer, ingress, and CSI to more user-style applications such as password manager and file management apps. The included Flux kustomizations are automatically installed and/or reconciled on the cluster without* user interaction. This process is staggered since there is an inherent dependency between some of the components. After completion of this stage, the cluster is fully set up and ready for use.

In addition to the core homelab IaC, there is one more loosely related stage:

  • ansible/gateway contains system configuration for a remotely hosted ingress gateway used to expose select services publicly. This includes setup of GitOps outside of Kubernetes, mesh networking outside of Kubernetes, and a reverse proxy with ACME support. The included Ansible playbook performs the required tasks on a manually provisioned gateway via SSH and a dedicated Ansible user also created manually. After completion of this stage, the public gateway is set up and ready to reverse proxy connections to the cluster.

πŸ“ Tech Stack

Component Purpose Notes
Ubuntu Server 24.04 Base Operating System
cloud-init Headless OS Installation see cloud-init/README.md
Ansible OS Configuration
kubeadm k8s Distribution / Install Mechanism stacked HA controlplanes
containerd OCI Runtime
Calico CNI dual-stack nodes and services
kube-vip Virtual IP for controlplane Nodes used in L2/ARP mode
Flux2 GitOps Automation inside the Cluster
SOPS Secrets Management age rather than PGP, but not any more user-friendly

πŸ“± Applications

πŸ€– System-Level

Name Purpose Notes
metallb Cloud-Native Service LoadBalancer used in L2/ARP mode, so only VIP rather than true LB
external-dns DNS Management Automation split-horizon realized using opnsense webhook
cert-manager Automated Certificate Management Let's Encrypt via ACME DNS
ingress-nginx Ingress Controller
Kyverno Policy Engine
Spegel Cluster-Internal P2P Container Image Distribution basically mandates the use of digests or good pinning
longhorn Cloud-Native Distributed Block Storage CSI
democratic-csi CSI for Common External Storage Systems using the freenas-nfs implementation
Renovate Bot Dependency Update Automation used for multiple repos, not just this one
k8up Cloud-Native Backup/Restore
CloudNativePG Cloud-Native PostgreSQL Operator
Grafana Monitoring and Observability
Prometheus Metrics Aggregation and Storage
Loki Log Aggregation and Storage
Scrutiny Drive Health Monitoring via SMART
descheduler Pod Eviction for Node Balancing
reloader Hot-Reload for ALL Workloads
Dex OIDC Provider used for API server authentication
Tailscale Overlay Mesh VPN Operator
metrics-server Metrics API
Goldilocks Resource Recommendation Engine
Vertical Pod Autoscaler Workload Resource Scaler used exclusively for Goldilocks recommendations

πŸ‘¨β€πŸ’» User-Level

Name Purpose Notes
Pi-hole Filtering DNS Proxy Pi-hole Uptime
Nextcloud File Storage and Management Nextcloud Uptime
Vaultwarden API-compatible Password Manager Vaultwarden Uptime
Immich Photo/Video Storage and Management Immich Uptime
Paperless-ngx Document Management System Paperkess-ngx Uptime
Firefly III Personal Finance Manager including importer and pico
Firefly III Uptime
KitchenOwl Recipe and Grocery Manager KitchenOwl Uptime
Homepage Application Dashboard Homepage Uptime
Fresh-RSS RSS Aggregator Fresh-RSS Uptime
RSS-Bridge Unofficial RSS Feeds of ANY Source any as long as you know some PHP
RSS-Bridge Uptime
Soundcloud Scraper Parser + Webhook for my Soundcloud Feed Soundcloud Scraper Uptime
Stirling PDF Swiss-Army Knife for PDFs Stirling PDF Uptime
Excalidraw Virtual Whiteboard Excalidraw Uptime
UniFi Network Application AP Administration and Management UniFi Uptime
OPNsense Prefix Updater Update Network Configs with Latest Non-Static IPv6 OPNsense Prefix Updater Uptime
n8n Workflow Automation freemium/open core
n8n Uptime
Jellyfin Media Streaming and Management Jellyfin Uptime
Gluetun VPN Gateway
qBittorrent Torrent Client qBittorrent Uptime
SABnzbd Usenet Client SABnzbd Uptime
Prowlarr Torrent & Usenet Indexer Engine Prowlarr Uptime
Radarr Movie Management Radarr Uptime
Sonarr TV Show Management Sonarr Uptime
Lidarr Music Management Lidarr Uptime
FlareSolverr Cloudflare Protection Bypass FlareSolverr Uptime

☁️ Cloud Dependencies

While the ultimate goal is to have as self-sufficient of a setup as possible, some external services are still required for proper operation.

Service Purpose Notes
GitHub Git Repository Hosting, GitOps Source
INWX Domain Registrar
Cloudflare Public DNS Auth Hosting
Let's Encrypt SSL Certificates
netcup Public Reverse-Proxy for Select Services
BackBlaze Cloud Storage for Backups the "3" in 3-2-1 for the really important data
TailScale Overlay Mesh VPN used for split-horizon and a direct route back home
VPN Provider VPN Gateway unassociated external IP for all the Linux ISOs

About

IaC for going from empty disks to running HA homelab cluster managed using GitOps within 2(-ish) clicks

Topics

Resources

License

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •