Declarative Talos Linux machine configs for the homelab clusters.
Talos runs standard Kubernetes (not k3s) as system containers managed by the Talos API.
talos/
├── patches/
│ ├── network.yaml # Shared: DNS, NTP
│ ├── podsecurity.yaml # Cluster-wide privileged PodSecurity admission (matches k3s behavior)
│ └── sops-age.sops.yaml # SOPS/Age key for Flux secret decryption (SOPS-encrypted)
└── README.md
Base configs are generated dynamically by scripts/provision-node.sh using talosctl gen config and patched with the files above.
To wipe a node and prepare it for reprovisioning (destroys all data and reboots):
# After provisioning, the talosconfig is automatically merged into ~/.talos/config
# No need to pass --talosconfig for subsequent commands
talosctl --nodes <node-ip> reset --graceful=false --rebootAfter reset, boot from Talos ISO and run provision-node.sh again. The script will generate fresh CAs, encrypt the new configs, and update ~/.talos/config.
If you leave the Talos USB plugged in, you can reprovision remotely without physical access. However, boot order matters:
- BIOS set to disk first → Normal reboots boot into the installed Talos OS. USB is ignored unless disk fails.
- BIOS set to USB first → Every reboot (including
talosctl reboot) boots into the ISO, not the installed OS.
Recommended setup: Set BIOS boot order to disk first, keep USB plugged in as fallback. To reprovision remotely:
- Run
talosctl reset --graceful=false --reboot— this wipes the disk and reboots - Since disk is now empty, BIOS falls through to USB and boots the ISO
- Run
provision-node.shto reinstall
If your BIOS is set to USB first, a normal talosctl reboot will boot back into the ISO. To boot into the installed OS instead, you must either change the BIOS boot order (via IPMI/BMC if available) or remove the USB.
# Generate a new config with secrets filled in
talosctl gen config <cluster-name> https://<node-ip>:6443 \
--output-dir /tmp/talos-gen \
--with-cluster-discovery=false \
--with-docs=false \
--with-examples=false \
--kubernetes-version v1.32.4
# Apply machine config with patches (network + hostname + optional secrets)
talosctl apply-config \
--insecure \
--nodes <node-ip> \
--file /tmp/talos-gen/controlplane.yaml \
--config-patch @talos/patches/network.yaml \
--config-patch @<(cat <<EOF
machine:
network:
hostname: <cluster-name>
EOF
)
# Bootstrap Kubernetes
talosctl bootstrap
# Get kubeconfig
talosctl kubeconfig /tmp/<cluster-name>-kubeconfig
kubectx <cluster-name>=admin@<cluster-name>
# Bootstrap Flux
export GITHUB_TOKEN=$(gh auth token)
flux bootstrap github \
--owner=nimser \
--repository=homelab \
--branch=main \
--path=clusters/<cluster-name># Once Tailscale extension is active, access node via Tailscale IP
talosctl --nodes <tailscale-ip> get membersAfter provisioning, each cluster's context is automatically merged into ~/.talos/config. Switch between clusters using talosctl config context:
talosctl config context rammus
talosctl config context karmaTo easily switch between the rammus and karma Kubernetes clusters, you can use kubectx.
Since the kubeconfigs are generated dynamically in /tmp and reset when your environment is rebuilt, we recommend managing the KUBECONFIG path in a local .envrc file (which is ignored by git).
Create or edit your .envrc file at the root of the project to look like this:
export KUBECONFIG=/tmp/rammus-kubeconfig:/tmp/karma-kubeconfigIf you lose your local dev environment, you do not need to reprovision the cluster. You can recover access using the SOPS-encrypted talosconfig backed up in this repository.
# 1. Decrypt the cluster's talosconfig using your SOPS Age key
sops -d clusters/karma/talosconfig.sops.yaml > /tmp/talosconfig
# 2. Fetch a fresh kubeconfig from the node
talosctl --talosconfig /tmp/talosconfig kubeconfig /tmp/karma-kubeconfig --nodes <node-ip>
# 3. Rename the context for easier switching
kubectx karma=admin@karma| Patch | Purpose |
|---|---|
network.yaml |
DNS servers (1.1.1.1, 1.0.0.1, 8.8.8.8), NTP (time.cloudflare.com) |
podsecurity.yaml |
Cluster-wide PodSecurity admission set to privileged — matches k3s behavior, no per-namespace labels needed |
sops-age.sops.yaml |
SOPS Age key for Flux secret decryption |
| Setting | Value |
|---|---|
| Kubernetes | Standard k8s v1.32.4 (NOT k3s) |
| Pod CIDR | 10.42.0.0/16 |
| Service CIDR | 10.43.0.0/16 |
| DNS Domain | cluster.local |
| Upstream DNS | 1.1.1.1, 1.0.0.1, 8.8.8.8 |
- Tolerations required: Single-node clusters need
node-role.kubernetes.io/control-planetolerations on all workloads. The Flux tolerations patch is inclusters/karma/flux-system/patches/controller-tolerations.yaml.
When reprovisioning a cluster, Tailscale machines can multiply (old entries persist). To prevent this:
- Automatic cleanup:
provision-node.shrunscleanup-tailscale.shbefore provisioning to remove stale machines with matching names. - Consistent hostnames: The
TS_HOSTNAMEenvironment variable is injected during provisioning, ensuring machines always register with the cluster name (e.g.,rammus,karma). - Stable DNS via MagicDNS: Tailscale IPs change on reprovision. Use MagicDNS names instead of IPs for stable DNS bindings:
- Talos nodes:
<cluster-name>.<tailnet>.ts.net(e.g.,rammus.example.ts.net) - Soft-serve service:
soft-serve.<tailnet>.ts.net - Find your tailnet name at https://login.tailscale.com/admin/dns
- Talos nodes:
No environment variables are strictly required to be exported for provision-node.sh to run the Tailscale cleanup.
Cleanup authentication uses OAuth only — credentials are automatically extracted
from infrastructure/configs/<cluster>/tailscale-operator/oauth-credentials.sops.yaml
and used to generate a short-lived API token. No manual API key management needed.
The OAuth client requires these scopes:
- Devices > Devices write — to list and delete machines
- Auth Keys > Auth Keys write — to generate auth keys (used by the operator)
Create the OAuth client at: https://login.tailscale.com/admin/settings/keys
Select scopes: devices:write, auth_keys:write. Tag it with tag:k8s-operator.
./scripts/cleanup-tailscale.sh <cluster-name>