You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This cluster uses [1Password Connect](https://developer.1password.com/docs/connect) and [External Secrets Operator](https://external-secrets.io/) to manage secrets.
166
-
167
-
1.**Generate 1Password Connect Credentials**: Follow the [1Password documentation](https://developer.1password.com/docs/connect/get-started#step-2-deploy-the-1password-connect-server) to generate your `1password-credentials.json` file and your access token.
# Watch ArgoCD sync status (STATUS should show 'Synced')
125
+
kubectl get applications -n argocd -w
193
126
194
-
### 6. Bootstrap ArgoCD & Deploy The Stack
195
-
This final step uses our "App of Apps" pattern to bootstrap the entire cluster. This is a multi-step process to avoid race conditions with CRD installation.
127
+
# Verify all pods are running (may take 10-15 minutes)
128
+
kubectl get pods -A
196
129
197
-
```bash
198
-
# 1. Apply the ArgoCD main components and CRDs
199
-
# This deploys the ArgoCD Helm chart, which creates the CRDs and controller.
kubectl get applications -n argocd -o custom-columns=NAME:.metadata.name,WAVE:.metadata.annotations.argocd\.argoproj\.io/sync-wave,STATUS:.status.sync.status
219
138
```
220
-
**That's it!** You have successfully and reliably bootstrapped the cluster.
221
-
222
-
### What Happens Next Automatically?
223
139
224
-
1. **ArgoCD Syncs Itself**: The `root` Application tells ArgoCD to sync the contents of `infrastructure/argocd/apps/`.
225
-
2. **Projects & AppSets Created**: ArgoCD creates the `AppProject`s and the three `ApplicationSet`s (`infrastructure`, `monitoring`, `my-apps`).
226
-
3. **Applications Discovered**: The `ApplicationSet`s scan the repository for any directories matching their defined paths (e.g., `infrastructure/*`, `monitoring/*`, `my-apps/*/*`) and create the corresponding ArgoCD `Application` resources.
227
-
4. **Cluster Reconciliation**: ArgoCD syncs all discovered applications, building the entire cluster state declaratively from Git.
140
+
**Full verification steps in [BOOTSTRAP.md](BOOTSTRAP.md#verification)**
228
141
229
-
## 🔍 Verification
230
-
After the final step, you can monitor the deployment and verify that everything is working correctly.
142
+
## 🛡️ Talos OS Features
143
+
-**No SSH**: All management via API (Omni UI or `talosctl`)
144
+
-**Immutable OS**: No package manager, no shell access
145
+
-**Declarative**: All config stored in Git or Omni
146
+
-**System Extensions**: GPU, storage drivers enabled at boot
147
+
-**Secure by Default**: Minimal attack surface
231
148
232
-
```bash
233
-
# Check Talos node health (run for each node)
234
-
talosctl health --nodes <node-ip>
149
+
### Node Management
235
150
236
-
# Watch ArgoCD sync status
237
-
# The `STATUS` column should eventually show `Synced` for all applications
238
-
kubectl get applications -n argocd -w
151
+
**Using Omni (Recommended):**
152
+
- Manage all nodes through Omni web UI
153
+
- Automated Talos upgrades
154
+
- Visual cluster health monitoring
155
+
- No manual `talosctl` commands needed
239
156
240
-
# Verify all pods are running across the cluster
241
-
# It may take 10-15 minutes for all images to pull and pods to become Ready.
242
-
kubectl get pods -A
243
-
244
-
# Check that secrets have been populated by External Secrets
245
-
kubectl get externalsecret -A
246
-
# You should see secrets like `cloudflare-api-credentials` in the `cert-manager` namespace
247
-
248
-
# Verify the Longhorn UI is accessible and backups are configured
249
-
kubectl get backuptarget -n longhorn-system
250
-
```
157
+
**Manual Talos:**
158
+
- See `iac/talos/` directory for configuration
159
+
- Use `talosctl` for node operations
160
+
- Requires `talhelper` for config generation
251
161
252
-
## 🛡️ Talos-Specific Notes
253
-
- **No SSH**: All management via `talosctl` API
254
-
- **Immutable OS**: No package manager, no shell
255
-
- **Declarative**: All config in Git, applied via Talhelper/Talosctl
256
-
- **System Extensions**: GPU, storage, and other drivers enabled via config
257
-
- **SOPS**: Used for encrypting Talos secrets
258
-
- **No plaintext secrets in Git**
259
-
260
-
#### Upgrading Nodes
261
-
When a new version of Talos is released or system extensions in `iac/talos/talconfig.yaml` are changed, follow this process to upgrade your nodes. This method uses the direct `upgrade` command to ensure the new system image is correctly applied, which is more reliable than `apply-config` for image changes.
262
-
263
-
**Important:** Always upgrade control plane nodes **one at a time**, waiting for each node to successfully reboot and rejoin the cluster before proceeding to the next. This prevents losing etcd quorum. Worker nodes can be upgraded in parallel after the control plane is healthy.
264
-
265
-
1. **Update Configuration**:
266
-
Modify `iac/talos/talconfig.yaml` with the new `talosVersion` or changes to `systemExtensions`.
267
-
268
-
2. **Ensure Environment is Set**:
269
-
Make sure your `TALOSCONFIG` variable is pointing to your generated cluster configuration file as described in the Quick Start.
270
-
271
-
3. **Upgrade a Control Plane Node**:
272
-
Run the following commands from the root of the repository. Replace `<node-name>` and `<node-ip>` with the target node's details. Run this for each control plane node sequentially.
273
-
274
-
```bash
275
-
# Example for the first control plane node
276
-
NODE_NAME="talos-cluster-control-00"
277
-
NODE_IP="192.168.10.100"# Replace with your node's IP
0 commit comments