Skip to content

Commit 2e3adfd

Browse files
committed
off f
1 parent aff1bcb commit 2e3adfd

2 files changed

Lines changed: 96 additions & 81 deletions

File tree

my-apps/ai/comfyui/README.md

Lines changed: 91 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
1-
# ComfyUI on Kubernetes (Talos)
1+
# ComfyUI on Kubernetes (Talos) with ArgoCD
22

3-
This directory contains Kubernetes manifests to deploy ComfyUI with GPU support on Talos Linux.
3+
This directory contains Kubernetes manifests to deploy ComfyUI with GPU support on Talos Linux via ArgoCD.
44

55
## Files Structure
66

77
- `namespace.yaml` - ComfyUI namespace
8-
- `pvc.yaml` - Persistent Volume Claim for models and outputs (100GB)
8+
- `pvc.yaml` - Persistent Volume Claim for models and outputs (180GB, Longhorn single replica)
99
- `configmap.yaml` - Configuration for model paths
1010
- `deployment.yaml` - Main ComfyUI deployment with GPU support
1111
- `service.yaml` - ClusterIP and NodePort services
1212
- `httproute.yaml` - HTTPRoute configuration for Gateway API
1313
- `kustomization.yaml` - Kustomize configuration
14-
- `setup-comfyui.sh` - Automated setup script
15-
- `comfyui-manifests.yaml` - Single file with all manifests (for reference)
14+
- `setup-comfyui.sh` - Post-deployment setup script for models and workflows
15+
- `README.md` - This documentation
1616

1717
## Prerequisites for Talos
1818

@@ -21,34 +21,38 @@ This directory contains Kubernetes manifests to deploy ComfyUI with GPU support
2121
```bash
2222
kubectl label nodes <your-gpu-node> accelerator=nvidia-gpu
2323
```
24-
3. **Storage**: Configure appropriate storage class (default: `local-path`)
24+
3. **Storage**: Longhorn configured (single replica for space efficiency)
2525
4. **Gateway API**: Ensure Gateway API is installed and configured in your cluster
26+
5. **ArgoCD**: This setup assumes deployment via ArgoCD
2627

27-
## Quick Deployment
28+
## Deployment Workflow
2829

29-
### Option 1: Using the Setup Script (Recommended)
30+
### 1. ArgoCD Deployment
31+
ArgoCD will automatically deploy the manifests. Ensure your ArgoCD application points to this directory.
32+
33+
### 2. Post-Deployment Setup (Models & Workflows)
34+
After ArgoCD deploys ComfyUI, run the setup script to install models and custom nodes:
3035

3136
```bash
37+
# Navigate to the ComfyUI directory
38+
cd my-apps/ai/comfyui
39+
3240
# Make the script executable
3341
chmod +x setup-comfyui.sh
3442

35-
# Run the complete setup
43+
# Run post-deployment setup
3644
./setup-comfyui.sh
3745
```
3846

39-
### Option 2: Manual Deployment
47+
### 3. Manual Deployment (Alternative)
48+
If you need to deploy manually without ArgoCD:
4049

4150
```bash
4251
# Apply all manifests using kustomize
4352
kubectl apply -k .
4453

45-
# Or apply individual files
46-
kubectl apply -f namespace.yaml
47-
kubectl apply -f pvc.yaml
48-
kubectl apply -f configmap.yaml
49-
kubectl apply -f deployment.yaml
50-
kubectl apply -f service.yaml
51-
kubectl apply -f httproute.yaml
54+
# Then run the setup script
55+
./setup-comfyui.sh
5256
```
5357

5458
## Features
@@ -64,88 +68,73 @@ kubectl apply -f httproute.yaml
6468
- ComfyUI Essentials
6569
- Custom Scripts
6670
- RGThree Comfy
67-
68-
### Pre-downloaded Models
69-
- **SDXL Base 1.0** - Main diffusion model
70-
- **SDXL VAE** - Variational autoencoder
71-
- **ControlNet Models** - Canny and OpenPose
72-
- **RealESRGAN 4x** - Upscaling model
73-
74-
### Default Workflow
75-
- Basic SDXL generation workflow
76-
- Located at: `/opt/ComfyUI/user/default/workflows/basic_sdxl.json`
71+
- Flux-specific nodes (FluxTrainer, GGUF)
72+
- WAS Node Suite & Efficiency Nodes
73+
74+
### Pre-downloaded Models (via setup script)
75+
- **Flux Dev BF16 & FP8** - Latest 12B parameter models for exceptional quality
76+
- **CyberRealistic Pony v11** - Popular photorealistic model
77+
- **SDXL Base 1.0** - Stable foundation model
78+
- **Flux & SDXL VAEs** - High-quality decoders
79+
- **Flux ControlNet** - Canny and Depth control
80+
- **Traditional ControlNet** - Canny and OpenPose
81+
- **RealESRGAN Upscalers** - General and Anime variants
82+
- **CyberRealistic Embeddings** - Optimized prompt tokens
83+
84+
### Pre-configured Workflows
85+
- **Flux Dev Workflow** - High-quality photorealistic generation
86+
- **CyberRealistic Pony Workflow** - Versatile realistic content
7787

7888
## Resource Requirements
7989

80-
- **CPU**: 2-4 cores
81-
- **Memory**: 8-16 GB
82-
- **GPU**: 1x NVIDIA GPU
83-
- **Storage**: 100GB for models and outputs
90+
- **CPU**: 2-8 cores
91+
- **Memory**: 8-32 GB
92+
- **GPU**: 1x NVIDIA GPU (optimized for 24GB VRAM)
93+
- **Storage**: 180GB Longhorn (single replica for efficiency)
8494

8595
## Access Methods
8696

87-
### 1. NodePort (Direct Access)
97+
### 1. HTTPRoute (Primary - Domain Access)
98+
- URL: `https://comfyui.vanillax.me`
99+
- Uses Gateway API with `gateway-internal`
100+
101+
### 2. NodePort (Direct Access)
88102
```bash
89103
# Access via node IP on port 30188
90104
http://<NODE_IP>:30188
91105
```
92106

93-
### 2. Port Forward (Local Development)
107+
### 3. Port Forward (Local Development)
94108
```bash
95109
kubectl port-forward -n comfyui service/comfyui-service 8188:8188
96110
# Access at: http://localhost:8188
97111
```
98112

99-
### 3. HTTPRoute (Domain Access)
100-
Update `httproute.yaml` with your domain and gateway configuration:
101-
```yaml
102-
hostnames:
103-
- comfyui.your-domain.com
104-
parentRefs:
105-
- name: your-gateway-name
106-
namespace: gateway-system
107-
```
108-
Then access via: `http://comfyui.your-domain.com`
109-
110113
## Configuration
111114

112-
### Gateway Configuration
113-
Update the HTTPRoute in `httproute.yaml`:
114-
```yaml
115-
spec:
116-
parentRefs:
117-
- name: your-gateway-name
118-
namespace: gateway-system
119-
hostnames:
120-
- comfyui.your-domain.com
121-
```
122-
123-
### Storage Class
124-
Default uses `local-path`. Update in `pvc.yaml`:
125-
```yaml
126-
storageClassName: your-storage-class
127-
```
115+
### Storage (Longhorn)
116+
- Single replica for space efficiency
117+
- 180GB capacity for models and outputs
118+
- Persistent across pod restarts
128119

129120
### Resource Limits
130-
Adjust in `deployment.yaml`:
121+
Optimized for 24GB GPU systems:
131122
```yaml
132123
resources:
133124
requests:
134125
memory: "8Gi"
135126
cpu: "2"
136127
nvidia.com/gpu: 1
137128
limits:
138-
memory: "16Gi"
139-
cpu: "4"
129+
memory: "32Gi"
130+
cpu: "8"
140131
nvidia.com/gpu: 1
141132
```
142133
143-
### Node Selection
144-
Update node selector in `deployment.yaml`:
145-
```yaml
146-
nodeSelector:
147-
accelerator: nvidia-gpu
148-
```
134+
### Gateway Configuration
135+
HTTPRoute configured for:
136+
- Gateway: `gateway-internal` in `gateway` namespace
137+
- Domain: `comfyui.vanillax.me`
149138

150139
## Monitoring
151140

@@ -160,10 +149,11 @@ kubectl get httproute -n comfyui
160149
## Troubleshooting
161150

162151
1. **Pod not starting**: Check GPU node labels and availability
163-
2. **Storage issues**: Verify storage class and PVC status
164-
3. **Model download issues**: Check pod logs for download progress
152+
2. **Storage issues**: Verify Longhorn status and PVC binding
153+
3. **Model download issues**: Check pod logs during setup script execution
165154
4. **GPU not detected**: Ensure NVIDIA device plugin is running
166155
5. **HTTPRoute not working**: Check Gateway API installation and gateway configuration
156+
6. **ArgoCD sync issues**: Verify kustomization.yaml and resource files
167157

168158
## Customization
169159

@@ -180,10 +170,35 @@ Use ComfyUI Manager web interface or install manually:
180170
kubectl exec -n comfyui $POD_NAME -- bash -c "cd /opt/ComfyUI/custom_nodes && git clone <repo-url>"
181171
```
182172

173+
### Updating Models via Script
174+
Re-run the setup script to add new models:
175+
```bash
176+
./setup-comfyui.sh
177+
```
178+
179+
## ArgoCD Integration
180+
181+
### Application Configuration
182+
Ensure your ArgoCD application configuration includes:
183+
```yaml
184+
spec:
185+
source:
186+
path: my-apps/ai/comfyui
187+
repoURL: <your-repo>
188+
targetRevision: HEAD
189+
destination:
190+
namespace: comfyui
191+
server: https://kubernetes.default.svc
192+
```
193+
194+
### Sync Strategy
195+
- **Automatic Sync**: Recommended for seamless updates
196+
- **Manual Sync**: For controlled deployments
197+
183198
## Notes
184199

185-
- First startup takes time for model downloads
186-
- ComfyUI Manager provides easy model and node management
187-
- Models persist in the PVC across pod restarts
188-
- The setup script handles initial configuration and model downloads
189-
- HTTPRoute requires Gateway API to be installed in your cluster
200+
- **First startup**: Takes time for model downloads (~50GB+)
201+
- **ComfyUI Manager**: Provides easy model and node management via web UI
202+
- **Model persistence**: All models persist in Longhorn storage across restarts
203+
- **Setup script**: Only handles post-deployment configuration, not manifest deployment
204+
- **ArgoCD friendly**: All manifests are properly structured for GitOps workflows

my-apps/ai/khoj/kustomization.yaml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@ kind: Kustomization
33

44
resources:
55
- namespace.yaml
6-
- pvc.yaml
7-
- deployment.yaml
8-
- service.yaml
9-
- httproute.yaml
10-
- externalsecret.yaml
6+
# - pvc.yaml
7+
# - deployment.yaml
8+
# - service.yaml
9+
# - httproute.yaml
10+
# - externalsecret.yaml
1111

1212
namespace: khoj

0 commit comments

Comments
 (0)