Skip to content

Commit a1f0340

Browse files
committed
fix: make HuggingFace token optional for non-gated models
Only set spec.secrets.huggingFaceToken on ModelDeployment CRs when the model is gated. Previously, the Web UI hardcoded hfTokenSecret for all deployments, causing non-gated models (Qwen, DeepSeek, TinyLlama, Phi-3) to fail with CreateContainerConfigError when the referenced K8s Secret did not exist. Changes: - Conditional hfTokenSecret default based on model.gated - Guard in handleSubmit to strip secret for non-gated models - Remove KAITO exception from needsHfAuth button disable logic - Add clarifying comments to sample YAMLs Fixes #43
1 parent aff3083 commit a1f0340

3 files changed

Lines changed: 12 additions & 3 deletions

File tree

controller/config/samples/kubeairunway_v1alpha1_modeldeployment.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ spec:
2020
gpu:
2121
count: 1
2222
memory: "32Gi"
23+
# Required: Llama is a gated model requiring HuggingFace authentication
2324
secrets:
2425
huggingFaceToken: "hf-token"
2526
---
@@ -77,5 +78,6 @@ spec:
7778
gpu:
7879
count: 2
7980
memory: "64Gi"
81+
# Required: Llama is a gated model requiring HuggingFace authentication
8082
secrets:
8183
huggingFaceToken: "hf-token"

controller/config/samples/kubeairunway_v1alpha1_modeldeployment_llmd.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ spec:
2424
gpu:
2525
count: 1
2626
memory: "24Gi"
27+
# Required: Llama is a gated model requiring HuggingFace authentication
2728
secrets:
2829
huggingFaceToken: "llm-d-hf-token"
2930
---
@@ -54,5 +55,6 @@ spec:
5455
gpu:
5556
count: 4
5657
memory: "96Gi"
58+
# Required: Llama is a gated model requiring HuggingFace authentication
5759
secrets:
5860
huggingFaceToken: "llm-d-hf-token"

frontend/src/components/deployments/DeploymentForm.tsx

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -248,7 +248,7 @@ export function DeploymentForm({ model, detailedCapacity, autoscaler, runtimes }
248248
provider: getDefaultRuntime(),
249249
routerMode: 'none',
250250
replicas: 1,
251-
hfTokenSecret: import.meta.env.VITE_DEFAULT_HF_SECRET || 'hf-token-secret',
251+
hfTokenSecret: model.gated ? (import.meta.env.VITE_DEFAULT_HF_SECRET || 'hf-token-secret') : '',
252252
enforceEager: true,
253253
enablePrefixCaching: false,
254254
trustRemoteCode: false,
@@ -370,6 +370,11 @@ export function DeploymentForm({ model, detailedCapacity, autoscaler, runtimes }
370370
// Build the deployment config, adding KAITO-specific fields if needed
371371
let deployConfig = { ...config }
372372

373+
// Only include hfTokenSecret for gated models
374+
if (!model.gated) {
375+
delete deployConfig.hfTokenSecret;
376+
}
377+
373378
if (selectedRuntime === 'kaito') {
374379
// Add kaitoResourceType to all KAITO deployments
375380
deployConfig = { ...deployConfig, kaitoResourceType }
@@ -592,7 +597,7 @@ export function DeploymentForm({ model, detailedCapacity, autoscaler, runtimes }
592597

593598
// Status-aware button content
594599
const getButtonContent = () => {
595-
if (needsHfAuth && selectedRuntime !== 'kaito') {
600+
if (needsHfAuth) {
596601
return 'HuggingFace Auth Required'
597602
}
598603

@@ -1558,7 +1563,7 @@ export function DeploymentForm({ model, detailedCapacity, autoscaler, runtimes }
15581563
</Button>
15591564
<Button
15601565
type="submit"
1561-
disabled={createDeployment.isProcessing || (needsHfAuth && selectedRuntime !== 'kaito') || !isRuntimeInstalled || !isKaitoConfigValid}
1566+
disabled={createDeployment.isProcessing || needsHfAuth || !isRuntimeInstalled || !isKaitoConfigValid}
15621567
loading={createDeployment.isProcessing}
15631568
className={cn(
15641569
"flex-1 gap-2",

0 commit comments

Comments
 (0)