Skip to content

Commit f6c4dd8

Browse files
[None][chore] Update AutoDeploy model list (NVIDIA#10505)
Signed-off-by: Tal Cherckez <127761168+tcherckez-nvidia@users.noreply.github.com>
1 parent 6ab996d commit f6c4dd8

2 files changed

Lines changed: 5 additions & 6 deletions

File tree

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
attn_backend: triton

examples/auto_deploy/model_registry/models.yaml

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ models:
6565
- name: bigcode/starcoder2-7b
6666
yaml_extra: ['dashboard_default.yaml', 'world_size_2.yaml']
6767
- name: bigcode/starcoder2-15b-instruct-v0.1
68-
yaml_extra: ['dashboard_default.yaml', 'world_size_2.yaml']
68+
yaml_extra: ['dashboard_default.yaml', 'world_size_2.yaml', 'attn_backend_triton.yaml']
6969
- name: deepseek-ai/DeepSeek-Prover-V1.5-SFT
7070
yaml_extra: ['dashboard_default.yaml', 'world_size_2.yaml', 'compile_backend_torch_cudagraph.yaml']
7171
- name: deepseek-ai/DeepSeek-Prover-V2-7B
@@ -118,8 +118,6 @@ models:
118118
yaml_extra: ['dashboard_default.yaml', 'world_size_2.yaml', 'multimodal.yaml']
119119
- name: google/gemma-3-27b-it
120120
yaml_extra: ['dashboard_default.yaml', 'world_size_2.yaml', 'multimodal.yaml']
121-
- name: google/gemma-3-2b-it
122-
yaml_extra: ['dashboard_default.yaml', 'world_size_2.yaml']
123121
- name: deepseek-ai/DeepSeek-V2.5
124122
yaml_extra: ['dashboard_default.yaml', 'world_size_2.yaml']
125123
# DISABLED: Network timeout downloading from Hugging Face
@@ -145,8 +143,6 @@ models:
145143
# DISABLED: Graph transformation error in auto-deploy
146144
# - name: neuralmagic/Mixtral-8x7B-Instruct-v0.1-FP8
147145
# yaml_extra: ['dashboard_default.yaml', 'world_size_4.yaml']
148-
- name: TheBloke/falcon-40b-instruct-GPTQ
149-
yaml_extra: ['dashboard_default.yaml', 'world_size_4.yaml']
150146
- name: Qwen/QwQ-32B
151147
yaml_extra: ['dashboard_default.yaml', 'world_size_4.yaml', 'compile_backend_torch_cudagraph.yaml']
152148
- name: google/gemma-2-27b-it
@@ -159,7 +155,7 @@ models:
159155
yaml_extra: ['dashboard_default.yaml', 'world_size_4.yaml']
160156
- name: Qwen/QwQ-32B-Preview
161157
yaml_extra: ['dashboard_default.yaml', 'world_size_4.yaml', 'compile_backend_torch_cudagraph.yaml']
162-
- name: Qwen/Qwen3-Coder-32B-Instruct
158+
- name: Qwen/Qwen3-Coder-30B-A3B-Instruct
163159
yaml_extra: ['dashboard_default.yaml', 'world_size_4.yaml']
164160
- name: Qwen/Qwen3-235B-A22B-Instruct-2507
165161
yaml_extra: ['dashboard_default.yaml', 'world_size_4.yaml']
@@ -222,3 +218,5 @@ models:
222218
yaml_extra: ['dashboard_default.yaml', 'world_size_8.yaml', 'multimodal.yaml', 'llama4_scout.yaml']
223219
- name: meta-llama/Llama-4-Maverick-17B-128E-Instruct
224220
yaml_extra: ['dashboard_default.yaml', 'world_size_8.yaml', 'multimodal.yaml', 'llama4_maverick_lite.yaml']
221+
- name: nvidia/NVIDIA-Nemotron-3-Super-120B-BF16-BF16KV-010726
222+
yaml_extra: ['dashboard_default.yaml', 'world_size_4.yaml','super_v3.yaml']

0 commit comments

Comments
 (0)