Skip to content

Commit 4c9e90a

Browse files
authored
Add serving commands for ministral 3, smollm3, eurollm, and trinity (#9)
1 parent 6e909a6 commit 4c9e90a

1 file changed

Lines changed: 194 additions & 0 deletions

File tree

serving/README.md

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,78 @@ python serving/submit_job.py \
5454

5555
</details>
5656

57+
#### `Ministral-3-3B-Instruct-2512`
58+
59+
<details>
60+
<summary>vLLM (tested ✅)</summary>
61+
62+
```bash
63+
python serving/submit_job.py \
64+
--slurm-nodes 1 \
65+
--serving-framework vllm \
66+
--slurm-environment $(pwd)/serving/envs/vllm.toml \
67+
--framework-args "--model mistralai/Ministral-3-3B-Instruct-2512\
68+
--served-model-name mistralai/Ministral-3-3B-Instruct-2512-$(whoami) \
69+
--host 0.0.0.0 \
70+
--port 8080 \
71+
--data-parallel-size 4 \
72+
--tokenizer_mode mistral \
73+
--load_format mistral \
74+
--config_format mistral \
75+
--tool-call-parser mistral \
76+
--enable-auto-tool-choice"
77+
```
78+
79+
</details>
80+
81+
#### `Ministral-3-8B-Instruct-2512`
82+
83+
<details>
84+
<summary>vLLM (tested ✅)</summary>
85+
86+
```bash
87+
python serving/submit_job.py \
88+
--slurm-nodes 1 \
89+
--serving-framework vllm \
90+
--slurm-environment $(pwd)/serving/envs/vllm.toml \
91+
--framework-args "--model mistralai/Ministral-3-8B-Instruct-2512 \
92+
--served-model-name mistralai/Ministral-3-8B-Instruct-2512-$(whoami) \
93+
--host 0.0.0.0 \
94+
--port 8080 \
95+
--data-parallel-size 4 \
96+
--tokenizer_mode mistral \
97+
--load_format mistral \
98+
--config_format mistral \
99+
--tool-call-parser mistral \
100+
--enable-auto-tool-choice"
101+
```
102+
103+
</details>
104+
105+
#### `Ministral-3-14B-Instruct-2512`
106+
107+
<details>
108+
<summary>vLLM (tested ✅)</summary>
109+
110+
```bash
111+
python serving/submit_job.py \
112+
--slurm-nodes 1 \
113+
--serving-framework vllm \
114+
--slurm-environment $(pwd)/serving/envs/vllm.toml \
115+
--framework-args "--model mistralai/Ministral-3-14B-Instruct-2512 \
116+
--served-model-name mistralai/Ministral-3-14B-Instruct-2512-$(whoami) \
117+
--host 0.0.0.0 \
118+
--port 8080 \
119+
--data-parallel-size 4 \
120+
--tokenizer_mode mistral \
121+
--load_format mistral \
122+
--config_format mistral \
123+
--tool-call-parser mistral \
124+
--enable-auto-tool-choice"
125+
```
126+
127+
</details>
128+
57129
### Snowflake
58130

59131
#### `snowflake-arctic-embed-l-v2.0`
@@ -329,6 +401,128 @@ python serving/submit_job.py \
329401

330402
</details>
331403

404+
### Hugging Face
405+
406+
#### `SmolLM3-3B`
407+
408+
<details>
409+
<summary>SGLang (tested ✅)</summary>
410+
411+
```bash
412+
python serving/submit_job.py \
413+
--slurm-nodes 1 \
414+
--serving-framework sglang \
415+
--slurm-environment $(pwd)/serving/envs/sglang.toml \
416+
--framework-args "--model HuggingFaceTB/SmolLM3-3B \
417+
--served-model-name HuggingFaceTB/SmolLM3-3B-$(whoami) \
418+
--dp-size 4 \
419+
--host 0.0.0.0 \
420+
--port 8080"
421+
```
422+
423+
</details>
424+
425+
### Utter
426+
427+
#### `EuroLLM-1.7B-Instruct`
428+
429+
<details>
430+
<summary>SGLang (tested ✅)</summary>
431+
432+
```bash
433+
python serving/submit_job.py \
434+
--slurm-nodes 1 \
435+
--serving-framework sglang \
436+
--slurm-environment $(pwd)/serving/envs/sglang.toml \
437+
--framework-args "--model utter-project/EuroLLM-1.7B-Instruct \
438+
--served-model-name utter-project/EuroLLM-1.7B-Instruct-$(whoami) \
439+
--dp-size 4 \
440+
--host 0.0.0.0 \
441+
--port 8080"
442+
```
443+
444+
</details>
445+
446+
#### `utter-project/EuroLLM-9B-Instruct-2512`
447+
448+
<details>
449+
<summary>SGLang (tested ✅)</summary>
450+
451+
```bash
452+
python serving/submit_job.py \
453+
--slurm-nodes 1 \
454+
--serving-framework sglang \
455+
--slurm-environment $(pwd)/serving/envs/sglang.toml \
456+
--framework-args "--model utter-project/EuroLLM-9B-Instruct-2512 \
457+
--served-model-name utter-project/EuroLLM-9B-Instruct-2512-$(whoami) \
458+
--dp-size 4 \
459+
--host 0.0.0.0 \
460+
--port 8080"
461+
```
462+
463+
</details>
464+
465+
#### `utter-project/EuroLLM-22B-Instruct-2512`
466+
467+
<details>
468+
<summary>SGLang (tested ✅)</summary>
469+
470+
```bash
471+
python serving/submit_job.py \
472+
--slurm-nodes 1 \
473+
--serving-framework sglang \
474+
--slurm-environment $(pwd)/serving/envs/sglang.toml \
475+
--framework-args "--model utter-project/EuroLLM-22B-Instruct-2512 \
476+
--served-model-name utter-project/EuroLLM-22B-Instruct-2512-$(whoami) \
477+
--dp-size 4 \
478+
--host 0.0.0.0 \
479+
--port 8080"
480+
```
481+
482+
</details>
483+
484+
485+
### Arcee AI
486+
487+
#### `Trinity-Mini`
488+
489+
<details>
490+
<summary>vLLM (tested ✅)</summary>
491+
492+
```bash
493+
python serving/submit_job.py \
494+
--slurm-nodes 1 \
495+
--serving-framework vllm \
496+
--slurm-environment $(pwd)/serving/envs/vllm.toml \
497+
--framework-args "--model arcee-ai/Trinity-Mini \
498+
--served-model-name arcee-ai/Trinity-Mini-$(whoami) \
499+
--host 0.0.0.0 \
500+
--port 8080 \
501+
--enable-auto-tool-choice \
502+
--reasoning-parser deepseek_r1 \
503+
--tool-call-parser hermes"
504+
```
505+
506+
</details>
507+
508+
#### `Trinity-Nano-Preview`
509+
510+
<details>
511+
<summary>vLLM (tested ✅)</summary>
512+
513+
```bash
514+
python serving/submit_job.py \
515+
--slurm-nodes 1 \
516+
--serving-framework vllm \
517+
--slurm-environment $(pwd)/serving/envs/vllm.toml \
518+
--framework-args "--model arcee-ai/Trinity-Nano-Preview\
519+
--served-model-name arcee-ai/Trinity-Nano-Preview-$(whoami) \
520+
--host 0.0.0.0 \
521+
--port 8080"
522+
```
523+
524+
</details>
525+
332526
## Parameters
333527

334528
### Required

0 commit comments

Comments
 (0)