Skip to content

Commit 5bd5309

Browse files
authored
Update readme for supporting deepseek and phi4 (#1522)
Signed-off-by: Xinyao Wang <[email protected]>
1 parent 8002e8b commit 5bd5309

File tree

1 file changed

+21
-12
lines changed

1 file changed

+21
-12
lines changed

comps/llms/src/text-generation/README.md

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,21 @@ Overall, this microservice offers a streamlined way to integrate large language
88

99
## Validated LLM Models
1010

11-
| Model | TGI-Gaudi | vLLM-CPU | vLLM-Gaudi | OVMS |
12-
| ------------------------------------------- | --------- | -------- | ---------- | -------- |
13-
| [Intel/neural-chat-7b-v3-3] |||||
14-
| [meta-llama/Llama-2-7b-chat-hf] |||||
15-
| [meta-llama/Llama-2-70b-chat-hf] || - || - |
16-
| [meta-llama/Meta-Llama-3-8B-Instruct] |||||
17-
| [meta-llama/Meta-Llama-3-70B-Instruct] || - || - |
18-
| [Phi-3] | x | Limit 4K | Limit 4K | Limit 4K |
19-
| [deepseek-ai/DeepSeek-R1-Distill-Llama-70B] || - || - |
20-
| [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B] || - || - |
21-
| [mistralai/Mistral-Small-24B-Instruct-2501] || - || - |
22-
| [mistralai/Mistral-Large-Instruct-2411] | x | - || - |
11+
| Model | TGI-Gaudi | vLLM-CPU | vLLM-Gaudi | OVMS | Optimum-Habana |
12+
| ------------------------------------------- | --------- | -------- | ---------- | -------- | -------------- |
13+
| [Intel/neural-chat-7b-v3-3] ||||||
14+
| [meta-llama/Llama-2-7b-chat-hf] ||||||
15+
| [meta-llama/Llama-2-70b-chat-hf] || - || - ||
16+
| [meta-llama/Meta-Llama-3-8B-Instruct] ||||||
17+
| [meta-llama/Meta-Llama-3-70B-Instruct] || - || - ||
18+
| [Phi-3] | x | Limit 4K | Limit 4K | Limit 4K ||
19+
| [Phi-4] | x | x | x | x ||
20+
| [deepseek-ai/DeepSeek-R1-Distill-Llama-8B] || - || - ||
21+
| [deepseek-ai/DeepSeek-R1-Distill-Llama-70B] || - || - ||
22+
| [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B] || - || - ||
23+
| [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B] || - || - ||
24+
| [mistralai/Mistral-Small-24B-Instruct-2501] || - || - ||
25+
| [mistralai/Mistral-Large-Instruct-2411] | x | - || - ||
2326

2427
### System Requirements for LLM Models
2528

@@ -31,7 +34,10 @@ Overall, this microservice offers a streamlined way to integrate large language
3134
| [meta-llama/Meta-Llama-3-8B-Instruct] | 1 |
3235
| [meta-llama/Meta-Llama-3-70B-Instruct] | 2 |
3336
| [Phi-3] | x |
37+
| [Phi-4] | x |
38+
| [deepseek-ai/DeepSeek-R1-Distill-Llama-8B] | 1 |
3439
| [deepseek-ai/DeepSeek-R1-Distill-Llama-70B] | 8 |
40+
| [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B] | 2 |
3541
| [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B] | 4 |
3642
| [mistralai/Mistral-Small-24B-Instruct-2501] | 1 |
3743
| [mistralai/Mistral-Large-Instruct-2411] | 4 |
@@ -192,8 +198,11 @@ curl http://${host_ip}:${TEXTGEN_PORT}/v1/chat/completions \
192198
[meta-llama/Meta-Llama-3-8B-Instruct]: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
193199
[meta-llama/Meta-Llama-3-70B-Instruct]: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct
194200
[Phi-3]: https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3
201+
[Phi-4]: https://huggingface.co/collections/microsoft/phi-4-677e9380e514feb5577a40e4
195202
[HuggingFace]: https://huggingface.co/
203+
[deepseek-ai/DeepSeek-R1-Distill-Llama-8B]: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B
196204
[deepseek-ai/DeepSeek-R1-Distill-Llama-70B]: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B
197205
[deepseek-ai/DeepSeek-R1-Distill-Qwen-32B]: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
206+
[deepseek-ai/DeepSeek-R1-Distill-Qwen-14B]: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
198207
[mistralai/Mistral-Small-24B-Instruct-2501]: https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501
199208
[mistralai/Mistral-Large-Instruct-2411]: https://huggingface.co/mistralai/Mistral-Large-Instruct-2411

0 commit comments

Comments
 (0)