Skip to content

Commit c5f96c9

Browse files
[skip ci] Update tt_transformers/README.md (#23866)
### Ticket [Add huggingface org/model names of tt-transformers supported models #23858](#23858) ### Problem description The document did not list the supported models handle to set up HF_MODEL for automatic weight download. When setting up HF_MODEL, users would have to access huggingface to search for the proper org/name handle. ### What's changed - Added a chart with the verified models names, hardware and org\name as in huggingface for easier setup and improved user experience. ### Checklist - [x] [All post commit](https://github.com/tenstorrent/tt-metal/actions/runs/15789101701) CI passes - [x] New/Existing tests provide coverage for changes Co-authored-by: Mark O'Connor <moconnor@tenstorrent.com>
1 parent 48e916b commit c5f96c9

File tree

1 file changed

+23
-16
lines changed

1 file changed

+23
-16
lines changed

models/tt_transformers/README.md

Lines changed: 23 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,21 @@
11
# TT-Transformers
22

3-
This code can run large language models that are similar to the Llama3 family and other similar models such as Qwen2.5, Mistral and DeepSeek-R1-Distill variants. Tensor-parallelism is automatically used to parallelize workloads across all available chips.
3+
This code can run large language models such as the Llama3 family, Qwen2.5, Mistral, DeepSeek-R1-Distill variants and similar. Tensor-parallelism automatically distributes workloads across all available chips.
44

55
The current version is verified to work with the following models:
6-
- Llama3.2-1B
7-
- Llama3.2-3B
8-
- Llama3.1-8B
9-
- Llama3.2-11B
10-
- Llama3.1-70B (LoudBox / QuietBox and Galaxy)
11-
- Llama3.2-90B (LoudBox / QuietBox)
12-
- Qwen2.5-7B (N300)
13-
- Qwen2.5-72B (LoudBox / QuietBox)
14-
- Qwen3-32B (LoudBox / QuietBox)
15-
- DeepSeek R1 Distill Llama 3.3 70B (LoudBox / QuietBox and Galaxy)
16-
- [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)
6+
| Model | Hardware | <org/name> |
7+
|--------------------------------------------------------------------------------------------------|-----------------------------|-------------------------------------------------|
8+
| [DeepSeek R1 Distill Llama 70B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B)| LoudBox / QuietBox / Galaxy | ```deepseek-ai/DeepSeek-R1-Distill-Llama-70B``` |
9+
| [Llama 3.1 8B](https://huggingface.co/meta-llama/Llama-3.1-8B) | n150 / p100 / p150 | ```meta-llama/Llama-3.1-8B``` |
10+
| [Llama 3.1 70B](https://huggingface.co/meta-llama/Llama-3.1-70B) | LoudBox / QuietBox / Galaxy | ```meta-llama/Llama-3.1-70B``` |
11+
| [Llama 3.2 1B](https://huggingface.co/meta-llama/Llama-3.2-1B) | n150 | ```meta-llama/Llama-3.2-1B``` |
12+
| [Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B) | n150 | ```meta-llama/Llama-3.2-3B``` |
13+
| [Llama 3.2 11B Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision) | n300 | ```meta-llama/Llama-3.2-11B-Vision``` |
14+
| [Llama 3.2 90B Vision](https://huggingface.co/meta-llama/Llama-3.2-90B-Vision) | LoudBox / QuietBox | ```meta-llama/Llama-3.2-90B-Vision``` |
15+
| [Mistral 7B Instruct v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) | n150 | ```mistralai/Mistral-7B-Instruct-v0.3``` |
16+
| [Qwen 2.5 7B](https://huggingface.co/Qwen/Qwen2.5-7B) | n300 | ```Qwen/Qwen2.5-7B``` |
17+
| [Qwen 2.5 72B](https://huggingface.co/Qwen/Qwen2.5-72B) | LoudBox / QuietBox | ```Qwen/Qwen2.5-72B``` |
18+
| [Qwen 3 32B](https://huggingface.co/Qwen/Qwen3-32B) | LoudBox / QuietBox | ```Qwen/Qwen3-32B``` |
1719

1820
## Dependencies
1921

@@ -26,15 +28,20 @@ pip install -r models/tt_transformers/requirements.txt
2628

2729
## Run a demo
2830

29-
### 1. Specify which model you want to run
31+
To run a demo, choose one of the methods below for downloading the model weights:
3032

31-
The easiest way to do this is to set the `HF_MODEL` environment variable to the Huggingface org/name of the model you want to run:
33+
### 1. Automatic download
3234

35+
Set the `HF_MODEL` environment variable to the Huggingface org/name of the model you want to run, This will automatically download the weights into your HuggingFace cache directory and run the model directly.
36+
37+
Check the models chart on the top of the page and substitue the <org/name> on the following command:
3338
```
34-
export HF_MODEL=deepseek-ai/DeepSeek-R1-Distill-Llama-70B
39+
export HF_MODEL=deepseek-ai/<org/name>
3540
```
3641

37-
This will automatically download the weights into your HuggingFace cache directory and run the model directly. If you wish, you can manually download the weights either from Huggingface or from Meta as described by the two following sections:
42+
### 2. Manual download
43+
44+
If you wish, you can manually download the weights either from Huggingface or from Meta as described by the two following sections:
3845

3946
#### Option 1: download Llama weights from Meta
4047

0 commit comments

Comments
 (0)