You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+18-14
Original file line number
Diff line number
Diff line change
@@ -4,10 +4,10 @@
4
4
items that are not factual. If you find an item that is incorrect, please tag as an issue, so we can triage and determine whether to fix,
5
5
or drop from our initial release.*
6
6
7
-
# TorchAt*NORTHSTAR*
7
+
# torchat*NORTHSTAR*
8
8
A repo for building and using llama on servers, desktops and mobile.
9
9
10
-
The TorchAt repo enables model inference of llama models (and other LLMs) on servers, desktop and mobile devices.
10
+
The torchat repo enables model inference of llama models (and other LLMs) on servers, desktop and mobile devices.
11
11
For a list of devices, see below, under *SUPPORTED SYSTEMS*.
12
12
13
13
A goal of this repo, and the design of the PT2 components was to offer seamless integration and consistent workflows.
@@ -29,12 +29,12 @@ Featuring:
29
29
and backend-specific mobile runtimes ("delegates", such as CoreML and Hexagon).
30
30
31
31
The model definition (and much more!) is adopted from gpt-fast, so we support the same models. As new models are supported by gpt-fast,
32
-
bringing them into TorchAt should be straight forward. In addition, we invite community contributions
32
+
bringing them into torchat should be straight forward. In addition, we invite community contributions
33
33
34
34
# Getting started
35
35
36
36
Follow the `gpt-fast`[installation instructions](https://github.com/pytorch-labs/gpt-fast?tab=readme-ov-file#installation).
37
-
Because TorchAt was designed to showcase the latest and greatest PyTorch 2 features for Llama (and related llama-style) models, many of the features used in TorchAt are hot off the press. [Download PyTorch nightly](https://pytorch.org/get-started/locally/) with the latest steaming hot PyTorch 2 features.
37
+
Because torchat was designed to showcase the latest and greatest PyTorch 2 features for Llama (and related llama-style) models, many of the features used in torchat are hot off the press. [Download PyTorch nightly](https://pytorch.org/get-started/locally/) with the latest steaming hot PyTorch 2 features.
First cd into TorchAt. We first create a directory for stories15M and download the model and tokenizers.
97
+
First cd into torchat. We first create a directory for stories15M and download the model and tokenizers.
94
98
We show how to download @Andrej Karpathy's stories15M tiny llama-style model that were used in llama2.c. Advantageously,
95
99
stories15M is both a great example and quick to download and run across a range of platforms, ideal for introductions like this
96
-
README and for [testing](https://github.com/pytorch-labs/TorchAt/blob/main/.github/workflows). We will be using it throughout
100
+
README and for [testing](https://github.com/pytorch-labs/torchat/blob/main/.github/workflows). We will be using it throughout
97
101
this introduction as our running example.
98
102
99
103
```
@@ -127,7 +131,7 @@ We use several variables in this example, which may be set as a preparatory step
127
131
or any other directory you already use to store model information.
128
132
129
133
*`MODEL_PATH` describes the location of the model. Throughput the description
130
-
herein, we will assume that MODEL_PATH starts with a subdirectory of the TorchAt repo
134
+
herein, we will assume that MODEL_PATH starts with a subdirectory of the torchat repo
131
135
named checkpoints, and that it will contain the actual model. In this case, the MODEL_PATH will thus
132
136
be of the form ${MODEL_OUT}/model.{pt,pth}. (Both the extensions `pt` and `pth`
133
137
are used to describe checkpoints. In addition, model may be replaced with the name of the model.)
@@ -144,7 +148,7 @@ You can set these variables as follows for the exemplary model15M model from And
144
148
MODEL_NAME=stories15M
145
149
MODEL_DIR=checkpoints/${MODEL_NAME}
146
150
MODEL_PATH=${MODEL_OUT}/stories15M.pt
147
-
MODEL_OUT=~/TorchAt-exports
151
+
MODEL_OUT=~/torchat-exports
148
152
```
149
153
150
154
When we export models with AOT Inductor for servers and desktops, and Executorch for mobile and edge devices,
@@ -193,7 +197,7 @@ Add option to load tiktoken
193
197
194
198
Model definition in model.py, generation code in generate.py. The
195
199
model checkpoint may have extensions `pth` (checkpoint and model definition) or `pt` (model checkpoint).
196
-
At present, we always use the TorchAt model for export and import the checkpoint into this model definition
200
+
At present, we always use the torchat model for export and import the checkpoint into this model definition
197
201
because we have tested that model with the export descriptions described herein.
198
202
199
203
```
@@ -231,7 +235,7 @@ quantization to achieve this, as described below.
231
235
232
236
We export the model with the export.py script. Running this script requires you first install executorch with pybindings, see [here](#setting-up-executorch-and-runner-et).
233
237
At present, when exporting a model, the export command always uses the
234
-
xnnpack delegate to export. (Future versions of TorchAt will support additional
238
+
xnnpack delegate to export. (Future versions of torchat will support additional
235
239
delegates such as Vulkan, CoreML, MPS, HTP in addition to Xnnpack as they are released for Executorch.)
236
240
237
241
@@ -292,7 +296,7 @@ AOTI). The basic model build for mobile surfaces two issues: Models
292
296
quickly run out of memory and execution can be slow. In this section,
293
297
we show you how to fit your models in the limited memory of a mobile
294
298
device, and optimize execution speed -- both using quantization. This
295
-
is the `TorchAt` repo after all!
299
+
is the `torchat` repo after all!
296
300
297
301
For high-performance devices such as GPUs, quantization provides a way
298
302
to reduce the memory bandwidth required to and take advantage of the
@@ -534,7 +538,7 @@ To run your pte model, use the following command (assuming you already generated
534
538
535
539
### Android
536
540
537
-
Check out the [tutorial on how to build an Android app running your PyTorch models with Executorch](https://pytorch.org/executorch/main/llm/llama-demo-android.html), and give your TorchAt models a spin.
541
+
Check out the [tutorial on how to build an Android app running your PyTorch models with Executorch](https://pytorch.org/executorch/main/llm/llama-demo-android.html), and give your torchat models a spin.
@@ -643,15 +647,15 @@ List dependencies for these backends
643
647
Set up ExecuTorch by following the instructions [here](https://pytorch.org/executorch/stable/getting-started-setup.html#setting-up-executorch).
644
648
For convenience, we provide a script that does this for you.
645
649
646
-
From the TorchAt root directory, run the following
650
+
From the torchat root directory, run the following
647
651
```
648
652
export LLAMA_FAST_ROOT=${PWD}
649
653
./scripts/install_et.sh
650
654
```
651
655
652
656
This will create a build directory, git clone ExecuTorch to ./build/src, applies some patches to the ExecuTorch source code, install the ExecuTorch python libraries with pip, and install the required ExecuTorch C++ libraries to ./build/install. This will take a while to complete.
653
657
654
-
After ExecuTorch is installed, you can build runner-et from the TorchAt root directory with the following
658
+
After ExecuTorch is installed, you can build runner-et from the torchat root directory with the following
0 commit comments