Skip to content

Commit 179a6db

Browse files
Michael Gschwindmalfet
Michael Gschwind
authored andcommitted
repo name change
1 parent 3b32288 commit 179a6db

File tree

1 file changed

+14
-14
lines changed

1 file changed

+14
-14
lines changed

README.md

+14-14
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@
44
items that are not factual. If you find an item that is incorrect, please tag as an issue, so we can triage and determine whether to fix,
55
or drop from our initial release.*
66

7-
# llama-fast *NORTHSTAR*
7+
# TorchAt *NORTHSTAR*
88
A repo for building and using llama on servers, desktops and mobile.
99

10-
The llama-fast repo enables model inference of llama models (and other LLMs) on servers, desktop and mobile devices.
10+
The TorchAt repo enables model inference of llama models (and other LLMs) on servers, desktop and mobile devices.
1111
For a list of devices, see below, under *SUPPORTED SYSTEMS*.
1212

1313
A goal of this repo, and the design of the PT2 components was to offer seamless integration and consistent workflows.
@@ -29,12 +29,12 @@ Featuring:
2929
and backend-specific mobile runtimes ("delegates", such as CoreML and Hexagon).
3030

3131
The model definition (and much more!) is adopted from gpt-fast, so we support the same models. As new models are supported by gpt-fast,
32-
bringing them into llama-fast should be straight forward. In addition, we invite community contributions
32+
bringing them into TorchAt should be straight forward. In addition, we invite community contributions
3333

3434
# Getting started
3535

3636
Follow the `gpt-fast` [installation instructions](https://github.com/pytorch-labs/gpt-fast?tab=readme-ov-file#installation).
37-
Because llama-fast was designed to showcase the latest and greatest PyTorch 2 features for Llama (and related llama-style) models, many of the features used in llama-fast are hot off the press. [Download PyTorch nightly](https://pytorch.org/get-started/locally/) with the latest steaming hot PyTorch 2 features.
37+
Because TorchAt was designed to showcase the latest and greatest PyTorch 2 features for Llama (and related llama-style) models, many of the features used in TorchAt are hot off the press. [Download PyTorch nightly](https://pytorch.org/get-started/locally/) with the latest steaming hot PyTorch 2 features.
3838

3939

4040
Install sentencepiece and huggingface_hub
@@ -89,10 +89,10 @@ mistralai/Mistral-7B-Instruct-v0.2 | - | ✅ | ✅ | ✅ | ✅ | ❹ |
8989
### More downloading
9090

9191

92-
First cd into llama-fast. We first create a directory for stories15M and download the model and tokenizers.
92+
First cd into TorchAt. We first create a directory for stories15M and download the model and tokenizers.
9393
We show how to download @Andrej Karpathy's stories15M tiny llama-style model that were used in llama2.c. Advantageously,
9494
stories15M is both a great example and quick to download and run across a range of platforms, ideal for introductions like this
95-
README and for [testing](https://github.com/pytorch-labs/llama-fast/blob/main/.github/workflows). We will be using it throughout
95+
README and for [testing](https://github.com/pytorch-labs/TorchAt/blob/main/.github/workflows). We will be using it throughout
9696
this introduction as our running example.
9797

9898
```
@@ -126,7 +126,7 @@ We use several variables in this example, which may be set as a preparatory step
126126
or any other directory you already use to store model information.
127127

128128
* `MODEL_PATH` describes the location of the model. Throughput the description
129-
herein, we will assume that MODEL_PATH starts with a subdirectory of the llama-fast repo
129+
herein, we will assume that MODEL_PATH starts with a subdirectory of the TorchAt repo
130130
named checkpoints, and that it will contain the actual model. In this case, the MODEL_PATH will thus
131131
be of the form ${MODEL_OUT}/model.{pt,pth}. (Both the extensions `pt` and `pth`
132132
are used to describe checkpoints. In addition, model may be replaced with the name of the model.)
@@ -143,7 +143,7 @@ You can set these variables as follows for the exemplary model15M model from And
143143
MODEL_NAME=stories15M
144144
MODEL_DIR=checkpoints/${MODEL_NAME}
145145
MODEL_PATH=${MODEL_OUT}/stories15M.pt
146-
MODEL_OUT=~/llama-fast-exports
146+
MODEL_OUT=~/TorchAt-exports
147147
```
148148

149149
When we export models with AOT Inductor for servers and desktops, and Executorch for mobile and edge devices,
@@ -185,7 +185,7 @@ environment:
185185

186186
Model definition in model.py, generation code in generate.py. The
187187
model checkpoint may have extensions `pth` (checkpoint and model definition) or `pt` (model checkpoint).
188-
At present, we always use the llama-fast model for export and import the checkpoint into this model definition
188+
At present, we always use the TorchAt model for export and import the checkpoint into this model definition
189189
because we have tested that model with the export descriptions described herein.
190190

191191
```
@@ -223,7 +223,7 @@ quantization to achieve this, as described below.
223223

224224
We export the model with the export.py script. Running this script requires you first install executorch with pybindings, see [here](#setting-up-executorch-and-runner-et).
225225
At present, when exporting a model, the export command always uses the
226-
xnnpack delegate to export. (Future versions of llama-fast will support additional
226+
xnnpack delegate to export. (Future versions of TorchAt will support additional
227227
delegates such as Vulkan, CoreML, MPS, HTP in addition to Xnnpack as they are released for Executorch.)
228228

229229

@@ -260,7 +260,7 @@ AOTI). The basic model build for mobile surfaces two issues: Models
260260
quickly run out of memory and execution can be slow. In this section,
261261
we show you how to fit your models in the limited memory of a mobile
262262
device, and optimize execution speed -- both using quantization. This
263-
is the `llama-fast` repo after all!
263+
is the `TorchAt` repo after all!
264264

265265
For high-performance devices such as GPUs, quantization provides a way
266266
to reduce the memory bandwidth required to and take advantage of the
@@ -468,7 +468,7 @@ To run your pte model, use the following command (assuming you already generated
468468

469469
### Android
470470

471-
Check out the [tutorial on how to build an Android app running your PyTorch models with Executorch](https://pytorch.org/executorch/main/llm/llama-demo-android.html), and give your llama-fast models a spin.
471+
Check out the [tutorial on how to build an Android app running your PyTorch models with Executorch](https://pytorch.org/executorch/main/llm/llama-demo-android.html), and give your TorchAt models a spin.
472472

473473
![Screenshot](https://pytorch.org/executorch/main/_static/img/android_llama_app.png "Android app running Llama model")
474474

@@ -552,15 +552,15 @@ List dependencies for these backends
552552
Set up ExecuTorch by following the instructions [here](https://pytorch.org/executorch/stable/getting-started-setup.html#setting-up-executorch).
553553
For convenience, we provide a script that does this for you.
554554

555-
From the llama-fast root directory, run the following
555+
From the TorchAt root directory, run the following
556556
```
557557
export LLAMA_FAST_ROOT=${PWD}
558558
./scripts/install_et.sh
559559
```
560560

561561
This will create a build directory, git clone ExecuTorch to ./build/src, applies some patches to the ExecuTorch source code, install the ExecuTorch python libraries with pip, and install the required ExecuTorch C++ libraries to ./build/install. This will take a while to complete.
562562

563-
After ExecuTorch is installed, you can build runner-et from the llama-fast root directory with the following
563+
After ExecuTorch is installed, you can build runner-et from the TorchAt root directory with the following
564564

565565
```
566566
export LLAMA_FAST_ROOT=${PWD}

0 commit comments

Comments
 (0)