Implementing a Chatbot with Llama 3.3 70B on AIC100 Ultra

Overview
Features
Install Platform and App SDK
Enable Root Access
Activate the qeff Environment
Download the Pre-compiled Llama 3.3 70B Model
Extract the Model
Download the code demo.py in this repository
Demo
Example

1. Overview

This guide outlines the steps to implement a chatbot using the Llama 3.3 70B model on the AIC100 Ultra platform.

2. Features

Pre-compiled Llama3.3 70B models with 8k ctx_len.
Python API "QEfficient.generation.text_generation_inference.cloud_ai_100_exec_kv".
Enabling steam function for smooth inference result.
while loop to avoid reload model to AIC100 Ultra card.

3. Install Platform and App SDK

Follow the official installation guide from Efficient Transformers:
👉 https://quic.github.io/efficient-transformers/source/installation.html

4. Enable Root Access

sudo -i

5. Activate the `qeff` Environment

source /opt/qti-aic/dev/python/qeff/bin/activate

6. Download the Pre-compiled Llama 3.3 70B Model

Download from:
https://qualcom-qpc-models.s3-accelerate.amazonaws.com/SDK1.19.6/meta-llama/Llama-3.3-70B-Instruct/qpc_16cores_128pl_8192cl_1fbs_4devices_mxfp6_mxint8.tar.gz

7. Extract the Model

Extract the tarball to your desired directory:

tar -xzvf qpc_16cores_128pl_8192cl_1fbs_4devices_mxfp6_mxint8.tar.gz -C /your/target/folder

8. Download the code demo.py in this repository

Download or directly copy the code demo.py in this repository

9. Demo

python demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing a Chatbot with Llama 3.3 70B on AIC100 Ultra

Table of Contents

1. Overview

2. Features

3. Install Platform and App SDK

4. Enable Root Access

5. Activate the `qeff` Environment

6. Download the Pre-compiled Llama 3.3 70B Model

7. Extract the Model

8. Download the code demo.py in this repository

9. Demo

10. Example:

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Implementing a Chatbot with Llama 3.3 70B on AIC100 Ultra

Table of Contents

1. Overview

2. Features

3. Install Platform and App SDK

4. Enable Root Access

5. Activate the qeff Environment

6. Download the Pre-compiled Llama 3.3 70B Model

7. Extract the Model

8. Download the code demo.py in this repository

9. Demo

10. Example:

5. Activate the `qeff` Environment