- 1. Overview
- 2. Features
- 3. Requirements
- 4. Setup Instructions
- 5. Environment Setup
- 6. File Structure
- 7. Run the Application
Chat application for Windows on Snapdragon® demonstrating a large language model (LLM, e.g., Llama 3.1 8B) using Genie SDK.
The app demonstrates how to use the Genie APIs from QAIRT SDK to run and accelerate LLMs using the Snapdragon® Neural Processing Unit (NPU).
- Chat with Local LLM: Conversational interface powered by LLMs running locally via the Genie framework.
- Genie Bundle Integration: Seamlessly loads and utilizes
genie_bundledefinitions fromgenie_config.jsonfor model inference. - Local RAG (Retrieval-Augmented Generation): Facilitates RAG implementation with local PDFs.
- Snapdragon® Platform (e.g. X-Elite)
- Windows 11+
-
Please follow this tutorial to generate
genie_bundlerequired by ChatApp. If you use any of the Llama 3 models, the app will work without modifications. If you use another model, you will need to update the prompt format insrc\handlers\prompt_handler.pyfirst. -
Copy bundle assets from step 1 to
NPU-Chatbot_LLM_RAG\genie_bundle. You should seeNPU-Chatbot_LLM_RAG\genie_bundle\*.bincontext binary files.
Note: if genie_bundle saved in some other location, please update the genie_bundle path in src\config.yaml.
- QAIRT SDK: Qualcomm AI Runtime SDK (see QNN SDK for older versions)
- Refer to Setup QAIRT SDK to install compatible QAIRT SDK for models downloaded from AI Hub.
Note: please update the path QAIRT SDK root path in src\config.yaml.
Before proceeding, ensure that all setup steps outlined below are completed in the specified order. These instructions are critical for configuring the necessary tools and dependencies to successfully run the application.
Each section provides references to internal documentation or external guides for detailed guidance. Please follow them carefully to avoid any setup issues.
This application requires two different python environments to successfully run the application.
- python (64-bit) Windows installer (64-bit) for stramlit,chromaDB and RAG implementation
- python (ARM64) Windows installer (ARM64) for loading generated context binaries on NPU using Gen AI Inference Extensions (GENIE) libraries.
Git is required for version control and collaboration. Proper configuration ensures seamless integration with repositories and development workflows.
For detailed steps, refer to the internal documentation (or adjust to a public link if applicable): Setup Git
To set up the environments required for running the application, follow the steps below.
-
Create working directory (if not already done):
mkdir <your_working_dir> cd <your_working_dir>
-
Download Application:
git clone -n --depth=1 --filter=tree:0 https://github.com/qualcomm/Startup-Demos.git cd Startup-Demos git sparse-checkout set --no-cone /GenAI/AI_PC/NPU-Chatbot_LLM_RAG git checkout
-
Navigate to Application Directory:
cd ./GenAI/AI_PC/NPU-Chatbot_LLM_RAG -
Create the
venvenvironments
-
with Python (64-bit):
To create a virtual environment using a specific Python version (e.g., Python 3.10), run:
"C:\path\to\AppData\Local\Programs\Python\Python310\python.exe" -m venv .venv -
with Python (ARM64):
To create a virtual environment using a specific Python version (e.g., Python 3.13 arm64), run:
"C:\path\to\AppData\Local\Programs\Python\Python313-arm64\python.exe" -m venv .venv_arm64The environment will be created in a directory named
.venv_arm64and.venvatNPU-Chatbot_LLM_RAG\.Note: Recommend to create both the virtual environments at
NPU-Chatbot_LLM_RAG\, if you are changing the path or the name of the.venv_arm64, please update thepyarmpath in path insrc\config.yamlaccordingly.
-
Activate the environment:
.venv\Scripts\activate
-
Install the required dependencies for the app:
pip install -r requirements.txt
-
To download embedding model locally:
py \src\handlers\initial_setup.py
requirements.txt
src/
config.yaml # Application configuration file
streamlitchat.py # Main entry point for the Streamlit app
handlers/
__init__.py
genie_loader.py # loads model from genie_bundle to the NPU
initial_setup.py # To download embedding model locally, one-time setup.
logger.py # Custom logging utility, Handles logging in different handlers files and generate logs at `src\logs\debug.log`
prompt_handler.py # Manages prompt
rag_pipeline.py # Manages RAG pipeline
To run the Streamlit application:
-
Activate the virtual environment:
Note: Please ensure that
.venvenvironment is acticated..venv\Scripts\activate
-
Run the Streamlit application:
cd src\ streamlit run streamlitchat.py
This will launch the application in your default web browser. You can then navigate to the provided URL (usually http://localhost:8501) to interact with the LLM Chat and the upload the PDFs to interact.
✅ Once all configurations are complete, you can begin interacting with the application through the chat interface.
