Skip to content

Thomisch/Esp32-LLM

Repository files navigation

Running a LLM on the ESP32

Optimizing Llama2.c for the ESP32

With the following changes to llama2.c, I am able to achieve 19.13 tok/s:

  1. Utilizing both cores of the ESP32 during math heavy operations.
  2. Utilizing some special dot product functions from the ESP-DSP library that are designed for the ESP32-S3. These functions utilize some of the few SIMD instructions the ESP32-S3 has.
  3. Maxing out CPU speed to 240 MHz and PSRAM speed to 80MHZ and increasing the instruction cache size.

Setup

This requires the ESP-IDF toolchain to be installed

idf.py build
idf.py -p /dev/{DEVICE_PORT} flash

About

DaveBben/esp32-llm for Esp32 test

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors