Skip to content

erectbranch/stm32ai-latency-lookup-table

Repository files navigation

STM32 Cloud Latency Lookup Table

This repository performs inference of deep learning models on STMicroelectronics MCU boards and analyzes the results. Using the API provided by the STM32Cube.AI Developer Cloud service, it enables model deployment, inference, and result profiling.

With the code in this repository, you can create a latency look-up table used in Once for All: Train One Network and Specialize it for Efficient Deployment (ICLR 2020).

main


1. Structure

├── .benchmark/                  # Benchmark results on target MCU  (provided by the STM32 Cloud).
├── .lut/                        # Latency look-up table generated by analyzing benchmark results.
├── .models/                     # DNN models and its configuration files.
├── latency_lookup_table/        # Scripts to build the latency look-up table.
│   ├── ops/     
│   ├── tables/              
│   └── helper.py
├── stm32_api/                   # A set of scripts to interact with the STM32 Cloud API.                     
│   ├── benchmark/ 
│   ├── file/
│   ├── login/
│   ├── utils/
│   ├── analyze.py
│   └── helper.py
├── benchmark.ipynb              # Script to perform benchmark on the STM32 Cloud and save the results.
└── build_latency_table.py       # Script to build the latency look-up table.

2. Prerequisites

Registration for STM32Cube.AI Developer Cloud is required.

Requirements

  • Python

Model Preparation

This Repository will use the 'MCUNet: Tiny Deep Learning on IoT Devices' model: [GitHub]

  • model file (mcunet-512kb-2mb_imagenet.tflite) [Link]

  • model configuration (mcunet-512kb-2mb_imagenet.json) [Link]


3. Step-by-Step

STEP 1. Prepare the model file and its configuration file.

Download the model file and its configuration file from the links provided above. Place them in the ./.models/ directory.

  • Q. How to create a configuration file?

    A. You can use the following script to generate a configuration file for the model.

    import json
    # ...
    ofa_network.sample_active_subnet()
    subnet = ofa_network.get_active_subnet(preserve_weight=True)
    
    with open(f'./.models/{subnet_name}.json', 'w') as f:
        json.dump(subnet.config, f, indent=2)

STEP 2. Benchmark the model on the STM32 Cloud.

The results will be saved in the ./.benchmark/ directory.

Run the benchmark.ipynb notebook to perform the benchmark on the STM32 Cloud.

STEP 3. Build the latency look-up table.

The latency LUT will be saved in the ./.lut/ directory.

python build_latency_table.py --model-class mobilenetv2 --input-shape 160 160

4. License

The portions of the project are available under separate license terms.


5. Contributing

Welcome any contributions to the project! Please submit a pull request or open an issue.