This repository is the course deliverable for CSE421: a collection of embedded signal-processing and machine-learning projects developed for the STM32F7 Discovery family. The work demonstrates end-to-end pipelines: dataset preparation and training in Python, model conversion/quantization, and resource-aware inference on an STM32F746G-DISCO (Cortex-M7, FPU).
Status: Each homework (HW1..HW6) is a self-contained project with reports, firmware, host-side tools, and conversion artifacts. See the per-homework READMEs for deep details.
Table of Contents
- Overview
- Supported Hardware & Prerequisites
- Repository Structure
- Quick Start (Build & Run)
- HIL Communication Protocol
- Model Conversion & Quantization
- Testing & Validation
- Contributing
- Licenses & Attribution
Overview
- Purpose: Provide reproducible examples of deploying classical ML and small neural networks to resource-constrained MCUs. Key topics: MFCC extraction, Bayes/k-NN/SVM, MLPs/CNNs, TFLite quantization, and MCU-side inference harnesses.
Supported Hardware & Prerequisites
- Target board: STM32F746G-DISCO (DISCO_F746NG). All projects target an FPU-capable Cortex-M7.
- Host tools (recommended): Python 3.8+, numpy, scipy, scikit-learn, tensorflow, librosa, matplotlib.
- MCU toolchains & IDEs:
- HW1: developed and built in Mbed Studio (see
HW1/HW1-part1/mbed_app.json). - HW2..HW6 and Project: developed and built using Keil Studio integrated with VS Code (Keil + VS Code extensions). Ensure the Keil/ARM toolchain, the VS Code C/C++ extensions, and Keil integration are installed and configured. For float-based builds enable hardware FPU support in project settings.
- HW1: developed and built in Mbed Studio (see
Repository Structure
- HW1/ — Temperature sensor demo (Part-1) and MFCC feature-extraction firmware (Part-2).
Minimal MCU demonstration was implemented with Mbed Studio. See
HW1/HW1-part1/mbed_app.jsonfor FPU config andHW1/HW1-part2/script.pyfor the host-side MFCC streaming tool. - HW2/ — Classical ML pipeline: Bayes (QDA), k-NN, and SVM. Contains training notebooks,
sklearn2c-style model exports, and Keil/VS Code MCU projects for inference and HIL validation. SeeHW2/Readme.mdfor architecture and comms flowchart. - HW3/ — Lightweight time-series regression (temperature forecasting) and a compact HAR single-neuron classifier. Firmware and host tests are Keil/VS Code projects; notebooks and MCU glue live in
HW3/. - HW4/ — Focused MLP work: Human Activity Recognition, Keyword Spotting, MNIST (Hu moments), and temperature regression. This folder contains training notebooks, exported
.h5models (e.g.,Part-1/mlp_har_model.h5), and conversion notes. HW4 is a strong example of end-to-end model design → quantization → MCU integration using Keil/VS Code. - HW5/ — MCU test harnesses, TFLite verification, and validation scripts that exercise the deployed models across HAR, MFCC, MNIST, and temperature tasks. Keil/VS Code builds and host-side test drivers are included.
- HW6/ — Comparative CNN experiments (Custom CNN, EfficientNet, MobileNet, ResNet, SqueezeNet) with full quantization and MCU deployment pipeline. This is one of the most complete sections for production-style optimization and benchmarking; results are saved under
HW6/results/and per-architecture subfolders include notebooks and conversion scripts. - Project/ — Independent project work and capstone-like experiments (e.g., FOMO, Rice classifier, regression studies). These are Keil/VS Code based MCU integrations plus supporting training notebooks and dataset artifacts.
Project (Detailed)
The Project/ folder contains the repository's capstone-style work. Two subprojects are the primary contributions: Part-1 (FOMO) and Part-2 (Rice classifier). These two parts combine end-to-end model development, quantization/optimization, and Keil/VS Code MCU deployment with Hardware-in-the-Loop validation.
-
Part-1 — FOMO (Project/Part-1)
- Objective: build and validate a compact MCU classifier (FOMO) that detects a specified event or pattern from sensor-derived features.
- Contents:
Project/Part-1/MCU_FOMO/(firmware and Keil project files),Project/Part-1/Train_FOMO/(training notebooks and conversion scripts), andProject/Part-1/Results-FOMO/(validation logs, CSVs, and accuracy results). - Training & conversion: The offline workflow trains a reference model in Python (notebooks in
Train_FOMO), then converts and quantizes the model (TFLite INT8 or C export) using a representative dataset generator. Conversion scripts and calibration datasets reside alongside the training notebooks. - MCU integration: The MCU project in
MCU_FOMOis organized for Keil/VS Code. Large model arrays are declaredconstto place them in Flash. The Keil project uses a TensorArena or a small inference routine depending on whether TFLite Micro or a hand-written inference function is used. - How to run: create a Keil project using the Blinky template (see Quick Start), copy the files from
Project/Part-1/MCU_FOMOinto the Blinky projectsrc/folder, build, and flash. Host-side validation scripts inResults-FOMO/show the expected serial protocol and CSV output format. - Inspiration & attribution: Project Part-1 (FOMO) was inspired by and references the upstream FOMO repository: https://github.com/bhoke/FOMO — consult that project for additional design ideas and reference implementations.
-
Part-2 — Rice Classifier (Project/Part-2)
- Objective: classify rice grain types (or other agricultural varieties) using a compact model suitable for MCU deployment. This part demonstrates image/feature processing, lightweight model architecture design, and the full conversion/validation pipeline.
- Contents:
Project/Part-2/MCU-Rice/(Keil project and MCU inference code),Project/Part-2/Train-Rice/(training notebooks, preprocessing scripts), andProject/Part-2/Result-Rice/(evaluation outputs and inference logs). - Training & conversion: Training notebooks implement preprocessing (feature extraction or image resizing), model training, and representative dataset creation. The conversion step typically outputs a quantized
.tfliteand a C array for inclusion in the MCU project. - MCU integration:
MCU-Ricecontains the Keil/VS Code project configured for theDISCO_F746NGtarget. The source organizes preprocessing code, inference engine, and UART HIL handlers. The README insideProject/Part-2contains detailed build notes and expected serial packet formats for host testing. - How to run: copy the MCU source from
Project/Part-2/MCU-Riceinto a Keil Blinky project, ensure the project configuration matches the required heap/stack for model tensors, build, and flash. Use the host-side scripts inResult-Riceto stream test images/features and collect results.
Common notes for both parts:
- Keil/VS Code: Both parts are designed for Keil Studio integrated into VS Code. Follow the Quick Start steps to install the Keil extension and create a Blinky-based Keil project; then copy the MCU sources into that project.
- Representative data & calibration: Each
Train_*folder contains the scripts to generate representative datasets required for post-training quantization — keep these consistent with the MCU preprocessing (scaler means/std, input order). - Validation: Results folders contain CSV logs and example serial transcripts. Use these to confirm parity between the Python reference implementation and MCU inference.
Quick Start (Keil Studio on VS Code)
This repository's MCU projects (HW2..HW6 and Project/) are developed for Keil Studio integrated
into Visual Studio Code. Follow these steps to create a new Keil project and reuse the MCU sources
from any homework.
- Install the Keil Studio extension for VS Code:
- Create a new Keil project using the Blinky pre-project template:
- In VS Code, open the Keil Studio extension commands and select "Create Solution".
- Select the target board: STM32F746G-DISCO (DISCO_F746NG).
- Choose the Blinky pre-project/template when prompted.
- Copy the MCU firmware from a homework into your new project:
- Locate the MCU sources for the homework you want to run (for example,
HW4/Part-1/orHW6/customCNN/mcu). - Copy the C/C++ source files, headers, and any
mbed_app.jsonor project config snippets into the Keil project folder (keep folder structure tidy undersrc/orapp/). - Update the Keil project include paths and linker settings if required (especially if the homework requires extra heap/stack or places large arrays in Flash).
- Build and flash
- Use the Keil Studio build and flash commands from within VS Code. Confirm the target remains
DISCO_F746NGand that hardware FPU is enabled for float-based projects.
This approach lets you reuse the MCU implementation from any homework by copying the MCU portion into a standard Blinky-keil project and building with Keil/VS Code.
HIL Communication Protocol
- A Stop-and-Wait UART handshake is used across many assignments:
- MCU sends
[READY]to host. - Host transmits one data chunk (raw samples, features, or an image).
- MCU replies
[SAMPLE_OK]after reception. - MCU runs feature extraction / inference and returns a label or vector.
- MCU sends
- Packet formats vary by homework. See host scripts in each homework (e.g., HW1/HW1-part2/script.py)
and the MCU
main.cppimplementation for precise formats.
Model Conversion & Quantization
- Typical pipeline used in HW4/HW6:
- Train model in Python and save as
.h5. - Create a small representative dataset generator for calibration.
- Convert to TFLite using
tf.lite.TFLiteConverterwithOptimize.DEFAULT. - Target INT8 quantization and produce a
.tflitefile. - Generate C arrays for TFLite Micro and integrate into MCU project.
- Train model in Python and save as
Example conversion snippet:
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
tflite_model = converter.convert()
open('model_quant.tflite', 'wb').write(tflite_model)Memory & Performance Notes
- Declare large constant arrays as
constin C/C++ so they are placed in Flash, preserving SRAM. - Increase linker heap/stack size when MFCC buffers or convolutional arenas require more memory.
- Enable hardware floating point in the project config when running floating-point inference.
Testing & Validation
- Many homeworks include host-side validation scripts that stream datasets and write CSV logs
(see
HW6/results/andHW1/mfcc_results0.txt). Use these to verify parity between Python reference outputs and MCU predictions.
Contributing
- To contribute: open an issue describing the change, then submit a PR against this repo.
- When adding firmware, preserve the folder layout and provide conversion scripts rather than committing large binary model blobs when possible.
Licenses & Attribution
- Several homework subfolders contain
LICENSEfiles. Respect each subproject's license before reuse.
Contact & Authors
- Author and report metadata are included in each homework README. For questions about a specific homework, start by reading the per-homework README (linked in the Repository Structure).