Setup Streaming version of model for Hardware deployment

## Summary

Based on previous deployment attempts to the ESP32, going to try mimicking the streaming conversion shown in the original [GTCRN](https://github.com/Xiaobin-Rong/gtcrn) repo to see how that affects latency and Tensor Arena allocation on hardware.

-- -  
UPDATE: Will need to revisit at a later date. Currently, the current model, due to the padding setup, has really poor performance when converted to Streaming. I know what the issue is, it's the padding set in the GTConv decoder block. It looks like the best path forward is to retrain with a changed model architecture. For the sake of testing deployment and benchmarking the current offline quantized model, I will be be tabling this until later. Therefore, it's expected that the deployed version of GTCRN-Micro will have some pretty poor latency.
- - - 
## To-dos

- [x] Create a converter script for any of the Conv ops (including the TCN ones most likely)
- [ ] Run inference with streaming version
- [ ] Quantize the streaming variant of the model 
- [ ] Do sample deployment to hardware (either ESP32-S3 or STM32H7) to report streaming latency results on hardware

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setup Streaming version of model for Hardware deployment #9

Summary

To-dos

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Setup Streaming version of model for Hardware deployment #9

Description

Summary

To-dos

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions