Summary
Based on previous deployment attempts to the ESP32, going to try mimicking the streaming conversion shown in the original GTCRN repo to see how that affects latency and Tensor Arena allocation on hardware.
UPDATE: Will need to revisit at a later date. Currently, the current model, due to the padding setup, has really poor performance when converted to Streaming. I know what the issue is, it's the padding set in the GTConv decoder block. It looks like the best path forward is to retrain with a changed model architecture. For the sake of testing deployment and benchmarking the current offline quantized model, I will be be tabling this until later. Therefore, it's expected that the deployed version of GTCRN-Micro will have some pretty poor latency.
To-dos
Summary
Based on previous deployment attempts to the ESP32, going to try mimicking the streaming conversion shown in the original GTCRN repo to see how that affects latency and Tensor Arena allocation on hardware.
UPDATE: Will need to revisit at a later date. Currently, the current model, due to the padding setup, has really poor performance when converted to Streaming. I know what the issue is, it's the padding set in the GTConv decoder block. It looks like the best path forward is to retrain with a changed model architecture. For the sake of testing deployment and benchmarking the current offline quantized model, I will be be tabling this until later. Therefore, it's expected that the deployed version of GTCRN-Micro will have some pretty poor latency.
To-dos