fix multinet model for v2 #1208
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request adds support for configuring custom wake words and multinet speech recognition models through a new JSON-based configuration, enhancing flexibility and maintainability. The main changes include parsing a new
index.jsonasset for model and command configuration, updating the build script to extract and generate this configuration fromsdkconfig, and refactoring the asset build pipeline to support both wakenet and multinet models. These improvements make it easier to customize speech recognition features without code changes.Custom Wake Word and Multinet Model Configuration:
ParseWakenetModelConfigmethod inCustomWakeWordto load language, duration, threshold, and commands fromindex.jsonat runtime, allowing dynamic configuration of wake word and commands. (main/audio/wake_words/custom_wake_word.cc,main/audio/wake_words/custom_wake_word.h) [1] [2]Commandstructure and related logic to support multiple commands and actions, not just a single wake word. (main/audio/wake_words/custom_wake_word.h,main/audio/wake_words/custom_wake_word.cc) [1] [2] [3]Asset Build Script Enhancements:
build_default_assets.pyto process both wakenet and multinet models, extract their configuration fromsdkconfig, and generate a comprehensiveindex.jsoncontaining model and command metadata. (scripts/build_default_assets.py) [1] [2] [3] [4] [5]sdkconfig. (scripts/build_default_assets.py) [1] [2]multinet_model_infoin the generatedindex.json, enabling the firmware to access all relevant SR model and command configuration at runtime. (scripts/build_default_assets.py) [1] [2] [3]These changes significantly improve the flexibility of speech recognition model configuration and asset management, making it easier to deploy and update custom wake words and command sets.