Cache ParakeetModule across transcription calls to avoid repea…#223
Cache ParakeetModule across transcription calls to avoid repea…#223psiddh wants to merge 3 commits intometa-pytorch:mainfrom
Conversation
…ted model loading Summary: Currently, runParakeetOnWavFile() creates a new ParakeetModule and calls .close() on every transcription request (L235-251). This means the model is loaded from disk on every single inference call, and the reported latency includes model load time. This change caches the ParakeetModule at the class level and only recreates it when the user changes model settings.
There was a problem hiding this comment.
Pull request overview
This PR improves transcription latency in the Android Parakeet demo by caching a single ParakeetModule instance across transcription calls, instead of loading/closing the model for every request.
Changes:
- Add a cached
ParakeetModuleplus bookkeeping for which model/tokenizer/data paths are currently loaded. - Introduce
getOrCreateModule()to reuse the existing module unless model settings changed. - Close the cached module in
onDestroy().
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private fun getOrCreateModule(settings: ModelSettings): ParakeetModule { | ||
| val dataPath = settings.dataPath.ifBlank { null } | ||
| if (parakeetModule != null && | ||
| loadedModelPath == settings.modelPath && | ||
| loadedTokenizerPath == settings.tokenizerPath && | ||
| loadedDataPath == dataPath | ||
| ) { | ||
| return parakeetModule!! |
There was a problem hiding this comment.
getOrCreateModule() reads/writes parakeetModule and the loaded*Path fields without any synchronization. Since runParakeetOnWavFile() can be invoked from different threads (UI thread via runParakeet() and a background Thread via runParakeetFromFile()), this can race with module creation/close and/or concurrent transcribe() calls. Consider guarding module access with a lock/single-thread executor, and ensure only one transcription can run at a time across all entry points.
parakeet/android/ParakeetApp/app/src/main/java/com/example/parakeetapp/MainActivity.kt
Outdated
Show resolved
Hide resolved
parakeet/android/ParakeetApp/app/src/main/java/com/example/parakeetapp/MainActivity.kt
Outdated
Show resolved
Hide resolved
…akeetapp/MainActivity.kt Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…akeetapp/MainActivity.kt Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ted model loading
Summary: Currently, runParakeetOnWavFile() creates a new ParakeetModule and calls .close() on every transcription request (L235-251). This means the model is loaded from disk on every single inference call, and the reported latency includes model load time. This change caches the ParakeetModule at the class level and only recreates it when the user changes model settings.