Fix Kokoro language pipeline: multi-lang dispatch, null audio guard, warning fixes #123
Open
Naitik120gupta wants to merge 1 commit intosugarlabs:mainfrom
Open
Fix Kokoro language pipeline: multi-lang dispatch, null audio guard, warning fixes #123Naitik120gupta wants to merge 1 commit intosugarlabs:mainfrom
Naitik120gupta wants to merge 1 commit intosugarlabs:mainfrom
Conversation
speech.py — 5 fixes
1. Multi-language pipeline architecture (__init__ + setup_kokoro)
- Added self.kokoro_pipelines = {} (dict: lang_code → KPipeline) and self.kokoro_model = None (shared neural model)
- setup_kokoro() now passes repo_id='hexgrad/Kokoro-82M' explicitly (suppresses the noisy startup warning) and registers the created pipeline into kokoro_pipelines['a'], storing the
shared KModel reference
- self.kokoro_pipeline is kept as-is for backward compatibility with activity.py code that reads kokoro_pipeline.repo_id
2. Language-aware lazy pipeline initialization (set_kokoro_voice + new _init_pipeline_for_lang)
- When a voice from a different language group is selected (e.g. jf_alpha → Japanese, bf_alice → British English), set_kokoro_voice() extracts lang_code = voice_name[0] and spins up
a background thread to create a KPipeline for that language, reusing the already-loaded KModel (no redundant model downloads)
3. Correct pipeline dispatch in _stream_kokoro_audio
- Now looks up self.kokoro_pipelines.get(lang_code) to route to the right G2P pipeline. Falls back gracefully to the default 'a' pipeline (with a warning) if the language-specific
one hasn't finished initializing yet
4. Null-check audio_chunk before .numpy()
- Added if audio_chunk is None: continue — a None audio is valid (quiet pipeline mode or inference skip) and previously would crash with AttributeError
5. appsrc = None before the try block
- Moving initialization before try ensures the except block's if appsrc: guard doesn't raise NameError when the exception fires before appsrc is assigned
---
kokoro/pipeline.py — 1 fix
6. Wrong LANG_CODES key in language-mismatch warning
- LANG_CODES.get(voice, voice) → LANG_CODES.get(voice[0], voice)
- LANG_CODES is keyed by single characters ('j', 'z', 'a', …). voice is a full name like 'jf_alpha', which is never a key, so the lookup always silently fell back to the raw voice
name. Using voice[0] correctly resolves e.g. 'j' → 'Japanese' in the warning message.
d42031b to
9925943
Compare
Author
|
Hey, @chimosky @mebinthattil |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
setup_kokoro()hardcodedlang_code='a'— selecting a Japanese, Chinese,Spanish, etc. voice still ran text through the English G2P, producing bad output
set_kokoro_voice()only updated the voice name, never switched the underlying pipelineaudio_chunkwas dereferenced without a None check, causingAttributeErroron quiet chunksappsrcwas referenced in theexceptblock before being assigned → potentialNameErrorLANG_CODES.get(voice, voice)used the full voice name (e.g.jf_alpha) as a dict key;it should be
voice[0](j) which is the actual keyChanges
speech.py: addkokoro_pipelinesdict and sharedkokoro_model; lazy-init per-languagepipelines in background threads reusing the loaded KModel; dispatch to correct pipeline in
_stream_kokoro_audio; fix null guard andappsrcinit; passrepo_idexplicitlykokoro/pipeline.py: fixLANG_CODES.get(voice[0], voice)in language-mismatch warning