Skip to content

Commit f87dc3b

Browse files
authored
Merge pull request #146 from danielferr85/main
Fix #116 for multispeaker for Piper voice
2 parents 1836cf4 + d0e75a3 commit f87dc3b

3 files changed

Lines changed: 8 additions & 3 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# RKLLama: LLM Server and Client for Rockchip 3588/3576
22

3-
### [Version: 0.0.66](#New-Version)
3+
### [Version: 0.0.67](#New-Version)
44

55
Video demo ( version 0.0.1 ):
66

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "rkllama"
3-
version = "0.0.66"
3+
version = "0.0.67"
44
authors = [
55
{ name="NotPunchnox", email="punchnoxpro@gmail.com" },
66
{ name="TomJacobsUK", email="tom@tomjacobs.co.uk" },

src/rkllama/api/models/audio/piper.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -206,8 +206,9 @@ def phoneme_ids_to_audio(
206206
)
207207

208208
# Get the encoder outputs
209+
g = None # In case of Multispeaker Voice
209210
if speaker_id is not None:
210-
z, y_mask, _ = encoder_output
211+
z, y_mask, g = encoder_output
211212
else:
212213
z, y_mask = encoder_output
213214

@@ -241,6 +242,10 @@ def phoneme_ids_to_audio(
241242
# Construct inputs for RKNN decoder model
242243
inputs_chunk = [zc.astype(np.float32), yc.astype(np.float32)]
243244

245+
# For multispeaker models, we need to add the channel info generated by the encoder
246+
if g is not None:
247+
inputs_chunk.append(g)
248+
244249
# Inference RKNN (decoder) of the chunk
245250
result = self.session_rknn.inference(inputs=inputs_chunk, data_format="nchw")
246251

0 commit comments

Comments
 (0)