Toucan Questions

I have a few questions that I hope will not much of your time.

- Is there support for IPA or some other phonetic pronunciation for words that are incorrectly pronounced or that you have a specific pronunciation for?
- Is there a way to add specific lengths of silence e.g. [[slnc 2000]], where 2000 is 2000 milliseconds
- How does one add emphasis or emotion?
- Are there any scripts that can be used for fine tuning a single voice?  Hopefully, a simple one with a GUI.
- Can you utilize TensorBoard to view the training logs and see what is likely the best checkpoint?
- Can you provide some guidance on what is different about my voice that **no cloning software is effective** (I've tried many)?  The only exception is using post processing using RVC but again, that is post processing.
[My_Voice_mp3.zip](https://github.com/user-attachments/files/17217703/My_Voice_mp3.zip)

I'm ultimately looking for a clone close enough that I could fool myself.  I get pretty close when I use RVC.

I tried the space on HuggingFace but the clone using any of my uploaded samples sounded like another guy.
I have approximately 1.5 hours of good quality audio, similar to the attached, so I can fine-tune if needed.
https://huggingface.co/spaces/Flux9665/MassivelyMultilingualTTS


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Toucan Questions #195

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Toucan Questions #195

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions