Open
Description
Self Checks
- I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find any relevant information that meets my needs. English 中文 日本語 Portuguese (Brazil)
- I have searched for existing issues search for existing issues, including closed ones.
- I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- Please do not modify this template :) and fill in all the required fields.
1. Is this request related to a challenge you're experiencing? Tell us your story.
Thanks for sharing this amazing work!
I have to say that I was lured into this repo by the video on X and thought about using it for my own avatar chat. However, I can't find any references to the model returning visemes as well as sound.
These visemes (phoneme+duration) is key to make real-time avatars, at least, to my knowledge.
Any chance this can be obtained from the generation of the model?
2. What is your suggested solution?
Actually, I am not sure this can be achieved, since this model does not follow the phoneme rule.
3. Additional context or comments
No response
4. Can you help us with this feature?
- I am interested in contributing to this feature.