You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have tried your models(voicebox and this one) and vall-e-2 sounds more natural, but there is lot of misspellings in the generated speech. Is it because of dataset? Have you tried to train voicebox on the libriheavy?