Cannot reshape tensor of 0 elements error during parsing part of transcribe & process

Would be super appreciative of any insight. I am very out of my depth with this and feel very out of place even posting on Github. A while back I blindly followed along with one of Jarod's tutorials to clone a voice. I copied all the settings and used a pretty good 10 minute piece of speaking audio and the results were amazing, as good as ElevenLabs (just slower!)

I tried to reproduce this again with a different voice except now I'm running into problems. When I do the initial transcription part of training in Whisper I keep getting this error after it reaches 100% and begins to parse: "RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous"

I've managed to get past this but I'm not sure how. Basically just by resaving my wav file and trying different sample rates. From what I've read on the wiki pages this shouldn't actually matter, so maybe I've just been brute forcing it. The audio isn't corrupt as far as I can tell and plays back fine in media players, the waveform looks healthy, etc. 

When I do manage to get beyond this point all my models are terrible. Nothing like the successful model I made with a different voice the first time. Is it likely this tensor error is the reason for it, perhaps something that's still affecting the way Tortoise interacts with the wav files and transcription txt when it actually trains? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot reshape tensor of 0 elements error during parsing part of transcribe & process #183

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Cannot reshape tensor of 0 elements error during parsing part of transcribe & process #183

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions