used the pre_trained model to run test sentence text, the output sample wavs sounds like noize, is it right?