Question about MelSpectrogram #287
Replies: 1 comment
-
|
Hi @ductho9799 I think it was a mistake when training the HiFi Gan model (torchaudio.transforms.MelSpectrogram set by default sr = 16Khz). And now it's baked into the pipeline. The correct way is of course that the mel spectrograms should be calculated from 24Khz with real sr 24Khz and after passing through HiFi Gan give 24Khz wave. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
In StyleTTS2 paper, the datasets were resampled to 24kHz. But in StyleTTS2 code, when calculating the MelSpectrogram:
to_mel = torchaudio.transforms.MelSpectrogram( n_mels=80, n_fft=2048, win_length=1200, hop_length=300). It used the default sampling_rate = 16000.If I change the sampling_rate to 24000, will it affect the model results?
Beta Was this translation helpful? Give feedback.
All reactions