❓ Questions and Help
I am encountering consistent issues where Silero VAD fails to detect speech or exhibits significantly lower sensitivity when processing female voices compared to male voices, specifically when the audio is sampled at 8kHz.
I am working with telephone connection, PCM int16 in 8khz, but when i use resampling from 8khz to 16khz on each chunk the model performs adequately for both genders. However, downsampling to 8kHz seems to disproportionately affect the detection of female speech, likely due to the loss of higher frequency components which are more prominent in female vocal ranges.