HumAware-VAD: Humming-Aware Voice Activity Detection

Overview

HumAware-VAD is a fine-tuned version of the Silero-VAD model, trained to distinguish humming from actual speech. Standard Voice Activity Detection (VAD) models, including Silero-VAD, often misclassify humming as speech, leading to inaccurate speech segmentation. HumAware-VAD improves upon this by leveraging a custom dataset (HumSpeechBlend) to enhance speech detection accuracy in the presence of humming.

Demo

demo2.mp4

🎯 Purpose

The primary goal of HumAware-VAD is to:

Reduce false positives where humming is mistakenly detected as speech.
Enhance speech segmentation accuracy in real-world applications.
Improve VAD performance for tasks involving music, background noise, and vocal sounds.

🗂️ Model Details

Base Model: Silero-VAD
Fine-tuning Dataset: HumSpeechBlend
Format: JIT (TorchScript)
Framework: PyTorch
Inference Speed: Real-time

🚀 Using HumAware-VAD with FastRTC

You can integrate HumAware-VAD with FastRTC for real-time voice activity detection in streaming applications.

Installation

pip install humaware-vad

Clone the this Repository

git clone https://github.com/CuriousMonkey7/HumAwareVad.git
cd HumAwareVad

Run the script:

python app.py

⚠️ Limitations

The model may miss speech detection if the user speaks too softly.
Works best for detecting "mhm" humming sounds.
May also work for sounds like "la la la" or "da da da", but with varying accuracy.

📄 Citation

If you use this model, please cite it accordingly.

@model{HumAwareVAD2025,
  author = {Sourabh Saini},
  title = {HumAware-VAD: Humming-Aware Voice Activity Detection},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/CuriousMonkey7/HumAware-VAD}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HumAware-VAD: Humming-Aware Voice Activity Detection

Overview

Demo

🎯 Purpose

🗂️ Model Details

🚀 Using HumAware-VAD with FastRTC

Installation

Clone the this Repository

⚠️ Limitations

📄 Citation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

HumAware-VAD: Humming-Aware Voice Activity Detection

Overview

Demo

🎯 Purpose

🗂️ Model Details

🚀 Using HumAware-VAD with FastRTC

Installation

Clone the this Repository

⚠️ Limitations

📄 Citation