Speech enhancement pipeline removing background noise from audio signals, optimized for edge devices.
- Removes background noise from audio recordings
- Optimized for CPU-only inference (<100ms latency)
- Works on resource-constrained devices (earbuds, wearables, IoT)
- Converts to ONNX for edge deployment
Speech enhancement is crucial for:
- Clear voice calls in noisy environments
- Voice assistants reliability
- Hearing aids and audio devices
- IoT and embedded systems
- Spectral subtraction algorithm
- Spectrogram visualization
- ONNX conversion (75% size reduction)
- CPU-only real-time inference
- PyTorch + Librosa pipeline
pip install -r requirements.txtOr:
pip install torch torchaudio librosa soundfile numpy matplotlibjupyter notebook notebook_inference.ipynbpython scripts/inference.py --input input_audio/noisy_speech.wav --output output_audio/clean_speech.wavpython scripts/convert_to_onnx.py| Metric | Value |
|---|---|
| Inference Latency | <100ms |
| Model Size Reduction | 75% |
| Platform | CPU-only |
| Target Devices | Earbuds, IoT, Wearables |
Noisy Audio → STFT → Spectrogram → Spectral Subtraction → ISTFT → Clean Audio
edge-ai-speech-enhancement/
├── notebook_inference.ipynb # Demo notebook
├── README.md # This file
├── requirements.txt # Dependencies
├── LICENSE # MIT License
└── scripts/
├── inference.py # Main enhancement
├── convert_to_onnx.py # ONNX conversion
└── quantize_model.py # Model quantization
- Audio signal processing (FFT/STFT)
- Spectral subtraction algorithms
- Model optimization for edge devices
- ONNX conversion and quantization
- Low-latency inference
Together these show a complete audio AI pipeline!
GitHub: github.com/satzgits/edge-ai-speech-enhancement