Audio-WestlakeU repositories

The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 20…

end-to-end pytorch speaker-diarization

end-to-end pytorch speaker-diarization self-attention long-form online-inference frame-wise

Python

•17•181•9•0•Updated

May 7, 2026

VING

Public

Official implementation of VING: Variational Bayesian Inference with Multi-Aspect Neural Guidance for Speech Dereverberation

MIT License

•1•4•0•0•Updated

Apr 17, 2026

FN-SSL

Public

The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]

speech narrow-band sound-source-localization

speech narrow-band sound-source-localization microphone-array-generalization

Python

•19•156•6•0•Updated

Mar 10, 2026

VINP

Public

Official PyTorch implementation of 'VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR I…

speech rir asr

speech rir asr dereverberation

Python

•

MIT License

•6•35•1•0•Updated

Feb 23, 2026

CleanMel

Public

Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".

Python

•

Apache License 2.0

•12•94•2•0•Updated

Feb 2, 2026

audiossl

Public

A library built for easier audio self-supervised training, downstream tasks evaluation

pytorch audio-classification audioset

pytorch audio-classification audioset nsynth speech-commands audio-datasets self-supervised-learning voxceleb1 urbansound8k pytorch-lightning

Python

•

Other

•11•141•6•1•Updated

Sep 25, 2025

RealMAN

Public

A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]

multi-channel speech-enhancement microphone-array-processing

multi-channel speech-enhancement microphone-array-processing doa-estimation audio-datasets sound-source-localization microphone-audio-capture real-world-datasets

Python

•16•173•4•0•Updated

Apr 29, 2025

RVAE-EM

Public

Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [I…

vae bayesian-inference speech-processing

vae bayesian-inference speech-processing speech-enhancement dereverberation

Python

•

MIT License

•5•52•1•0•Updated

Mar 6, 2025

NBSS

Public

The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation

speech pytorch multi-channel

speech pytorch multi-channel enhancement denoising separation dereverberation narrow-band full-band

Python

•

MIT License

•44•354•28•0•Updated

Jan 1, 2025

UMA-ASR

Public

This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).

speech-recognition asr

Shell

•6•35•1•0•Updated

Dec 17, 2024

SAR-SSL

Public

A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer” [T…

multi-channel microphone-array real-world-data

multi-channel microphone-array real-world-data room-acoustics conformer fine-tuning tdoa array-signal-processing self-supervised-learning downstream-tasks

Python

•

MIT License

•1•40•2•0•Updated

Oct 11, 2024

ATST-RCT

Public

ATST-RCT model for DCASE 2022 task4.

Python

•0•3•0•0•Updated

Sep 19, 2024

Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement-

Public

Python

•

MIT License

•0•5•0•0•Updated

Mar 12, 2024

pytorch_lightning_template_for_beginners

Public

A pytorch template for beginners based on pytorch_lightning

Python

•7•50•0•0•Updated

Feb 1, 2024

FullSubNet

Public

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

audio reproducible-research paper

audio reproducible-research paper speech pytorch band speech-processing noise-reduction denoising speech-separation

Python

•

MIT License

•161•607•41•3•Updated

Aug 19, 2023

McNet

Public

The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023

signal-processing pytorch speech-enhancement

signal-processing pytorch speech-enhancement array-signal-processing multi-channel-speech-enhancement pytorch-lightning

Python

•17•130•3•0•Updated

Mar 24, 2023

RCT

Public

This repo gives the code for the official implementation of RCT.

Python

•1•14•0•0•Updated

Jun 28, 2022

Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement

Public

Python

•

MIT License

•9•7•0•0•Updated

Sep 30, 2021

Audio-WestlakeU.github.io

Public

Audio and Signal Information Processing Lab in Westlake University concentrates on speech processing algorithm

MIT License

•1•3•0•0•Updated

Jul 8, 2021

Narrowband_DeepFiltering

Public

Python

•6•19•0•0•Updated

Apr 1, 2020

RTF_InterFrameSpecSub

Public

MATLAB

•3•1•0•0•Updated

Apr 1, 2020

RS_noisePSD

Public

MATLAB

•0•1•0•0•Updated

Apr 1, 2020

DP_RTF_SSL

Public

MATLAB

•3•4•1•0•Updated

Apr 1, 2020

bss_ctf_lasso

Public

MATLAB

•3•5•0•0•Updated

Apr 1, 2020

dereverb_ctf_nonneg

Public

MATLAB

•2•1•0•0•Updated

Apr 1, 2020

BSS_CTF_EM

Public

MATLAB

•1•0•0•0•Updated

Apr 1, 2020

LSTM-noisePSD

Public

Python

•2•8•0•0•Updated

Apr 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio-WestlakeU

All

All

33 repositories

ATST-SED

Rec-RIR

Mel-McNet

FS-EEND

VING

FN-SSL

VINP

CleanMel

audiossl

RealMAN

RVAE-EM

NBSS

UMA-ASR

SAR-SSL

ATST-RCT

Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement-

pytorch_lightning_template_for_beginners

FullSubNet

McNet

RCT

Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement

Audio-WestlakeU.github.io

Narrowband_DeepFiltering

RTF_InterFrameSpecSub

RS_noisePSD

DP_RTF_SSL

bss_ctf_lasso

dereverb_ctf_nonneg

BSS_CTF_EM

LSTM-noisePSD

All

All

Repositories list

33 repositories