Skip to content

v0.1.0-alpha

Pre-release
Pre-release
Compare
Choose a tag to compare
@a-n-rose a-n-rose released this 20 Jul 14:28
· 417 commits to master since this release

An experimental Python framework for sound visualization, analysis, augmentation, filtering as well as machine learning.

Basic functionality for preparing audio datasets (e.g. formatting them), filtering audio, visualizing audio and its features (signal, stft, powspec, fbank, mfcc), augmenting audio for machine learning, and building/implementing basic neural networks for simple speech recognition, speech classification (e.g. language, gender or sex, emotion, etc.), and denoising.

Might be a bit buggy still.

keywords:
audio file format conversion, dataset preparation, wiener filter, convolutional neural networks, cnn, conv, lstm, long short-term memory network, cnn+lstm, cnnlstm, convlstm, autoencoder, denoiser, speech recognition, environment classification, scene classification, language classification, denoising, augmentation, feature extraction, mel-filterbank energies, fbank, mel-frequency cepstral coefficients, mfcc, short-time fourier transfrom, stft, raw signal.