DillWave

DillWave is a fast, high-quality neural vocoder and waveform synthesizer. It starts with Gaussian noise and converts it into speech via iterative refinement. The speech can be controlled by providing a conditioning signal (e.g. log-scaled Mel spectrogram). The model and architecture details are described in DiffWave: A Versatile Diffusion Model for Audio Synthesis.

Credit to the original repo here.

Recommended Requirements

An Nvidia GPU that is somewhere in the RTX 30XX-40XX range.

For training it's recommended to have 16+ GB of VRAM. For inference its recommended to have at least 4 GB of VRAM.

Install

First install Pytorch, GPU version recommended! Also you need Python of course! Version 3.10.X is recommended for dillwave.

From GitHub:

git clone https://github.com/dillfrescott/dillwave
pip install -e dillwave

or

pip install git+https://github.com/dillfrescott/dillwave

You need Git installed for either of these "From GitHub" install methods to work.

Training

python -m dillwave.preprocess /path/to/dir/containing/wavs # 48000hz, 1 channel, (8 seconds length recommended for each clip)
python -m dillwave /path/to/model/dir /path/to/dir/containing/wavs

# in another shell to monitor training progress:
tensorboard --logdir /path/to/model/dir --bind_all

Inference CLI

python -m dillwave.inference /path/to/model --spectrogram_path /path/to/spectrogram -o output.wav [--fast]

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
src/dillwave		src/dillwave
LICENSE.txt		LICENSE.txt
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DillWave

Recommended Requirements

Install

Training

Inference CLI

About

Releases

Languages

License

dillfrescott/dillwave

Folders and files

Latest commit

History

Repository files navigation

DillWave

Recommended Requirements

Install

Training

Inference CLI

About

Resources

License

Stars

Watchers

Forks

Releases

Languages