Skip to content

marduk191/ComfyUI-LavaSR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ComfyUI-LavaSR

ComfyUI custom nodes for LavaSR — a fast speech enhancement and audio super-resolution model that upsamples degraded audio to 48 kHz with noise reduction.

Key LavaSR specs:

  • 5000× real-time on GPU, ~60× on CPU
  • ~50 MB model, ~500 MB VRAM
  • Accepts any input sample rate (8–48 kHz)
  • Outputs 48 kHz enhanced audio

Nodes

LavaSR Model Loader

Loads the LavaSR model from Hugging Face (YatharthS/LavaSR) or a local path. Downloads and caches the model on first use (~50 MB).

Widget Description
model_name HF repo ID or local folder path
device auto picks CUDA → MPS → CPU

Output: LAVASR_MODEL


LavaSR Enhance Audio

Enhances a ComfyUI AUDIO tensor. Accepts any sample rate and channel count — audio is downmixed to mono and resampled to 16 kHz internally. Output is 48 kHz mono.

Widget Description
denoise Run the denoiser stage before enhancement
batch_mode Split audio into 1-second chunks (use for long files)
lr_cutoff_hz Linkwitz-Riley crossover frequency in Hz (default 8000). Set to roughly half your source sample rate

Inputs: LAVASR_MODEL, AUDIO Output: AUDIO at 48 kHz


LavaSR Enhance Audio File

Convenience node — loads a file by path and enhances it in one step. Uses the model's native load_audio() which configures the LR crossover automatically.

Widget Description
audio_file Absolute path to a .wav file
denoise Run the denoiser stage
batch_mode Split into 1-second chunks for long files
input_sr Sample rate of the source file (default 16000)
cutoff_hz LR crossover in Hz; 0 = auto (half of input_sr)

Inputs: LAVASR_MODEL Output: AUDIO at 48 kHz


Installation

ComfyUI Manager (recommended)

Search for ComfyUI-LavaSR in the Manager and install.

Manual

cd ComfyUI/custom_nodes
git clone https://github.com/marduk191/ComfyUI-LavaSR.git
cd ComfyUI-LavaSR
pip install -r requirements.txt

Restart ComfyUI. The nodes appear under audio/LavaSR.


Example Workflows

Three ready-to-use workflows are in the workflows/ folder:

File Description
basic_enhance_file.json Simplest — file path → enhance → preview
enhance_loaded_audio.json Use ComfyUI's LoadAudio picker → enhance → preview
enhance_and_save.json LoadAudio → enhance → SaveAudio (FLAC)

Tips

  • TTS output: connect any TTS node's AUDIO output directly to LavaSR Enhance Audio. Leave lr_cutoff_hz at 8000.
  • Phone/call audio (8 kHz source): set lr_cutoff_hz to 4000.
  • Long files: enable batch_mode to avoid memory spikes.
  • Denoise only: set denoise = true and the enhance stage still runs (LavaSR always applies bandwidth extension). To skip it entirely, use the underlying Python API directly.

Credits

  • LavaSR by Yatharth Sharma — model, training, and architecture
  • Vocos — vocoder backbone
  • ComfyUI nodes by marduk191

About

ComfyUI custom nodes for LavaSR — a fast speech enhancement and audio super-resolution model that upsamples degraded audio to 48 kHz with noise reduction.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages