Skip to content

phaserblast/ComfyUI-DGXSparkSafetensorsLoader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ComfyUI-DGXSparkSafetensorsLoader

A ComfyUI model loader that uses the fastsafetensors library to perform a very fast, zero-copy load from storage to VRAM.

This is very experimental, and may destroy the universe. So please don't use it in a production environment under any circumstances.

On DGX Spark, fastsafetensors is a massive improvement over the Hugging Face safetensors library for loading AI models. The Hugging Face library doesn't work well with the DGX Spark due to its architecture and memory design. Models load very slowly and sometimes use up to 2x memory during loading. This can cause large models to exceed the Spark's RAM capacity and fail, even when the model should fit in under half of the machine's RAM capacity.

This node doesn't require ComfyUI to be launched with the --cache-none or --disable-mmap options. The default options should work fine.

Here's an example of memory usage during and after loading the 60GB FLUX.2-dev BF16 model to the GPU and the 17GB Mistral FP8 text encoder to the CPU. As you can see, model loading happens extremely fast and memory usage never goes over 60%:

FLUX.2-dev memory usage

How to Install

Clone this repository into your ComfyUI/custom_nodes folder:

cd ComfyUI/custom_nodes
git clone https://github.com/phaserblast/ComfyUI-DGXSparkSafetensorsLoader.git

Install the fastsafetensors Python package. If you use a Python venv, remember activate it first:

source venv/bin/activate
pip install fastsafetensors

Restart ComfyUI, and search for the "DGX Spark Safetensors Loader" node. It should also be in the "loaders" category. Use this node in place of ComfyUI's built-in "Load Diffusion Model" node.

Known Issues

  • Memory management is broken, as there is no way to free the memory allocated by fastsafetensors. This is due to the custom memory management used by fastsafetensors, which bypasses ComfyUI's built-in memory management. The workaround is to just quit and restart ComfyUI to clear VRAM.

  • Quantized models don't work. Use the FP16 versions. If you need to use a quantized model, just use ComfyUI's Load Diffusion Model node instead.

  • Only minimal testing has been done on machines with discrete GPUs.

About

A model loader that uses fastsafetensors library to perform a fast, zero-copy load from storage to VRAM.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages