Skip to content

Commit 1fca339

Browse files
committed
[2/N]: config yaml support svdq dq/few-shot
1 parent 3a33e7d commit 1fca339

6 files changed

Lines changed: 26 additions & 9 deletions

File tree

docs/user_guide/LOAD_CONFIGS.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,7 @@ cache_dit.set_compile_configs()
235235
pipe.transformer = torch.compile(pipe.transformer)
236236
```
237237

238-
For <span style="color:#c77dff;">SVDQuant W4A4 DQ</span> workflow, you can define a yaml file `quantize_svdquant.yaml` that contains:
238+
For <span style="color:#c77dff;">SVDQuant W4A4 DQ</span> workflow, you can define a yaml file `quantize_svdq.yaml` that contains:
239239

240240
```yaml
241241
# Please install Cache-DiT with SVDQuant support (Experimental) before using the
@@ -331,16 +331,16 @@ pipe.transformer = torch.compile(pipe.transformer)
331331
## Quick Examples
332332

333333
```bash
334-
# recommend: install latest pytorch for better compile compatiblity.
335-
pip3 install torch==2.11.0 torchvision torchaudio triton --upgrade
336-
# recommend: install latest torchao nightly due to issue: https://github.com/pytorch/ao/issues/3670
337-
pip3 install --pre torchao --index-url https://download.pytorch.org/whl/cu130
338-
pip3 install transformers accelerate opencv-python-headless einops imageio-ffmpeg ftfy
339-
pip3 install git+https://github.com/huggingface/diffusers.git # latest or >= 0.36.0
340-
pip3 install git+https://github.com/vipshop/cache-dit.git # latest
341-
git clone https://github.com/vipshop/cache-dit.git && cd cache-dit/examples/configs
334+
pip install -U uv # use uv for faster installation
335+
uv pip install torch==2.11.0 torchvision torchaudio triton \
336+
transformers diffusers accelerate torchao opencv-python-headless \
337+
einops imageio-ffmpeg ftfy numpy
338+
uv pip install -U cache-dit # stable release from PyPI.
339+
git clone https://github.com/vipshop/cache-dit.git
340+
cd cache-dit/examples/configs # Preset yaml configs for quick test.
342341
343342
python3 -m cache_dit.generate flux --config cache.yaml
343+
python3 -m cache_dit.generate flux --config quantize.yaml --compile
344344
torchrun --nproc_per_node=4 -m cache_dit.generate flux --config hybrid.yaml
345345
torchrun --nproc_per_node=4 -m cache_dit.generate flux --config parallel.yaml
346346
torchrun --nproc_per_node=4 -m cache_dit.generate flux --config parallel_2d.yaml
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
quantize_config:
2+
quant_type: "svdq_nvfp4_r128_dq"
3+
svdq_kwargs:
4+
quantize_device: "cuda"
5+
exclude_layers:
6+
- "embedder"
7+
- "embed"
8+
verbose: false

examples/configs/blackwell/quantize_svdquant_sm120.yaml renamed to examples/configs/blackwell/quantize_svdq_few_shot.yaml

File renamed without changes.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
quantize_config:
2+
quant_type: "svdq_int4_r128_dq"
3+
svdq_kwargs:
4+
quantize_device: "cuda"
5+
exclude_layers:
6+
- "embedder"
7+
- "embed"
8+
verbose: false
File renamed without changes.

src/cache_dit/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
from .caching import FoCaCalibratorConfig
4747
from .caching import supported_pipelines
4848
from .caching import get_adapter
49+
from .caching import BlockAdapterRegister
4950
from .distributed import ParallelismBackend
5051
from .distributed import ParallelismConfig
5152
from .compile import set_compile_configs

0 commit comments

Comments
 (0)