Running a 5090 and getting OOM errors on 720p resolution

I tried export PYTORCH_ALLOC_CONF=expandable_segments:True to see if that would resolve the OOM errors on my 5090 but it does not.
I've been at this for hours and havent been able to determine the right combination to use and not get OOM

I am **also** utilizing the "--quant_linear" argument



```
export PYTHONPATH=turbodiffusion
export PYTORCH_ALLOC_CONF=expandable_segments:True
export TOKENIZERS_PARALLELISM=false

python turbodiffusion/inference/wan2.1_t2v_infer.py \
    --model Wan2.1-14B \
    --dit_path checkpoints/TurboWan2.1-T2V-14B-720P-quant.pth \
    --prompt "A photo realistic stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about." \
    --resolution 720p \
    --aspect_ratio 16:9 \
    --num_steps 4 \
    --seed 297308 \
    --attention_type sagesla \
    --sla_topk 0.15 \
    --num_samples 1 \
    --num_frames 77 \
    --sigma_max 80 \
    --save_path /home/wedwards/Documents/turbo_diffusion_ouput/t2v_20251226_104112.mp4 \
    --quant_linear
```

Console Information
.
.
.

[12-26 10:48:47|INFO|turbodiffusion/inference/wan2.1_t2v_infer.py:61:<module>] Computing embedding for prompt: A photo realistic stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.
[12-26 10:48:47|INFO|turbodiffusion/rcm/utils/umt5.py:495:__init__] loading checkpoints/models_t5_umt5-xxl-enc-bf16.pth
[12-26 10:48:57|INFO|turbodiffusion/inference/wan2.1_t2v_infer.py:66:<module>] Loading DiT model from checkpoints/TurboWan2.1-T2V-14B-720P-quant.pth
[12-26 10:48:57|INFO|turbodiffusion/rcm/networks/wan2pt1.py:829:enable_selective_checkpoint] Enable selective checkpoint with mm_only, for every 1 blocks. Total blocks: 40
[12-26 10:49:13|SUCCESS|turbodiffusion/inference/wan2.1_t2v_infer.py:69:<module>] Successfully loaded DiT model.
[12-26 10:49:13|INFO|turbodiffusion/rcm/tokenizers/wan2pt1.py:590:_video_vae] loading checkpoints/Wan2.1_VAE.pth
[12-26 10:49:13|INFO|turbodiffusion/inference/wan2.1_t2v_infer.py:75:<module>] Generating with prompt: A photo realistic stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.
Sampling: 100%|█████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:32<00:00,  8.11s/it]
Traceback (most recent call last):
  File "/home/wedwards/Documents/Programs/TurboDiffusion/turbodiffusion/inference/wan2.1_t2v_infer.py", line 132, in <module>
    video = tokenizer.decode(samples)
            ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/Documents/Programs/TurboDiffusion/turbodiffusion/rcm/tokenizers/wan2pt1.py", line 737, in decode
    return self.model.decode(
           ^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/anaconda3/envs/turbodiffusion/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/Documents/Programs/TurboDiffusion/turbodiffusion/rcm/tokenizers/wan2pt1.py", line 685, in decode
    video_recon = self.model.decode(zs, self.scale)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/Documents/Programs/TurboDiffusion/turbodiffusion/rcm/tokenizers/wan2pt1.py", line 532, in decode
    out_ = self.decoder(x[:, :, i : i + 1, :, :], feat_cache=self._feat_map, feat_idx=self._conv_idx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/anaconda3/envs/turbodiffusion/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/anaconda3/envs/turbodiffusion/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/Documents/Programs/TurboDiffusion/turbodiffusion/rcm/tokenizers/wan2pt1.py", line 416, in forward
    x = layer(x, feat_cache, feat_idx)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/anaconda3/envs/turbodiffusion/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/anaconda3/envs/turbodiffusion/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/Documents/Programs/TurboDiffusion/turbodiffusion/rcm/tokenizers/wan2pt1.py", line 202, in forward
    x = layer(x, feat_cache[idx])
        ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/anaconda3/envs/turbodiffusion/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/anaconda3/envs/turbodiffusion/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/Documents/Programs/TurboDiffusion/turbodiffusion/rcm/tokenizers/wan2pt1.py", line 55, in forward
    return super().forward(x)
           ^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/anaconda3/envs/turbodiffusion/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 717, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wedwards/anaconda3/envs/turbodiffusion/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 712, in _conv_forward
    return F.conv3d(
           ^^^^^^^^^
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 17.80 GiB. GPU 0 has a total capacity of 31.35 GiB of which 2.64 GiB is free. Including non-PyTorch memory, this process has 27.65 GiB memory in use. Of the allocated memory 7.80 GiB is allocated by PyTorch, and 19.26 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
(turbodiffusion) wedwards@Inf-Imagine-Linux:~/Documents/Programs/TurboDiffusion$ 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Running a 5090 and getting OOM errors on 720p resolution #52

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Running a 5090 and getting OOM errors on 720p resolution #52

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions