Adds faster model loading args - RunPod loading takes at least 15-20 minutes without this and with this it takes 30 seconds #681

FurkanGozukara · 2025-10-24T22:51:21Z

Without this literally taking 15 to 20 minute to load Qwen Dit model

Unberable on RunPod

They have massive RAM but extremely slow Disk

- Add fast_load parameter to load_transformer function in hunyuan_model/models.py - When fast_load=True, sets disable_mmap=False to load entire model to RAM without memory mapping - This speeds up model loading significantly, especially for block swap or LoRA training - Increases RAM usage but eliminates the slow memory mapping overhead - Add --fast_load argument to hv_train.py and hv_train_network.py command line parsers

…to load_qwen_image_model and propagate through loading chain - Add disable_mmap support to MemoryEfficientSafeOpen for fast loading - Update callers to pass args.fast_load - Add informative log message when fast_load is enabled

…e load_transformer method in qwen_image_train.py to support fast_load - Add fast_load log message when enabled - Pass disable_mmap to load_safetensors and MemoryEfficientSafeOpen based on fast_load flag

FurkanGozukara · 2025-10-24T22:52:16Z

Hey @kohya-ss please add this as well. I would recommend add this universall because it makes massive huge speed difference on slow disk systems with big RAM

kohya-ss · 2025-10-25T11:27:46Z

Thank you for this!

I think what you're saying is that Numpy's np.memmap becomes slow in some environments. I haven't been able to confirm this, but it's possible.

The option name --fast_load gives the misleading impression that it will always be fast, so could you change it to something more appropriate, like --disable_np_memmap_for_model_loading?

Also, the variable name disable_mmap is confusing because the official Safetensors implementation has an argument with the same name, so I would appreciate it if you could change the variable name to something more appropriate.

…load argument to --disable_numpy_memmap to be more descriptive - Rename variable disable_mmap to disable_numpy_memmap where it controls numpy memmap - Rename variable to safetensors_disable_mmap where it controls Safetensors library - Update all log messages to reflect new naming - Clarifies that this disables numpy memory mapping specifically, not Safetensors memory mapping

FurkanGozukara · 2025-10-25T11:39:32Z

@kohya-ss sure just made changes i hope it is good now

edit : i also modified docs

…w --disable_numpy_memmap option in qwen_image.md - Document the new --disable_numpy_memmap option in hunyuan_video.md - Add English and Japanese descriptions - Explain benefits for block swap and LoRA training - Include note about RAM usage increase

kohya-ss · 2025-10-26T01:14:14Z

Thanks for the update! I'll format the code with ruff after the merge, but I'd appreciate it if you could format it from next time.

kohya-ss · 2025-10-26T01:38:05Z

After merging, I noticed many flaws... It was my mistake.

HunyuanVideo doesn't use Numpy's memmap in the first place, so it was inappropriate to apply it there. Also, this argument was ignored in many places.

I'll review more carefully next time😅

kohya-ss · 2025-10-26T03:59:46Z

I have made some fixes in #687 and merged it, so I would appreciate it if you could check that it works.

FurkanGozukara added 4 commits October 25, 2025 01:35

Add informative log message when fast_load is enabled

2d2a659

Add fast_load support to qwen_image_train.py load_transformer - Updat…

9b7c7e1

…e load_transformer method in qwen_image_train.py to support fast_load - Add fast_load log message when enabled - Pass disable_mmap to load_safetensors and MemoryEfficientSafeOpen based on fast_load flag

FurkanGozukara mentioned this pull request Oct 24, 2025

Something so wrong with Loading DiT model from disk logic #655

Open

kohya-ss merged commit 9be18c9 into kohya-ss:main Oct 26, 2025
1 check failed

kohya-ss mentioned this pull request Oct 26, 2025

fix: update handling of disable_numpy_memmap #687

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Adds faster model loading args - RunPod loading takes at least 15-20 minutes without this and with this it takes 30 seconds #681

Adds faster model loading args - RunPod loading takes at least 15-20 minutes without this and with this it takes 30 seconds #681

FurkanGozukara commented Oct 24, 2025

Uh oh!

FurkanGozukara commented Oct 24, 2025

Uh oh!

kohya-ss commented Oct 25, 2025

Uh oh!

FurkanGozukara commented Oct 25, 2025 •

edited

Loading

Uh oh!

kohya-ss commented Oct 26, 2025

Uh oh!

Uh oh!

kohya-ss commented Oct 26, 2025

Uh oh!

kohya-ss commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Adds faster model loading args - RunPod loading takes at least 15-20 minutes without this and with this it takes 30 seconds #681

Adds faster model loading args - RunPod loading takes at least 15-20 minutes without this and with this it takes 30 seconds #681

Conversation

FurkanGozukara commented Oct 24, 2025

Uh oh!

FurkanGozukara commented Oct 24, 2025

Uh oh!

kohya-ss commented Oct 25, 2025

Uh oh!

FurkanGozukara commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kohya-ss commented Oct 26, 2025

Uh oh!

Uh oh!

kohya-ss commented Oct 26, 2025

Uh oh!

kohya-ss commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FurkanGozukara commented Oct 25, 2025 •

edited

Loading