Skip to content

Conversation

@gluttony-10
Copy link

1.添加了名为.project-root的空文件
2.修改了三个yaml文件,使其对应模型
3.修正了README
3.1 修正了CogVideoX对应的链接
3.2 添加了checkpoints文件夹里三个模型的下载方式
4.修正了requirements文件
4.1 torch==2.4.0版本有bug,替换成较新版本
4.2 同步修改了torchvison和xformers版本
4.3 补充了可能缺少的依赖

1.添加了名为.project-root的空文件
2.修改了三个yaml文件,使其对应模型
3.修正了README
    3.1 修正了CogVideoX对应的链接
    3.2 添加了checkpoints文件夹里三个模型的下载方式
4.修正了requirements文件
    4.1 torch==2.4.0版本有bug,替换成较新版本
    4.2 同步修改了torchvison和xformers版本
    4.3 补充了可能缺少的依赖
@ToolKami
Copy link

ToolKami commented Apr 4, 2025

Hi @gluttony-10 , thanks for the great work, running into the following error at this step:

python inference_MLLM.py:

[2025-04-04 12:07:22,808] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards:   0%|                                                                                                  | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 92, in _call_target
    return _target_(*args, **kwargs)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3960, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4414, in _load_pretrained_model
    state_dict = load_state_dict(shard_file, is_quantized=is_quantized)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 548, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/AnimeGamer2/inference_MLLM.py", line 80, in <module>
    llm = hydra.utils.instantiate(llm_cfg, torch_dtype=dtype)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 226, in instantiate
    return instantiate_node(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 342, in instantiate_node
    value = instantiate_node(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 347, in instantiate_node
    return _call_target(_target_, partial, args, kwargs, full_key)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 97, in _call_target
    raise InstantiationException(msg) from e
hydra.errors.InstantiationException: Error in call to target 'transformers.modeling_utils.PreTrainedModel.from_pretrained':
SafetensorError('Error while deserializing header: HeaderTooLarge')
full_key: model

@gluttony-10
Copy link
Author

Hi @gluttony-10 , thanks for the great work, running into the following error at this step:

python inference_MLLM.py:

[2025-04-04 12:07:22,808] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards:   0%|                                                                                                  | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 92, in _call_target
    return _target_(*args, **kwargs)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3960, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4414, in _load_pretrained_model
    state_dict = load_state_dict(shard_file, is_quantized=is_quantized)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 548, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/AnimeGamer2/inference_MLLM.py", line 80, in <module>
    llm = hydra.utils.instantiate(llm_cfg, torch_dtype=dtype)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 226, in instantiate
    return instantiate_node(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 342, in instantiate_node
    value = instantiate_node(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 347, in instantiate_node
    return _call_target(_target_, partial, args, kwargs, full_key)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 97, in _call_target
    raise InstantiationException(msg) from e
hydra.errors.InstantiationException: Error in call to target 'transformers.modeling_utils.PreTrainedModel.from_pretrained':
SafetensorError('Error while deserializing header: HeaderTooLarge')
full_key: model

Hi @ToolKami , a nice question I havent met.
Please ensure all checkpoint shards are fully downloaded (e.g., model-00001-of-00002.safetensors, model-00002-of-00002.safetensors).
Or download models form here. https://modelscope.cn/models/AI-ModelScope/Mistral-7B-Instruct-v0.1

@OminousIndustries
Copy link

I encountered the same issue as @ToolKami and the issue was that the safetensors in the t5-v1_xxl directory were just pointers to the files, and not the files themselves. Downloading them and manually placing them resolved the issue.

Also make sure to check the yaml config files in MLLM and VDM_Decoder and ensure they are pointing to the correct paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants