Skip to content

修复项目 #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

修复项目 #1

wants to merge 1 commit into from

Conversation

gluttony-10
Copy link

1.添加了名为.project-root的空文件
2.修改了三个yaml文件,使其对应模型
3.修正了README
3.1 修正了CogVideoX对应的链接
3.2 添加了checkpoints文件夹里三个模型的下载方式
4.修正了requirements文件
4.1 torch==2.4.0版本有bug,替换成较新版本
4.2 同步修改了torchvison和xformers版本
4.3 补充了可能缺少的依赖

1.添加了名为.project-root的空文件
2.修改了三个yaml文件,使其对应模型
3.修正了README
    3.1 修正了CogVideoX对应的链接
    3.2 添加了checkpoints文件夹里三个模型的下载方式
4.修正了requirements文件
    4.1 torch==2.4.0版本有bug,替换成较新版本
    4.2 同步修改了torchvison和xformers版本
    4.3 补充了可能缺少的依赖
@ToolKami
Copy link

ToolKami commented Apr 4, 2025

Hi @gluttony-10 , thanks for the great work, running into the following error at this step:

python inference_MLLM.py:

[2025-04-04 12:07:22,808] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards:   0%|                                                                                                  | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 92, in _call_target
    return _target_(*args, **kwargs)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3960, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4414, in _load_pretrained_model
    state_dict = load_state_dict(shard_file, is_quantized=is_quantized)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 548, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/AnimeGamer2/inference_MLLM.py", line 80, in <module>
    llm = hydra.utils.instantiate(llm_cfg, torch_dtype=dtype)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 226, in instantiate
    return instantiate_node(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 342, in instantiate_node
    value = instantiate_node(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 347, in instantiate_node
    return _call_target(_target_, partial, args, kwargs, full_key)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 97, in _call_target
    raise InstantiationException(msg) from e
hydra.errors.InstantiationException: Error in call to target 'transformers.modeling_utils.PreTrainedModel.from_pretrained':
SafetensorError('Error while deserializing header: HeaderTooLarge')
full_key: model

@gluttony-10
Copy link
Author

Hi @gluttony-10 , thanks for the great work, running into the following error at this step:

python inference_MLLM.py:

[2025-04-04 12:07:22,808] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards:   0%|                                                                                                  | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 92, in _call_target
    return _target_(*args, **kwargs)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3960, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4414, in _load_pretrained_model
    state_dict = load_state_dict(shard_file, is_quantized=is_quantized)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/transformers/modeling_utils.py", line 548, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/AnimeGamer2/inference_MLLM.py", line 80, in <module>
    llm = hydra.utils.instantiate(llm_cfg, torch_dtype=dtype)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 226, in instantiate
    return instantiate_node(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 342, in instantiate_node
    value = instantiate_node(
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 347, in instantiate_node
    return _call_target(_target_, partial, args, kwargs, full_key)
  File "/opt/conda/envs/animegamer/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 97, in _call_target
    raise InstantiationException(msg) from e
hydra.errors.InstantiationException: Error in call to target 'transformers.modeling_utils.PreTrainedModel.from_pretrained':
SafetensorError('Error while deserializing header: HeaderTooLarge')
full_key: model

Hi @ToolKami , a nice question I havent met.
Please ensure all checkpoint shards are fully downloaded (e.g., model-00001-of-00002.safetensors, model-00002-of-00002.safetensors).
Or download models form here. https://modelscope.cn/models/AI-ModelScope/Mistral-7B-Instruct-v0.1

@OminousIndustries
Copy link

I encountered the same issue as @ToolKami and the issue was that the safetensors in the t5-v1_xxl directory were just pointers to the files, and not the files themselves. Downloading them and manually placing them resolved the issue.

Also make sure to check the yaml config files in MLLM and VDM_Decoder and ensure they are pointing to the correct paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants