Skip to content

ERNIE-4.5-VL-28B-A3B-Paddle sft 训练报错:AssertionError: weight_idxs' length should be greater than 0 #1408

@lx11

Description

@lx11

根据官方环境:
paddle_xpu 0.0.1
paddle2onnx 2.0.1
paddleformers 0.4.0
paddlepaddle-xpu 3.3.0.dev20251016

启动命令:erniekit train examples/configs/xpu/ERNIE-4.5-VL-28B-A3B/sft/run_sft_32k.yaml

[2025-12-23 12:07:45,066] [ INFO] pp_layers.py:854 - start segment network..
Traceback (most recent call last):
File "/work/ERNIE/erniekit/launcher.py", line 58, in
launch()
File "/work/ERNIE/erniekit/launcher.py", line 46, in launch
run_tuner()
File "/work/ERNIE/erniekit/train/tuner.py", line 82, in run_tuner
_training_function(config={"args": args})
File "/work/ERNIE/erniekit/train/tuner.py", line 59, in _training_function
run_vl_sft(
File "/work/ERNIE/erniekit/train/vl_sft/workflow.py", line 517, in run_vl_sft
model = Ernie4_5_VLMoeForConditionalGenerationPipe.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/paddleformers/transformers/model_utils.py", line 2965, in from_pretrained
model = cls(config, *init_args, **model_kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddleformers/transformers/utils.py", line 290, in impl
init_func(self, *args, **kwargs)
File "/work/ERNIE/ernie/modeling_moe_vl_pp.py", line 1908, in init
PipelineLayer.init(
File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/meta_parallel/parallel_layers/pp_layers.py", line 506, in init
self._segment_network(seg_method)
File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/meta_parallel/parallel_layers/pp_layers.py", line 858, in _segment_network
self.segment_parts = seg.do_segment()
File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/meta_parallel/parallel_layers/pp_layers.py", line 187, in do_segment
weight_idxs = self._gen_layer_weight(layername)
File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/fleet/meta_parallel/parallel_layers/pp_layers.py", line 234, in _gen_layer_weight
assert len(weight_idxs) > 0, (
AssertionError: weight_idxs' length should be greater than 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions