Skip to content

[Question]: 修改模型config未生效,命令行指定FP16精度训练未生效 #4332

@baichihgl-hue

Description

@baichihgl-hue

问题

模型中config.json的torch_dtype修改无效,vision_config下的torch_dtype修改有效。
执行paddleformers-cli train full.yaml --fp16=True --fp16_full_eval=True
输出内容中包含
[2026-04-21 17:29:04,802] [ DEBUG] - fp16 : False
[2026-04-21 17:29:04,802] [ DEBUG] - fp16_full_eval : False
full.yaml 文件中也有添加 bf16: False
报错信息:OSError: (External) CUDNN error(3007), CUDNN_STATUS_NOT_SUPPORTED_ARCH_MISMATCH.
[Hint: Please search for the error code(3007) on website (https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnStatus_t) to get Nvidia's official solution and advice about CUDNN Error.] (at /paddle/paddle/phi/kernels/gpudnn/conv_cudnn_v7.h:114)

场景:因为使用的显卡为V100不支持bf16,所以得指定fp16。
显卡驱动版本570,CUDA toolKit 12.6,cuDNN 8.9.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions