Skip to content

Why is deepspeed enabled in the Bloom training script? #71

@robertLiuLinFeng

Description

@robertLiuLinFeng

Why is the value of Zero-State 0 when deepspeed is enabled in the Bloom training script? Can the Bloom model be trained and the loss curve is aligned when deepspeed is disabled? Thanks very much.

DEEPSPEED_ARGS=" \
    --deepspeed \
    --deepspeed_config ${config_json} \
    --zero-stage ${ZERO_STAGE} \
    --deepspeed-activation-checkpointing \
    "

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions