Skip to content

TP + FP8 - NotImplementedError for certain operations #2629

Open
pytorch/ao
#2154
@nathan-az

Description

@nathan-az

FP8 training is now supported #2546, but has issues with tensor parallelism which is currently gated. MVP for this feature should include:

  • Plug-and-play support for enable_fp8_training with setting a tensor_parallel_plan
  • Compatibility with torch.compile

This issue is to track support and the request. It's not clear to me the scope of what needs to be done to support this. @andrewor14 feel free to comment if there are other requirements for MVP for this feature, or if you want to clarify the scope.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions