Skip to content

float8 upcoming feature tracker #556

Open
@vkuzo

Description

configurability

  • [planned] support rowwise/blockwise scaling granularity, configurable separately for each gemm
  • [planned] configure settings for each of the three gemms in linear fwd/bwd separately
  • [planned] support more fine grained configuration of how to apply Float8Linear to individual modules
  • [planned] inference support (see [RFC] Float8 Inference pytorch-labs/float8_experimental#314)

performance

distributed

  • [in progress] integrate with FSDP2 with 16-bit or 8-bit all-gather with delayed scaling for weights
    • POC is done, performance optimizations are ongoing
  • [planned] verify integration with PP

other

copied from pytorch-labs/float8_experimental#187

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions