why fp8_e4m3 min_scaling_factor divide 512?

Hello, I would like to perform quantization from the FP16 data type to the FP8E4M3 data type. I referred to the method described in the link https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/experimental/gen_ai/src/quantize/quantize.cu#L629, but I have a question. Why is the calculation of min_scaling_factor done by dividing by (FP8_E4M3_MAX::value * 512.f)? Could you please explain the basis for choosing 512.f? Thanks.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why fp8_e4m3 min_scaling_factor divide 512? #3590

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

why fp8_e4m3 min_scaling_factor divide 512? #3590

Description

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions