Skip to content

Move EmbeddingQuantizer to torchao #9514

Open
@metascroy

Description

@metascroy

🚀 The feature, motivation and pitch

Move the EmbeddingQuantizer in ET llama code to torchao and write it using torchao quant primitives. Recombine embedding Q/DQ ops into packed weights during to_executorch.

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions