Skip to content

Integrate input preprocessing into export flow #152

@larryliu0820

Description

@larryliu0820

When we export multimodal LLMs we almost always have to call processor.apply_chat_template in HF transformers. That call under the hood will preprocess the multimodal inputs.

This is quite annoying during exporting since we can't export processor.apply_chat_template directly. One example is this logic: https://github.com/huggingface/transformers/blob/main/src/transformers/models/whisper/processing_whisper.py#L69 it uses a lot of numpy.

Ideally we should ask transformers to write standard processor for all inputs in torch (tracked by huggingface/transformers#40986). In the short term I think optimum-executorch should host some of the common processors like the one in whisper.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions