Support funasr/paraformer-zh to export OpenVINO IR model

I would like to support funasr/paraformer-zh to export OpenVINO IR model from optimum-intel. The original model implementation status is

1. The model network is implemented in python script file (https://github.com/modelscope/FunASR/tree/main/funasr) and covers SANMEncoder/ParaformerSANMDecoder/CifPredictor which separately in different script files in that repo. 
2. The model.pt only contains model parameters in (https://huggingface.co/funasr/paraformer-zh)

As the original model in https://github.com/modelscope/FunASR/tree/main/funasr is complex, i reimplement the code into one python file naming modeling_paraformer.py. The implementation of modeling_paraformer.py is based on torch.nn.Module. Please see the below part of code for details. I went through the whole optimum-intel github and it seems there is no proper place to put this modeling_paraformer.py.  @rkazants may I have your suggestion how to do that? Thanks

class Paraformer(torch.nn.Module):
    """
    Author: Speech Lab of DAMO Academy, Alibaba Group
    Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition
    https://arxiv.org/abs/2206.08317
    """

    def __init__(
        self,
        specaug: Optional[str] = None,
        specaug_conf: Optional[Dict] = None,
        normalize: str = None,
        normalize_conf: Optional[Dict] = None,
        encoder: str = None,
        encoder_conf: Optional[Dict] = None,
        decoder: str = None,
        decoder_conf: Optional[Dict] = None,
        ctc: str = None,
        ctc_conf: Optional[Dict] = None,
        predictor: str = None,
        predictor_conf: Optional[Dict] = None,
        ctc_weight: float = 0.5,
        input_size: int = 80,
        vocab_size: int = -1,
        ignore_id: int = -1,
        blank_id: int = 0,
        sos: int = 1,
        eos: int = 2,
        lsm_weight: float = 0.0,
        length_normalized_loss: bool = False,
        # report_cer: bool = True,
        # report_wer: bool = True,
        # sym_space: str = "<space>",
        # sym_blank: str = "<blank>",
        # extract_feats_in_collect_stats: bool = True,
        # predictor=None,
        predictor_weight: float = 0.0,
        predictor_bias: int = 0,
        sampling_ratio: float = 0.2,
        share_embedding: bool = False,
        # preencoder: Optional[AbsPreEncoder] = None,
        # postencoder: Optional[AbsPostEncoder] = None,
        use_1st_decoder_loss: bool = False,
        **kwargs,
    ):

        super().__init__()
        encoder = SANMEncoder(input_size=input_size, **encoder_conf)
        encoder_output_size = encoder.output_size()

        if decoder is not None:
            decoder = ParaformerSANMDecoder(
                vocab_size=vocab_size,
                encoder_output_size=encoder_output_size,
                **decoder_conf,
            )

        if predictor is not None:
            predictor = CifPredictorV2(**predictor_conf)

        self.encoder = encoder
        self.decoder = decoder
        self.predictor = predictor

    def export(self, **kwargs):

        if "max_seq_len" not in kwargs:
            kwargs["max_seq_len"] = 512
        models = export_rebuild_model(model=self, **kwargs)
        return models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support funasr/paraformer-zh to export OpenVINO IR model #1567

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support funasr/paraformer-zh to export OpenVINO IR model #1567

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions