import ImageGenerationModelsTable from './_components/image-generation-models-table'; import VideoGenerationModelsTable from './_components/video-generation-models-table'; import LLMModelsTable from './_components/llm-models-table'; import VLMModelsTable from './_components/vlm-models-table'; import WhisperModelsTable from './_components/whisper-models-table'; import TextEmbeddingsModelsTable from './_components/text-embeddings-models-table'; import SpeechGenerationModelsTable from './_components/speech-generation-models-table'; import TextRerankModelsTable from './_components/text-rerank-models-table';

Supported Models

:::info Models Compatibility Other models with similar architectures may also work successfully even if not explicitly validated. Consider testing any unlisted models to verify compatibility with your specific use case. :::

Large Language Models (LLMs)

:::tip LoRA Support LLM pipeline supports LoRA adapters. :::

::::info

The LLM pipeline can work with other similar topologies produced by optimum-intel with the same model signature. The model is required to have the following inputs after the conversion:

input_ids contains the tokens.
attention_mask is filled with 1.
beam_idx selects beams.
position_ids (optional) encodes a position of currently generating token in the sequence and a single logits output.

:::note

Models should belong to the same family and have the same tokenizers.

:::

::::

Image Generation Models

Video Generation Models

Visual Language Models (VLMs)

:::tip LoRA Support VLM pipeline supports LoRA adapters applied to the language-model (LLM) part. LoRA adapters targeting the vision encoder or other multimodal components are not supported. :::

:::warning VLM Models Notes

InternVL2 {#internvl2-notes}

To convert InternVL2 models, timm and einops are required:

pip install timm einops

MiniCPMO {#minicpm-o-notes}

openbmb/MiniCPM-o-2_6 doesn't support transformers>=4.52 which is required for optimum-cli export.
--task image-text-to-text is required for optimum-cli export openvino --trust-remote-code because image-text-to-text isn't MiniCPM-o-2_6's native task.

phi3_v {#phi3_v-notes}

Models' configs aren't consistent. It's required to override the default eos_token_id with the one from a tokenizer:

generation_config.set_eos_token_id(pipe.get_tokenizer().get_eos_token_id())

phi4mm {#phi4mm-notes}

Apply https://huggingface.co/microsoft/Phi-4-multimodal-instruct/discussions/78/files to fix the model export for transformers>=4.50 :::

Speech Recognition Models (Whisper-based)

:::info LoRA Support Speech recognition pipeline does not support LoRA adapters. :::

Speech Generation Models

:::info LoRA Support Speech generation pipeline does not support LoRA adapters. :::

Text Embeddings Models

:::info LoRA Support Text embeddings pipeline does not support LoRA adapters. :::

:::warning Text Embeddings Models Notes Qwen3 Embedding models require --task feature-extraction during the conversion with optimum-cli. :::

Text Rerank Models

:::info LoRA Support Text rerank pipeline does not support LoRA adapters. :::

:::warning Text Rerank Models Notes Text Rerank models require appropriate --task provided during the conversion with optimum-cli. Task can be found in the table above. :::

:::info Hugging Face Notes Some models may require access request submission on the Hugging Face page to be downloaded.

If https://huggingface.co/ is down, the conversion step won't be able to download the models. :::

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supported Models

Large Language Models (LLMs)

Image Generation Models

Video Generation Models

Visual Language Models (VLMs)

InternVL2 {#internvl2-notes}

MiniCPMO {#minicpm-o-notes}

phi3_v {#phi3_v-notes}

phi4mm {#phi4mm-notes}

Speech Recognition Models (Whisper-based)

Speech Generation Models

Text Embeddings Models

Text Rerank Models

FilesExpand file tree

index.mdx

Latest commit

History

index.mdx

File metadata and controls

Supported Models

Large Language Models (LLMs)

Image Generation Models

Video Generation Models

Visual Language Models (VLMs)

InternVL2 {#internvl2-notes}

MiniCPMO {#minicpm-o-notes}

phi3_v {#phi3_v-notes}

phi4mm {#phi4mm-notes}

Speech Recognition Models (Whisper-based)

Speech Generation Models

Text Embeddings Models

Text Rerank Models