OpenVINO™ Notebooks at GitHub Pages
- Video generation with ZeroScope and OpenVINO
- Text-to-image generation with Z-Image-Turbo and OpenVINO
- Convert and Optimize YOLOv9 with OpenVINO™
- Convert and Optimize YOLOv8 real-time object detection with OpenVINO™
- YOLOv8 Oriented Bounding Boxes Object Detection with OpenVINO™
- Convert and Optimize YOLOv8 keypoint detection model with OpenVINO™
- Convert and Optimize YOLOv8 instance segmentation model with OpenVINO™
- Convert and Optimize YOLOv12 real-time object detection with OpenVINO™
- Convert and Optimize YOLOv11 real-time object detection with OpenVINO™
- Convert and Optimize YOLOv11 keypoint detection model with OpenVINO™
- Convert and Optimize YOLOv11 instance segmentation model with OpenVINO™
- Convert and Optimize YOLOv10 with OpenVINO
- Video Subtitle Generation using Whisper and OpenVINO™
- Automatic speech recognition using Whisper and OpenVINO with Generate API
- Wav2Lip: Accurately Lip-syncing Videos and OpenVINO
- Text-Image to Video generation with Wan2.2 and OpenVINO
- Text to Video generation with Wan2.1 and OpenVINO
- Text to Image pipeline and OpenVINO with Generate API
- Line-level text detection with Surya
- Image to Video Generation with Stable Video Diffusion
- Image generation with Stable Diffusion XL and OpenVINO
- Image generation with Torch.FX Stable Diffusion v3 and OpenVINO
- Image generation with Stable Diffusion v3 and OpenVINO
- Infinite Zoom Stable Diffusion v2 and OpenVINO™
- Visual-Language Assistant with SmolVLM2 and OpenVINO
- Document conversion with SmolDocling and OpenVINO
- Zero-shot Image Classification with SigLIP2
- Object masks from prompts with SAM and OpenVINO
- Object masks from prompts with SAM2 and OpenVINO
- Object masks from prompts with SAM2 and OpenVINO for Images
- Visual-language assistant with Qwen3-VL and OpenVINO
- Qwen3-TTS Text-to-Speech with OpenVINO™
- Text Rerank with Qwen3 and OpenVINO
- Text Embedding with Qwen3 and OpenVINO
- Qwen3-ASR Speech Recognition with OpenVINO™
- Visual-language assistant with Qwen2.5VL and OpenVINO
- Omnimodal assistant with Qwen2.5-Omni and OpenVINO
- Visual-language assistant with Qwen2VL and OpenVINO
- Audio-language assistant with Qwen2Audio and OpenVINO
- Text-to-image generation with Qwen-Image and OpenVINO
- Visual-language assistant with Pixtral and OpenVINO
- Multimodal assistant with Phi-4-multimodal and OpenVINO
- Visual-language assistant with Phi3-Vision and OpenVINO
- Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO
- Voice tone cloning with OpenVoice and OpenVINO
- Screen Parsing with OmniParser-v2.0 and OpenVINO
- PDF converting with olmOCR model and OpenVINO
- Structure Extraction with NuExtract and OpenVINO
- Controllable Music Generation with MusicGen and OpenVINO
- Multimodal RAG for video analytics with LlamaIndex
- Multi LoRA Image Generation
- Visual Content Search using MobileCLIP and OpenVINO
- Visual-language assistant with Llama-3.2-11B-Vision and OpenVINO
- Visual-language assistant with MiniCPM-V and OpenVINO
- Omnimodal assistant with MiniCPM-o 2.6 and OpenVINO
- LTX Video and OpenVINO™
- Create a RAG system using OpenVINO and LlamaIndex
- Create a RAG system using OpenVINO and LangChain
- Create a RAG system using OpenVINO GenAI and LangChain
- RAG Performance & Fairness Evaluation Toolkit (OpenVINO + LangChain)
- LLM Instruction-following pipeline with OpenVINO
- Create an LLM-powered Chatbot using OpenVINO
- Create an LLM-powered Chatbot using OpenVINO Generate API
- Create ReAct Agent using OpenVINO and LangChain
- Create an Agentic RAG using OpenVINO and LlamaIndex
- Create MCP Agent using OpenVINO and Qwen-Agent
- Create Function-calling Agent using OpenVINO and Qwen-Agent
- Visual-language assistant with LLaVA Next and OpenVINO
- Visual-language assistant with LLaVA and OpenVINO Generative API
- Image generation with Latent Consistency Model and OpenVINO
- Multimodal understanding and generation with Janus-Pro and OpenVINO
- Visual-language assistant with InternVL2 and OpenVINO
- InstantID: Zero-shot Identity-Preserving Generation using OpenVINO
- Inpainting with OpenVINO GenAI
- Image-to-image generation using OpenVINO GenAI
- Image generation with HunyuanDIT and OpenVINO
- Object detection and masking from prompts with GroundedSAM (GroundingDINO + SAM) and OpenVINO
- Visual-language assistant with GLM-4.1V-9B-Thinking and OpenVINO
- Visual-language assistant with GLM4-V and OpenVINO
- Visual-language assistant with Gemma3 and OpenVINO
- End-to-End Speech Recognition with Fun-ASR-Nano and OpenVINO
- Image-to-image generation with Flux.1 Kontext and OpenVINO
- Image generation with Flux.1 and OpenVINO
- Image inpainting and outpainting with FLUX.1 Fill
- Florence-2: Open Source Vision Foundation Model
- Image generation with universal control using Flex.2 and OpenVINO
- Multi-speaker dialogue generation with FireRedTTS‑2 and OpenVINO
- Object segmentations with FastSAM and OpenVINO
- Object segmentations with EfficientSAM and OpenVINO
- Automatic speech recognition using Distil-Whisper and OpenVINO
- Depth estimation with DepthAnything and OpenVINO
- Visual-language assistant using DeepSeek-VL2 and OpenVINO
- LLM reasoning with DeepSeek-R1 distilled models
- Document Parsing using DeepSeek-OCR/DeepSeek-OCR-2 and OpenVINO
- Text-to-Speech (TTS) system with Fun-CosyVoice 3.0 and OpenVINO
- Text-to-Image Generation with ControlNet Conditioning
- Zero-shot Image Classification with OpenAI CLIP and OpenVINO™
- Virtual Try-On with CatVTON and OpenVINO
- Visual Question Answering and Image Captioning using BLIP and OpenVINO
- Text-to-speech generation using Bark and OpenVINO
- Image-to-Video synthesis with AnimateAnyone and OpenVINO
- Text to Image pipeline and OpenVINO with Generate API
- Quantize Speech Recognition Models with accuracy control using NNCF PTQ API
- Post-Training Quantization of PyTorch models with NNCF
- Optimize Preprocessing
- OpenVINO Tokenizers: Incorporate Text Processing Into OpenVINO Pipelines
- Convert models from ModelScope to OpenVINO
- Text Generation with LoRA via OpenVINO GenAI
- Quantize NLP models with Post-Training Quantization in NNCF
- Inpainting with OpenVINO GenAI
- Image-to-image generation using OpenVINO GenAI
- Quantization of Image Classification Models
- 🤗 Hugging Face Model Hub with OpenVINO™
- Hello NPU
- Working with GPUs in OpenVINO™
- OpenVINO™ Model conversion
- Automatic Device Selection with OpenVINO™
- Asynchronous Inference with OpenVINO™
- YOLOv11 quantization with accuracy control using NNCF
- Classification with ConvNeXt and OpenVINO
- Convert a Tensorflow Lite Model to OpenVINO™
- Using TensorFlow Object Detection API with OpenVINO™
- Convert a TensorFlow Model to OpenVINO™
- Line-level text detection with Surya
- Convert a PyTorch Model to OpenVINO™ IR
- Document Parsing using PaddleOCR-VL/PaddleOCR-VL-1.5 and OpenVINO
- Convert a PaddlePaddle Model to OpenVINO™ IR
- Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO
- Voice tone cloning with OpenVoice and OpenVINO
- OpenVINO Tokenizers: Incorporate Text Processing Into OpenVINO Pipelines
- Object detection and masking from prompts with GroundedSAM (GroundingDINO + SAM) and OpenVINO
- Convert Detectron2 Models to OpenVINO™
- Quantize a Segmentation Model and Show Live Inference
- OpenVINO™ Model conversion
- OpenVINO™ Explainable AI Toolkit (3/3): Saliency map interpretation
- OpenVINO™ Explainable AI Toolkit (2/3): Deep Dive
- OpenVINO™ Explainable AI Toolkit (1/3): Basic
- OpenVINO™ Runtime API Tutorial
- Text Generation with LoRA via OpenVINO GenAI
- Image-to-image generation using OpenVINO GenAI
- Hello Image Classification
- Hello Image Segmentation
- Hello Object Detection
- OpenVINO™ Explainable AI Toolkit (1/3): Basic
- Style Transfer with OpenVINO™
- Live Human Pose Estimation with OpenVINO™
- Person Tracking with OpenVINO™
- Person Counting System using YOLOV8 and OpenVINO™
- PaddleOCR with OpenVINO™
- Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO
- Voice tone cloning with OpenVoice and OpenVINO
- Live Object Detection with OpenVINO™
- CLIP model with Jina CLIP and OpenVINO
- Object detection and masking from prompts with GroundedSAM (GroundingDINO + SAM) and OpenVINO
- Quantize a Segmentation Model and Show Live Inference
- Human Action Recognition with OpenVINO™
- Live 3D Human Pose Estimation with OpenVINO
- Video generation with ZeroScope and OpenVINO
- Text-to-image generation with Z-Image-Turbo and OpenVINO
- Convert and Optimize YOLOv9 with OpenVINO™
- Convert and Optimize YOLOv8 real-time object detection with OpenVINO™
- YOLOv8 Oriented Bounding Boxes Object Detection with OpenVINO™
- Convert and Optimize YOLOv8 keypoint detection model with OpenVINO™
- Convert and Optimize YOLOv8 instance segmentation model with OpenVINO™
- Convert and Optimize YOLOv12 real-time object detection with OpenVINO™
- Convert and Optimize YOLOv11 real-time object detection with OpenVINO™
- Convert and Optimize YOLOv11 keypoint detection model with OpenVINO™
- Convert and Optimize YOLOv11 instance segmentation model with OpenVINO™
- Convert and Optimize YOLOv10 with OpenVINO
- Video Subtitle Generation using Whisper and OpenVINO™
- Automatic speech recognition using Whisper and OpenVINO with Generate API
- Wav2Lip: Accurately Lip-syncing Videos and OpenVINO
- Text-Image to Video generation with Wan2.2 and OpenVINO
- Text to Video generation with Wan2.1 and OpenVINO
- Monodepth Estimation with OpenVINO
- Image Background Removal with U^2-Net and OpenVINO™
- Vehicle Detection And Recognition with OpenVINO™
- Selfie Segmentation using TFLite and OpenVINO
- Text-to-Speech Generation with OpenVINO GenAI
- Text to Image pipeline and OpenVINO with Generate API
- Line-level text detection with Surya
- Image to Video Generation with Stable Video Diffusion
- Image generation with Stable Diffusion XL and OpenVINO
- Image generation with Torch.FX Stable Diffusion v3 and OpenVINO
- Image generation with Stable Diffusion v3 and OpenVINO
- Infinite Zoom Stable Diffusion v2 and OpenVINO™
- Text-to-Image Generation with Stable Diffusion and OpenVINO™
- Text Generation via Speculative Decoding using FastDraft and OpenVINO™
- Visual-Language Assistant with SmolVLM2 and OpenVINO
- Document conversion with SmolDocling and OpenVINO
- Zero-shot Image Classification with SigLIP2
- Object masks from prompts with SAM and OpenVINO
- Object masks from prompts with SAM2 and OpenVINO
- Object masks from prompts with SAM2 and OpenVINO for Images
- Background removal with RMBG v1.4 and OpenVINO
- Visual-language assistant with Qwen3-VL and OpenVINO
- Qwen3-TTS Text-to-Speech with OpenVINO™
- Text Rerank with Qwen3 and OpenVINO
- Text Embedding with Qwen3 and OpenVINO
- Qwen3-ASR Speech Recognition with OpenVINO™
- Visual-language assistant with Qwen2.5VL and OpenVINO
- Omnimodal assistant with Qwen2.5-Omni and OpenVINO
- Visual-language assistant with Qwen2VL and OpenVINO
- Audio-language assistant with Qwen2Audio and OpenVINO
- Text-to-image generation with Qwen-Image and OpenVINO
- Visual-language assistant with Pixtral and OpenVINO
- Multimodal assistant with Phi-4-multimodal and OpenVINO
- Visual-language assistant with Phi3-Vision and OpenVINO
- Text-to-speech (TTS) with Parler-TTS and OpenVINO
- Document Parsing using PaddleOCR-VL/PaddleOCR-VL-1.5 and OpenVINO
- Optical Character Recognition (OCR) with OpenVINO™
- Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO
- Voice tone cloning with OpenVoice and OpenVINO
- Universal Segmentation with OneFormer and OpenVINO
- Screen Parsing with OmniParser-v2.0 and OpenVINO
- PDF converting with olmOCR model and OpenVINO
- Structure Extraction with NuExtract and OpenVINO
- Controllable Music Generation with MusicGen and OpenVINO
- Multimodal RAG for video analytics with LlamaIndex
- Multi LoRA Image Generation
- Visual Content Search using MobileCLIP and OpenVINO
- MMS: Scaling Speech Technology to 1000+ languages with OpenVINO™
- Visual-language assistant with Llama-3.2-11B-Vision and OpenVINO
- Visual-language assistant with MiniCPM-V and OpenVINO
- Omnimodal assistant with MiniCPM-o 2.6 and OpenVINO
- Industrial Meter Reader
- LTX Video and OpenVINO™
- Create a RAG system using OpenVINO and LlamaIndex
- Create a RAG system using OpenVINO and LangChain
- Create a RAG system using OpenVINO GenAI and LangChain
- RAG Performance & Fairness Evaluation Toolkit (OpenVINO + LangChain)
- LLM Instruction-following pipeline with OpenVINO
- Create an LLM-powered Chatbot using OpenVINO
- Create an LLM-powered Chatbot using OpenVINO Generate API
- Create ReAct Agent using OpenVINO and LangChain
- Create an Agentic RAG using OpenVINO and LlamaIndex
- Create MCP Agent using OpenVINO and Qwen-Agent
- Create Function-calling Agent using OpenVINO and Qwen-Agent
- Visual-language assistant with LLaVA Next and OpenVINO
- Visual-language assistant with LLaVA and OpenVINO Generative API
- Image generation with Latent Consistency Model and OpenVINO
- Text-to-Speech synthesis using Kokoro and OpenVINO
- Multimodal understanding and generation with Janus-Pro and OpenVINO
- Visual-language assistant with InternVL2 and OpenVINO
- InstantID: Zero-shot Identity-Preserving Generation using OpenVINO
- Inpainting with OpenVINO GenAI
- Image-to-image generation using OpenVINO GenAI
- Image generation with HunyuanDIT and OpenVINO
- Handwritten Chinese and Japanese OCR with OpenVINO™
- Object detection and masking from prompts with GroundedSAM (GroundingDINO + SAM) and OpenVINO
- Visual-language assistant with GLM-4.1V-9B-Thinking and OpenVINO
- Visual-language assistant with GLM4-V and OpenVINO
- Visual-language assistant with Gemma3 and OpenVINO
- End-to-End Speech Recognition with Fun-ASR-Nano and OpenVINO
- High-Quality Text-Free One-Shot Voice Conversion with FreeVC and OpenVINO™
- Image-to-image generation with Flux.1 Kontext and OpenVINO
- Image generation with Flux.1 and OpenVINO
- Image inpainting and outpainting with FLUX.1 Fill
- Florence-2: Open Source Vision Foundation Model
- Image generation with universal control using Flex.2 and OpenVINO
- Multi-speaker dialogue generation with FireRedTTS‑2 and OpenVINO
- Object segmentations with FastSAM and OpenVINO
- Object segmentations with EfficientSAM and OpenVINO
- Automatic speech recognition using Distil-Whisper and OpenVINO
- Depth estimation with DepthAnything and OpenVINO
- Visual-language assistant using DeepSeek-VL2 and OpenVINO
- LLM reasoning with DeepSeek-R1 distilled models
- Document Parsing using DeepSeek-OCR/DeepSeek-OCR-2 and OpenVINO
- Low-Light Image Restoration with DarkIR model using OpenVINO™
- Text-to-Speech (TTS) system with Fun-CosyVoice 3.0 and OpenVINO
- Text-to-Image Generation with ControlNet Conditioning
- Zero-shot Image Classification with OpenAI CLIP and OpenVINO™
- Virtual Try-On with CatVTON and OpenVINO
- Visual Question Answering and Image Captioning using BLIP and OpenVINO
- Text-to-speech generation using Bark and OpenVINO
- Image-to-Video synthesis with AnimateAnyone and OpenVINO
- Imitation Learning - ACT
- Music generation using ACE Step and OpenVINO
- Part Segmentation of 3D Point Clouds with OpenVINO™
- PointPillar for 3D object detection
- YOLOv11 quantization with accuracy control using NNCF
- Quantize Speech Recognition Models with accuracy control using NNCF PTQ API
- Quantization-Sparsity Aware Training with NNCF, using PyTorch framework
- Post-Training Quantization of PyTorch models with NNCF
- Optimize Preprocessing
- Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO
- Voice tone cloning with OpenVoice and OpenVINO
- OpenVINO Tokenizers: Incorporate Text Processing Into OpenVINO Pipelines
- Quantize NLP models with Post-Training Quantization in NNCF
- Quantization of Image Classification Models
- Object detection and masking from prompts with GroundedSAM (GroundingDINO + SAM) and OpenVINO
- Quantize a Segmentation Model and Show Live Inference