Project ideas for 2026

1. Build a GUI Agent with local LLM/VLM and OpenVINO

Short description: You will be required to build an agent application with a graphical user interface as input. It should be able to automatically operate your computer screen or the UI interface of a specific application based on user instructions, and accomplish complex logical goals. In this pipeline, at least one model must be deployed locally using OpenVINO. During this project, you will get free access to AIPC cloud. You can refer to the following projects for ramp up.

Expected outcomes: a desktop application that provides a native GUI Agent based on local models

Skills required/preferred: Python, OpenVINO, Prompt engineering , Agentic workflow

Mentors: Ethan Yang, Zhuo Wu

Size of project: 350 hours

Difficulty: Hard

2. OpenVINO Deep Search AI Assistant on Multimodal Personal Database for AIPC

Short description: Deep Search, as one of the core functions of a personal AI assistant, significantly enhances the user experience by providing information extraction capabilities for various file types (such as Word, PowerPoint, PDF, images, and videos) and supporting multi-dimensional information queries. The localized personal knowledge base not only improves the accuracy and relevance of answers but also protects data security and provides personalized search results based on the user's private data. This project aims to develop a desktop AI localized personal knowledge base search assistant for AI PCs. By building a multimodal personal database and using Retrieval Augmented Generation (RAG) technology, this project leverages this private multimodal data to enhance local large language models (LLMs). Users can interact with the OpenVINO instant messaging AI assistant, ask questions, and perform fuzzy searches using multimodal data.

Expected outcomes:

A standalone desktop application capable of building a personal knowledge base from multimodal data (word, images, videos) in specified directories, and supporting information retrieval and summarization via API/application.
Localized deployment using OpenVINO, building a localized multimodal personal knowledge base using local multimodal LLMs and Retrieval Augmented Generation (RAG) technology.
Deployment on Intel AIPC, with flexible switching between GPU/NPU hardware based on task load.
The application will have a user interface allowing users to interact with the local LLM, perform fuzzy searches using multimodal information, and generate valuable output.

Skills required/preferred: Python or C++, OpenVINO, OpenCV, ollama, llama.cpp, LLMs, RAG, OCR, UI

Mentors: Hongbo Zhao, Kunda Xu

Size of project: 350 hours

Difficulty: Hard

3. Object tracking in MP with OpenVINO inference

Short description: Tracking the objects in a video stream is an important use case. It combines an object detection model with a tracking algorithm that analyzes a whole sequence of images. The current state-of-the-art algorithm is ByteTrack.

The goal of the project is to implement the ByteTrack algorithm as a MediaPipe graph that could delegate inference execution to the OpenVINO inference calculator. This graph could be deployed in the OpenVINO Model Server and deployed for serving. A sample application adopting KServer API would send the stream of images and would get the information about the tracked objects in the stream.

Expected outcomes: MediaPipe graphs with the calculator implementation for ByteTrack algorithm with yolo models.

Skills required/preferred:  C++ (for writing calculator), Python(for writing client) MediaPipe

Mentors: Adrian Tobiszewski, Dariusz Trawinski

Size of project: 175 hours

Difficulty: Medium

4. OpenVINO GenAI: Add Image-to-Video Support to LTX Video Generation Pipeline

Short description: OpenVINO GenAI is a library of popular Generative AI pipelines, optimized execution methods, and samples built on top of the high-performance OpenVINO Runtime, focused on efficient deployment and easy integration. Currently, OpenVINO GenAI provides a text-to-video generation pipeline based on the LTX model - a diffusion-based video generator that creates videos from a text prompt via iterative denoising in latent space. This project extends the LTX pipeline with image-to-video (I2V) generation, enabling users to create short videos conditioned on an input image combined with a text prompt, running on Intel CPU and GPU. Adding image conditioning provides a strong visual anchor, improving control over composition and style. The project output includes C++ and Python API updates, runnable samples, validation tool updates (OpenVINO GenAI WWB and LLM Benchmarking), and basic tests to validate functionality.

Expected outcomes: Pull-request implementing image-to-video support in the OpenVINO GenAI API including: `. Pipeline Architecture: Extension of the Text2VideoPipeline class to support image-to-video execution paths with minimal memory overhead. 2. API Parity: Full C++ and Python API support for image conditioning inputs. 3. Infrastructure: Updates to OpenVINO GenAI benchmarking tools to measure I2V throughput and latency. 4. Reproducibility: A comprehensive test suite ensuring output consistency between Python and C++ implementations.

Skills required/preferred: C++, Python, good understanding of Stable diffusion architectures, experience with Hugging Face and Diffusers libraries, experience with PyTorch (OpenVINO is a plus), Git.

Mentors: Anna Likholat, Stanislav Gonorovskii

Size of project: 350 hours

Difficulty: Medium

5. Optimize Quantized Model Inference Performance on ARM Devices with OpenVINO

Short description: The goal of this project is to design and implement a set of optimizations in the OpenVINO runtime focused on improving inference performance of quantized neural network models on ARM-based devices. The work will target commonly used quantization schemes and model types, with an emphasis on reducing inference latency, increasing throughput, improving compilation time, and minimizing memory footprint. Special attention will be given to efficiently leveraging ARM-specific features such as NEON and ARM Compute Library integrations.

Expected outcomes:

Improved adoption of quantized models in OpenVINO on ARM platforms
Reduced inference latency and increased throughput for quantized workloads
Faster model compilation and initialization times
Lower memory consumption for deploying quantized models on resource-constrained ARM devices

Skills required/preferred: C++, Mac device with ARM chip is a must-have

Mentors: Aleksandr Voron, Vladislav Golubev

Size of project: 350 hors

Difficulty: Medium

Home
General resources
- Getting started
- Contribute
  - Google Summer of Code
    - Project ideas for 2026
    - Projects already implemented (2022-2025)
How to build
Developer documentation
- Inference Engine architecture
- CPU plugin
- GPU plugin
- HETERO plugin architecture
- Snippets
- Sample for IE C++/C/Python API
- Proxy plugin (Concept)
Tests

Project ideas for 2026

1. Build a GUI Agent with local LLM/VLM and OpenVINO

2. OpenVINO Deep Search AI Assistant on Multimodal Personal Database for AIPC

3. Object tracking in MP with OpenVINO inference

4. OpenVINO GenAI: Add Image-to-Video Support to LTX Video Generation Pipeline

5. Optimize Quantized Model Inference Performance on ARM Devices with OpenVINO

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally