GSOC PROJECT: Develop an OpenVINO-Domain Specialized Coder Model with SFT/GRPO/RAG #34299

Shi-pra-19 · 2026-02-25T00:30:38Z

Shi-pra-19
Feb 25, 2026

Hello @yinquan251, @7taozhou7, and the OpenVINO community!

I hope this message finds you both well. My name is Shipra, and I am currently a third year student at IIT Madras.

A bit about me: Currently I work as a Quantitative Research Consultant at WorldQuant, and prior to that, I gained experience in LLM evaluation and dataset curation for code-generation models at Remotasks. I’ve fine-tuned large language models, including a Qwen2.5-Math model (https://github.com/Shi-pra-19/Qwen_2.5_Fine_Tuning), and I also hold Kaggle Competition Expert rank with two bronze medals. I have also deployed a RAG pipeline indexing course content (including discourse forum discussions).

This GSoC project caught my attention because it aligns closely with my background. I do have some questions regarding the implementation:

I think curating a proper dataset would be a crucial part of the project.
Dataset curation: I am considering a combination of resources for curating a high-quality dataset:
OpenVINO latest documentations, repo, GenAI API, tutorials, notebooks.
Stack Overflow and other discussion forums (focusing on OpenVINO 2.0 API)
Specific migration examples from older API version to newer one.
Scraping specific relevant commits, issues via GitHub GraphQL API

Model selection: I am thinking of using LLMs such as Qwen 2.5 7B Coder, DeepSeek 7B Instruct Coder, or CodeLlama 7B Instruct.

GRPO design: Rewarding positively to signals such as compilation success, execution correctness, latency/performance, code quality/structure, and favoring correct usage of newer APIs.

Inference and deployment: My plan is to export the model to ONNX, then optimize using OpenVINO, applying NNCF for quantization/compression. Also I will use suitable precision per device and monitor performance via the OpenVINO benchmark_app.

As part of the project, I aim to provide an additional lightweight interface for users to interact with the trained model. This could be a terminal-based TUI or Streamlit demo.
Would you recommend including this within the time frame, or should it be considered an optional demo feature?

Prerequisite Contribution:
I have contributed 10+ PRs implementing numpy operations to Keras for OpenVINO backend, some of them including:
keras-team/keras#22078
keras-team/keras#22025

Would you recommend any additional considerations for deployment optimization?
Is the above model size appropriate, or should we consider larger models (e.g., Qwen 2.5 14B Coder, CodeLlama 13B)?
Can I use LoRA, QLoRA, or libraries like Unsloth to speed up training time and reduce memory usage?
Are there other resources I should consider for dataset curation and knowledge base for RAG implementation?

Is there a preferred platform you prefer for discussion, sharing demos and proposal drafts in future? Email, Discord, or another medium?

I am excited to discuss the implementation further and explore how I can contribute.

Thank you very much for your time!

7taozhou7 · 2026-02-25T03:23:05Z

7taozhou7
Feb 25, 2026

Hi @Shi-pra-19 ,

Thank you for reaching out and sharing such a detailed and well-thought-out proposal!

Regarding your proposed implementation and questions, here are my thoughts and recommendations:

Model Selection & Size
Model Size: I strongly recommend keeping the parameter size between 7 and 8 bytes. Since the primary goal of this project is local deployment on an AI PC, models like 14B/13B will consume excessive RAM/VRAM, especially when running with a RAG vector database. While 14B models will offer better performance, their memory footprint is generally unacceptable. (If you have more resources, you could try fine-tuning the 14B model using LoRA/QLoRA with Unsloth and then quantizing it at deployment time.)

Model Choice: Qwen 2.5 7B Coder is currently the state-of-the-art for this size, and DeepSeek-Coder-V2-Lite is also a fantastic choice. However, I would suggest dropping CodeLlama 7B from your list, as its architecture and performance are quite outdated compared to Qwen and DeepSeek.

Training Strategy (LoRA, QLoRA, Unsloth)
Using Unsloth (or other framework) and LoRA/QLoRA is fine. Applying full-parameter fine-tuning is unnecessary and resource-heavy. Unsloth provides immense speedups and memory reduction. Coupling QLoRA with GRPO (via libraries like Hugging Face trl) will be the most pragmatic approach for this project, allowing us to iterate quickly.
Inference, Deployment, and NNCF
A small technical correction to save you time: You do not need to export the LLM to ONNX first. Exporting modern LLMs (with KV caching and dynamic shapes) to ONNX can be a nightmare.
Instead, you can use Hugging Face optimum-intel. It allows you to directly export the Hugging Face model to OpenVINO IR format and apply NNCF INT4/INT8 weight compression in a single pipeline (e.g., using OVModelForCausalLM).
GRPO Design & Demo Interface
Your GRPO reward signals (compilation, correctness, new API usage) are excellent.

An additional lightweight interface for users to interact with the trained model is also an important part of this project, so I recommend including this within the timeframe. (Of course, the core goal of this project is still to train an excellent OpenVINO coder model; if you don't have enough time, you can consider the demo as an option.)

Dataset
Regarding the dataset, I think your ideas are already very well-developed.

Using Github or email for future discussions and sharing proposal drafts is fine. My email is tao1.zhou@intel.com

Looking forward to your next step!

0 replies

Shi-pra-19 · 2026-02-25T17:01:04Z

Shi-pra-19
Feb 25, 2026
Author

Hi @7taozhou7,

Thank you for the detailed feedback and technical clarifications.

I'll proceed with Qwen 2.5 7B Coder as the base model.

For the training strategy, I will adopt QLoRA + GRPO (via Hugging Face TRL), using Unsloth for efficient fine-tuning:

Use Unsloth for faster supervised fine-tuning
Apply GRPO in a second stage with specified reward signals

Also, thank you for the correction regarding ONNX. I will instead use optimum-intel.

For the user interface, I'll develop a terminal-based TUI as part of the deployment stage. I’ll ensure model quality remains our primary milestone.

I’ll prepare a more detailed technical proposal draft soon and share it via email or here for feedback.

Looking forward to the next steps, and thanks again for the guidance! I’m very excited to move forward with this!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSOC PROJECT: Develop an OpenVINO-Domain Specialized Coder Model with SFT/GRPO/RAG #34299

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

GSOC PROJECT: Develop an OpenVINO-Domain Specialized Coder Model with SFT/GRPO/RAG #34299

Uh oh!

Shi-pra-19 Feb 25, 2026

Replies: 2 comments

Uh oh!

7taozhou7 Feb 25, 2026

Uh oh!

Shi-pra-19 Feb 25, 2026 Author

Shi-pra-19
Feb 25, 2026

7taozhou7
Feb 25, 2026

Shi-pra-19
Feb 25, 2026
Author