Name	Name	Last commit message	Last commit date
parent directory ..
.gitignore	.gitignore
README.md	README.md
_copy.json.config	_copy.json.config
deepseek_ov_config.json	deepseek_ov_config.json
deepseek_ov_config.json.config	deepseek_ov_config.json.config
deepseek_ov_npu_config.json	deepseek_ov_npu_config.json
deepseek_ov_npu_config.json.config	deepseek_ov_npu_config.json.config
inference_model.json	inference_model.json
inference_sample.ipynb	inference_sample.ipynb
info.yml	info.yml
model_project.config	model_project.config
requirements.txt	requirements.txt
winml.py	winml.py

Name

Last commit message

Last commit date

README.md

_copy.json.config

deepseek_ov_config.json

deepseek_ov_config.json.config

deepseek_ov_npu_config.json

deepseek_ov_npu_config.json.config

inference_model.json

inference_sample.ipynb

DeepSeek-R1-Distill-Llama-8B Model Optimization

This repository demonstrates the optimization of the DeepSeek-R1-Distill-Llama-8B model using post-training quantization (PTQ) techniques. The optimization process is divided into these workflows:

OpenVINO for Intel® GPU/NPU
- This process uses OpenVINO specific passes like OpenVINOOptimumConversion, OpenVINOIoUpdate and OpenVINOEncapsulation

Intel® Workflows

These workflows performs quantization with Optimum Intel®. It performs the optimization pipeline:

HuggingFace Model -> Quantized OpenVINO model -> Quantized encapsulated ONNX OpenVINO IR model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

README.md

DeepSeek-R1-Distill-Llama-8B Model Optimization

Intel® Workflows

Uh oh!

FilesExpand file tree

aitk

Directory actions

More options

Directory actions

More options

Latest commit

History

aitk

Folders and files

parent directory

README.md

DeepSeek-R1-Distill-Llama-8B Model Optimization

Intel® Workflows