DeepSeek-R1-Distill-Llama-8B Model Optimization

This repository demonstrates the optimization of the DeepSeek-R1-Distill-Llama-8B model using post-training quantization (PTQ) techniques. The optimization process is divided into these workflows:

OpenVINO for Intel® GPU/NPU
- This process uses OpenVINO specific passes like OpenVINOOptimumConversion, OpenVINOIoUpdate and OpenVINOEncapsulation

Intel® Workflows

These workflows performs quantization with Optimum Intel®. It performs the optimization pipeline:

HuggingFace Model -> Quantized OpenVINO model -> Quantized encapsulated ONNX OpenVINO IR model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DeepSeek-R1-Distill-Llama-8B Model Optimization

Intel® Workflows

Uh oh!

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

DeepSeek-R1-Distill-Llama-8B Model Optimization

Intel® Workflows