Skip to content

Conversation

xufang-lisa
Copy link

What does this PR do?

This PR adds conversion of draft model in eagle3 pipeline.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

)
return self.random_float_tensor(shape, framework=framework, dtype=float_dtype)

@register_in_tasks_manager( "llamaeagle3",*["text-generation","text-generation-with-past"],library_name="transformers")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what kind of model can we convert with such addition? I am asking because original model has a different model type llama3.
Can you convert only local copy with modified model type? I am not sure that it is capable to convert original eagle3 llama model.
Also, implemented solution looks not scalable for other eagle3 models such as https://huggingface.co/nvidia/gpt-oss-120b-Eagle3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have verified conversion and GENAI pipeline with yuhuili/EAGLE3-LLaMA3.1-Instruct-8B and Tengyunw/qwen3_8b_eagle3 locally, and AngelSlim/Qwen3-1.7B_eagle3 will be added to GENAI repo test openvinotoolkit/openvino.genai#2740. Checked the list on EAGLE3 github repo, most of them are llama type, they can be converted in theory or with limit update, can we merge this PR firstly and leave the verification per OpenVINO base model support progress and customer requirements?

AngelSlim/Qwen3-14B_eagle3/config.json:  "model_type": "qwen3",
AngelSlim/Qwen3-a3B_eagle3/config.json:  "model_type": "llama",
AngelSlim/Qwen3-32B_eagle3/config.json:  "model_type": "llama",
AngelSlim/Qwen3-4B_eagle3/config.json:  "model_type": "llama",
AngelSlim/Qwen3-8B_eagle3/config.json:  "model_type": "llama",
AngelSlim/Qwen3-1.7B_eagle3/config.json:  "model_type": "llama",
linglingdan/Eagle3_for_MiniCPM4/config.json:  "model_type": "llama", 
lmsys/EAGLE3-gpt-oss-120b-bf16/config.json:  "model_type": "llama",
lmsys/sglang-EAGLE3-Llama-4-Scout-17B-16E-Instruct-v1/config.json:  "model_type": "llama",
lmsys/Qwen3-235B-A22B-EAGLE3/config.json:  "model_type": "llama",
lmsys/sglang-EAGLE3-Llama-4-Maverick-17B-128E-Instruct-v1/config.json:  "model_type": "llama",
nvidia/gpt-oss-120b-Eagle3/config.json:  "model_type": "llama",
nvidia/Qwen3-235B-A22B-Eagle3/config.json:  "model_type": "llama",
nvidia/Llama-4-Maverick-17B-128E-Eagle3 ??,
Tengyunw/qwen3_30b_moe_eagle3/config.json:  "model_type": "llama",
Tengyunw/qwen3_8b_eagle3/config.json:  "model_type": "llama",
wantsleep/OLMoE_1B_7B_Eagle3/config.json:  "model_type": "olmoe",
yuhuili/EAGLE3-LLaMA3.3-Instruct-70B/config.json:  "model_type": "llama",
yuhuili/EAGLE3-DeepSeek-R1-Distill-LLaMA-8B/config.json:  "model_type": "llama",
yuhuili/EAGLE3-LLaMA3.1-Instruct-8B/config.json:  "model_type": "llama",
yuhuili/EAGLE3-Vicuna1.3-13B/config.json:  "model_type": "llama",
Zjcxy-SmartAI/Eagle3-Qwen3-4B-Instruct-2507-zh/config.json:  "model_type": "llama",

Copy link
Collaborator

@rkazants rkazants Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we use original model type? For this, it relies on the different model type that seems to be modified manually by you. That is not how it should work. These changes should allow to convert the original model. where does llamaeagle3 model type come from?
Does it mean that user should re-create all eagle3 model and modify its model type, etc.?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkazants Discussed with Fang, WIP to avoid config.json modification by passing model_type="llamaeagle3" to AutoConfig.from_pretrained

Copy link
Contributor

@peterchen-intel peterchen-intel Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we use original model type? llama modeling in transformers can't support eagle3 draft model, the modeling for eagle3 draft model is from https://github.com/SafeAILab/EAGLE/blob/main/eagle/model/cnets.py. Current PR should support the conversion of eagle3 draft model with model_type: "llama" in config.json

Copy link
Collaborator

@rkazants rkazants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants