-
Notifications
You must be signed in to change notification settings - Fork 145
add support for draft model of eagle3 #1468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add support for draft model of eagle3 #1468
Conversation
) | ||
return self.random_float_tensor(shape, framework=framework, dtype=float_dtype) | ||
|
||
@register_in_tasks_manager( "llamaeagle3",*["text-generation","text-generation-with-past"],library_name="transformers") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what kind of model can we convert with such addition? I am asking because original model has a different model type llama3
.
Can you convert only local copy with modified model type? I am not sure that it is capable to convert original eagle3 llama model.
Also, implemented solution looks not scalable for other eagle3 models such as https://huggingface.co/nvidia/gpt-oss-120b-Eagle3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have verified conversion and GENAI pipeline with yuhuili/EAGLE3-LLaMA3.1-Instruct-8B and Tengyunw/qwen3_8b_eagle3 locally, and AngelSlim/Qwen3-1.7B_eagle3 will be added to GENAI repo test openvinotoolkit/openvino.genai#2740. Checked the list on EAGLE3 github repo, most of them are llama type, they can be converted in theory or with limit update, can we merge this PR firstly and leave the verification per OpenVINO base model support progress and customer requirements?
AngelSlim/Qwen3-14B_eagle3/config.json: "model_type": "qwen3",
AngelSlim/Qwen3-a3B_eagle3/config.json: "model_type": "llama",
AngelSlim/Qwen3-32B_eagle3/config.json: "model_type": "llama",
AngelSlim/Qwen3-4B_eagle3/config.json: "model_type": "llama",
AngelSlim/Qwen3-8B_eagle3/config.json: "model_type": "llama",
AngelSlim/Qwen3-1.7B_eagle3/config.json: "model_type": "llama",
linglingdan/Eagle3_for_MiniCPM4/config.json: "model_type": "llama",
lmsys/EAGLE3-gpt-oss-120b-bf16/config.json: "model_type": "llama",
lmsys/sglang-EAGLE3-Llama-4-Scout-17B-16E-Instruct-v1/config.json: "model_type": "llama",
lmsys/Qwen3-235B-A22B-EAGLE3/config.json: "model_type": "llama",
lmsys/sglang-EAGLE3-Llama-4-Maverick-17B-128E-Instruct-v1/config.json: "model_type": "llama",
nvidia/gpt-oss-120b-Eagle3/config.json: "model_type": "llama",
nvidia/Qwen3-235B-A22B-Eagle3/config.json: "model_type": "llama",
nvidia/Llama-4-Maverick-17B-128E-Eagle3 ??,
Tengyunw/qwen3_30b_moe_eagle3/config.json: "model_type": "llama",
Tengyunw/qwen3_8b_eagle3/config.json: "model_type": "llama",
wantsleep/OLMoE_1B_7B_Eagle3/config.json: "model_type": "olmoe",
yuhuili/EAGLE3-LLaMA3.3-Instruct-70B/config.json: "model_type": "llama",
yuhuili/EAGLE3-DeepSeek-R1-Distill-LLaMA-8B/config.json: "model_type": "llama",
yuhuili/EAGLE3-LLaMA3.1-Instruct-8B/config.json: "model_type": "llama",
yuhuili/EAGLE3-Vicuna1.3-13B/config.json: "model_type": "llama",
Zjcxy-SmartAI/Eagle3-Qwen3-4B-Instruct-2507-zh/config.json: "model_type": "llama",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we use original model type? For this, it relies on the different model type that seems to be modified manually by you. That is not how it should work. These changes should allow to convert the original model. where does llamaeagle3
model type come from?
Does it mean that user should re-create all eagle3 model and modify its model type, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rkazants Discussed with Fang, WIP to avoid config.json modification by passing model_type="llamaeagle3" to AutoConfig.from_pretrained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we use original model type? llama modeling in transformers can't support eagle3 draft model, the modeling for eagle3 draft model is from https://github.com/SafeAILab/EAGLE/blob/main/eagle/model/cnets.py. Current PR should support the conversion of eagle3 draft model with model_type: "llama" in config.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs tests
What does this PR do?
This PR adds conversion of draft model in eagle3 pipeline.
Before submitting