[Question] A little doubt about the draft training of VL model

<img width="1576" height="300" alt="Image" src="https://github.com/user-attachments/assets/f8121fbf-82a7-47dc-af6d-ec3e05202cc7" />

I noticed that when the draft model gets the input embedding, the commented-out line actually looks correct, using pixel value to get the image embedding. However, the current implementation does not use the image feature. Is there any experimental support that the latter is better?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] A little doubt about the draft training of VL model #260

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] A little doubt about the draft training of VL model #260

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions