-
Notifications
You must be signed in to change notification settings - Fork 26.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Customized position_ids not working #33938
Comments
Cc @gante |
You can directly modify how position_ids are computed within your code before passing them to the model. Ensure that your custom position_ids are aligned with the expected shape and values. |
Thanks for the reply. Since the model generates token-by-token, feeding a customized position_ids causes an size mismatch error after the first token is generated. I am not sure if I missed or misunderstood anything. |
It has a tracker here (#29149) and we had a PR a few months ago. Unfortunately the PR was too big and needed to be decomposed into parts, after which it went low in priority :( |
System Info
Hello,
I am trying to feed a customized position IDs to Llama model.
If I fed a customized position_ids vector, for example, [[0, 0, 1, 2, 2, 2]] means batch size = 1, 1st and 2nd tokens share the same position, and 3rd-5th tokens share the same position 2, this will cause an error.
The error seems to be located in the function
prepare_inputs_for_generation
insrc/transformers/models/llama/modeling_llama.py
, where theposition_ids
does not change as thecache_position
increase, so the shape inconsistency occurs.Is there any way to successfully feed a customized position ids to the model?
Thanks!
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
position_ids
as one of the input inmodel.generate()
Expected behavior
Size mismatch
The text was updated successfully, but these errors were encountered: