Skip to content

Refactor pixtal/loader.py and fix load_inputs function#402

Open
devisettymahidhar608 wants to merge 1 commit intomainfrom
devisettym/pixtral_reformated
Open

Refactor pixtal/loader.py and fix load_inputs function#402
devisettymahidhar608 wants to merge 1 commit intomainfrom
devisettym/pixtral_reformated

Conversation

@devisettymahidhar608
Copy link
Contributor

@devisettymahidhar608 devisettymahidhar608 commented Jan 13, 2026

Ticket

Link to Github Issue

Problem description

Debug PCC drop in the model

What's changed

Upon inspecting the load_inputs function, it returns the following keys: input_ids and attention_mask.

While running the model from Hugging Face in Google Colab, I observed a difference in the input keys being passed to the model.

The following keys are being sent to the model: ['input_ids', 'attention_mask', 'pixel_values', 'image_sizes'].

loader.py only sends input_ids and attention_mask; pixel_values and image_sizes are not included.

Edited loader.py to ensure correct inputs are passed to the model.

While running the Mistral/pixtral model, I encountered the following error: loc("set-dimension-size.60"): Shardy propagation only supports ranked tensors with a static shape. The tensor type causing the issue is tensor>. pixtral.log

Issue originats from the masked_scatter - sanity.log

Checklist

  • New/Existing tests provide coverage for changes

@devisettymahidhar608 devisettymahidhar608 changed the title Reformatted pixtal/loader.py and fixed the load_inputs function Refactor pixtal/loader.py and fix load_inputs function Jan 13, 2026
@devisettymahidhar608 devisettymahidhar608 marked this pull request as ready for review January 14, 2026 08:17
@devisettymahidhar608 devisettymahidhar608 marked this pull request as draft January 14, 2026 15:18
@devisettymahidhar608 devisettymahidhar608 force-pushed the devisettym/pixtral_reformated branch from ee0b92d to 0785910 Compare January 21, 2026 11:57
@sonalibaskaran2499 sonalibaskaran2499 marked this pull request as ready for review January 21, 2026 13:50
@devisettymahidhar608 devisettymahidhar608 force-pushed the devisettym/pixtral_reformated branch from 0785910 to 4a79156 Compare January 27, 2026 04:55
@kmabeeTT
Copy link
Contributor

Adding @AleksKnezevic - Thanks for changes @devisettymahidhar608 - I'd like to get some input from Aleks on what to do here, since this model is currently running (but with low PCC lately, and incorrect loader.py), do we merge these changes which causes the model to not run e2e anymore until the issue you opened tenstorrent/tt-xla#2924 gets assigned and debugged.

Side Note - We should hold off a bit on merging this until tt-forge-models recent uplift issues (a week out of date in tt-xla) get resolved.

@AleksKnezevic
Copy link
Contributor

Commented on the other issue, we can't support any dynamic shapes through the compiler right now. We'll need to debug why we're seeing this dynamism.

@devisettymahidhar608 devisettymahidhar608 force-pushed the devisettym/pixtral_reformated branch 2 times, most recently from 3684d4a to 90a3dad Compare February 4, 2026 09:19
@devisettymahidhar608 devisettymahidhar608 force-pushed the devisettym/pixtral_reformated branch from 90a3dad to 0eb2335 Compare February 5, 2026 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants