Open
Conversation
odjuricicTT
requested changes
Jan 26, 2026
Collaborator
odjuricicTT
left a comment
There was a problem hiding this comment.
Let's explore if there is a cleaner way to do this first.
benchmark/tt-xla/utils.py
Outdated
|
|
||
|
|
||
| # TODO(vkovacevic): Issue #804 | ||
| def patch_transformers_for_eager_attn(cls): |
Collaborator
There was a problem hiding this comment.
Is this the only way? Is it not possible to pass this as a param somewhere when loading the model?
Contributor
Author
There was a problem hiding this comment.
For ViT and BERT we could pass attn_implementation="eager" in tt_forge_models here.
For BGE-m3 I think we need to monkey patch since loading is done internally in FlagEmbedding lib.
3e99519 to
568f090
Compare
Contributor
Author
Contributor
Author
|
Updated, as discussed in offline discussion @odjuricicTT |
568f090 to
6344b6a
Compare
Open
vvukomanTT
approved these changes
Feb 4, 2026
nsumrakTT
approved these changes
Feb 4, 2026
383079f to
b4cf87f
Compare
vmilosevic
approved these changes
Feb 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #812
Closes #813
Closes #814
Long-term solution #821
Description
After transformers uplift 4.52.4 -> 4.57.1 we had a significant perf drop in ViT, BGE-M3-Encode and BERT for sentence embedding.
What's changed
Uplifted third_part/tt_forge_models to include changes that allow passing **kwargs when loading model.
Set
attn_implementation="eager"for ViT and BERT.BGE-M3-Encode is specific and requires monkey patching, it is skipped for now.