Fix BERT and ViT perf by vkovacevicTT · Pull Request #822 · tenstorrent/tt-forge

vkovacevicTT · 2026-01-19T17:17:27Z

Closes #812
Closes #813
Closes #814

Long-term solution #821

Description

After transformers uplift 4.52.4 -> 4.57.1 we had a significant perf drop in ViT, BGE-M3-Encode and BERT for sentence embedding.

What's changed

Uplifted third_part/tt_forge_models to include changes that allow passing **kwargs when loading model.
Set attn_implementation="eager" for ViT and BERT.

BGE-M3-Encode is specific and requires monkey patching, it is skipped for now.

odjuricicTT

Let's explore if there is a cleaner way to do this first.

odjuricicTT · 2026-01-26T15:04:59Z

benchmark/tt-xla/utils.py

+
+
+# TODO(vkovacevic): Issue #804
+def patch_transformers_for_eager_attn(cls):


Is this the only way? Is it not possible to pass this as a param somewhere when loading the model?

For ViT and BERT we could pass attn_implementation="eager" in tt_forge_models here.

For BGE-m3 I think we need to monkey patch since loading is done internally in FlagEmbedding lib.

vkovacevicTT · 2026-02-03T16:26:01Z

Run: https://github.com/tenstorrent/tt-forge/actions/runs/21638859015

vkovacevicTT · 2026-02-03T16:26:39Z

Updated, as discussed in offline discussion @odjuricicTT

vkovacevicTT requested review from odjuricicTT, rpavlovicTT and tt-mpantic as code owners January 19, 2026 17:17

odjuricicTT requested changes Jan 26, 2026

View reviewed changes

vkovacevicTT requested a review from odjuricicTT January 26, 2026 15:45

vkovacevicTT force-pushed the vkovacevic/attn_implementation branch from 3e99519 to 568f090 Compare February 3, 2026 15:05

vkovacevicTT changed the title ~~Set attn_implementation="eager" for ViT, BERT and BGE-M3~~ Fix BERT and ViT perf Feb 3, 2026

vkovacevicTT force-pushed the vkovacevic/attn_implementation branch from 568f090 to 6344b6a Compare February 3, 2026 16:45

vkovacevicTT mentioned this pull request Feb 3, 2026

ViT PCC drop #851

Open

vkovacevicTT requested review from nsumrakTT, vmilosevic and vvukomanTT as code owners February 3, 2026 17:27

vvukomanTT approved these changes Feb 4, 2026

View reviewed changes

nsumrakTT approved these changes Feb 4, 2026

View reviewed changes

vkovacevicTT added 3 commits February 4, 2026 18:00

Add attn_implementation="eager" to ViT and BERT

3fe77a9

Test with fix

a94247f

wip

b4cf87f

vkovacevicTT force-pushed the vkovacevic/attn_implementation branch from 383079f to b4cf87f Compare February 4, 2026 18:00

vmilosevic approved these changes Feb 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix BERT and ViT perf#822

Fix BERT and ViT perf#822
vkovacevicTT wants to merge 3 commits intomainfrom
vkovacevic/attn_implementation

vkovacevicTT commented Jan 19, 2026 •

edited

Loading

Uh oh!

odjuricicTT left a comment

Uh oh!

odjuricicTT Jan 26, 2026

Uh oh!

vkovacevicTT Jan 26, 2026

Uh oh!

vkovacevicTT commented Feb 3, 2026 •

edited

Loading

Uh oh!

vkovacevicTT commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants



		# TODO(vkovacevic): Issue #804
		def patch_transformers_for_eager_attn(cls):

Conversation

vkovacevicTT commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What's changed

Uh oh!

odjuricicTT left a comment

Choose a reason for hiding this comment

Uh oh!

odjuricicTT Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

vkovacevicTT Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

vkovacevicTT commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vkovacevicTT commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

vkovacevicTT commented Jan 19, 2026 •

edited

Loading

vkovacevicTT commented Feb 3, 2026 •

edited

Loading