Get Falcon 7B running E2E #1265

lanchongyizu · 2025-11-02T09:27:42Z

Ticket

Problem description

See the details in the above ticket.

What's changed

* Add disable_load_params_once option for model testing since Falcon 7B
  is incompatible with load_params_once which causes OOM.
* Convert following torch ops to ttnn ops:
    {
      "op_name": "aten.squeeze.default",
      "op_schema": "Tensor<[1, 7]> self = ?"
    },
    {
      "op_name": "aten.native_layer_norm.default",
      "op_schema": "Tensor<[1, 7, 4544]> input = ?,<br>List[int] normalized_shape = [4544],<br>Optional[Tensor]<[4544]> weight = ?,<br>Optional[Tensor]<[4544]> bias = ?,<br>float eps = 1e-05"
    },
    {
      "op_name": "aten.index.Tensor",
      "op_schema": "Tensor<[1, 7, 73, 64]> self = ?,<br>List[Optional[Tensor]] indices = [None, None, _folded_lift_fresh_copy]"
    },
    {
      "op_name": "aten.mul.Scalar",
      "op_schema": "Tensor<[1, 71, 7, 64]> self = ?,<br>number other = 0.3535533905932738"
    },
* Enable batch_size 32 for Falcon 7B E2E testing

OOM issue:
As mentioned by @kevinwuTT in another issue, Falcon 7B is incompatible with load_params_once. After load_params_once is disabled, OOM issue disappears for both batchsize 1 and 32.

Model	Status	Batch	Compiled First Run (ms)	Original Throughput (Inferences Per Second)	Compiled Throughput (Inferences Per Second)	Accuracy (%)	Torch Ops Before (Unique Ops)	Torch Ops Remain (Unique Ops)	To/From Device Ops
Falcon-7B	✅	1	44009.8	0.384783	0.13878	98	2600 (27)	0 (0)	193
Falcon-7B	✅	32	50098.8	5.07062	4.27947	98	2696 (27)	0 (0)	193

marty1885 · 2025-11-07T14:06:09Z

Ping. @ayerofieiev-tt if you can get someone to look at this one

jmalone-tt

These changes look reasonable. Happy to merge if tests pass (kicked off a run here: https://github.com/tenstorrent/pytorch2.0_ttnn/actions/runs/19867910005)

jmalone-tt

Looks like it still hits OOM in the latest run. Have you verified that it runs on your end?
https://github.com/tenstorrent/pytorch2.0_ttnn/actions/runs/19867910005/job/57557971201

lanchongyizu · 2025-12-14T03:02:08Z

Looks like it still hits OOM in the latest run. Have you verified that it runs on your end? https://github.com/tenstorrent/pytorch2.0_ttnn/actions/runs/19867910005/job/57557971201

Sure. As mentioned in #1265 (comment), Falcon 7B is incompatible with load_params_once, so please add --disable_load_params_once in the pytest command line as below.

python3 -m pytest --github-report tests/models/falcon/test_falcon.py --disable_load_params_once --report_nth_iteration=$num_iterations --export_code=accuracy --splits 57 --group 10 -s

jmalone-tt · 2025-12-18T16:11:28Z

We just merged another PR that was hitting the same issue - if you look at tests/models/mamba/test_mamba, you should see a new pytest fixture load_params_once. Please rebase your changes and add that fixture, and if tests pass we should be good to merge

* Add disable_load_params_once option for model testing since Falcon 7B is incompatible with load_params_once which causes OOM. * Convert following torch ops to ttnn ops: { "op_name": "aten.squeeze.default", "op_schema": "Tensor<[1, 7]> self = ?" }, { "op_name": "aten.native_layer_norm.default", "op_schema": "Tensor<[1, 7, 4544]> input = ?, List[int] normalized_shape = [4544], Optional[Tensor]<[4544]> weight = ?, Optional[Tensor]<[4544]> bias = ?, float eps = 1e-05" }, { "op_name": "aten.index.Tensor", "op_schema": "Tensor<[1, 7, 73, 64]> self = ?, List[Optional[Tensor]] indices = [None, None, _folded_lift_fresh_copy]" }, { "op_name": "aten.mul.Scalar", "op_schema": "Tensor<[1, 71, 7, 64]> self = ?, number other = 0.3535533905932738" }, * Enable batch_size 32 for Falcon 7B E2E testing

This reverts commit e838c7f.

lanchongyizu force-pushed the falcon_7b branch 5 times, most recently from 6b12a16 to be97d01 Compare November 2, 2025 13:01

marty1885 linked an issue Nov 18, 2025 that may be closed by this pull request

[Bounty $1500] Get Falcon running E2E #1045

Open

jmalone-tt approved these changes Dec 8, 2025

View reviewed changes

jmalone-tt requested changes Dec 9, 2025

View reviewed changes

lanchongyizu requested a review from jmalone-tt December 14, 2025 03:07

lanchongyizu force-pushed the falcon_7b branch from be97d01 to 62dd8cb Compare December 20, 2025 12:03

jmalone-tt added 2 commits December 30, 2025 11:44

Just test falcon

e838c7f

Revert "Just test falcon"

33f9472

This reverts commit e838c7f.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Get Falcon 7B running E2E #1265

Get Falcon 7B running E2E #1265

lanchongyizu commented Nov 2, 2025 •

edited

Loading

Uh oh!

marty1885 commented Nov 7, 2025 •

edited

Loading

Uh oh!

jmalone-tt left a comment

Uh oh!

jmalone-tt left a comment

Uh oh!

lanchongyizu commented Dec 14, 2025

Uh oh!

jmalone-tt commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Get Falcon 7B running E2E #1265

Are you sure you want to change the base?

Get Falcon 7B running E2E #1265

Conversation

lanchongyizu commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ticket

Problem description

What's changed

Uh oh!

marty1885 commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmalone-tt left a comment

Choose a reason for hiding this comment

Uh oh!

jmalone-tt left a comment

Choose a reason for hiding this comment

Uh oh!

lanchongyizu commented Dec 14, 2025

Uh oh!

jmalone-tt commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lanchongyizu commented Nov 2, 2025 •

edited

Loading

marty1885 commented Nov 7, 2025 •

edited

Loading