-
Notifications
You must be signed in to change notification settings - Fork 25
Get Falcon 7B running E2E #1265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
6b12a16 to
be97d01
Compare
|
Ping. @ayerofieiev-tt if you can get someone to look at this one |
jmalone-tt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look reasonable. Happy to merge if tests pass (kicked off a run here: https://github.com/tenstorrent/pytorch2.0_ttnn/actions/runs/19867910005)
jmalone-tt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it still hits OOM in the latest run. Have you verified that it runs on your end?
https://github.com/tenstorrent/pytorch2.0_ttnn/actions/runs/19867910005/job/57557971201
Sure. As mentioned in #1265 (comment), Falcon 7B is incompatible with load_params_once, so please add |
|
We just merged another PR that was hitting the same issue - if you look at tests/models/mamba/test_mamba, you should see a new pytest fixture load_params_once. Please rebase your changes and add that fixture, and if tests pass we should be good to merge |
* Add disable_load_params_once option for model testing since Falcon 7B
is incompatible with load_params_once which causes OOM.
* Convert following torch ops to ttnn ops:
{
"op_name": "aten.squeeze.default",
"op_schema": "Tensor<[1, 7]> self = ?"
},
{
"op_name": "aten.native_layer_norm.default",
"op_schema": "Tensor<[1, 7, 4544]> input = ?,<br>List[int] normalized_shape = [4544],<br>Optional[Tensor]<[4544]> weight = ?,<br>Optional[Tensor]<[4544]> bias = ?,<br>float eps = 1e-05"
},
{
"op_name": "aten.index.Tensor",
"op_schema": "Tensor<[1, 7, 73, 64]> self = ?,<br>List[Optional[Tensor]] indices = [None, None, _folded_lift_fresh_copy]"
},
{
"op_name": "aten.mul.Scalar",
"op_schema": "Tensor<[1, 71, 7, 64]> self = ?,<br>number other = 0.3535533905932738"
},
* Enable batch_size 32 for Falcon 7B E2E testing
be97d01 to
62dd8cb
Compare
This reverts commit e838c7f.
Ticket
#1045
Problem description
See the details in the above ticket.
What's changed
OOM issue:
As mentioned by @kevinwuTT in another issue, Falcon 7B is incompatible with load_params_once. After load_params_once is disabled, OOM issue disappears for both batchsize 1 and 32.