Open
Description
🐛 Describe the bug
Picked up in #1367, and worked around via pytorch/pytorch#143236, it appears the input to the torchchat AOTI runner is not 16 byte aligned.
While the PR from pytorch/pytorch eases this constraint, this may be indicative of potential perf losses (common of misalignment)
hattip to @malfet for suggesting line of investigation
Versions
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Staging