Skip to content

Conversation

@vtangTT
Copy link
Contributor

@vtangTT vtangTT commented Nov 21, 2025

Ticket

#5232
#5918

Problem description

Provide context for the problem.

What's changed

  • jit wheel now only packages necessary mlir + jit libs, then patches the rpath to look for the dependent metal libs
    • mlir + jit libs include:
      • libJITCPP.so - JIT src lib (this is just the JitCache)
      • _ttnn_jit.cpython-311-x86_64-linux-gnu.so - JIT bindings
      • _ttmlir.cpython-311-x86_64-linux-gnu.so - MLIR bindings
      • libTTMLIRRuntime.so - MLIR runtime bindings
        • the linked metal libs to this lib are not found until we patch the rpath and it's installed in an expected location
    • rpath patches include:
      • "$ORIGIN/../../ttnn/build/lib", - to point to site-packages/ttnn/build/lib which is where ttnn libs will be installed through the wheel
      • "$ORIGIN/../../../../../../build/lib" - to point to ${TT_METAL_HOME}/build/lib, which assumes you are running the metal dev venv (created using create_venv.sh)
  • cmake --build build will now default to building and installing the JIT wheel into venv instead of copying files into build/python_packages
    • also added new target: cmake --build build -- ttnn-jit will do the same
  • CI will run all tests with the JIT wheel (just like how ttrt does it)
    • removed ttnn-jit from call-build-wheels.yml
    • moved jit tests to new test script: ttnn_jit.sh
    • removed ttnn-jit.sh build script since it's no longer needed

Important Note

  • tt-metal runs in python3.10 env while tt-mlir runs in python3.11 env :(
  • metal devs will have to rebuild everything (the venv with llvm as well) to build a compatible python3.10
  • we are testing the 3.11 wheel in our CI, leaving potential gaps for 3.10 specific problems
  • solution is to probably push forward for a manylinux wheel and pypi

Checklist

@vtangTT vtangTT changed the title [ttnn.jit] wheel [ttnn.jit] new wheel Nov 21, 2025
@codecov-commenter
Copy link

codecov-commenter commented Nov 21, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.35%. Comparing base (becbe0a) to head (c43fb96).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5983   +/-   ##
=======================================
  Coverage   69.34%   69.35%           
=======================================
  Files         334      334           
  Lines       50999    50999           
=======================================
+ Hits        35367    35371    +4     
+ Misses      15632    15628    -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vtangTT vtangTT marked this pull request as ready for review November 21, 2025 19:01
@vtangTT vtangTT requested review from a team, arminaleTT and xanderchin as code owners November 21, 2025 19:01
Copy link
Contributor

@nsumrakTT nsumrakTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are building ttnn-jit wheel in every release build configuration. Then, on each test machine we download and install it.

  • Do we need ttnn-jit wheel for each test?
  • Do we need ttnn-jit for each configuration?
  • Do we need ttnn-jit wheel for ttnn-jit test at all? The wheel is nice feature for release to the users, but for CI I don't see why it has to be built, uploaded and downloaded when it can run from source?
  • Is wheel tested for release to the end user? Do we support manylinux as pykernel and tt-mlir (and ttnn wheel in metal)?

@vtangTT
Copy link
Contributor Author

vtangTT commented Nov 24, 2025

* Do we need ttnn-jit wheel for each test?

No only for the ttnn-jit tests.

* Do we need ttnn-jit for each configuration?

No only for the tracy build config in which we run the jit tests

To this end, I will try to isolate the wheel download/install step to only the jit tests is possible.

* Do we need ttnn-jit wheel for ttnn-jit test at all? The wheel is nice feature for release to the users, but for CI I don't see why it has to be built, uploaded and downloaded when it can run from source?

Yes we want to run it on CI so devs are working as close to the production workflow as possible. I don't think this blows up build times either.

* Is wheel tested for release to the end user? Do we support manylinux as pykernel and tt-mlir (and ttnn wheel in metal)?

To test the wheel for release to user, we'd need a representative metal dev env in python3.10, which currently doesn't seem possible on CI. We could also maybe have a flow to test the wheel release in metal in a python3.11 env.
No manylinux support yet, but that is the next step forward.

@nsumrakTT
Copy link
Contributor

@vtangTT Thanks for explanation.
Please contain both build and test changes to shell scripts (not workflow files) so it is clear what ttnn jit component requires.

@vtangTT vtangTT force-pushed the vtangTT/jit-whl branch 2 times, most recently from 82c8899 to e5044f9 Compare November 24, 2025 18:12
@vtangTT vtangTT requested a review from nsumrakTT November 24, 2025 19:39
@vtangTT vtangTT requested a review from a team as a code owner November 25, 2025 05:37
@vtangTT vtangTT force-pushed the vtangTT/jit-whl branch 3 times, most recently from a36b8b8 to 0dbce0b Compare November 25, 2025 05:53
Copy link
Contributor

@nsumrakTT nsumrakTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you!

BUILD_RPATH "$ORIGIN;$<TARGET_FILE_DIR:TTMLIRRuntime>"
BUILD_WITH_INSTALL_RPATH FALSE
)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean about changing the runtime rpath properties based on TTMLIR_ENABLE_TTNN_JIT being set, but this enables such a drastic cleanup of setup.py I think it's still worth it.

@vtangTT vtangTT force-pushed the vtangTT/jit-whl branch 2 times, most recently from 16a2f65 to f7aa536 Compare November 25, 2025 23:14
{ "runs-on": "n150", "image": "speedy", "script": "ttnn_standalone.sh" },
{ "runs-on": "n150", "image": "tracy", "script": "pykernel.sh" },
{ "runs-on": "n150", "image": "tracy", "script": "ttnn_jit.sh" },
{ "runs-on": "n150", "image": "speedy", "script": "alchemist.sh" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vtangTT can we change this to

{ "runs-on": "n150",   "image": "tracy",  "script": "pykernel.sh" },
 { "runs-on": ["n150", "llmbox"],   "image": "tracy",  "script": "ttnn_jit.sh" },

so that jit tests stay enabled on t3k/llmbox

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep i see the merge conflict

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants