Conversation
marbre
left a comment
There was a problem hiding this comment.
GitHub action logs will expire and as there are no issues filed providing further details. What is the plan to fix those test failures? The plan cannot be to just skip tests.
| "test_forward_nn_CTCLoss_cuda_float32", | ||
| ], | ||
| "export": [ | ||
| # TestExportOnFakeCudaCUDA - subprocess import fails: missing librocm_sysdeps_liblzma.so.5 |
There was a problem hiding this comment.
This rather sounds like a bug that should be fixed.
| "test_compile_standalone_cos", | ||
| "test_compile_with_exporter", | ||
| "test_compile_with_exporter_weights", | ||
| #Also failed on https://github.com/ROCm/TheRock/actions/runs/24898379109 |
There was a problem hiding this comment.
Logs will expire, so this needs an issue to track.
| ], | ||
| "inductor": [ | ||
| # inductor/test_aot_inductor_package: AOTI C++ package tests need | ||
| # more complete CMake/runtime library-path handling in the wheel CI lane. |
There was a problem hiding this comment.
As the comment already says this might need proper handling in the wheel, why are tests skipped instead of aiming for a fix?
| #Passed in https://github.com/ROCm/TheRock/actions/runs/24898379109 | ||
| "test_return_aux_deprecation_warnings_cuda_float16", |
There was a problem hiding this comment.
If this is is passing, why is it on the exclude list? As above, logs will expire and this might need a proper issue.
|
@marbre -- We are trying to get some green state for full pytorch UT testing and then tackle them one by one by root causing them. With so many failures currently in pytorch full testing and with constantly moving rocm, it's been hard to achieve stability. So, yes, this is a hammer. But, we need to get this in and then start working on unskipping them, @rraminen has another task on her scrum board to start to root cause each of these skips. |
Motivation