-
Notifications
You must be signed in to change notification settings - Fork 70
Open
Labels
Description
🐛 Describe the bug
Once test_tril got UR_RESULT_ERROR_DEVICE_LOST and all the next cases will be failed too.
Cases:
op_regression,third_party.torch-xpu-ops.test.regressions.test_tril.TestSimpleBinary,test_tril
__________________________ TestSimpleBinary.test_tril __________________________
[gw7] linux -- Python 3.12.12 /__w/torch-xpu-ops/torch-xpu-ops/.venv/bin/python
Traceback (most recent call last):
File "/__w/torch-xpu-ops/torch-xpu-ops/pytorch/third_party/torch-xpu-ops/test/regressions/test_tril.py", line 13, in test_tril
torch.xpu.synchronize()
File "/__w/torch-xpu-ops/torch-xpu-ops/.venv/lib/python3.12/site-packages/torch/xpu/__init__.py", line 451, in synchronize
return torch._C._xpu_synchronize(device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: level_zero backend failed with error: 20 (UR_RESULT_ERROR_DEVICE_LOST)
To execute this test, run the following from the base repo dir:
python test/regressions/test_tril.py TestSimpleBinary.test_tril
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
Versions
BMG B60