[CI][XPU] enable unit test for XPU device#2814
[CI][XPU] enable unit test for XPU device#2814DiweiSun wants to merge 25 commits intopytorch:mainfrom
Conversation
This reverts commit a736c41.
.github/workflows/pr-test-xpu.yml
Outdated
| - name: Clean all stopped docker containers | ||
| if: always() | ||
| shell: bash | ||
| run: | | ||
| # Prune all stopped containers. | ||
| # If other runner is pruning on this node, will skip. | ||
| nprune=$(ps -ef | grep -c "docker container prune") | ||
| if [[ $nprune -eq 1 ]]; then | ||
| docker container prune -f | ||
| fi | ||
|
|
||
| - name: Runner health check GPU count | ||
| if: always() | ||
| shell: bash | ||
| run: | | ||
| ngpu=$(timeout 30 clinfo -l | grep -c -E 'Device' || true) | ||
| msg="Please file an issue on pytorch/ao reporting the faulty runner. Include a link to the runner logs so the runner can be identified" | ||
| if [[ $ngpu -eq 0 ]]; then | ||
| echo "Error: Failed to detect any GPUs on the runner" | ||
| echo "$msg" | ||
| exit 1 | ||
| fi | ||
|
|
||
| - name: Use following to pull public copy of the image | ||
| id: print-ghcr-mirror | ||
| shell: bash | ||
| run: | | ||
| echo "docker pull ${DOCKER_IMAGE}" | ||
| docker pull ${DOCKER_IMAGE} |
There was a problem hiding this comment.
I think all those steps we also need in here https://github.com/pytorch/pytorch/blob/main/.github/workflows/_xpu-test.yml#L79-L114
There was a problem hiding this comment.
Ported done. Please kindly help review.
| if-no-files-found: ignore | ||
| path: ./**/core.[1-9]* | ||
|
|
||
| - name: Teardown XPU |
There was a problem hiding this comment.
Can reuse the action in pytorch directly
There was a problem hiding this comment.
yes, this is literally ported from pytorch
Co-authored-by: Wang, Chuanqi <chuanqi.wang@intel.com>
Co-authored-by: Wang, Chuanqi <chuanqi.wang@intel.com>
Co-authored-by: Wang, Chuanqi <chuanqi.wang@intel.com>
.github/workflows/pr-test-xpu.yml
Outdated
| - ciflow/xpu/* | ||
| pull_request: | ||
| branches: | ||
| - main |
There was a problem hiding this comment.
remove pull-request after review
|
@pytorchbot label "ciflow/xpu" |
|
Didn't find following labels among repository labels: ciflow/xpu |
|
@pytorchbot label "ciflow/xpu" |
|
Unknown label
|
|
@pytorchbot label "ciflow/xpu" |
|
Warning: Unknown label
Please add the new label to .github/pytorch-probot.yml |
|
Unknown label
|
|
@pytorchbot label "ciflow/xpu" |
1 similar comment
|
@pytorchbot label "ciflow/xpu" |
|
@pytorchbot label "ciflow/xpu" |
|
Unknown label
|
Enabling CI testing for the torchao project on the Intel XPU (GPU) platform to ensure functional correctness, performance consistency, and long-term compatibility as both torchao and XPU support evolve.