Skip to content

Commit 9408a86

Browse files
authored
Update AMD workflows (#3179)
* Update AMD workflows * Update MI200 test flow to use torch latest * Update tolerances to values that pass (will fix before completing PR) * Revert chyanges to atol * Rename workflows * Fix CI badges
1 parent a725c5d commit 9408a86

File tree

3 files changed

+72
-3
lines changed

3 files changed

+72
-3
lines changed

.github/workflows/amd.yml .github/workflows/amd-mi100.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: amd
1+
name: amd-mi100
22

33
on:
44
push:
@@ -19,7 +19,7 @@ concurrency:
1919
jobs:
2020
amd-tests:
2121
# The type of runner that the job will run on
22-
runs-on: [self-hosted, amd]
22+
runs-on: [self-hosted, amd, mi100]
2323

2424
# Steps represent a sequence of tasks that will be executed as part of the job
2525
steps:

.github/workflows/amd-mi200.yml

+69
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
name: amd-mi200
2+
3+
on:
4+
push:
5+
branches:
6+
- 'staging**'
7+
paths-ignore:
8+
- 'docs/**'
9+
pull_request:
10+
paths-ignore:
11+
- 'docs/**'
12+
schedule:
13+
- cron: "0 0 * * *"
14+
15+
concurrency:
16+
group: ${{ github.workflow }}-${{ github.ref }}
17+
cancel-in-progress: true
18+
19+
jobs:
20+
amd-tests:
21+
# The type of runner that the job will run on
22+
runs-on: [self-hosted, amd, mi200]
23+
24+
# Steps represent a sequence of tasks that will be executed as part of the job
25+
steps:
26+
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
27+
- uses: actions/checkout@v2
28+
29+
- id: setup-venv
30+
uses: ./.github/workflows/setup-venv
31+
32+
- name: Install pytorch
33+
run: |
34+
pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/rocm5.4.2
35+
python -c "import torch; print('torch:', torch.__version__, torch)"
36+
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
37+
38+
- name: Install transformers
39+
run: |
40+
git clone https://github.com/huggingface/transformers
41+
cd transformers
42+
# if needed switch to the last known good SHA until transformers@master is fixed
43+
# git checkout 1cc453d33
44+
git rev-parse --short HEAD
45+
pip install .
46+
47+
- name: Install apex
48+
run: |
49+
pip install ninja
50+
pip install -v --install-option="--cpp_ext" --install-option="--cuda_ext" 'git+https://github.com/ROCmSoftwarePlatform/apex.git'
51+
52+
# Runs a set of commands using the runners shell
53+
- name: Install deepspeed
54+
run: |
55+
pip install .[dev,1bit,autotuning]
56+
#python -c "from deepspeed.env_report import cli_main; cli_main()"
57+
ds_report
58+
59+
- name: Python environment
60+
run: |
61+
pip list
62+
63+
# Runs a set of commands using the runners shell
64+
- name: Unit tests
65+
run: |
66+
if [[ -d ./torch-extensions ]]; then rm -rf ./torch-extensions; fi
67+
cd tests
68+
TORCH_EXTENSIONS_DIR=./torch-extensions pytest -n 4 --verbose unit/
69+
TORCH_EXTENSIONS_DIR=./torch-extensions pytest -m 'sequential' unit/

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ DeepSpeed has been integrated with several different popular open-source DL fram
107107
| Description | Status |
108108
| ----------- | ------ |
109109
| NVIDIA | [![nv-torch19-p40](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch19-p40.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch19-p40.yml) [![nv-torch19-v100](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch19-v100.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch19-v100.yml) [![nv-torch-latest-v100](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch-latest-v100.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch-latest-v100.yml) [![nv-inference](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-inference.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-inference.yml) [![nv-nightly](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-nightly.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-nightly.yml) |
110-
| AMD | [![amd](https://github.com/microsoft/DeepSpeed/actions/workflows/amd.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/amd.yml) |
110+
| AMD | [![amd-mi100](https://github.com/microsoft/DeepSpeed/actions/workflows/amd-mi100.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/amd-mi100.yml) [![amd-mi200](https://github.com/microsoft/DeepSpeed/actions/workflows/amd-mi200.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/amd-mi200.yml) |
111111
| CPU | [![nv-torch-latest-cpu](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch-latest-cpu.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch-latest-cpu.yml) |
112112
| PyTorch Nightly | [![nv-torch-nightly-v100](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch-nightly-v100.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-torch-nightly-v100.yml) |
113113
| Integrations | [![nv-transformers-v100](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-transformers-v100.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-transformers-v100.yml) [![nv-lightning-v100](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-lightning-v100.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-lightning-v100.yml) [![nv-accelerate-v100](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-accelerate-v100.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-accelerate-v100.yml)[![nv-megatron](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-megatron.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-megatron.yml)[![nv-mii](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-mii.yml/badge.svg?branch=master)](https://github.com/microsoft/DeepSpeed/actions/workflows/nv-mii.yml) |

0 commit comments

Comments
 (0)