Skip to content

Commit b4196b7

Browse files
committed
update benchmarks
Signed-off-by: Dmitry Shmulevich <dshmulevich@nvidia.com>
1 parent 15be68c commit b4196b7

File tree

11 files changed

+10
-96
lines changed

11 files changed

+10
-96
lines changed

resources/benchmarks/README.md

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -29,28 +29,16 @@ To run the benchmark test for Kueue:
2929
./scripts/benchmarks/gang-scheduling/run-kueue.sh
3030
```
3131

32-
To run the benchmark test for Run:ai
33-
34-
```bash
35-
./scripts/benchmarks/gang-scheduling/run-runai.sh
36-
```
37-
3832
## Scaling Benchmark Test
3933

40-
The scaling benchmark workflow operates on 700 virtual GPU nodes with tho workflows. The first [workflow](scaling/workflows/run-test-multi.yaml) submits is a job with 700 replicas, the second [workflow](scaling/workflows/run-test-single.yaml) submits a batch of 700 single-node jobs.
34+
The scaling benchmark workflow operates on 700 virtual GPU nodes. The [workflow](scaling/workflows/run-test.yaml) submits a batch of 700 single-node jobs.
4135

4236
### Example
4337

4438
To run the benchmark test for Volcano:
4539

4640
```bash
47-
./bin/knavigator -workflow 'resources/benchmarks/scaling/workflows/{config-nodes.yaml,config-volcano.yaml,run-test-multi.yaml}'
48-
```
49-
50-
To run the benchmark test for Run:ai
51-
52-
```bash
53-
./bin/knavigator -workflow 'resources/benchmarks/scaling/workflows/{config-nodes.yaml,config-runai.yaml,runai-test-single.yaml}'
41+
./scripts/benchmarks/scaling/run-volcano.sh
5442
```
5543

5644
## Network Topology Benchmark Test

resources/benchmarks/gang-scheduling/workflows/config-yunikorn.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,6 @@ tasks:
3939
- name: sandbox
4040
submitacl: '*'
4141
resources:
42-
max:
43-
{memory: 36Gi, vcore: 8000m, nvidia.com/gpu: 256}
42+
guaranteed: {memory: 36Gi, vcore: 8000m, nvidia.com/gpu: 256}
43+
max: {memory: 36Gi, vcore: 8000m, nvidia.com/gpu: 256}
4444
timeout: 1m

resources/benchmarks/scaling/workflows/config-runai.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
name: config-runai
1616
description: register, deploy and configure run:ai custom resources
1717
tasks:
18-
- id: register-trainingworkload
18+
- id: register
1919
type: RegisterObj
2020
params:
2121
template: "resources/benchmarks/templates/runai/trainingworkload.yaml"

resources/benchmarks/scaling/workflows/run-test-multi.yaml

Lines changed: 0 additions & 25 deletions
This file was deleted.

resources/benchmarks/scaling/workflows/run-test-single.yaml renamed to resources/benchmarks/scaling/workflows/run-test.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
name: test-scaling-single-node-jobs
15+
name: test-scaling
1616
description: deploy 700 single-replica jobs
1717
tasks:
1818
- id: job

resources/benchmarks/scaling/workflows/runai-test-multi.yaml

Lines changed: 0 additions & 25 deletions
This file was deleted.

resources/benchmarks/scaling/workflows/runai-test-single.yaml

Lines changed: 0 additions & 24 deletions
This file was deleted.

scripts/benchmarks/scaling/run-kai.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,4 @@ set -e
1818

1919
REPO_HOME=$(readlink -f $(dirname $(readlink -f "$0"))/../../../)
2020

21-
$REPO_HOME/bin/knavigator -workflow "$REPO_HOME/resources/benchmarks/scaling/workflows/{config-nodes.yaml,config-kai.yaml,run-test-single.yaml}"
21+
$REPO_HOME/bin/knavigator -workflow "$REPO_HOME/resources/benchmarks/scaling/workflows/{config-nodes.yaml,config-kai.yaml,run-test.yaml}"

scripts/benchmarks/scaling/run-kueue.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,4 @@ set -e
1818

1919
REPO_HOME=$(readlink -f $(dirname $(readlink -f "$0"))/../../../)
2020

21-
$REPO_HOME/bin/knavigator -workflow "$REPO_HOME/resources/benchmarks/scaling/workflows/{config-nodes.yaml,config-kueue.yaml,run-test-single.yaml}"
21+
$REPO_HOME/bin/knavigator -workflow "$REPO_HOME/resources/benchmarks/scaling/workflows/{config-nodes.yaml,config-kueue.yaml,run-test.yaml}"

scripts/benchmarks/scaling/run-volcano.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,4 @@ set -e
1818

1919
REPO_HOME=$(readlink -f $(dirname $(readlink -f "$0"))/../../../)
2020

21-
$REPO_HOME/bin/knavigator -workflow "$REPO_HOME/resources/benchmarks/scaling/workflows/{config-nodes.yaml,config-volcano.yaml,run-test-single.yaml}"
21+
$REPO_HOME/bin/knavigator -workflow "$REPO_HOME/resources/benchmarks/scaling/workflows/{config-nodes.yaml,config-volcano.yaml,run-test.yaml}"

0 commit comments

Comments
 (0)