Skip to content

Commit 0e26c88

Browse files
authored
Merge pull request #523 from NVIDIA/am/doc-upd
Updated README
2 parents 7d696c1 + 8f5e97a commit 0e26c88

File tree

1 file changed

+45
-33
lines changed

1 file changed

+45
-33
lines changed

README.md

Lines changed: 45 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,38 @@
11
# CloudAI Benchmark Framework
22

3-
## Project Description
43
CloudAI benchmark framework aims to develop an industry standard benchmark focused on grading Data Center (DC) scale AI systems in the Cloud. The primary motivation is to provide automated benchmarking on various systems.
54

5+
## Get Started
6+
**Note**: instructions for installing a custom python version are available [here](#install-custom-python-version).
7+
8+
**Note**: instructions for setting up access for `enroot` are available [here](#set-up-access-to-the-private-ngc-registry).
9+
10+
1. Clone the CloudAI repository to your local machine:
11+
```bash
12+
git clone git@github.com:NVIDIA/cloudai.git
13+
cd cloudai
14+
```
15+
16+
2. Create a virtual environment:
17+
```bash
18+
python -m venv venv
19+
source venv/bin/activate
20+
```
21+
22+
3. Next, install the required packages:
23+
```bash
24+
pip install .
25+
```
26+
27+
For development please use the following command:
28+
```bash
29+
pip install -e '.[dev]'
30+
```
31+
632
## Key Concepts
7-
### Schemas
833
CloudAI operates on four main schemas:
934

1035
- **System Schema**: Describes the system, including the scheduler type, node list, and global environment variables.
11-
- **Test Template Schema**: A template for tests that includes all required command-line arguments and environment variables. This schema allows users to separate test template implementations from systems.
1236
- **Test Schema**: An instance of a test template with custom arguments and environment variables.
1337
- **Test Scenario Schema**: A set of tests with dependencies and additional descriptions about the test scenario.
1438

@@ -30,8 +54,8 @@ These schemas enable CloudAI to be flexible and compatible with different system
3054
|SlurmContainer|||||
3155
|MegatronRun (experimental)|||||
3256

33-
34-
## Set Up Access to the Private NGC Registry
57+
## Details
58+
### Set Up Access to the Private NGC Registry
3559
First, ensure you have access to the Docker repository. Follow the following steps:
3660

3761
1. **Sign In**: Go to [NVIDIA NGC](https://ngc.nvidia.com/signin) and sign in with your credentials.
@@ -49,28 +73,18 @@ machine nvcr.io login $oauthtoken password <api-key>
4973
Replace `<api-key>` with your respective credentials. Keep `$oauthtoken` as is.
5074
5175
52-
## Get Started
53-
1. Clone the CloudAI repository to your local machine:
54-
```bash
55-
git clone git@github.com:NVIDIA/cloudai.git
56-
cd cloudai
57-
```
58-
59-
2. Create a virtual environment:
60-
```bash
61-
python -m venv venv
62-
source venv/bin/activate
63-
```
64-
65-
3. Next, install the required packages:
66-
```bash
67-
pip install .
68-
```
76+
### Install custom python version
77+
If your system python version is not supported, you can install a custom version using [uv](https://docs.astral.sh/uv/getting-started/installation/) tool:
78+
```bash
79+
curl -LsSf https://astral.sh/uv/install.sh | sh
80+
source $HOME/.local/bin/env
81+
uv venv -p 3.10
82+
source .venv/bin/activate
83+
# optionally you might need to install pip which is not installed by default:
84+
uv pip install -U pip
85+
```
6986

70-
For development please use the following command:
71-
```bash
72-
pip install -e '.[dev]'
73-
```
87+
## CloudAI Modes Usage Examples
7488

7589
CloudAI supports five modes:
7690
- [install](#install) - Use the install mode to install all test templates in the specified installation path
@@ -79,9 +93,7 @@ CloudAI supports five modes:
7993
- [generate-report](#generate-report) - Use the generate-report mode to generate reports under the test directories alongside the raw data
8094
- [uninstall](#uninstall) - Use the uninstall mode to remove installed test templates
8195

82-
### CloudAI Modes Usage Examples
83-
84-
#### install
96+
### install
8597

8698
To install test prerequisites, run CloudAI CLI in install mode.
8799

@@ -91,23 +103,23 @@ cloudai install\
91103
--system-config conf/common/system/example_slurm_cluster.toml\
92104
--tests-dir conf/common/test
93105
```
94-
#### dry-run
106+
### dry-run
95107
To simulate running experiments without execution, use the dry-run mode:
96108
```bash
97109
cloudai dry-run\
98110
--system-config conf/common/system/example_slurm_cluster.toml\
99111
--tests-dir conf/common/test\
100112
--test-scenario conf/common/test_scenario/sleep.toml
101113
```
102-
#### run
114+
### run
103115
To run experiments, execute CloudAI CLI in run mode:
104116
```bash
105117
cloudai run\
106118
--system-config conf/common/system/example_slurm_cluster.toml\
107119
--tests-dir conf/common/test\
108120
--test-scenario conf/common/test_scenario/sleep.toml
109121
```
110-
#### generate-report
122+
### generate-report
111123
To generate reports, execute CloudAI CLI in generate-report mode:
112124
```bash
113125
cloudai generate-report\
@@ -118,7 +130,7 @@ cloudai generate-report\
118130
```
119131
In the generate-report mode, use the --result-dir argument to specify a subdirectory under the output directory.
120132
This subdirectory is usually named with a timestamp for unique identification.
121-
#### uninstall
133+
### uninstall
122134
To uninstall test prerequisites, run CloudAI CLI in uninstall mode:
123135
```bash
124136
cloudai uninstall\

0 commit comments

Comments
 (0)