NVIDIA
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 1 addition & 0 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md‎
Lines changed: 9 additions & 0 deletions b/‎README.md‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎USER_GUIDE.md‎
Lines changed: 41 additions & 30 deletions b/‎USER_GUIDE.md‎
Lines changed: 41 additions & 30 deletions
diff --git a/‎conf/common/test_scenario/chakra_replay.toml‎
Lines changed: 4 additions & 5 deletions b/‎conf/common/test_scenario/chakra_replay.toml‎
Lines changed: 4 additions & 5 deletions
@@ -84,3 +84,4 @@ jobs:
           cloudai --help
           cloudai --mode verify-systems --tests-dir conf/common/test --system-config conf/common/system
           cloudai --mode verify-tests --system-config conf/common/system/standalone_system.toml --tests-dir conf/common/test
+          cloudai --mode verify-test-scenarios --system-config conf/common/system/example_slurm_cluster.toml --tests-dir conf/common/test --test-scenario conf/common/test_scenario
@@ -127,6 +127,15 @@ cloudai\
 ```
 `--tests-dir` can be a file or a directory to verify all configs in the directory.
 
+Verify if test scenarios are valid:
+```bash
+cloudai\ --mode verify-test-scenarios\
+    --system-config conf/common/system/example_slurm_cluster.toml\
+    --tests-dir conf/common/test\
+    --test-scenario conf/common/test_scenario
+```
+`--test-scenario` can be a file or a directory to verify all configs in the directory.
+
 ## Contributing
 Feel free to contribute to the CloudAI project. Your contributions are highly appreciated.
 
 
@@ -123,20 +123,24 @@ Test Scenario uses Test description from the previous step. Below is the `myconf
 ```toml
 name = "nccl-test"
 
-[Tests.1]
-  name = "nccl_test_all_reduce_single_node"
-  time_limit = "00:20:00"
-
-[Tests.2]
-  name = "nccl_test_all_reduce_single_node"
-  time_limit = "00:20:00"
-  [Tests.2.dependencies]
-    start_post_comp = { name = "Tests.1", time = 0 }
+[[Tests]]
+id = "Tests.1"
+test_name = "nccl_test_all_reduce_single_node"
+time_limit = "00:20:00"
+
+[[Tests]]
+id = "Tests.2"
+test_name = "nccl_test_all_reduce_single_node"
+time_limit = "00:20:00"
+  [[Tests.dependencies]]
+  type = "start_post_comp"
+  id = "Tests.1"
+  time = 0
 ```
 
 Notes on the test scenario:
-1. `name` is a mandatory filed. Other fields describe arbitrary number of tests and their dependencies.
-1. The `name` of the tests should be found in the test schema files. Node lists and time limits are optional.
+1. `id` is a mandatory filed and must be uniq for each test.
+1. The `test_name` specifies test definition from one of the Test TOML files. Node lists and time limits are optional.
 1. If needed, `nodes` should be described as a list of node names as shown in a Slurm system. Alternatively, if groups are defined in the system schema, you can ask CloudAI to allocate a specific number of nodes from a specified partition and group. For example `nodes = ['PARTITION:GROUP:16']`: 16 nodes are allocated from a group `GROUP`, from a partition `PARTITION`.
 1. There are three types of dependencies: `start_post_comp`, `start_post_init` and `end_post_comp`.
     1. `start_post_comp` means that the current test should be started after a specific delay of the completion of the depending test.
@@ -243,27 +247,34 @@ cache_docker_images_locally = true
 
 ## Describing a Test Scenario in the Test Scenario Schema
 A test scenario is a set of tests with specific dependencies between them. A test scenario is described in a TOML schema file. This is an example of a test scenario file:
-```
+```toml
 name = "nccl-test"
 
-[Tests.1]
-  name = "nccl_test_all_reduce"
-  num_nodes = "2"
-  time_limit = "00:20:00"
-
-[Tests.2]
-  name = "nccl_test_all_gather"
-  num_nodes = "2"
-  time_limit = "00:20:00"
-  [Tests.2.dependencies]
-    start_post_comp = { name = "Tests.1", time = 0 }
-
-[Tests.3]
-  name = "nccl_test_reduce_scatter"
-  num_nodes = "2"
-  time_limit = "00:20:00"
-  [Tests.3.dependencies]
-    start_post_comp = { name = "Tests.2", time = 0 }
+[[Tests]]
+id = "Tests.1"
+test_name = "nccl_test_all_reduce"
+num_nodes = "2"
+time_limit = "00:20:00"
+
+[[Tests]]
+id = "Tests.2"
+test_name = "nccl_test_all_gather"
+num_nodes = "2"
+time_limit = "00:20:00"
+  [[Tests.dependencies]]
+  type = "start_post_comp"
+  id = "Tests.1"
+  time = 0
+
+[[Tests]]
+id = "Tests.3"
+templat_test = "nccl_test_reduce_scatter"
+num_nodes = "2"
+time_limit = "00:20:00"
+  [[Tests.dependencies]]
+  type = "start_post_comp"
+  id = "Tests.2"
+  time = 0
 ```
 
 The `name` field is the test scenario name, which can be any unique identifier for the scenario. Each test has a section name, following the convention `Tests.1`, `Tests.2`, etc., with an increasing index. The `name` of a test should be specified in this section and must correspond to an entry in the test schema. If a test in a test scenario is not present in the test schema, CloudAI will not be able to identify it.
 
@@ -15,8 +15,7 @@
 # limitations under the License.
 
 name = "chakra_replay"
-
-[Tests]
-  [Tests.1]
-  name = "chakra_replay"
-  num_nodes = "2"
+[[Tests]]
+id = "Tests.1"
+test_name = "chakra_replay"
+num_nodes = "2"