Skip to content

Releases: NVIDIA/cloudai

v0.9.beta16

11 Oct 06:51
156ccf9

Choose a tag to compare

v0.9.beta16 Pre-release
Pre-release

Highlights

Use subcommands instead of --mode <value> by @amaslenn in #194

New help message looks like this:

> cloudai --help
usage: cloudai [-h] [--log-file LOG_FILE] [--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
               {uninstall,install,dry-run,run,generate-report,verify-systems,verify-tests,verify-test-scenarios} ...

Cloud AI

optional arguments:
  -h, --help            show this help message and exit
  --log-file LOG_FILE   The name of the log file (default: debug.log).
  --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Set the logging level (default: INFO).

modes:
  {uninstall,install,dry-run,run,generate-report,verify-systems,verify-tests,verify-test-scenarios}
    uninstall           Remove the installed dependencies.
    install             Prepare execution by setting up env and dependencies for the tests to run.
    dry-run             Perform a dry-run of the test scenarios without executing them.
    run                 Execute the test scenarios.
    generate-report     Generate a report based on the test results.
    verify-systems      Verify the system configurations.
    verify-tests        Verify the test configurations.
    verify-test-scenarios
                        Verify the test scenario configurations.
  1. Each command (a.k.a mode) has own help message.
  2. Each command also has a uniq set of required and optional arguments. While for many commands options are the same, others are quite different, for example:
    > cloudai run --help
    usage: cloudai run [-h] --system-config SYSTEM_CONFIG --tests-dir TESTS_DIR --test-scenario TEST_SCENARIO [--output-dir OUTPUT_DIR]
    
    optional arguments:
      -h, --help            show this help message and exit
      --system-config SYSTEM_CONFIG
                            Path to the system configuration file.
      --tests-dir TESTS_DIR
                            Path to the test configuration directory.
      --test-scenario TEST_SCENARIO
                            Path to the test scenario file.
      --output-dir OUTPUT_DIR
                            Path to the output directory.
    
    > cloudai verify-tests --help
    usage: cloudai verify-tests [-h] test_configs
    
    positional arguments:
      test_configs  Path to the test configuration file or directory.
    
    optional arguments:
      -h, --help    show this help message and exit

What's Changed

Full Changelog: v0.9.beta15...v0.9.beta16

v0.9.beta15

09 Oct 13:56
2b6181b

Choose a tag to compare

v0.9.beta15 Pre-release
Pre-release

What's Changed

  • Remove assigning null when the value is null (NeMo launcher) by @TaekyungHeo in #250

Full Changelog: v0.9.beta14...v0.9.beta15

v0.9.beta14

09 Oct 13:23
7455f42

Choose a tag to compare

v0.9.beta14 Pre-release
Pre-release

What's Changed

Full Changelog: v0.9.beta13...v0.9.beta14

v0.9.beta13

09 Oct 09:53
e54f4c1

Choose a tag to compare

v0.9.beta13 Pre-release
Pre-release

What's Changed

Full Changelog: v0.9.beta12...v0.9.beta13

v0.9.beta12

07 Oct 17:38
c40e92c

Choose a tag to compare

v0.9.beta12 Pre-release
Pre-release

Highlights

We are working on schema improvements to simplify configs management and make them verifiable. This will help ensure that configs are correct before expensive runs on real hardware. Today we are enabling it for Test Scenario configs. This is a continuation of #145.

  1. Tests becomes and array. This helps making case names more expressive:
    before:
    [Tests.1]
    # ...
    now:
    [[Tests]]
    id = "any-name.you_want" # before it was just "1"
  2. id field is mandatory and must be unique and is used to specify dependencies:
    [[Tests]]
    id = "Tests.1"
    # ...
    
    [[Tests]]
    id = "Tests.2"
    # ...
      [[Tests.dependencies]]
      id = "Tests.1"
      # ...
  3. name (under the list of tests) renamed to test_name to better reflect its meaning. It still references a test defined in a separate TOML file.
  4. Dependencies converted to a list to support multiple dependencies of the same type.
    before
    # ...
    
    [Tests.2]
    name = "ucc_test_alltoall"
      [Tests.2.dependencies]
      start_post_comp = { name = "Tests.1", time = 0 }  # only one dependency of this type is allowed
    now
    # ...
    
    [[Tests]]
    id = "Tests.3"
    test_name = "ucc_test_alltoall"
    # ...
      [[Tests.dependencies]]
      type = "start_post_comp"
      id = "Tests.1"
    
      [[Tests.dependencies]]
      type = "start_post_comp"
      id = "Tests.2"

What's Changed

Full Changelog: v0.9.beta11...v0.9.beta12

v0.9.beta11

07 Oct 15:57
d7f4235

Choose a tag to compare

v0.9.beta11 Pre-release
Pre-release

What's Changed

Full Changelog: v0.9.beta10...v0.9.beta11

v0.9.beta10

07 Oct 10:53
c3b41b5

Choose a tag to compare

v0.9.beta10 Pre-release
Pre-release

What's Changed

Full Changelog: v0.9.beta9...v0.9.beta10

v0.9.beta9

02 Oct 11:16
228bb45

Choose a tag to compare

v0.9.beta9 Pre-release
Pre-release

What's Changed

Full Changelog: v0.9.beta8...v0.9.beta9

v0.9.beta8

01 Oct 19:08
1254085

Choose a tag to compare

v0.9.beta8 Pre-release
Pre-release

What's Changed

Full Changelog: v0.9.beta7...v0.9.beta8

v0.9.beta7

01 Oct 16:22
83c6949

Choose a tag to compare

v0.9.beta7 Pre-release
Pre-release

What's Changed

Full Changelog: v0.9.beta6...v0.9.beta7