Skip to content

v0.9.beta2

Pre-release
Pre-release

Choose a tag to compare

@amaslenn amaslenn released this 25 Sep 16:36
· 2178 commits to main since this release
953f04b

Release notes

We are working on schema improvements to simplify configs management and make them verifiable. This will help ensure that configs are correct before expensive runs on real hardware. Today we are enabling it for Test configs. This is a continuation of #158.

  1. Test Template TOML files were replaced with Pydantic models. That ensures mandatory arguments as well as its types and requires less code to maintain.
  2. --test-templates-dir option was removed for all commands. All supported tests are registered in code using Registry().add_test_definition(...) and Registry().add_test_template(...). Documentation was updated to reflect this change.
  3. Test TOML files now take advantage of standard TOML format for all know arguments.
    Before:
    [cmd_args]
    "training" = "llama/llama2_70b"
    "training.trainer.max_steps" = "120"
    "training.model.global_batch_size" = "256"
    "training.model.pipeline_model_parallel_size" = "1"
    Now:
    [cmd_args]
      [cmd_args.training]
      values = "llama/llama2_70b"
        [cmd_args.training.trainer]
        max_steps = "120"
        [cmd_args.training.model]
        global_batch_size = "256"
        pipeline_model_parallel_size = "2"
  4. extra_cmd_args converted from str to dict[str, str]:
    Before:
    extra_cmd_args = "--stepfactor 2"
    Now:
    [extra_cmd_args]
    "--stepfactor" = "2"
  5. Add a new mode to verify if Tests TOMLs are valid: cloudai --mode verify-tests --system-config conf/common/system/standalone_system.toml --tests-dir conf/common/test/chakra_replay.toml

Full Changelog: v0.9.beta1...v0.9.beta2