Releases: NVIDIA/cloudai
Releases · NVIDIA/cloudai
v1.0.rc0
What's Changed
- Refactor JobStatusRetrievalStrategy unit tests by @TaekyungHeo in #280
- Fix typo in README by @amaslenn in #281
- Fix bug by removing trailing newline from +cluster.nodelist in command args by @TaekyungHeo in #282
Full Changelog: v0.9.beta25...v1.0.rc0
v0.9.beta25
What's Changed
- List supported tests per each system in README by @amaslenn in #278
- Remove delay support in dependencies by @amaslenn in #279
Full Changelog: v0.9.beta24...v0.9.beta25
v0.9.beta24
What's Changed
- Reorganize ReportGenerationStrategy unit tests by @TaekyungHeo in #276
- Rename --output-dir to --results-dir for generate-report command by @TaekyungHeo in #277
Full Changelog: v0.9.beta23...v0.9.beta24
v0.9.beta23
What's Changed
- Update NCCL report strategy and tests to enforce correct data types by @TaekyungHeo in #273
- Fix README on verify-* modes by @amaslenn in #275
Full Changelog: v0.9.beta22...v0.9.beta23
v0.9.beta22
What's Changed
- Fix bugs in updating the output_path of JaxToolbox cmd_args by @TaekyungHeo in #270
Full Changelog: v0.9.beta21...v0.9.beta22
v0.9.beta21
v0.9.beta20
What's Changed
- Update NeMo launcher commit hash and image tag by @TaekyungHeo in #265
Full Changelog: v0.9.beta19...v0.9.beta20
v0.9.beta19
What's Changed
- Refactor SlurmCommandGenStrategy (_write_sbatch_script) by @TaekyungHeo in #253
- Refactor JaxToolboxSlurmCommandGenStrategy unit tests by @TaekyungHeo in #259
- Handle node allocation errors gracefully, log details, and exit on failure by @TaekyungHeo in #264
Full Changelog: v0.9.beta18...v0.9.beta19
v0.9.beta18
What's Changed
- Cleanup docs from mentioning --mode option by @amaslenn in #260
- Improve verify modes by @amaslenn in #262
Full Changelog: v0.9.beta17...v0.9.beta18
v0.9.beta17
What's Changed
- Refactor JaxToolboxSlurmCommandGenStrategy by @TaekyungHeo in #254
- Move JaxToolbox-related test definitions to CloudAI by @TaekyungHeo in #257
Full Changelog: v0.9.beta16...v0.9.beta17