v1.2.beta1
Pre-release
Pre-release
Highlights
Changes in mounts for Slurm runs
Documentation is available in the User Guide.
Default mount
Test output directory <output_path>/<scenario_name_with_timestamp>/<test_name>/<iteration> (for ex. results/scenario_2024-06-18_17-40-13/Tests.1/0) is mounted as /cloudai_run_results.
Custom mounts
Users can now specify custom mounts via Test configuration:
extra_container_mounts = [
"/path/to/mount1:/path/in/container1",
"/path/to/mount2:/path/in/container2"
]Git repo mounts
Arbitrary amount of Git repositories can be cloned as part of cloudai install and the mounted into containers.
[[git_repos]]
url = "https://github.com/NVIDIA/cloudai"
commit = "sha1"
mount_as = "/work"
[[git_repos]]
url = "https://github.com/NVIDIA/cloudai-new"
commit = "sha1"
mount_as = "/opt/new"Configuration is done via Test TOML file.
Sbatch custom arguments
Users can now specify custom sbatch arguments via System configuration:
extra_sbatch_args = [
"--section=4",
"--other-arg val"
]The snippet above will result in the following sbatch directives added in addition to others:
#SBATCH --section=4
#SBATCH --other-arg valMore info.
What's Changed
- Move conf/staging/nemo/ to CloudAI by @TaekyungHeo in #349
- Implement report generation for NeMoRun, summarizing train step timing by @TaekyungHeo in #354
- Always mount current output dir as /cloudai_run_results by @amaslenn in #355
- Support custom mounts for slurm container jobs by @amaslenn in #356
- Manage fields' serialization of SlurmSystem by @amaslenn in #341
- NemoRun DSE PoC by @srivatsankrishnan in #353
- Add support for extra sbatch args via system model by @amaslenn in #357
- Introduce configurable mounts for git repos by @amaslenn in #358
Full Changelog: v1.1.0...v1.2.beta1