Nemotron Kit provides automatic W&B configuration that seamlessly passes credentials and settings to containers running via nemo-run. This eliminates manual credential management across local, Docker, Slurm, and cloud executors.
Note: The artifact system currently requires W&B. Backend-agnostic artifact tracking is in development.
Add a [wandb] section to your env.toml:
[wandb]
project = "nemotron"
entity = "YOUR-TEAM"| Field | Description |
|---|---|
project |
W&B project name (required to enable tracking) |
entity |
W&B team/entity name |
Authenticate locally before running jobs:
wandb loginYour API key is stored in ~/.netrc and automatically detected by the kit.
When you run jobs via nemo-run, the kit automatically detects your W&B configuration and passes it to the container as environment variables:
| Variable | Source | Description |
|---|---|---|
WANDB_API_KEY |
wandb.api.api_key |
API key from local wandb login |
WANDB_PROJECT |
env.toml [wandb] |
Project name |
WANDB_ENTITY |
env.toml [wandb] |
Team/entity name |
This works across all executor types:
- Local — Environment variables set directly
- Docker — Passed via container env vars
- Slurm — Included in job submission
- SkyPilot — Set in cloud instance environment
- Ray — Passed via
runtime_env.env_vars
The build_executor() function in nemotron.kit.run handles automatic detection:
# Auto-detect W&B API key from local login
if "WANDB_API_KEY" not in merged_env:
import wandb
api_key = wandb.api.api_key
if api_key:
merged_env["WANDB_API_KEY"] = api_key
# Load project/entity from env.toml [wandb] section
wandb_config = load_wandb_config()
if wandb_config is not None:
if wandb_config.project:
merged_env["WANDB_PROJECT"] = wandb_config.project
if wandb_config.entity:
merged_env["WANDB_ENTITY"] = wandb_config.entityTraining scripts running inside containers can initialize W&B from environment variables:
from nemotron.kit.train_script import init_wandb_from_env
# Reads WANDB_PROJECT and WANDB_ENTITY from environment
init_wandb_from_env()For scripts that support optional W&B tracking:
from nemotron.kit import init_wandb_if_configured
from nemotron.kit.wandb import WandbConfig
# Initialize only if WandbConfig is provided and has a project set
wandb_config = WandbConfig(project="nemotron", entity="my-team")
init_wandb_if_configured(wandb_config, job_type="training")The WandbConfig dataclass provides typed configuration:
from nemotron.kit.wandb import WandbConfig
config = WandbConfig(
project="nemotron", # Required to enable tracking
entity="my-team", # Team/entity name
run_name="experiment-001", # Optional run name
tags=("pretrain", "nano3"), # Tags for filtering
notes="First pretrain run", # Run description
)
# Check if tracking is enabled
if config.enabled:
print(f"Logging to {config.entity}/{config.project}")W&B artifacts provide full lineage tracking. See Artifact Lineage for details on:
- End-to-end lineage from raw data to final model
- Semantic URIs for artifact references
- Viewing lineage in the W&B UI
The kit automatically patches checkpoint saving to log artifacts to W&B:
from nemotron.kit.wandb import patch_wandb_checkpoint_logging
# Patch Megatron-Bridge checkpoint saving
patch_wandb_checkpoint_logging()This enables:
- Automatic artifact creation for each checkpoint
- Lineage links to training data artifacts
- Version tracking with step numbers
For reinforcement learning with NeMo-RL:
from nemotron.kit.wandb import patch_nemo_rl_checkpoint_logging
# Patch NeMo-RL checkpoint saving
patch_nemo_rl_checkpoint_logging()When using seeded random states (common in RL), W&B's default run ID generation can fail. The kit provides a patch:
from nemotron.kit.wandb import patch_wandb_runid_for_seeded_random
# Fix "Invalid Client ID digest" errors
patch_wandb_runid_for_seeded_random()Ensure you're logged in locally:
wandb loginVerify the project exists in your W&B workspace, or let W&B create it automatically on first run.
Check that your env.toml has a [wandb] section:
[wandb]
project = "nemotron"
entity = "YOUR-TEAM"For Ray data prep jobs, credentials are passed via runtime_env.env_vars. Ensure your local wandb login is active before submitting the job.
| Export | Description |
|---|---|
WandbConfig |
Configuration dataclass |
init_wandb_if_configured() |
Conditional W&B initialization |
patch_wandb_checkpoint_logging() |
Enable Megatron-Bridge checkpoint artifacts |
patch_nemo_rl_checkpoint_logging() |
Enable NeMo-RL checkpoint artifacts |
patch_wandb_runid_for_seeded_random() |
Fix seeded random ID generation |
| Export | Description |
|---|---|
load_wandb_config() |
Load WandbConfig from env.toml |
build_executor() |
Build executor with auto W&B env vars |
- OmegaConf Configuration — Artifact interpolations and unified logging patches
- Artifact Lineage — Full lineage tracking and W&B UI
- Nemotron Kit — Core framework overview
- Execution through NeMo-Run — Execution profiles and env.toml