Skip to content

Conversation

@NotNANtoN
Copy link

@NotNANtoN NotNANtoN commented Dec 12, 2025

Summary

Adds optional validation loss tracking during training using a separate validation episode split.

Features

  • Validation split: validation_fraction config option to split episodes into train/val sets
  • Validation loss: Computed using select_action inference for model-agnostic metrics (L1/L2)
  • Early stopping: Stop training when validation loss or eval success stops improving
  • Checkpoint cleanup: keep_last_n_checkpoints to automatically remove old checkpoints

Design Decisions

  • Uses select_action for validation rather than modifying individual policies - this makes validation policy-agnostic
  • Validation dataset is created without augmentations for clean evaluation
  • All features are opt-in with sensible defaults (no breaking changes)

Config Options

validation_fraction: float = 0.0 # 0.1 = 10% for validation
early_stopping.enable: bool = False
early_stopping.patience_steps: int = 10000
early_stopping.monitor: str = "val_loss" # or "eval_success"
keep_last_n_checkpoints: int = 0 # 0 = keep all

Testing

  • Tested with ACT policy
  • Tested with SmolVLA policy

This PR adds the ability to track validation loss during training:

Features:
- validation_fraction config option to split episodes into train/val sets
- Validation loss computed using inference (select_action) for model-agnostic metrics
- L1 and L2 loss metrics logged to wandb under val/ prefix
- Early stopping based on validation loss or eval success rate
- keep_last_n_checkpoints option to automatically cleanup old checkpoints

The validation uses a separate dataset copy without augmentations for clean evaluation.
Uses select_action for inference-based validation, making it policy-agnostic.
Backward compatible - defaults maintain existing behavior (no validation split).

Config options:
- validation_fraction: 0.0-1.0 (default 0.0, no validation)
- early_stopping.enable: bool (default False)
- early_stopping.patience_steps: int (default 10000)
- early_stopping.monitor: 'val_loss' or 'eval_success'
- keep_last_n_checkpoints: int (default 0, keep all)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant