This is a template project for research experiments with NeMo RL.
Important
This is a template! To start a new research project, copy this directory to a new location:
cp -r research/template_project research/my_new_projectThen add your code and tests! Note that this project includes nemo-rl as a core dependency.
The single_update.py script demonstrates a minimal train-and-generate loop:
- Sets up a Ray compute cluster
- Initializes the vLLM generation
- Initializes the LM policy with an extension worker class that supports custom functions
- Executes custom functions provided by the extension worker class
- Repeats the loop (10 iterations by default)
- Trains the policy on a small batch using NLL loss
- Refits the generation engine with the updated policy weights
- Generates outputs with the new policy
This shows the basic cycle of training a language model and using it for generation.
To run the single_update.py script:
uv run single_update.pyTo add custom behavior to the policy worker, you can use an extension worker class that subclasses the default worker implementation. See the example in template_project/worker_extension.py.
After defining your extension class, you need to register it in the actor environment registry so that the runtime can resolve the correct Python environment for the worker. See the example in single_update.py.
from nemo_rl.distributed.ray_actor_environment_registry import ACTOR_ENVIRONMENT_REGISTRY
from nemo_rl.distributed.virtual_cluster import PY_EXECUTABLES
# register the worker extension class to the actor environment registry
ACTOR_ENVIRONMENT_REGISTRY[
"template_project.worker_extension.DTensorPolicyWorkerV2Extension"
] = PY_EXECUTABLES.AUTOMODELThis project includes a comprehensive test suite following NeMo RL's testing patterns.
Unit tests validate individual components and functions.
# Run all unit tests
uv run --group test pytest tests/unit/Functional tests run end-to-end scenarios with minimal configurations. These tests require GPU access.
Important
Functional tests require at least 1 GPU to run.
# Run the single_update functional test (runs for 1 step)
uv run bash tests/functional/single_update.shTest suites are longer-running comprehensive tests designed for validation on multiple steps.
Important
Test suites require 8 GPUs and may take several minutes to complete.
# Run the single_update test suite locally (runs for 10 steps on 1 node with 8 GPUs)
bash tests/test_suites/llm/single_update_1n8g.sh
# Launch on SLURM with code snapshots
# For full documentation on tools/launch, see:
# https://github.com/NVIDIA-NeMo/RL/blob/main/tests/test_suites/README.md#launching-with-code-snapshots
bash ../../tools/launch tests/test_suites/llm/single_update_1n8g.sh
# Dry run to estimate GPU hours needed
DRYRUN=1 bash ../../tools/launch tests/test_suites/llm/single_update_1n8g.shTip
The tools/launch script creates code snapshots and launches SLURM jobs for reproducible experiments. It automatically extracts the configuration from your test suite script and submits the appropriate number of jobs.
The test suite structure mirrors nemo-rl's test organization:
tests/unit/- Fast, isolated unit teststests/functional/- End-to-end tests with minimal configurationstests/test_suites/llm/- Comprehensive multi-step validation testsconfigs/recipes/llm/- Configuration files for test suites (using defaults to inherit from base configs)
If you update the dependencies of this research project, run the following command to update the global uv.lock file and freeze the working set of dependencies:
uv lockThis command will:
- Resolve all dependencies
- Update
uv.lockwith the latest compatible versions - Ensure dependency consistency across environments
Note
This project uses Python 3.13.13 as specified in .python-version.
This Python version should always be kept in sync with the .python-version file at the root of the nemo-rl repository to ensure compatibility.
If you use this research project or have questions, please contact:
Author: AUTHOR NAMES HERE
Email: AUTHOR EMAILS HERE
Organization: ORGANIZATION HERE (optional)
If you use this research project, please cite it using the following BibTeX entry:
@misc{template-project,
title = {Template Project: A Starting Point},
author = {AUTHOR NAMES HERE},
howpublished = {\url{https://github.com/NVIDIA-NeMo/RL/tree/main/research/template_project}},
year = {2025},
note = {Research project based on NeMo RL},
}