Releases · NVIDIA-NeMo/Gym

15 Apr 22:52

chtruong814

v0.2.1

27e9211

v0.2.1 Latest

Latest

pypi fixes for 0.2.1 patch release by @cmunley1 @kajalj22 :: PR: #1081

Contributors

cmunley1 and kajalj22

Assets 2

11 Mar 15:03

bxyu-nvidia

v0.2.0

3e587db

v0.2.0

Release Summary

NeMo Gym v0.2.0 ships alongside the NVIDIA Nemotron 3 Super model release, open sourcing the RL environments and corresponding datasets used during training. Highlights:

17 new training environments across coding, math, science, reasoning, agentic tasks, and safety.
Integrations with Future House Aviary, Open-Thought Reasoning Gym, and Prime Intellect Verifiers let you use environments from these libraries directly within NeMo Gym
End-to-end rollout collection with a locally managed vLLM server
Install directly from PyPI with pip install nemo-gym

First-Time Contributors

We welcomed 15 new contributors to this release! Here are a few highlights:

@sidnarayanan added the Aviary integration to enable training on any Aviary environment, a library of interactive RL environments spanning math, science, biology, and more
@3mei added the text-to-SQL environment to generate SQL queries from natural language across multiple SQL dialects
@Kelvin0110 added the NewtonBench environment to discover scientific laws through interactive experimentation

Thank you to all the new contributors for helping make NeMo Gym better!

Major Features & Improvements

New Environments

Added 17 new resources servers spanning:
- Coding: Text to SQL (#648), SWE RL Gen (#561), SWE RL LLM Judge (#561)
- Math: Lean4 Mathematical Proofs (#563)
- Science: Aviary (#55), NewtonBench (#650)
- Reasoning: MultiChallenge (#654), ARC-AGI (#105), Reasoning Gym (#113)
- Agent tasks: xLAM Function Calling (#262), Tavily Search (#825), Single Step Tool Use with Argument Comparison (#825), Terminus Judge (#594), NeMo Skills Tools (#571)
- Safety: Jailbreak Detection (#825), Over Refusal Detection (#825)
- RLHF: Generative Reward Model Compare (#674)
Added 5 new agent servers: Aviary agent (#55), proof refinement agent (#563), SWE agents (#343), tool simulation agent (#826), and verifiers agent (#573)

Environment Library Integrations
Combine environments from other libraries with NeMo Gym environments

Future House Aviary (#55, #590)
Open-Thought Reasoning Gym (#113)
Prime Intellect Verifiers (#573)

Model Serving

Local vLLM model server with end-to-end rollout collection without an external API (#558, #762)
vLLM 0.16+ support for the reasoning field in responses (#816)
VLLMModel chat template kwargs support (#538, #636)
Per-task chat template and extra body args, enabling per-task control of reasoning mode and thinking budget (#672)

Rollout Collection & Profiling

New ng_reward_profile command to compute per-task pass rates and aggregate metrics (#83, #621)
CPU profiling for rollout performance analysis (#763)
Add option for seeding on num_repeats for rollouts (#740)

Infrastructure & Developer Experience

PyPI compatibility: install via pip install nemo-gym (#649)
Dry run mode: ng_run +dryrun=true to validate configs and install environments without starting servers (#743)
ng_status command to list running servers and their health (#290)
Server stdout/stderr redirection with server name prefixes (#703)
FastAPI worker support for higher throughput across multiple workers (#566)

Model Recipes

Nemotron 3 Nano training recipe (#699)
Nemotron 3 Super training recipe (#863)

Deprecation Notices

Deprecated ng_viewer due to a Gradio security vulnerability. We plan to revisit rollout viewing with a more robust solution in a future release.

Bug Fixes

Fixed 0.1.1 environments to work correctly with RL training pipelines (#768)
Fixed crash when server receives malformed JSON during rollout collection (#770)
Fixed dry run mode failing (#746)
Fixed nested responses_create_params overrides not merging correctly from CLI (#827)
Fixed ng_prepare_data failing when multiple environments define overlapping metrics (#738)
Fixed reward profiling failing when model response doesn't include usage stats (#824)
Fixed NeMo-Skills python tool to use HTTP calls instead of subprocess execution (#606)
Bumped Pillow and other packages to address security vulnerabilities (#667, #739)
ng_dump_config now redacts API key values from output (#567)

Documentation

New training tutorials: Unsloth training with NeMo Gym, multi-environment training
New environment tutorials: creating a training environment, custom data preparation, integrating external environment libraries, environment best practices
Model recipes: reproduce the training for Nemotron 3 Nano and Nemotron 3 Super
Concepts & architecture overhaul: rewrote concepts docs, added architecture diagrams, added agent server and resources server docs
Training approaches: added training approaches docs page covering SFT, RL (GRPO), and RLVR
Ecosystem page: revamped ecosystem page with training framework integrations and environment library integrations
Infrastructure: added SWE RL infrastructure case study, deployment topology docs
Quality pass: redirect sweep, style guide sweep, consistent naming, FAQ additions, broken link fixes

Looking Ahead

VLM support: add support for VLM models and environments with images, e.g. browser environments and computer use agent (CUA) environments
Benchmark environments: add popular OSS environments such as OSWorld, Tau Bench, BrowseComp
Integrate existing agents: integrate popular existing agents, e.g. coding harnesses, as well as agents developed via popular agent frameworks, e.g. LangGraph
Environment tutorials: incorporate more complex agentic loops during training such as multi-turn conversation and user modeling

Release Assets

GitHub Release: https://github.com/NVIDIA-NeMo/Gym/releases/tag/v0.2.0
Container: nvcr.io/nvidia/nemo-rl:v0.5.0.nemotron_3_super

What's Changed

Bump to v0.2.0 by @bxyu-nvidia in #510
reasoning-gym resource server by @cmunley1 in #113
docs: redirect setup by @lbliii in #513
docs: Miscellaneous GRPO tutorial fixes by @bxyu-nvidia in #512
docs settings update by @lbliii in #525
Debug server package versions by @fsiino-nvidia in #406
List running server health and status by @fsiino-nvidia in #290
VLLMModel supports chat template kwargs by @pjin-nvidia in #538
Salesforce xlam-function-calling-60k resources server by @cmunley1 in #262
python flag for colab venv installation by @cmunley1 in #526
add unsloth and trl to docs by @cmunley1 in #536
docs: remove trl docs by @cmunley1 in #543
Remove PlainTextResponse response_class by @fsiino-nvidia in https://github.com/NVIDIA-N...

Contributors

Kipok, ananthsub, and 23 other contributors

Assets 2

15 Dec 00:32

bxyu-nvidia

v0.1.1

414127f

v0.1.1

What's Changed

Bump package info for v0.2.0 by @bxyu-nvidia in #337
fix: Update incorrect path in docs: library_judge_math -> math_with_j… by @shashank3959 in #355
Update secret detector to work with forks by @chtruong814 in #358
Removed reference to gitlab master by @hwolff99 in #377
Mark experimental tutorials by @bxyu-nvidia in #386
docs: experimental label by @lbliii in #391
Fixed typos by @hwolff99 in #400
Readme dataset discoverability cont by @fsiino-nvidia in #344
Add absolute ip for multi node by @sdevare-nv in #286
docs: removed "How to Navigate" section from concepts by @ahmadki in #414
docs: Fixed image embedding in core abstractions page by @ahmadki in #410
docs: Fixed Licensing information in structured outputs by @ahmadki in #412
docs: Added hyperlinks to github repo in docs by @ahmadki in #413
docs: Add software / hardware requirements to README and docs. by @ffrujeri in #401
docs: Cleaned the "Quick Start" section in the README by @ahmadki in #411
Display system and version info by @fsiino-nvidia in #347
docs: Improve language around resources servers. by @ffrujeri in #408
docs: Add Create Resource Server Tutorial by @ffrujeri in #407
miniswe w/ offline uv by @sdevare-nv in #357
update vllm model comments by @cmunley1 in #423
docs: linked several terms to their defenition in glossary by @ahmadki in #424
docs: Explain why GPT-4 is used and clarify support for other models by @ahmadki in #425
Removed internal section by @hwolff99 in #430
docs: various improvements and fixes by @ahmadki in #415
docs: Relate sections Get Started and Rollout Collection by @fsiino-nvidia in #426
Guide user on next steps after finishing get started by @cwing-nvidia in #435
Add placeholder author by @jkyi-nvidia in #440
Clarify training environment framing and align docs messaging by @cwing-nvidia in #438
docs: Added CLI documentation by @ahmadki in #444
Change NeMo Gym from framework to library by @cwing-nvidia in #456
Add Data Designer and links to ecosystem page by @cwing-nvidia in #462
docs: Moved configuration system under about by @ahmadki in #420
Add benefits to About page aligned with README by @cwing-nvidia in #452
Explain where the name Gym comes from; Gym Key Terminology doc is missing some of the old material by @bxyu-nvidia in #470
add calendar env for multi-turn IF by @sanjaykariyappa in #297
docs(readme): fix Example Resource Servers table - correct Multi Step… by @lbliii in #464
Remove penguin references by @ahmadki in #469
docs: Training framework integration by @bxyu-nvidia in #439
Bug: inconsistent documentation around servers running by @bxyu-nvidia in #472
docs: Improve server reference info by @bxyu-nvidia in #474
pyproject typos and grammar fixes by @ahmadki in #473
Miscellaneous infra improvements/fixes by @pjin-nvidia in #317
Expose server host and port in dataset viewer CLI by @ahmadki in #476
Rename examples simple_weather and stateful_counter by @fsiino-nvidia in #479
More single tool call filename updates by @fsiino-nvidia in #480
docs: Fix wrong count vs actual by @fsiino-nvidia in #482
Fix duplicate reference sections by @bxyu-nvidia in #483
docs: home pg, quickstart move, gh icon by @lbliii in #463
More single tool call filename updates cont by @fsiino-nvidia in #484
Fix NeMo Gym Pyproject links by @bxyu-nvidia in #486
docs: move FAQ by @lbliii in #489
docs: contribute section by @lbliii in #490
Misc rollout fixes by @pjin-nvidia in #447
improve framing of training framework integration guide for contributing by @cwing-nvidia in #493
Docs: Contribution Home & Dev Setup by @cwing-nvidia in #494
Add environment contribution docs by @cwing-nvidia in #498
FAQ cleanup by @cwing-nvidia in #499
Simplify contributing.md by @cwing-nvidia in #500
Reorder README structure by @cwing-nvidia in #501
docs: End-to-end GRPO Training with NeMo RL tutorial [master branch] by @bxyu-nvidia in #481
Update dataset configs with HuggingFace links by @bxyu-nvidia in #508
Change to v0.1.1 release version by @bxyu-nvidia in #509

New Contributors

@shashank3959 made their first contribution in #355
@hwolff99 made their first contribution in #377
@ahmadki made their first contribution in #414
@ffrujeri made their first contribution in #401
@sanjaykariyappa made their first contribution in #297

Full Changelog: v0.1.0...v0.1.1

Contributors

ffrujeri, sanjaykariyappa, and 12 other contributors

Assets 2

15 Nov 01:06

bxyu-nvidia

v0.1.0

3ac9f35

v0.1.0

What's Changed

Add copy-pr-bot by @chtruong814 in #1
Add initial repo template by @chtruong814 in #2
Update GitHub with Gitlab main by @bxyu-nvidia in #3
Alias as Penguin by @bxyu-nvidia in #4
Add Copyright docs README FAQ by @bxyu-nvidia in #7
Dapo17k by @bxyu-nvidia in #6
Fix docs build failures by @bxyu-nvidia in #8
Fix docs by @bxyu-nvidia in #10
Improve Github SSH Key setup docs by @bxyu-nvidia in #12
Comp-Coding Verifier by @kbhardwaj-nvidia in #5
Dataset viewer simple aggregations by @fsiino-nvidia in #9
VLLMModel docs in main Readme by @bxyu-nvidia in #13
Fix agent name in docs by @bxyu-nvidia in #15
VLLMModel propogates token IDs by @bxyu-nvidia in #11
VLLMModel tokenize params cleanup by @bxyu-nvidia in #21
Update Comp-Coding README.md by @kbhardwaj-nvidia in #26
Docs improvements - remove Why NeMo Gym section and add CI/CD tests info by @bxyu-nvidia in #27
update server logging format to be more consistent by @cmunley1 in #22
update readmes from ng_collect_traj to ng_collect_rollouts by @cmunley1 in #25
Simple agent stop criteria requires no tool calls AND output message item to be present by @bxyu-nvidia in #19
Server spinup polling by @bxyu-nvidia in #31
Rename top-level config key 'openai_model' => 'policy_model' by @pjin-nvidia in #33
Simple agent allows non-json tool responses by @bxyu-nvidia in #35
Multi-verifier docs by @bxyu-nvidia in #36
Servers have easy hooks into individual instances via session by @bxyu-nvidia in #24
Add Math Stack Overflow dataset by @damon-mosk-aoyama-nvidia in #42
Add Workbench validation dataset by @bxyu-nvidia in #46
Docs update by @bxyu-nvidia in #47
Implements LLM-as-Judge for Response Equivalence by @soares-f in #16
Configure global httpx client by @pjin-nvidia in #50
Fix OpenAI ResponseReasoningItem.status property by @bxyu-nvidia in #54
VLLMModel data parallel; explicit RunHelper shutdown handle by @bxyu-nvidia in #52
removed simple_agent_stateful, uses fastapi to keep track of session by @RahulSChand in #44
Migrate text_based_game: sudoku and game agent features by @RahulSChand in #30
Revert "Migrate text_based_game: sudoku and game agent features" by @bxyu-nvidia in #65
Add data aggregations to data preparation by @fsiino-nvidia in #49
Instantiate one httpx async client per unique connection / base url by @bxyu-nvidia in #75
Swap async http backend from httpx to aiohttp; various server infra improvements by @bxyu-nvidia in #77
Remove unnecessary GHA CI and add uv config to enable dependency scanning by @chtruong814 in #66
VLLMModel fix whitespace stripping and unwarranted spaces by @bxyu-nvidia in #70
Fix aggregation rounding in ng_prepare_data by @fsiino-nvidia in #76
Add profiling; improve rollout collection usability and efficiency; add uvicorn logging filtering by @bxyu-nvidia in #79
Delete .github/ISSUE_TEMPLATE directory by @pablo-garay in #87
Add support for num_repeats by @MahanFathi in #99
Comp coding fixes; lots of misc infra items by @bxyu-nvidia in #90
chore: Update cherry-pick workflow to use v0.63.0 by @pablo-garay in #108
Make Workbench stateful and sign commits by @abhibha-nvidia in #110
Clean deprecated Comp coding by @bxyu-nvidia in #106
Bxyu/misc infra 20251001 by @bxyu-nvidia in #116
Resource Server Organization by @fsiino-nvidia in #80
Add metrics conflict error FAQ to Readme by @fsiino-nvidia in #93
Azure OpenAI model support by @bxyu-nvidia in #112
Use python env for precommit hook; alter files trigger by @fsiino-nvidia in #125
Update issue templates by @bxyu-nvidia in #152
Add back Nemo Framework templates by @bxyu-nvidia in #153
Fix Workbench invalid function name by @bxyu-nvidia in #167
VLLMModel enable reasoning parsing by @bxyu-nvidia in #129
Add Attributions for Third Party Softwares by @banghuaz-nvidia in #154
Fix infinite OpenAI endpoint query; misc improvements by @bxyu-nvidia in #171
docs: Add Tutorial 00 - Key Terminology by @cwing-nvidia in #180
docs: Add tutorial README with learning path structure by @cwing-nvidia in #177
Redirect main Gym readme to Tutorials by @bxyu-nvidia in #201
docs: Add Tutorial 01 - Understanding Core Concepts by @cwing-nvidia in #181
docs: Add Tutorial 09 - Configuration Management by @cwing-nvidia in #183
Add CODE_OF_CONDUCT.md for community guidelines by @cwing-nvidia in #148
Add SECURITY.md with NVIDIA security policy by @cwing-nvidia in #149
Make metrics conflict criteria less strict by @fsiino-nvidia in #150
Move tutorials to docs by @bxyu-nvidia in #205
docs: Replace README with improved version by @cwing-nvidia in #192
Large docs improvement PR from @cwing-nvidia by @bxyu-nvidia in #208
Add back How-To's and FAQs by @bxyu-nvidia in #209
Docs fixes by @bxyu-nvidia in #210
Improve CONTRIBUTING.md by @cwing-nvidia in #151
feat (OpenQA): Add OpenQA support with per-record regex and rescue features by @psgundecha-nv in #155
feat(mcqa): Add custom answer extraction via template_metadata to support STEM MCQA dataset by @psgundecha-nv in #128
Add README to docs folder by @bxyu-nvidia in #216
Ray comp coding infra by @sdevare-nv in #195
Misc docs fixes by @bxyu-nvidia in #218
CLI help and command help; misc improvements by @bxyu-nvidia in #229
Misc infra 20251024 by @bxyu-nvidia in #234
Fix ray version mismatch by @sdevare-nv in #231
Misc fixes 20251027 by @bxyu-nvidia in #243
Validate server port selection by @fsiino-nvidia in #233
bxyu/misc-infra-20251027-001 by @bxyu-nvidia in #247
Fix input assistant messages by @bxyu-nvidia in #248
Misc infra 20251028 002 by @bxyu-nvidia in #253
Structured Outputs JSON Environment by @jkyi-nvidia in #251
Bump OpenAI version to 2.6.1; improve dependency constrain resolution by @bxyu-nvidia in #255
Update missing header and attributions by @banghuaz-nvidia in #237
Misc infra 20251031 by @bxyu-nvidia in #263
Update math dataset examples and metrics by @damon-mosk-aoyama-nvidia in #265
Misc infra 20251101 by @bxyu-nvidia in #267
Almost-server detection and reporting by @fsiino-nvidia in #249
Miniswe env by @sdevare-nv in #241
Differentiate Example-only and Training Resource Servers...