Skip to content

Pull requests: NVIDIA-NeMo/Automodel

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix: Change attn_impl to eager for gemma4 TP, PP configs
#2199 opened May 8, 2026 by athitten Contributor Loading…
3 tasks
fix(pp): preserve VLM forward when class opts in via _pp_keep_self_forward community-request waiting-on-customer Waiting on the original author to respond
#2192 opened May 8, 2026 by khazic Contributor Loading…
3 of 4 tasks
feat(deepseek-v4): add Multi-Token Prediction (MTP) training support community-request
#2191 opened May 8, 2026 by khazic Contributor Loading…
3 of 4 tasks
feat(ux): add quickpath invocation
#2189 opened May 8, 2026 by akoumpa Contributor Draft
3 tasks
fix(nemotron-v3): support THD with input_embeds instead of input_ids
#2185 opened May 7, 2026 by pzelasko Contributor Loading…
3 tasks done
ci(diffusion): remove local_dir and post process directly on cache
#2182 opened May 7, 2026 by thomasdhc Contributor Loading…
3 tasks
fix(vlm): fail loudly in PP chunker when pixel_values cannot be aligned community-request waiting-on-maintainers Waiting on maintainers to respond
#2181 opened May 7, 2026 by khazic Contributor Loading…
2 tasks done
fix(vlm): ceil-divide PP chunker so trailing samples are not dropped community-request waiting-on-maintainers Waiting on maintainers to respond
#2180 opened May 7, 2026 by khazic Contributor Loading…
2 of 3 tasks
fix(vlm): chunk video inputs for pipeline parallelism community-request waiting-on-maintainers Waiting on maintainers to respond
#2177 opened May 7, 2026 by khazic Contributor Loading…
2 of 3 tasks
fix(tests): clean up sys.modules pollution in training fixtures community-request waiting-on-maintainers Waiting on maintainers to respond
#2168 opened May 7, 2026 by rob-luke Loading…
3 tasks done
feat(nemotron-v3): add Multi-Token Prediction (MTP) support
#2161 opened May 6, 2026 by adil-a Collaborator Loading…
6 tasks done
feat: Add TE+CP support for gemma4 26b MoE
#2155 opened May 6, 2026 by athitten Contributor Draft
3 tasks
fix(gpt_oss): free quantized expert tensors per-layer to reduce peak memory community-request waiting-on-customer Waiting on the original author to respond
#2149 opened May 6, 2026 by stanley1208 Contributor Loading…
ci: Update transformers to latest version 5.8.0
#2148 opened May 6, 2026 by svcnvidia-nemo-ci Contributor Loading…
fix(infra): keep model.to(device) on unsharded post-shard load
#2146 opened May 6, 2026 by HuiyingLi Contributor Loading…
3 tasks
flux2 init draft
#2145 opened May 6, 2026 by linnanwang Contributor Draft
3 tasks
docs: release notes docs-only With great power comes great responsibility.
#2141 opened May 5, 2026 by akoumpa Contributor Loading…
3 tasks
docs: add bump-dependency skill for shepherding dependency PRs to green docs-only With great power comes great responsibility. documentation Improvements or additions to documentation
#2130 opened May 5, 2026 by ko3n1g Contributor Loading…
ci: Major refactor of release-workflows
#2127 opened May 5, 2026 by ko3n1g Contributor Loading…
2 of 3 tasks
refactor: Remove separate moe_mesh references community-request waiting-on-customer Waiting on the original author to respond
#2123 opened May 4, 2026 by edjson Contributor Loading…
2 of 3 tasks
ci: align CUDA 13.2 / cu130 toolchain for TE 2.14.1 bump
#2121 opened May 4, 2026 by thomasdhc Contributor Loading…
3 tasks
ProTip! Adding no:label will show everything without a label.