-
Notifications
You must be signed in to change notification settings - Fork 145
Pull requests: NVIDIA-NeMo/Automodel
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: Change attn_impl to eager for gemma4 TP, PP configs
#2199
opened May 8, 2026 by
athitten
Contributor
Loading…
3 tasks
docs(fern): scaffold Fern docs site mirroring published v0.4.0 sidebar
#2196
opened May 8, 2026 by
lbliii
Loading…
7 tasks
fix(pp): preserve VLM forward when class opts in via _pp_keep_self_forward
community-request
waiting-on-customer
Waiting on the original author to respond
#2192
opened May 8, 2026 by
khazic
Contributor
Loading…
3 of 4 tasks
feat(deepseek-v4): add Multi-Token Prediction (MTP) training support
community-request
#2191
opened May 8, 2026 by
khazic
Contributor
Loading…
3 of 4 tasks
fix(nemotron-v3): support THD with
input_embeds instead of input_ids
#2185
opened May 7, 2026 by
pzelasko
Contributor
Loading…
3 tasks done
ci(diffusion): remove local_dir and post process directly on cache
#2182
opened May 7, 2026 by
thomasdhc
Contributor
Loading…
3 tasks
fix(vlm): fail loudly in PP chunker when pixel_values cannot be aligned
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#2181
opened May 7, 2026 by
khazic
Contributor
Loading…
2 tasks done
fix(vlm): ceil-divide PP chunker so trailing samples are not dropped
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#2180
opened May 7, 2026 by
khazic
Contributor
Loading…
2 of 3 tasks
fix(vlm): chunk video inputs for pipeline parallelism
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#2177
opened May 7, 2026 by
khazic
Contributor
Loading…
2 of 3 tasks
fix(tests): clean up sys.modules pollution in training fixtures
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#2168
opened May 7, 2026 by
rob-luke
Loading…
3 tasks done
feat(nemotron-v3): add Multi-Token Prediction (MTP) support
#2161
opened May 6, 2026 by
adil-a
Collaborator
Loading…
6 tasks done
fix(gpt_oss): free quantized expert tensors per-layer to reduce peak memory
community-request
waiting-on-customer
Waiting on the original author to respond
#2149
opened May 6, 2026 by
stanley1208
Contributor
Loading…
ci: Update transformers to latest version 5.8.0
#2148
opened May 6, 2026 by
svcnvidia-nemo-ci
Contributor
Loading…
fix(qwen3_5): preserve packed-sample boundaries in GatedDeltaNet
#2147
opened May 6, 2026 by
HuiyingLi
Contributor
Loading…
fix(infra): keep model.to(device) on unsharded post-shard load
#2146
opened May 6, 2026 by
HuiyingLi
Contributor
Loading…
3 tasks
docs: release notes
docs-only
With great power comes great responsibility.
#2141
opened May 5, 2026 by
akoumpa
Contributor
Loading…
3 tasks
docs: add bump-dependency skill for shepherding dependency PRs to green
docs-only
With great power comes great responsibility.
documentation
Improvements or additions to documentation
#2130
opened May 5, 2026 by
ko3n1g
Contributor
Loading…
ci: Major refactor of release-workflows
#2127
opened May 5, 2026 by
ko3n1g
Contributor
Loading…
2 of 3 tasks
refactor: Remove separate moe_mesh references
community-request
waiting-on-customer
Waiting on the original author to respond
#2123
opened May 4, 2026 by
edjson
Contributor
Loading…
2 of 3 tasks
ci: align CUDA 13.2 / cu130 toolchain for TE 2.14.1 bump
#2121
opened May 4, 2026 by
thomasdhc
Contributor
Loading…
3 tasks
Previous Next
ProTip!
Adding no:label will show everything without a label.