Skip to content

Pull requests: patrick-toulme/axlearn

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

neuron changes for 1B,3B,8B models
#45 opened Jan 2, 2025 by aws-mengchiy Loading…
skip previous trained batches
#43 opened Dec 20, 2024 by aws-zhenguo Loading…
skip previous trained batches
#42 opened Dec 20, 2024 by aws-zhenguo Loading…
imported os and added ckpt scripts
#41 opened Dec 19, 2024 by dgourab-aws Loading…
Use default remat policy
#36 opened Dec 13, 2024 by apoorvtintin Loading…
resume training with next batch of data
#32 opened Dec 11, 2024 by aws-zhenguo Loading…
logit_bias support for NEW_UNSHARDED_ATTN_KERNEL
#24 opened Dec 4, 2024 by HahTK Loading…
Jit cache
#23 opened Nov 26, 2024 by amithrm Loading…
Multi graph gradient accumulation
#5 opened Apr 2, 2024 by apoorvtintin Loading…
gradient accumulation using optax multisteps*
#2 opened Mar 25, 2024 by apoorvtintin Loading…
ProTip! no:milestone will show everything without a milestone.