-
Notifications
You must be signed in to change notification settings - Fork 2.9k
[megatron] feat: LoRA adapter only refit (TensorLoRARequest) #4632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant feature for LoRA adapter-only weight updates in the Megatron backend, which avoids the overhead of merging adapters into the base model for each weight synchronization. The changes are well-structured and include updates to documentation, configuration handling for LoRA, new PEFT utility functions for vLLM compatibility, and refactored weight export logic. The implementation appears solid and aligns with the stated goals. I have a couple of suggestions to enhance code maintainability and clarity regarding duplicated logic and a confusing condition.
f6706bc to
81d261e
Compare
23b8716 to
797dc7f
Compare
797dc7f to
2542c79
Compare
Signed-off-by: Hollow Man <[email protected]>
dddcffe to
91d4732
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for LoRA adapter-only refitting in the Megatron backend, which should provide significant performance benefits. The changes across documentation, configuration, and worker implementations appear to correctly support both merging LoRA adapters and loading them separately. My primary concern, detailed in a specific comment, is the repeated implementation of LoRA configuration logic across several files. Addressing this will improve the long-term maintainability of the codebase.
What does this PR do?
Waiting for:
Checklist Before Starting
[{modules}] {type}: {description}(This will be checked by the CI){modules}includefsdp,megatron,sglang,vllm,rollout,trainer,ci,training_utils,recipe,hardware,deployment,ray,worker,single_controller,misc,perf,model,algo,env,tool,ckpt,doc,data,cfg,reward,like[megatron, fsdp, doc]{type}is infeat,fix,refactor,chore,test[BREAKING]to the beginning of the title.[BREAKING][fsdp, megatron] feat: dynamic batchingTest
API and Usage Example
# Add code snippet or script demonstrating how to use thisDesign & Code Changes
Checklist Before Submitting
Important
Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel in theverlSlack workspace. (If not accessible, please try the Feishu group (飞书群).)