Commit 4d9bd5c
committed
[megatron] feat: Share actor and ref in LoRA
For `compute_ref_log_prob`, we can do that by disabling
lora layers temporarily for the forward pass, as base
weight are frozen and only lora layers are trained.
This has already been supported in FSDP LoRA.
Signed-off-by: Hollow Man <[email protected]>1 parent a090cd8 commit 4d9bd5c
File tree
5 files changed
+41
-46
lines changed- recipe
- fully_async_policy
- one_step_off_policy
- transfer_queue
- verl
- trainer/ppo
- workers
5 files changed
+41
-46
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
83 | | - | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
84 | 87 | | |
85 | 88 | | |
86 | 89 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
| 40 | + | |
45 | 41 | | |
46 | 42 | | |
47 | 43 | | |
| |||
54 | 50 | | |
55 | 51 | | |
56 | 52 | | |
57 | | - | |
58 | | - | |
59 | | - | |
| 53 | + | |
60 | 54 | | |
61 | 55 | | |
62 | 56 | | |
| |||
120 | 114 | | |
121 | 115 | | |
122 | 116 | | |
123 | | - | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
124 | 121 | | |
125 | 122 | | |
126 | 123 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
| 51 | + | |
56 | 52 | | |
57 | 53 | | |
58 | 54 | | |
| |||
64 | 60 | | |
65 | 61 | | |
66 | 62 | | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
| 63 | + | |
| 64 | + | |
78 | 65 | | |
79 | 66 | | |
80 | 67 | | |
81 | 68 | | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
| 69 | + | |
87 | 70 | | |
88 | 71 | | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
| 72 | + | |
94 | 73 | | |
95 | 74 | | |
96 | 75 | | |
| |||
401 | 380 | | |
402 | 381 | | |
403 | 382 | | |
404 | | - | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
405 | 387 | | |
406 | 388 | | |
407 | 389 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
341 | 341 | | |
342 | 342 | | |
343 | 343 | | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
348 | 348 | | |
349 | 349 | | |
350 | 350 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
| 36 | + | |
35 | 37 | | |
36 | 38 | | |
37 | 39 | | |
| |||
816 | 818 | | |
817 | 819 | | |
818 | 820 | | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
819 | 825 | | |
820 | 826 | | |
821 | 827 | | |
| |||
842 | 848 | | |
843 | 849 | | |
844 | 850 | | |
| 851 | + | |
| 852 | + | |
845 | 853 | | |
846 | | - | |
847 | | - | |
848 | | - | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
849 | 858 | | |
850 | 859 | | |
851 | 860 | | |
| |||
854 | 863 | | |
855 | 864 | | |
856 | 865 | | |
857 | | - | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
858 | 871 | | |
859 | | - | |
| 872 | + | |
860 | 873 | | |
861 | 874 | | |
862 | 875 | | |
| |||
0 commit comments