Commit 302fd39
load/store outer optimizer state dict (#277)
Summary:
We don't restore outer optimizer state currently which can lead to bumps in loss because of high learning rate from a new replica. So save the outer optimizer state in the diloco specific state dict.
Pull Request resolved: #277
Reviewed By: d4l3k
Differential Revision: D83512078
fbshipit-source-id: 07c3ca7f4830f2115c3a4586d93c6d0883a386601 parent 6393e6d commit 302fd39
File tree
4 files changed
+33
-30
lines changed- torchft
- _test
4 files changed
+33
-30
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | 230 | | |
235 | 231 | | |
236 | 232 | | |
| |||
244 | 240 | | |
245 | 241 | | |
246 | 242 | | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | 243 | | |
252 | 244 | | |
253 | 245 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
221 | 221 | | |
222 | 222 | | |
223 | 223 | | |
224 | | - | |
| 224 | + | |
225 | 225 | | |
226 | 226 | | |
227 | 227 | | |
228 | 228 | | |
229 | | - | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
230 | 236 | | |
231 | 237 | | |
232 | 238 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
259 | 259 | | |
260 | 260 | | |
261 | 261 | | |
262 | | - | |
263 | | - | |
| 262 | + | |
| 263 | + | |
264 | 264 | | |
265 | 265 | | |
266 | 266 | | |
| 267 | + | |
| 268 | + | |
267 | 269 | | |
268 | | - | |
| 270 | + | |
269 | 271 | | |
270 | | - | |
271 | | - | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
272 | 277 | | |
273 | 278 | | |
274 | 279 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
143 | | - | |
144 | | - | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
145 | 149 | | |
146 | 150 | | |
147 | 151 | | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
| 152 | + | |
157 | 153 | | |
158 | 154 | | |
159 | 155 | | |
160 | | - | |
161 | | - | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
162 | 160 | | |
163 | 161 | | |
164 | | - | |
165 | | - | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
166 | 166 | | |
167 | 167 | | |
168 | 168 | | |
| |||
0 commit comments