fix one big and 2 small bugs in training by bkal01 · Pull Request #24 · Cohere-Labs-Community/expedition-tayavision

bkal01 · 2026-03-11T15:57:19Z

steps were being computed incorrectly so the LR scheduler was configured wrong. this means that the entire llava-pretrain run would just be in warmup

fixed some logging/saving logic to be more aligned with actual optimizer steps

added seed for reproducibility

fixed LR scheduler configuration on resuming training

fix some small bugs in training steps were being computed incorrectly so the LR scheduler was configured wrong. this means that the entire llava-pretrain run would just be in warmup fixed some logging/saving logic to be more aligned with actual optimizer steps added seed for reproducibility fixed LR scheduler configuration on resuming training swapped torch.compile and gradient checkpointing order so torch.compile gets the proper final computational graph

…rride

bkal01 force-pushed the bhavesh/fix-lr-scheduler branch from cd4b3ea to d29da76 Compare March 11, 2026 16:18

bkal01 marked this pull request as ready for review March 11, 2026 16:19

bkal01 requested review from engichang1467 and sanggusti March 11, 2026 16:19

use modal cls to make GPU configurable and also allow config json ove…

8fd683c

…rride

bkal01 force-pushed the bhavesh/fix-lr-scheduler branch from bdc66c5 to 8fd683c Compare March 11, 2026 19:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix one big and 2 small bugs in training#24

fix one big and 2 small bugs in training#24
bkal01 wants to merge 2 commits intomainfrom
bhavesh/fix-lr-scheduler

bkal01 commented Mar 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bkal01 commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bkal01 commented Mar 11, 2026 •

edited

Loading