Skip to content

Actions: pytorch/torchft

Unit Tests

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
357 workflow runs
357 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

make torchft work for llama3_8b 8x
Unit Tests #307: Pull request #104 opened by d4l3k
February 8, 2025 00:35 7m 3s d4l3k/fast_checkpoint
February 8, 2025 00:35 7m 3s
Refactors process_group_tests.py
Unit Tests #306: Pull request #103 synchronize by allenwang28
February 7, 2025 22:04 8m 57s allenwang28:pg_test_refactor
February 7, 2025 22:04 8m 57s
Refactors process_group_tests.py
Unit Tests #305: Pull request #103 synchronize by allenwang28
February 7, 2025 22:02 8m 55s allenwang28:pg_test_refactor
February 7, 2025 22:02 8m 55s
Refactors process_group_tests.py
Unit Tests #304: Pull request #103 synchronize by allenwang28
February 7, 2025 21:58 9m 4s allenwang28:pg_test_refactor
February 7, 2025 21:58 9m 4s
Refactors process_group_tests.py
Unit Tests #303: Pull request #103 synchronize by allenwang28
February 7, 2025 20:18 9m 22s allenwang28:pg_test_refactor
February 7, 2025 20:18 9m 22s
Refactors process_group_tests.py
Unit Tests #302: Pull request #103 synchronize by allenwang28
February 7, 2025 20:13 9m 11s allenwang28:pg_test_refactor
February 7, 2025 20:13 9m 11s
Refactors process_group_tests.py
Unit Tests #301: Pull request #103 opened by allenwang28
February 7, 2025 20:11 6m 58s allenwang28:pg_test_refactor
February 7, 2025 20:11 6m 58s
MonitoredQueue: fail fast when subprocess exits (#99)
Unit Tests #300: Commit 9533676 pushed by d4l3k
February 7, 2025 18:05 9m 11s main
February 7, 2025 18:05 9m 11s
MonitoredQueue: fail fast when subprocess exits
Unit Tests #299: Pull request #99 synchronize by d4l3k
February 7, 2025 01:00 9m 5s d4l3k/monitor_queue
February 7, 2025 01:00 9m 5s
Adds reduce_scatter into torchft
Unit Tests #298: Pull request #102 synchronize by allenwang28
February 6, 2025 22:46 9m 3s allenwang28:collectives
February 6, 2025 22:46 9m 3s
Adds reduce_scatter into torchft
Unit Tests #297: Pull request #102 opened by allenwang28
February 6, 2025 21:55 8m 15s allenwang28:collectives
February 6, 2025 21:55 8m 15s
Use streaming transfers
Unit Tests #296: Pull request #101 opened by d4l3k
February 5, 2025 23:14 6m 58s d4l3k/streaming_transfers
February 5, 2025 23:14 6m 58s
update quorum_ticks to use interval (#100)
Unit Tests #295: Commit 4d4d260 pushed by d4l3k
February 5, 2025 21:44 8m 50s main
February 5, 2025 21:44 8m 50s
update lighthouse quorum_ticks to use interval
Unit Tests #294: Pull request #100 opened by H-Huang
February 5, 2025 21:16 9m 26s H-Huang:diloco
February 5, 2025 21:16 9m 26s
MonitoredQueue: fail fast when subprocess exits
Unit Tests #293: Pull request #99 synchronize by d4l3k
February 5, 2025 20:05 9m 17s d4l3k/monitor_queue
February 5, 2025 20:05 9m 17s
README: fix logo font by using path instead of text (#98)
Unit Tests #292: Commit 0c4ccf9 pushed by d4l3k
February 5, 2025 19:47 8m 44s main
February 5, 2025 19:47 8m 44s
MonitoredQueue: fail fast when subprocess exits
Unit Tests #291: Pull request #99 opened by d4l3k
February 5, 2025 19:47 7m 8s d4l3k/monitor_queue
February 5, 2025 19:47 7m 8s
[easy] README: fix logo font by using path instead of text
Unit Tests #290: Pull request #98 opened by d4l3k
February 5, 2025 18:55 9m 26s d4l3k/fix_logo
February 5, 2025 18:55 9m 26s
Refactor local_sgd integration tests (#96)
Unit Tests #289: Commit 927f8b4 pushed by d4l3k
February 4, 2025 23:02 9m 39s main
February 4, 2025 23:02 9m 39s
Participants APIs should check if quorum is started (#95)
Unit Tests #288: Commit 118d1a2 pushed by d4l3k
February 4, 2025 23:01 9m 26s main
February 4, 2025 23:01 9m 26s
Refactor local_sgd integration tests
Unit Tests #287: Pull request #96 synchronize by H-Huang
February 4, 2025 20:34 8m 56s H-Huang:diloco
February 4, 2025 20:34 8m 56s
[WIP] FSDP example
Unit Tests #286: Pull request #77 synchronize by mreso
February 4, 2025 20:16 9m 22s mreso:fsdp_example
February 4, 2025 20:16 9m 22s
Refactor local_sgd integration tests
Unit Tests #285: Pull request #96 opened by H-Huang
February 4, 2025 19:58 8m 56s H-Huang:diloco
February 4, 2025 19:58 8m 56s
Participants APIs should check if quorum is started
Unit Tests #284: Pull request #95 opened by fegin
February 4, 2025 19:35 8m 59s chienchin/participant
February 4, 2025 19:35 8m 59s
manager: expose participating_rank (#94)
Unit Tests #283: Commit 87290f5 pushed by d4l3k
February 3, 2025 19:53 9m 11s main
February 3, 2025 19:53 9m 11s