Skip to content

Actions: pytorch/torchft

Docs

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
385 workflow runs
385 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

use torchx for manual many replica (20+) tests (#75)
Docs #160: Commit 39a40b2 pushed by d4l3k
January 18, 2025 05:26 4m 33s main
January 18, 2025 05:26 4m 33s
use torchx for manual many replica (20+) tests
Docs #159: Pull request #75 synchronize by d4l3k
January 18, 2025 00:40 3m 23s d4l3k/torchx
January 18, 2025 00:40 3m 23s
process_group: wait for futher_thread join before creating new one
Docs #158: Pull request #68 synchronize by dwancn
January 17, 2025 03:18 4m 18s dwancn:fix_pg_config
January 17, 2025 03:18 4m 18s
use torchx for manual many replica (20+) tests
Docs #157: Pull request #75 opened by d4l3k
January 16, 2025 22:51 3m 12s d4l3k/torchx
January 16, 2025 22:51 3m 12s
overhaul timeouts for Lighthouse, Manager, checkpoint server (#73)
Docs #156: Commit 3ee2360 pushed by d4l3k
January 16, 2025 19:05 3m 48s main
January 16, 2025 19:05 3m 48s
overhaul timeouts for Lighthouse, Manager, checkpoint server
Docs #155: Pull request #73 synchronize by d4l3k
January 16, 2025 18:54 3m 25s d4l3k/timeout_overhaul
January 16, 2025 18:54 3m 25s
Fix typo and use sampler in train_ddp.py (#74)
Docs #154: Commit 03160ee pushed by mreso
January 16, 2025 18:27 3m 18s main
January 16, 2025 18:27 3m 18s
Dont return quorum if requester isnt involved (#72)
Docs #153: Commit c58ed4c pushed by d4l3k
January 16, 2025 17:48 3m 40s main
January 16, 2025 17:48 3m 40s
Fix typo and use sampler in train_ddp.py
Docs #151: Pull request #74 opened by mreso
January 16, 2025 00:45 4m 5s mreso:fix/typos
January 16, 2025 00:45 4m 5s
overhaul timeouts for Lighthouse, Manager, checkpoint server
Docs #150: Pull request #73 synchronize by d4l3k
January 15, 2025 23:31 3m 15s d4l3k/timeout_overhaul
January 15, 2025 23:31 3m 15s
overhaul timeouts for Lighthouse, Manager, checkpoint server
Docs #149: Pull request #73 opened by d4l3k
January 15, 2025 19:06 3m 19s d4l3k/timeout_overhaul
January 15, 2025 19:06 3m 19s
lighthouse/quorum: avoid split brain and add shrink_only support (#71)
Docs #146: Commit 79572e6 pushed by d4l3k
January 15, 2025 00:01 3m 37s main
January 15, 2025 00:01 3m 37s
process_group: wait for futher_thread join before creating new one
Docs #145: Pull request #68 synchronize by dwancn
January 14, 2025 08:10 3m 29s dwancn:fix_pg_config
January 14, 2025 08:10 3m 29s
lighthouse/quorum: avoid split brain and add shrink_only support
Docs #144: Pull request #71 opened by d4l3k
January 14, 2025 01:43 3m 24s d4l3k/shrink_only
January 14, 2025 01:43 3m 24s
lighthouse, manager: remove room support (#70)
Docs #143: Commit 97ad397 pushed by d4l3k
January 13, 2025 22:17 3m 38s main
January 13, 2025 22:17 3m 38s
lighthouse, manager: remove room support
Docs #142: Pull request #70 opened by d4l3k
January 13, 2025 21:51 3m 17s d4l3k/remove_room
January 13, 2025 21:51 3m 17s
feat: fix security warnings in torchft (#69)
Docs #141: Commit e0f76e1 pushed by d4l3k
January 13, 2025 21:49 3m 32s main
January 13, 2025 21:49 3m 32s
feat: fix security warnings in torchft
Docs #140: Pull request #69 opened by c-p-i-o
January 13, 2025 21:05 4m 27s cpio/fix_vuln
January 13, 2025 21:05 4m 27s
[lighthouse] detect unhealthy participants via heartbeats (#64)
Docs #138: Commit 2f97660 pushed by d4l3k
January 11, 2025 01:19 3m 37s main
January 11, 2025 01:19 3m 37s
[lighthouse] detect unhealthy participants via heartbeats
Docs #137: Pull request #64 synchronize by d4l3k
January 11, 2025 01:04 3m 40s d4l3k/quorum_heartbeats
January 11, 2025 01:04 3m 40s
[manager] fix address when binding to 0 (#67)
Docs #136: Commit 6b3665a pushed by d4l3k
January 10, 2025 21:16 3m 29s main
January 10, 2025 21:16 3m 29s
ProTip! You can narrow down the results and go further in time using created:<2025-01-10 or the other filters available.