Commit d4f9a2c
Fix unit-test hang: use spawn (not fork) for multiprocess jobs
fork in the long-lived pytest process inherits locks held by background threads
(OpenMP / torch intra-op pools), deadlocking the child (e.g. in
dist.init_process_group) and hanging the job. Revert spawn_multiprocess_job to
spawn; the world_size reduction remains the speedup.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>1 parent a694e89 commit d4f9a2c
1 file changed
Lines changed: 2 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
| 56 | + | |
63 | 57 | | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
| 58 | + | |
68 | 59 | | |
69 | 60 | | |
70 | 61 | | |
| |||
0 commit comments