Skip to content

Commit 8216ac4

Browse files
authored
(torchx/local_scheduler) Use os.kill instead of os.killpg when sending SIGTERM to the replica pid. Add runner.wait() for torchx.runner.test.api_test#test_empty_session_id to gracefully wait for the replicas to finish running
Differential Revision: D74197282 Pull Request resolved: #1062
1 parent 83a2765 commit 8216ac4

File tree

2 files changed

+2
-1
lines changed

2 files changed

+2
-1
lines changed

torchx/runner/test/api_test.py

+1
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,7 @@ def test_empty_session_id(self, _: MagicMock) -> None:
153153
)
154154

155155
app_handle = runner.run(app, "local", self.cfg)
156+
runner.wait(app_handle, wait_interval=0.1)
156157

157158
scheduler, session_name, app_id = parse_app_handle(app_handle)
158159
self.assertEqual(scheduler, "local")

torchx/schedulers/local_scheduler.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -311,7 +311,7 @@ def terminate(self) -> None:
311311
"""
312312
# safe to call terminate on a process that already died
313313
try:
314-
os.killpg(self.proc.pid, signal.SIGTERM)
314+
os.kill(self.proc.pid, signal.SIGTERM)
315315
except ProcessLookupError as e:
316316
log.debug(f"Process {self.proc.pid} already got terminated")
317317

0 commit comments

Comments
 (0)