Skip to content

Commit 99046ec

Browse files
committed
Mock out idle sleeps in orchestration tests (~9x faster)
The ax/orchestration test suite spent ~400s of wall time almost entirely sleeping, not computing. Two sources of pure idle time are removed via a shared _mock_orchestrator_poll_sleep() helper called from both test classes' setUp: 1. The orchestrator polling loop (ax.orchestration.orchestrator.sleep). Many tests leave init_seconds_between_polls / min_seconds_before_poll at their non-zero defaults. Loop termination depends on total_seconds_elapsed (which accumulates the configured interval regardless of actual sleeping), so removing the wait is behavior-preserving. 2. The exponential backoff between DB-save retries in retry_on_exception (initial_wait_seconds=5 -> 5s + 10s = 15s for test_suppress_all_storage_errors). Only sleep is mocked, via a wraps-ed copy of the time module, so the retry count and all other time.* functions are untouched and trial-TTL tests that rely on real time.sleep still work. Runtime: 401.6s -> 42.9s (single-threaded), all 164 tests still pass.
1 parent b3ddd71 commit 99046ec

1 file changed

Lines changed: 31 additions & 0 deletions

File tree

ax/orchestration/tests/test_orchestrator.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -234,6 +234,36 @@ def setUp(self) -> None:
234234
]
235235
)
236236
self.orchestrator_options_kwargs = {}
237+
self._mock_orchestrator_poll_sleep()
238+
239+
def _mock_orchestrator_poll_sleep(self) -> None:
240+
"""Patch out wall-clock sleeps that only slow tests down.
241+
242+
Two sources of pure idle time are removed:
243+
244+
1. The orchestrator's polling loop (``ax.orchestration.orchestrator.sleep``).
245+
Many tests leave ``init_seconds_between_polls`` / ``min_seconds_before_poll``
246+
at their (non-zero) defaults, so the loop spends most of its wall-clock time
247+
sleeping. Loop termination depends on ``total_seconds_elapsed`` (which
248+
accumulates the configured interval regardless of actual sleeping), so
249+
removing the wait is behavior-preserving.
250+
2. The exponential backoff between DB-save retries in
251+
``retry_on_exception`` (``initial_wait_seconds=5`` → 5s + 10s = 15s for
252+
``test_suppress_all_storage_errors``). Only the ``sleep`` is mocked; the
253+
retry count and all other ``time`` functions are untouched, so retry
254+
assertions still hold. A ``wraps``-ed copy of the ``time`` module is used so
255+
that tests relying on real elapsed time (e.g. trial-TTL expiry via
256+
``time.sleep``) are unaffected.
257+
"""
258+
poll_patcher = patch("ax.orchestration.orchestrator.sleep")
259+
self.addCleanup(poll_patcher.stop)
260+
poll_patcher.start()
261+
262+
fake_time = Mock(wraps=time)
263+
fake_time.sleep = Mock()
264+
retry_patcher = patch("ax.utils.common.executils.time", fake_time)
265+
self.addCleanup(retry_patcher.stop)
266+
retry_patcher.start()
237267

238268
@property
239269
def runner_registry(self) -> dict[type[Runner], int]:
@@ -3120,6 +3150,7 @@ def setUp(self) -> None:
31203150
self.orchestrator_options_kwargs: dict[str, str | None] = {
31213151
"mt_experiment_trial_type": "type1"
31223152
}
3153+
self._mock_orchestrator_poll_sleep()
31233154

31243155
def test_init_with_no_impl_with_runner(self) -> None:
31253156
self.branin_experiment_no_impl_runner_or_metrics.update_runner(

0 commit comments

Comments
 (0)