Add utility for graceful Launchpad termination with external schedulers #352

natinew77-creator · 2025-12-08T00:25:07Z

Summary

Fixes #311

When using Acme distributed experiments with external schedulers like Ray Tune's ASHA scheduler, the scheduler may terminate trials early. However, the Launchpad processes spawned by the experiment are not automatically terminated, leaving orphan processes running.

Problem

As described in #311, when Ray Tune's ASHA scheduler terminates a trial, the mp.Process running the Launchpad program is killed, but the child processes spawned by Launchpad continue running as orphans. This happens because the termination signal is not forwarded to the Launchpad processes.

Solution

Added two new utilities to acme/utils/lp_utils.py:

1. `LaunchpadProgramStopper` (Context Manager)

A context manager that registers signal handlers for SIGTERM and SIGINT. When these signals are received, it calls lp.stop() to gracefully terminate all Launchpad processes.

2. `launch_with_termination_handler()` (Convenience Function)

A wrapper around lp.launch() that automatically uses the LaunchpadProgramStopper context manager.

Usage

from acme.utils import lp_utils

def train_function(config):
    experiment = build_experiment_config(config)
    program = experiments.make_distributed_experiment(
        experiment=experiment, num_actors=1)
    # Use the new utility instead of lp.launch()
    lp_utils.launch_with_termination_handler(program)

tuner = tune.Tuner(
    train_function,
    tune_config=tune.TuneConfig(scheduler=ASHAScheduler(...)),
)

Or using the context manager directly:

with lp_utils.LaunchpadProgramStopper():
    lp.launch(program, lp.LaunchType.LOCAL_MULTI_PROCESSING)

Testing

Verified syntax is valid with python3 -m py_compile
Follows the existing signal handling patterns used in acme/utils/signals.py

When using Acme distributed experiments with external schedulers like Ray Tune's ASHA scheduler, the scheduler may terminate trials early. However, the Launchpad processes spawned by the experiment are not automatically terminated, leaving orphan processes running. This commit adds: 1. LaunchpadProgramStopper: A context manager that registers signal handlers for SIGTERM and SIGINT. When these signals are received, it calls lp.stop() to gracefully terminate all Launchpad processes. 2. launch_with_termination_handler(): A convenience function that wraps lp.launch() with the LaunchpadProgramStopper context manager. Example usage with Ray Tune: def train_function(config): experiment = build_experiment_config(config) program = experiments.make_distributed_experiment( experiment=experiment, num_actors=1) launch_with_termination_handler(program) tuner = tune.Tuner( train_function, tune_config=tune.TuneConfig(scheduler=ASHAScheduler(...)), ) Fixes google-deepmind#311

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add utility for graceful Launchpad termination with external schedulers #352

Add utility for graceful Launchpad termination with external schedulers #352

Uh oh!

natinew77-creator commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add utility for graceful Launchpad termination with external schedulers #352

Are you sure you want to change the base?

Add utility for graceful Launchpad termination with external schedulers #352

Uh oh!

Conversation

natinew77-creator commented Dec 8, 2025

Summary

Problem

Solution

1. LaunchpadProgramStopper (Context Manager)

2. launch_with_termination_handler() (Convenience Function)

Usage

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `LaunchpadProgramStopper` (Context Manager)

2. `launch_with_termination_handler()` (Convenience Function)