-
Notifications
You must be signed in to change notification settings - Fork 62
Add Batch Sampler: coordinate parallel workers into jointly-selected batches #376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,145 @@ | ||
| --- | ||
| author: Mark Shipman | ||
| title: Remote GP q-EI Sampler | ||
| description: Batch Bayesian optimisation via q-Expected Improvement, delegating GP fitting and candidate scoring to a remote HTTP service. | ||
| tags: [sampler, bayesian-optimization, batch] | ||
| optuna_versions: [4.8.0] | ||
| license: MIT License | ||
| --- | ||
|
|
||
| ## Abstract | ||
|
|
||
| `qEISampler` is a batch Bayesian optimisation sampler for Optuna. | ||
| Instead of fitting a Gaussian Process locally, it sends the current observations to a user-supplied HTTP endpoint that returns a batch of `q` candidates maximising the q-Expected Improvement (q-EI) acquisition function. | ||
|
|
||
| This design is useful when: | ||
|
|
||
| - GP fitting is too slow to run inside the Optuna worker process (large datasets, expensive kernels). | ||
| - You want to centralise the surrogate model on a GPU server or managed service (e.g. Modal, AWS Lambda, Cloud Run). | ||
| - You need to reuse the same GP service across multiple concurrent Optuna studies. | ||
|
|
||
| During the startup phase (fewer than `n_startup_trials` complete trials) the sampler falls back to random search automatically. | ||
|
|
||
| ## APIs | ||
|
|
||
| ### `DimSpec(name, type, low, high, log=False, step=None)` | ||
|
|
||
| Dataclass describing one dimension of the search space. | ||
|
|
||
| | Field | Type | Description | | ||
| | ------ | ---------------- | --------------------------------------------------------- | | ||
| | `name` | `str` | Parameter name (must match `trial.suggest_*` calls). | | ||
| | `type` | `"float"\|"int"` | Distribution family. | | ||
| | `low` | `float` | Lower bound (inclusive). | | ||
| | `high` | `float` | Upper bound (inclusive). | | ||
| | `log` | `bool` | Use log-uniform spacing. Default `False`. | | ||
| | `step` | `float\|None` | Grid step for `int` dims (default 1). Unused for `float`. | | ||
|
|
||
| ### `qEISampler(search_space, api_url, ...)` | ||
|
|
||
| | Argument | Type | Default | Description | | ||
| | ------------------ | --------------- | -------------- | ------------------------------------------------------------------------- | | ||
| | `search_space` | `list[DimSpec]` | — | **Required.** Dimensions of the optimisation problem. | | ||
| | `api_url` | `str` | — | **Required.** URL of the GP suggestion endpoint (see API contract below). | | ||
| | `n_startup_trials` | `int` | `8` | Random trials before GP is used. | | ||
| | `q` | `int` | `4` | Batch size — number of candidates requested per API call. | | ||
| | `n_candidates` | `int` | `512` | Quasi-random candidates evaluated by the acquisition function. | | ||
| | `train_steps` | `int` | `60` | GP hyperparameter optimisation steps on the server side. | | ||
| | `lr` | `float` | `0.1` | Learning rate for GP hyperparameter optimisation. | | ||
| | `xi` | `float` | `0.01` | Exploration bonus added to the best observed value before computing EI. | | ||
| | `mode` | `str` | `"production"` | `"debug"` prints per-batch EI scores to stdout. | | ||
| | `seed` | `int\|None` | `None` | Seed for the fallback random sampler. | | ||
| | `timeout` | `float` | `120.0` | HTTP request timeout in seconds. | | ||
|
|
||
| ## Backend API contract | ||
|
|
||
| The sampler POSTs JSON to `api_url` and expects a JSON response. | ||
|
|
||
| **Request body** | ||
|
|
||
| ```json | ||
| { | ||
| "X": [[x1_dim1, x1_dim2, ...], [x2_dim1, ...], ...], | ||
| "y": [-val1, -val2, ...], | ||
| "search_space": [{"name": "x", "type": "float", "low": -5, "high": 5, "log": false, "step": null}], | ||
| "q": 4, | ||
| "n_candidates": 512, | ||
| "train_steps": 60, | ||
| "lr": 0.1, | ||
| "xi": 0.01, | ||
| "mode": "production" | ||
| } | ||
| ``` | ||
|
|
||
| Notes: | ||
|
|
||
| - `X` rows correspond to completed trials; columns correspond to `search_space` dims in order. | ||
| - `y` values are **negated** trial objectives (the server maximises q-EI; Optuna minimises). | ||
|
|
||
| **Response body** | ||
|
|
||
| ```json | ||
| { | ||
| "candidates": [ | ||
| {"x": [v1_dim1, v1_dim2, ...]}, | ||
| {"x": [v2_dim1, v2_dim2, ...]}, | ||
| ... | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| The server must return exactly `q` candidates. | ||
| Each `x` array must have the same length as `search_space`. | ||
|
|
||
| Optional debug fields (`ei_all`, `ei_scores`) are consumed when `mode="debug"`. | ||
|
|
||
| ## Installation | ||
|
|
||
| ```shell | ||
| pip install optuna optunahub | ||
| ``` | ||
|
|
||
| No additional Python dependencies are required — the sampler uses only the standard library and Optuna. | ||
|
|
||
| ### Backend endpoint | ||
|
|
||
| You must supply an `api_url` that implements the contract above. Two options: | ||
|
|
||
| **Use the hosted endpoint** (no setup required): | ||
|
|
||
| ``` | ||
| https://markshipman4273--bo-gp-service-gp-suggest.modal.run | ||
| ``` | ||
|
|
||
| This is a publicly available Modal deployment. Pass it directly as `api_url`. | ||
|
|
||
| **Deploy your own** using the open-source backend at | ||
| [sign-of-fourier/quantecarlo](https://github.com/sign-of-fourier/quantecarlo), | ||
| or implement the request/response contract on any HTTP server. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you please remove this information from the documentation? As we do not officially support calling third-party APIs in this context, I think it may be better not to include this example here. Thank you very much for your understanding.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I will remove. I'll change the language to explain q-EI as an example as to why one would need to use a remote source for this computation and describe what it would do, but I'll leave out any actual link. |
||
|
|
||
| ## Example | ||
|
|
||
| ```python | ||
| import optuna | ||
| import optunahub | ||
|
|
||
|
|
||
| module = optunahub.load_module(package="samplers/q_ei_sampler") | ||
| DimSpec = module.DimSpec | ||
| qEISampler = module.qEISampler | ||
|
|
||
| # Substitute the URL of your own GP service. | ||
| sampler = qEISampler( | ||
| search_space=[ | ||
| DimSpec("lr", "float", 1e-4, 1e-1, log=True), | ||
| DimSpec("n_hidden", "int", 16, 256), | ||
| ], | ||
| api_url="https://your-gp-service/suggest", | ||
| q=4, | ||
| n_startup_trials=8, | ||
| ) | ||
|
|
||
| study = optuna.create_study(direction="minimize", sampler=sampler) | ||
| study.optimize(lambda trial: trial.suggest_float("lr", 1e-4, 1e-1, log=True) ** 2, n_trials=40) | ||
| print("Best value:", study.best_value) | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| from ._bo_sampler import DimSpec | ||
| from ._bo_sampler import qEISampler | ||
|
|
||
|
|
||
| __all__ = ["DimSpec", "qEISampler"] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,198 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from collections import deque | ||
| from dataclasses import asdict | ||
| from dataclasses import dataclass | ||
| import json | ||
| import threading | ||
| from typing import Any | ||
| from typing import TYPE_CHECKING | ||
| import urllib.request | ||
| import warnings | ||
|
|
||
| from optuna.distributions import FloatDistribution | ||
| from optuna.distributions import IntDistribution | ||
| from optuna.samplers import BaseSampler | ||
| from optuna.samplers import RandomSampler | ||
| from optuna.trial import TrialState | ||
|
|
||
|
|
||
| if TYPE_CHECKING: | ||
| from optuna.distributions import BaseDistribution | ||
| from optuna.study import Study | ||
| from optuna.trial import FrozenTrial | ||
|
|
||
|
|
||
| @dataclass | ||
| class DimSpec: | ||
| """Describes one dimension of the search space.""" | ||
|
|
||
| name: str | ||
| type: str # "float" | "int" | ||
| low: float | ||
| high: float | ||
| log: bool = False | ||
| step: float | None = None # grid step for int dims (default 1) | ||
|
|
||
|
|
||
| class qEISampler(BaseSampler): | ||
| """Optuna sampler that delegates GP fitting and q-EI scoring to a remote HTTP service. | ||
|
|
||
| Fills a local deque with q suggestions on the first ask after the cache empties, | ||
| then hands them out one at a time. Falls back to random sampling during startup | ||
| and if the API call fails. | ||
|
|
||
| Thread-safety: a single threading.Lock ensures only one API call fires per batch | ||
| even when study.optimize(n_jobs=q) drives concurrent sample_relative calls. | ||
| """ | ||
|
|
||
| def __init__( | ||
| self, | ||
| search_space: list[DimSpec], | ||
| api_url: str, | ||
| n_startup_trials: int = 8, | ||
| q: int = 4, | ||
| n_candidates: int = 512, | ||
| train_steps: int = 60, | ||
| lr: float = 0.1, | ||
| xi: float = 0.01, | ||
| mode: str = "production", | ||
| seed: int | None = None, | ||
| timeout: float = 120.0, | ||
| ) -> None: | ||
| if not api_url: | ||
| raise ValueError( | ||
| "api_url must be set to the URL of your GP suggestion service. " | ||
| "See the README for the expected request/response contract." | ||
| ) | ||
| self._api_url = api_url | ||
| self._search_space = search_space | ||
| self._n_startup_trials = n_startup_trials | ||
| self._q = q | ||
| self._n_candidates = n_candidates | ||
| self._train_steps = train_steps | ||
| self._lr = lr | ||
| self._xi = xi | ||
| self._mode = mode | ||
| self._timeout = timeout | ||
| self._independent_sampler = RandomSampler(seed=seed) | ||
| self._pending: deque[dict[str, Any]] = deque() | ||
| self._lock = threading.Lock() | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # BaseSampler interface | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def infer_relative_search_space( | ||
| self, | ||
| study: Study, | ||
| trial: FrozenTrial, | ||
| ) -> dict[str, BaseDistribution]: | ||
| result: dict[str, BaseDistribution] = {} | ||
| for dim in self._search_space: | ||
| if dim.type == "float": | ||
| result[dim.name] = FloatDistribution(dim.low, dim.high, log=dim.log, step=dim.step) | ||
| elif dim.type == "int": | ||
| result[dim.name] = IntDistribution( | ||
| int(dim.low), | ||
| int(dim.high), | ||
| log=dim.log, | ||
| step=int(dim.step) if dim.step is not None else 1, | ||
| ) | ||
| return result | ||
|
|
||
| def sample_relative( | ||
| self, | ||
| study: Study, | ||
| trial: FrozenTrial, | ||
| search_space: dict[str, BaseDistribution], | ||
| ) -> dict[str, Any]: | ||
| with self._lock: | ||
| if self._pending: | ||
| return self._pending.popleft() | ||
|
|
||
| complete_trials = study.get_trials(deepcopy=False, states=(TrialState.COMPLETE,)) | ||
| if len(complete_trials) < self._n_startup_trials: | ||
| return {} | ||
|
|
||
| param_names = [dim.name for dim in self._search_space] | ||
| usable = [ | ||
| t | ||
| for t in complete_trials | ||
| if all(n in t.params for n in param_names) and t.value is not None | ||
| ] | ||
| if len(usable) < self._n_startup_trials: | ||
| return {} | ||
|
|
||
| X = [[float(t.params[n]) for n in param_names] for t in usable] | ||
| # Negate values: q-EI maximises, Optuna minimises. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As you showed in the example, Optuna can define the optimization direction explicitly. For example, the user can make Optuna maximize the objective by writing: study = optuna.create_study(direction="maximize", sampler=sampler)In addition, I believe the optimization direction of the remote optimizer depends on its implementer or user. Therefore, it is not necessarily true that remote optimizer always maximizes the objective.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. Don't negate it and the user will align. If neither direction nor directions is chosen, then the default is minimize. So, here just leave it out because this part doesn't depend on a particular backend. It will follow expected, normal Optuna behavior. |
||
| y = [-float(t.value) for t in usable] # type: ignore[arg-type] | ||
|
|
||
| payload = { | ||
| "X": X, | ||
| "y": y, | ||
| "search_space": [asdict(dim) for dim in self._search_space], | ||
| "q": self._q, | ||
| "n_candidates": self._n_candidates, | ||
| "train_steps": self._train_steps, | ||
| "lr": self._lr, | ||
| "xi": self._xi, | ||
| "mode": self._mode, | ||
| } | ||
|
|
||
| try: | ||
| data = self._post(payload) | ||
| if self._mode == "debug" and data.get("ei_all") is not None: | ||
| ei_all = data["ei_all"] | ||
| display = [round(v, 6) if v is not None else "NaN" for v in ei_all] | ||
| print(f"\n[debug] ei_all ({len(ei_all)} batches): {display}") | ||
| valid = [v for v in ei_all if v is not None] | ||
| if valid: | ||
| print( | ||
| f"[debug] max ei: {max(valid):.6f} " | ||
| f"winning batch ei_score: {data.get('ei_scores')}" | ||
| ) | ||
| except Exception as exc: | ||
| warnings.warn( | ||
| f"qEISampler: API call failed ({exc}), falling back to random.", | ||
| stacklevel=2, | ||
| ) | ||
| return {} | ||
|
|
||
| for candidate in data["candidates"]: | ||
| params: dict[str, Any] = {} | ||
| for i, dim in enumerate(self._search_space): | ||
| val: Any = float(candidate["x"][i]) | ||
| if dim.type == "int": | ||
| val = int(round(float(val))) | ||
| params[dim.name] = val | ||
| self._pending.append(params) | ||
|
|
||
| return self._pending.popleft() if self._pending else {} | ||
|
|
||
| def sample_independent( | ||
| self, | ||
| study: Study, | ||
| trial: FrozenTrial, | ||
| param_name: str, | ||
| param_distribution: BaseDistribution, | ||
| ) -> Any: | ||
| return self._independent_sampler.sample_independent( | ||
| study, trial, param_name, param_distribution | ||
| ) | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # Internal | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def _post(self, payload: dict[str, Any]) -> dict[str, Any]: | ||
| body = json.dumps(payload).encode("utf-8") | ||
| req = urllib.request.Request( | ||
| self._api_url, | ||
| data=body, | ||
| headers={"Content-Type": "application/json"}, | ||
| method="POST", | ||
| ) | ||
| with urllib.request.urlopen(req, timeout=self._timeout) as resp: | ||
| result: dict[str, Any] = json.loads(resp.read().decode("utf-8")) | ||
| return result | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import optuna | ||
| import optunahub | ||
|
|
||
|
|
||
| module = optunahub.load_module(package="samplers/q_ei_sampler") | ||
| DimSpec = module.DimSpec | ||
| qEISampler = module.qEISampler | ||
|
|
||
| sampler = qEISampler( | ||
| search_space=[DimSpec("x", "float", -5.0, 5.0)], | ||
| api_url="https://your-gp-service/suggest", # substitute your own endpoint | ||
| ) | ||
|
|
||
| study = optuna.create_study(direction="minimize", sampler=sampler) | ||
| study.optimize(lambda trial: trial.suggest_float("x", -5, 5) ** 2, n_trials=20) | ||
| print("Best value:", study.best_value) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| optuna>=3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this sampler could also be useful for optimization methods other than Bayesian optimization, even though it certainly works well in the Bayesian optimization setting.
If this understanding is correct, would it make sense to make the explanations and naming a bit more general so that the sampler can cover a broader range of optimizers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Something like "Remote Sampler", but then for the example, describe q-EI as an example.