Skip to content

Conversation

@nabenabe0928
Copy link
Contributor

@nabenabe0928 nabenabe0928 commented May 23, 2025

Contributor Agreements

Please read the contributor agreements and if you agree, please click the checkbox below.

  • I agree to the contributor agreements.

Tip

Please follow the Quick TODO list to smoothly merge your PR.

Motivation

Description of the changes

TODO List towards PR Merge

Please remove this section if this PR is not an addition of a new package.
Otherwise, please check the following TODO list:

  • Copy ./template/ to create your package
  • Replace <COPYRIGHT HOLDER> in LICENSE of your package with your name
  • Fill out README.md in your package
  • Add import statements of your function or class names to be used in __init__.py
  • (Optional) Add from __future__ import annotations at the head of any Python files that include typing to support older Python versions
  • Apply the formatter based on the tips in README.md
  • Check whether your module works as intended based on the tips in README.md

@HideakiImamura
Copy link
Member

@kAIto47802 Could you review this PR?

@@ -0,0 +1,56 @@
---
author: Shuhei Watanabe
title: A Sampler Using Parameter-Wise Bisection, aka Binary, Search
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: A Sampler Using Parameter-Wise Bisection, aka Binary, Search
title: A Sampler Using Parameter-Wise Bisection, aka Binary Search


def infer_relative_search_space(
self, study: optuna.Study, trial: optuna.trial.FrozenTrial
) -> dict[str, optuna.distributions.BaseDistribution]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addition of optuna.distributions is inconsistent. We can remove it because BaseDistributions is already imported.

Suggested change
) -> dict[str, optuna.distributions.BaseDistribution]:
) -> dict[str, BaseDistribution]:

param_name: str,
param_distribution: BaseDistribution,
) -> Any:
if isinstance(param_distribution, optuna.distributions.CategoricalDistribution):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Note that we have to add from optuna.distributions import CategoricalDistribution at the begging of this file.

Suggested change
if isinstance(param_distribution, optuna.distributions.CategoricalDistribution):
if isinstance(param_distribution, CategoricalDistribution):

Comment on lines +117 to +120
low = param_distribution.low
# The last element is padded to code the binary search routine cleaner.
high = param_distribution.high + step
assert step is not None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving this assertion here, since step is already used in the addition operation, which doesn’t allow None?

Suggested change
low = param_distribution.low
# The last element is padded to code the binary search routine cleaner.
high = param_distribution.high + step
assert step is not None
assert step is not None
low = param_distribution.low
# The last element is padded to code the binary search routine cleaner.
high = param_distribution.high + step

assert mid_index != len(possible_param_values) - 1, "The last element is for convenience."
return possible_param_values[mid_index].item()

def _get_possible_param_values(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think we need to list all possible parameter values for discrete distributions.
Doing so would make each sampling O(n_steps), which defeats the benefit of using binary search. Note that n_steps can be large, e.g., in trial.suggest_int("x", 0, 1 << 30, step=2).

We can avoid this by using the index of the discrete search space, as I’ll suggest in the alternative code below:

Comment on lines +131 to +135
possible_param_values = self._get_possible_param_values(dist)
indices = np.arange(len(possible_param_values))
left_index = indices[np.isclose(possible_param_values, left)][0]
right_index = indices[np.isclose(possible_param_values, right)][0]
return right_index - left_index <= 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
possible_param_values = self._get_possible_param_values(dist)
indices = np.arange(len(possible_param_values))
left_index = indices[np.isclose(possible_param_values, left)][0]
right_index = indices[np.isclose(possible_param_values, right)][0]
return right_index - left_index <= 1
left_index = int(np.round((left - dist.low) / dist.step))
right_index = int(np.round((right - dist.low) / dist.step))
return right_index - left_index <= 1

Comment on lines +113 to +123
def _get_possible_param_values(
self, param_distribution: FloatDistribution | IntDistribution
) -> np.ndarray:
step = param_distribution.step
low = param_distribution.low
# The last element is padded to code the binary search routine cleaner.
high = param_distribution.high + step
assert step is not None
n_steps = int(np.round((high - low) / step)) + 1
return np.linspace(low, high, n_steps)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _get_possible_param_values(
self, param_distribution: FloatDistribution | IntDistribution
) -> np.ndarray:
step = param_distribution.step
low = param_distribution.low
# The last element is padded to code the binary search routine cleaner.
high = param_distribution.high + step
assert step is not None
n_steps = int(np.round((high - low) / step)) + 1
return np.linspace(low, high, n_steps)

Comment on lines +105 to +111
possible_param_values = self._get_possible_param_values(param_distribution)
indices = np.arange(len(possible_param_values))
left_index = indices[np.isclose(possible_param_values, left)][0]
right_index = indices[np.isclose(possible_param_values, right)][0]
mid_index = (right_index + left_index) // 2
assert mid_index != len(possible_param_values) - 1, "The last element is for convenience."
return possible_param_values[mid_index].item()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
possible_param_values = self._get_possible_param_values(param_distribution)
indices = np.arange(len(possible_param_values))
left_index = indices[np.isclose(possible_param_values, left)][0]
right_index = indices[np.isclose(possible_param_values, right)][0]
mid_index = (right_index + left_index) // 2
assert mid_index != len(possible_param_values) - 1, "The last element is for convenience."
return possible_param_values[mid_index].item()
left_index = int(np.round((left - param_distribution.low) / step))
right_index = int(np.round((right - param_distribution.low) / step))
mid_index = (left_index + right_index) // 2
return param_distribution.low + mid_index * step

@kAIto47802
Copy link
Collaborator

kAIto47802 commented Oct 24, 2025

Also, I still do not understand the motivation for adding this sampler.
Currently, this sampler searches the best variable value satisfying xxx_is_too_high, which is obvious and does not require any search algorithm, since it simply sets bounds for each variable directly.

The name BisectSampler gives me the impression that it performs a binary search on the user-provided objective function to find the best variable satisfying the given bounds, assuming that the objective function is monotonic. However, the current one doesn’t actually work that way.

BisectSampler = optunahub.load_module("samplers/bisect").BisectSampler


def objective(trial: optuna.Trial, score_func: Callable[[optuna.Trial], float]) -> float:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it include the score_func argument, which will never be used and requires partial application before passed to study.optimize?

Comment on lines +23 to +24
PREFIX_LEFT = "bisect:left_"
PREFIX_RIGHT = "bisect:right_"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PREFIX_LEFT = "bisect:left_"
PREFIX_RIGHT = "bisect:right_"
_PREFIX_LEFT = "bisect:left_"
_PREFIX_RIGHT = "bisect:right_"

Comment on lines +121 to +122
n_steps = int(np.round((high - low) / step)) + 1
return np.linspace(low, high, n_steps)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the calculation of the possible parameter values is incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants