-
Notifications
You must be signed in to change notification settings - Fork 51
Add TuRBOSampler
#319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add TuRBOSampler
#319
Conversation
PerformanceThe performace was measured by comparing it with function_id = 1:
function_id = 22:
Although the best values are comparable, runtime is substantially reduced, especially at larger iteration counts —as expected for this approach. |
|
For ease of review: Given that this sampler mainly extends GPSampler, focusing on the diff starting from c574e68 should be sufficient to verify the logic. Thank you in advance! |
gen740
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR! I've left few comment.
package/samplers/turbo/sampler.py
Outdated
| self._init_length = 0.8 | ||
| self._max_length = 1.6 | ||
| self._min_length = 0.5**7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How abount setting these values via constructor arguments?
package/samplers/turbo/sampler.py
Outdated
| def reset_trust_region(self, delete_trust_region_id: int) -> None: | ||
| self._trial_ids_for_trust_region[delete_trust_region_id] = [] | ||
| self._length[delete_trust_region_id] = self._init_length | ||
| self._n_consecutive_success[delete_trust_region_id] = 0 | ||
| self._n_consecutive_failure[delete_trust_region_id] = 0 | ||
| self._best_value_in_current_trust_region[delete_trust_region_id] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function is internal, so it should start with an underscore.
Alternatively, it can be removed because it is used only once.
package/samplers/turbo/sampler.py
Outdated
| def _get_best_params_for_multi_objective( | ||
| self, | ||
| normalized_params: np.ndarray, | ||
| standardized_score_vals: np.ndarray, | ||
| ) -> np.ndarray: | ||
| pareto_params = normalized_params[ | ||
| _is_pareto_front(-standardized_score_vals, assume_unique_lexsorted=False) | ||
| ] | ||
| n_pareto_sols = len(pareto_params) | ||
| # TODO(nabenabe): Verify the validity of this choice. | ||
| size = min(self._n_local_search // 2, n_pareto_sols) | ||
| chosen_indices = self._rng.rng.choice(n_pareto_sols, size=size, replace=False) | ||
| return pareto_params[chosen_indices] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not used, how about removing it.
| def sample_relative( | ||
| self, study: Study, trial: FrozenTrial, search_space: dict[str, BaseDistribution] | ||
| ) -> dict[str, Any]: | ||
| if search_space == {}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if search_space == {}: | |
| if study._is_multi_objective(): | |
| raise ValueError("TurboSampler does not support multi-objective optimization.") | |
| if search_space == {}: |
How about checking whether this is multi-objective or not, as in the following code?
optunahub-registry/package/samplers/carbo/sampler.py
Lines 158 to 159 in 6818488
| if study._is_multi_objective(): | |
| raise ValueError("CARBOSampler does not support multi-objective optimization.") |
package/samplers/turbo/sampler.py
Outdated
| for id in range(self._n_trust_region): | ||
| if len(self._trial_ids_for_trust_region[id]) < self._n_startup_trials: | ||
| self._trial_ids_for_trust_region[id].append(trial._trial_id) | ||
| return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for id in range(self._n_trust_region): | |
| if len(self._trial_ids_for_trust_region[id]) < self._n_startup_trials: | |
| self._trial_ids_for_trust_region[id].append(trial._trial_id) | |
| return {} | |
| if any(len(ids) < self._n_startup_trials for ids in self._trial_ids_for_trust_region): | |
| return {} |
These lines can be simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion. While your approach simplifies the code significantly, in this case I need to append the trial_id to self._trial_ids_for_trust_region[id]. As you pointed out, the original code was somewhat hard to follow, so I have refactored it to make it clearer. I would appreciate any further suggestions or feedback you may have.
package/samplers/turbo/sampler.py
Outdated
| if len(trials) < self._n_startup_trials: | ||
| self._trial_ids_for_trust_region[id].append(trial._trial_id) | ||
| return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines maybe redundant.
package/samplers/turbo/sampler.py
Outdated
| _sign = np.array( | ||
| [-1.0 if d == StudyDirection.MINIMIZE else 1.0 for d in study.directions] | ||
| ) | ||
| standardized_score_vals, _, _ = _standardize_values( | ||
| _sign * np.array([trial.values for trial in trials]) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turbo is single-objective only, so we can simplify this part.
sign = -1.0 if study.direction == StudyDirection.MINIMIZE else 1.0
package/samplers/turbo/sampler.py
Outdated
| if direction == StudyDirection.MINIMIZE: | ||
| if values[0] < best_value: | ||
| self._n_consecutive_success[trust_region_id] += 1 | ||
| self._n_consecutive_failure[trust_region_id] = 0 | ||
| self._best_value_in_current_trust_region[trust_region_id] = values[0] | ||
| else: | ||
| self._n_consecutive_success[trust_region_id] = 0 | ||
| self._n_consecutive_failure[trust_region_id] += 1 | ||
| else: | ||
| if values[0] > best_value: | ||
| self._n_consecutive_success[trust_region_id] += 1 | ||
| self._n_consecutive_failure[trust_region_id] = 0 | ||
| self._best_value_in_current_trust_region[trust_region_id] = values[0] | ||
| else: | ||
| self._n_consecutive_success[trust_region_id] = 0 | ||
| self._n_consecutive_failure[trust_region_id] += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch can be simplified by checking whether direction == StudyDirection.MINIMIZE != values[0] < best_value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this is not a major issue, using (direction == StudyDirection.MINIMIZE) != (values[0] < best_value) allows the case where direction == StudyDirection.MAXIMIZE && values[0] == best_value, which is not ideal.
To make the logic clearer, I instead compute whether the value has improved first:
is_better = (
values[0] < best_value
if direction == StudyDirection.MINIMIZE
else values[0] > best_value
)
package/samplers/turbo/sampler.py
Outdated
| if self._length[trust_region_id] < self._min_length: | ||
| self.reset_trust_region(trust_region_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about check this condition first?
|
Thank you for the suggestions on this large PR. I’ve incorporated some changes based on your feedback. PTAL! |
Contributor Agreements
Please read the contributor agreements and if you agree, please click the checkbox below.
Tip
Please follow the Quick TODO list to smoothly merge your PR.
Motivation
Add a trust-region Bayesian optimizer (TuRBO) to OptunaHub.
Description of the changes
Implements
TuRBOSampleron based onGPSampler. Note that categorical parameters are currently unsupported, and multi-objective optimization is not available.Please refer to the paper, Scalable Global Optimization via Local Bayesian Optimization for more information.
TODO List towards PR Merge
Please remove this section if this PR is not an addition of a new package.
Otherwise, please check the following TODO list:
./template/to create your package<COPYRIGHT HOLDER>inLICENSEof your package with your nameREADME.mdin your package__init__.pyfrom __future__ import annotationsat the head of any Python files that include typing to support older Python versionsREADME.mdREADME.md