Skip to content

Commit 0585c35

Browse files
authored
Merge pull request optuna#5687 from nabenabe0928/doc/add-more-information-of-hack-in-wfg
Add more information about the hack in `WFG`
2 parents 71914f3 + 79baaec commit 0585c35

File tree

1 file changed

+32
-3
lines changed

1 file changed

+32
-3
lines changed

optuna/_hypervolume/wfg.py

Lines changed: 32 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,34 @@ def _compute_exclusive_hv(
3636
if limited_sols.shape[0] == 0:
3737
return inclusive_hv
3838

39-
# NOTE(nabenabe): For hypervolume calculation, duplicated Pareto solutions can be ignored. As
40-
# limited_sols[:, 0] is sorted, all the Pareto solutions necessary for hypervolume calculation
41-
# will be eliminated with assume_unique_lexsorted=True.
39+
# NOTE(nabenabe): As the following line is a hack for speedup, I will describe several
40+
# important points to note. Even if we do not run _is_pareto_front below or use
41+
# assume_unique_lexsorted=False instead, the result of this function does not change, but this
42+
# function simply becomes slower.
43+
#
44+
# For simplicity, I call an array ``quasi-lexsorted`` if it is sorted by the first objective.
45+
#
46+
# Reason why it will be faster with _is_pareto_front
47+
# Hypervolume of a given solution set and a reference point does not change even when we
48+
# remove non Pareto solutions from the solution set. However, the calculation becomes slower
49+
# if the solution set contains many non Pareto solutions. By removing some obvious non Pareto
50+
# solutions, the calculation becomes faster.
51+
#
52+
# Reason why assume_unique_lexsorted must be True for _is_pareto_front
53+
# assume_unique_lexsorted=True actually checks weak dominance and solutions will be weakly
54+
# dominated if there are duplications, so we can remove duplicated solutions by this option.
55+
# In other words, assume_unique_lexsorted=False may significantly slow down when limited_sols
56+
# has many duplicated Pareto solutions because this function becomes an exponential algorithm
57+
# without duplication removal.
58+
#
59+
# NOTE(nabenabe): limited_sols can be non-unique and/or non-lexsorted, so I will describe why
60+
# it is fine.
61+
#
62+
# Reason why we can specify assume_unique_lexsorted=True even when limited_sols is not
63+
# All ``False`` in on_front will be correct (, but it may not be the case for ``True``) even
64+
# if limited_sols is not unique or not lexsorted as long as limited_sols is quasi-lexsorted,
65+
# which is guaranteed. As mentioned earlier, if all ``False`` in on_front is correct, the
66+
# result of this function does not change.
4267
on_front = _is_pareto_front(limited_sols, assume_unique_lexsorted=True)
4368
return inclusive_hv - _compute_hv(limited_sols[on_front], reference_point)
4469

@@ -92,6 +117,10 @@ def compute_hypervolume(
92117
on_front = _is_pareto_front(unique_lexsorted_loss_vals, assume_unique_lexsorted=True)
93118
sorted_pareto_sols = unique_lexsorted_loss_vals[on_front]
94119
else:
120+
# NOTE(nabenabe): The result of this function does not change both by
121+
# np.argsort(loss_vals[:, 0]) and np.unique(loss_vals, axis=0).
122+
# But many duplications in loss_vals significantly slows down the function.
123+
# TODO(nabenabe): Make an option to use np.unique.
95124
sorted_pareto_sols = loss_vals[loss_vals[:, 0].argsort()]
96125

97126
if reference_point.shape[0] == 2:

0 commit comments

Comments
 (0)