You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you want to conduct black-box optimization in SMAC (https://arxiv.org/abs/2109.09831), and you have prior knowledge about the which regions of the search space are more likely to contain the optimum, you may include this knowledge when designing the configuration space. More specifically, you place prior distributions over the optimum on the parameters, either by a (log)-normal or (log)-Beta distribution. SMAC then considers the given priors through the optimization by using PiBO (https://openreview.net/forum?id=MMAeCXIa89).
251
+
If you want to conduct black-box optimization in SMAC (https://arxiv.org/abs/2109.09831), and you have prior knowledge about the which regions of the search space are more likely to contain the optimum, you may include this knowledge when designing the configuration space. More specifically, you place prior distributions over the optimum on the parameters, either by a (log)-normal or (log)-Beta distribution. SMAC then considers the given priors through the optimization by using PiBO (https://openreview.net/forum?id=MMAeCXIa89).
252
252
253
253
Consider the case of optimizing the accuracy of an MLP with three hyperparameters: learning rate [1e-5, 1e-1], dropout [0, 0.99] and activation {Tanh, ReLU}. From prior experience, you believe the optimal learning rate to be around 1e-3, a good dropout to be around 0.25, and the optimal activation function to be ReLU about 80% of the time. This can be represented accordingly:
254
254
255
-
>>> import numpy as np
256
-
>>> import ConfigSpace.hyperparameters asCSH
257
-
>>> from ConfigSpace.configuration_space import ConfigurationSpace
258
-
>>> # convert 10 log to natural log for learning rate, mean 1e-3
259
-
>>> logmean = np.log(1e-3)
260
-
>>> # two standard deviations on either side of the mean to cover the search space
To check that your prior makes sense for each hyperparameter, you can easily do so with the __pdf__ method. There, you will see that the probability of the optimal learning rate peaks at 10^-3, and decays as we go further away from it:
271
276
272
-
>>> test_points = np.logspace(-5, -1, 5)
273
-
>>> test_points
274
-
array([1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01])
277
+
.. code-block:: python
278
+
279
+
test_points = np.logspace(-5, -1, 5)
280
+
281
+
print(test_points)
282
+
# array([1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01])
283
+
284
+
The pdf function accepts an (N, ) numpy array as input.
285
+
286
+
.. code-block:: python
275
287
276
-
# the pdf function accepts an (N, ) numpy array as input.
0 commit comments