|
169 | 169 | "\n", |
170 | 170 | "The next cell runs Tune for this purpose. The comments explain what each argument does. We'll do four tries, one for each combination of the two possible values for the two hidden layers.\n", |
171 | 171 | "\n", |
172 | | - "> **Note:** `tune.run` will handle Ray initialization for us, if it isn't already initialized. To force Tune to throw an error instead, pass the argument `ray_auto_init=False`." |
| 172 | + "> **Note:** `tune.run` will handle Ray initialization for us, if it isn't already initialized. To force Tune to throw an error instead, pass the argument `ray_auto_init=False`.\n", |
| 173 | + "\n", |
| 174 | + "The next cell will take 5-6 minutes to run." |
173 | 175 | ] |
174 | 176 | }, |
175 | 177 | { |
|
187 | 189 | " config={\n", |
188 | 190 | " \"env\": \"CartPole-v1\", # Tune can associate this string with the environment.\n", |
189 | 191 | " \"num_gpus\": 0, # If you have GPUs, go for it!\n", |
190 | | - " \"num_workers\": 6, # Number of Ray workers to use (arbitrary choice).\n", |
| 192 | + " \"num_workers\": 3, # Number of Ray workers to use; Use one LESS than \n", |
| 193 | + " # the number of cores you wan to use (or omit this argument)!\n", |
191 | 194 | " \"model\": { # The NN model we'll optimize.\n", |
192 | 195 | " 'fcnet_hiddens': [ # \"Fully-connected network with N hidden layers\".\n", |
193 | 196 | " tune.grid_search([20, 40]), # Try these four values for layer one.\n", |
|
277 | 280 | "cell_type": "markdown", |
278 | 281 | "metadata": {}, |
279 | 282 | "source": [ |
280 | | - "We see from this table that the `[20,20]` hyperparameter set took the *most* training iterations, which is understandable as it is the least powerful network configuration. The corresponding number of timesteps was the longest. In contrast, `[40,40]` was the fastest to train with almost the same `episode_reward_mean` value.\n", |
| 283 | + "We see from this table that the `[20,20]` hyperparameter set took the *most* training iterations, which is understandable as it is the least powerful network configuration. The corresponding number of timesteps was the longest. In contrast, `[40,20]` and `[40,40]` are the fastest to train with almost the same `episode_reward_mean` value.\n", |
281 | 284 | "\n", |
282 | 285 | "Since all four combinations perform equally well, perhaps it's best to choose the largest network as it trains the fastest. If we need to train the neural network frequently, then fast training times might be most important. This also suggests that we should be sure the trial sizes we used are really best. In a real-world application, you would want to spend more time on HPO, trying a larger set of possible values." |
283 | 286 | ] |
|
0 commit comments