Skip to content

Commit ad06a85

Browse files
authored
Merge branch 'main' into refactor-tracker-interface
2 parents 9ee671d + 1bd7fe5 commit ad06a85

File tree

2 files changed

+184
-0
lines changed

2 files changed

+184
-0
lines changed

docs/how_to_guide.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ Training
4646
how_to_guide/07_gpu_training.ipynb
4747
how_to_guide/07_save_and_load.ipynb
4848
how_to_guide/07_resume_training.ipynb
49+
how_to_guide/21_hyperparameter_tuning.ipynb
4950
how_to_guide/22_experiment_tracking.ipynb
5051

5152

Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "7fb27b941602401d91542211134fc71a",
6+
"metadata": {},
7+
"source": [
8+
"# How to tune hyperparameters with Optuna"
9+
]
10+
},
11+
{
12+
"cell_type": "markdown",
13+
"id": "acae54e37e7d407bbb7b55eff062a284",
14+
"metadata": {},
15+
"source": [
16+
"This guide shows a minimal `optuna` ([documentation](https://optuna.org/)) loop for hyperparameter\n",
17+
"tuning in `sbi`. Optuna is a lightweight hyperparameter optimization library. You define\n",
18+
"an objective function that trains a model (e.g., NPE) and returns a validation metric,\n",
19+
"and Optuna runs multiple trials to explore the search space and track the best\n",
20+
"configuration. As validation metric, we recommend using the negative log probability of\n",
21+
"a held-out validation set `(theta, x)` under the current posterior estimate (see\n",
22+
"Lueckmann et al. 2021 for details). \n",
23+
"\n",
24+
"Note that Optuna is not a dependency of `sbi`, you need to install it yourself in your\n",
25+
"environment. \n",
26+
"\n",
27+
"Here, we use a toy simulator and do `NPE` with an embedding network built using the `posterior_nn` helper. We tune just two hyperparameters: the embedding dimension and the number of flow transforms in an `nsf` density estimator."
28+
]
29+
},
30+
{
31+
"cell_type": "markdown",
32+
"id": "9a63283cbaf04dbcab1f6479b197f3a8",
33+
"metadata": {},
34+
"source": [
35+
"## Setup a tiny simulation task"
36+
]
37+
},
38+
{
39+
"cell_type": "code",
40+
"execution_count": null,
41+
"id": "3iwctp8e9hj",
42+
"metadata": {},
43+
"outputs": [],
44+
"source": [
45+
"import optuna\n",
46+
"import torch\n",
47+
"\n",
48+
"from sbi.inference import NPE\n",
49+
"from sbi.neural_nets import posterior_nn\n",
50+
"from sbi.neural_nets.embedding_nets import FCEmbedding\n",
51+
"from sbi.utils import BoxUniform\n",
52+
"\n",
53+
"torch.manual_seed(0)\n",
54+
"\n",
55+
"\n",
56+
"def simulator(theta):\n",
57+
" return theta + 0.1 * torch.randn_like(theta)\n",
58+
"\n",
59+
"\n",
60+
"prior = BoxUniform(low=-2 * torch.ones(2), high=2 * torch.ones(2))\n",
61+
"\n",
62+
"theta = prior.sample((6000,))\n",
63+
"x = simulator(theta)\n",
64+
"# Use a separate validation data set for optuna\n",
65+
"theta_train, x_train = theta[:5000], x[:5000]\n",
66+
"theta_val, x_val = theta[5000:], x[5000:]"
67+
]
68+
},
69+
{
70+
"cell_type": "markdown",
71+
"id": "panj815v3nd",
72+
"metadata": {},
73+
"source": [
74+
"## Define the Optuna objective\n",
75+
"\n",
76+
"Optuna expects the objective function to return a scalar value that it will optimize. When creating a study, you specify the optimization direction: `direction=\"minimize\"` to find the configuration with the lowest objective value, or `direction=\"maximize\"` for the highest. Here, we minimize the negative log probability (NLL) on a held-out validation set, so lower is better."
77+
]
78+
},
79+
{
80+
"cell_type": "code",
81+
"execution_count": null,
82+
"id": "gcmp410rk97",
83+
"metadata": {},
84+
"outputs": [],
85+
"source": [
86+
"def objective(trial):\n",
87+
" # Optuna will track these parameters internally.\n",
88+
" embedding_dim = trial.suggest_categorical(\"embedding_dim\", [16, 32, 64])\n",
89+
" num_transforms = trial.suggest_int(\"num_transforms\", 2, 6)\n",
90+
"\n",
91+
" embedding_net = FCEmbedding(input_dim=x_train.shape[1], output_dim=embedding_dim)\n",
92+
" density_estimator = posterior_nn(\n",
93+
" model=\"nsf\",\n",
94+
" embedding_net=embedding_net,\n",
95+
" num_transforms=num_transforms,\n",
96+
" )\n",
97+
"\n",
98+
" inference = NPE(prior=prior, density_estimator=density_estimator)\n",
99+
" inference.append_simulations(theta_train, x_train)\n",
100+
" estimator = inference.train(\n",
101+
" training_batch_size=128,\n",
102+
" show_train_summary=False,\n",
103+
" )\n",
104+
" posterior = inference.build_posterior(estimator)\n",
105+
"\n",
106+
" with torch.no_grad():\n",
107+
" nll = -posterior.log_prob_batched(theta_val.unsqueeze(0), x=x_val).mean().item()\n",
108+
" # Return the metric to be optimized by Optuna.\n",
109+
" return nll"
110+
]
111+
},
112+
{
113+
"cell_type": "markdown",
114+
"id": "aad395b1",
115+
"metadata": {},
116+
"source": [
117+
"## Run the study and retrain\n",
118+
"\n",
119+
"Optuna defaults to the TPE (Tree-structured Parzen Estimator) sampler, which is a good starting point for many experiments. TPE is a Bayesian optimization method that\n",
120+
"models good vs. bad trials with nonparametric densities and samples new points\n",
121+
"that are likely to improve the objective. You can swap in other samplers (random\n",
122+
"search, Gaussian Process-based, etc.) by passing a different sampler instance to `create_study`.\n",
123+
"\n",
124+
"The TPE sampler uses `n_startup_trials` random trials to seed the model. With\n",
125+
"`n_trials=25` and `n_startup_trials=10`, the first 10 trials are random and the\n",
126+
"remaining 15 are guided by the acquisition function. If you want to ensure to start at\n",
127+
"the default configuration, _enqueue_ it before optimization."
128+
]
129+
},
130+
{
131+
"cell_type": "code",
132+
"execution_count": null,
133+
"id": "qp1lf4lzzie",
134+
"metadata": {},
135+
"outputs": [],
136+
"source": [
137+
"sampler = optuna.samplers.TPESampler(n_startup_trials=10)\n",
138+
"study = optuna.create_study(direction=\"minimize\", sampler=sampler)\n",
139+
"# Optional: ensure the default config is evaluated\n",
140+
"study.enqueue_trial({\"embedding_dim\": 32, \"num_transforms\": 4})\n",
141+
"# This will run the above NPE training up to 25 times\n",
142+
"study.optimize(objective, n_trials=25)\n",
143+
"\n",
144+
"best_params = study.best_params\n",
145+
"embedding_net = FCEmbedding(\n",
146+
" input_dim=x_train.shape[1],\n",
147+
" output_dim=best_params[\"embedding_dim\"],\n",
148+
")\n",
149+
"density_estimator = posterior_nn(\n",
150+
" model=\"nsf\",\n",
151+
" embedding_net=embedding_net,\n",
152+
" num_transforms=best_params[\"num_transforms\"],\n",
153+
")\n",
154+
"\n",
155+
"inference = NPE(prior=prior, density_estimator=density_estimator)\n",
156+
"inference.append_simulations(theta, x)\n",
157+
"final_estimator = inference.train(training_batch_size=128)\n",
158+
"posterior = inference.build_posterior(final_estimator)"
159+
]
160+
}
161+
],
162+
"metadata": {
163+
"kernelspec": {
164+
"display_name": "Python 3 (ipykernel)",
165+
"language": "python",
166+
"name": "python3"
167+
},
168+
"language_info": {
169+
"codemirror_mode": {
170+
"name": "ipython",
171+
"version": 3
172+
},
173+
"file_extension": ".py",
174+
"mimetype": "text/x-python",
175+
"name": "python",
176+
"nbconvert_exporter": "python",
177+
"pygments_lexer": "ipython3",
178+
"version": "3.12.4"
179+
}
180+
},
181+
"nbformat": 4,
182+
"nbformat_minor": 5
183+
}

0 commit comments

Comments
 (0)