Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/how_to_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ Training
how_to_guide/07_gpu_training.ipynb
how_to_guide/07_save_and_load.ipynb
how_to_guide/07_resume_training.ipynb
how_to_guide/21_hyperparameter_tuning.ipynb


Sampling
Expand Down
158 changes: 158 additions & 0 deletions docs/how_to_guide/21_hyperparameter_tuning.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to tune hyperparameters with Optuna"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This guide shows a minimal [`optuna`](https://optuna.org/) loop for hyperparameter\n",
"tuning in `sbi`. Optuna is a lightweight hyperparameter optimization library. You define\n",
"an objective function that trains a model (e.g., NPE) and returns a validation metric,\n",
"and Optuna runs multiple trials to explore the search space and track the best\n",
"configuration. As validation metric, we recommend using the negative log probability of\n",
"a held-out validation set `(theta, x)` under the current posterior estimate (see\n",
"Lueckmann et al. 2021 for details). \n",
"\n",
"Note that Optuna is not a dependency of `sbi`, you need to install it yourself in your\n",
"environment. \n",
"\n",
"Here, we use a toy simulator and do `NPE` with an embedding network built using the `posterior_nn` helper. We tune just two hyperparameters: the embedding dimension and the number of flow transforms in an `nsf` density estimator."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup a tiny simulation task\n",
"\n",
"```python\n",
"import optuna\n",
"import torch\n",
"from sbi.inference import NPE\n",
"from sbi.neural_nets import posterior_nn\n",
"from sbi.neural_nets.embedding_nets import FCEmbedding\n",
"from sbi.utils import BoxUniform\n",
"\n",
"torch.manual_seed(0)\n",
"\n",
"def simulator(theta):\n",
" return theta + 0.1 * torch.randn_like(theta)\n",
"\n",
"prior = BoxUniform(low=-2 * torch.ones(2), high=2 * torch.ones(2))\n",
"\n",
"theta = prior.sample((6000,))\n",
"x = simulator(theta)\n",
"# Use a separate validation data set for optuna\n",
"theta_train, x_train = theta[:5000], x[:5000]\n",
"theta_val, x_val = theta[5000:], x[5000:]\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define the Optuna objective\n",
"\n",
"```python\n",
"def objective(trial):\n",
" embedding_dim = trial.suggest_categorical(\"embedding_dim\", [16, 32, 64])\n",
" num_transforms = trial.suggest_int(\"num_transforms\", 2, 6)\n",
"\n",
" embedding_net = FCEmbedding(input_dim=x_train.shape[1], output_dim=embedding_dim)\n",
" density_estimator = posterior_nn(\n",
" model=\"nsf\",\n",
" embedding_net=embedding_net,\n",
" num_transforms=num_transforms,\n",
" )\n",
"\n",
" inference = NPE(prior=prior, density_estimator=density_estimator)\n",
" inference.append_simulations(theta_train, x_train)\n",
" estimator = inference.train(\n",
" max_num_epochs=50,\n",
" training_batch_size=128,\n",
" show_train_summary=False,\n",
" )\n",
" posterior = inference.build_posterior(estimator)\n",
"\n",
" with torch.no_grad():\n",
" nll = -posterior.log_prob_batched(\n",
" theta_val.unsqueeze(0), x=x_val\n",
" ).mean().item()\n",
" return nll\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "aad395b1",
"metadata": {},
"source": [
"## Run the study and retrain\n",
"\n",
"Optuna defaults to the TPE sampler, which is a good starting point for many experiments.\n",
"TPE (Tree-structured Parzen Estimator) is a Bayesian optimization method that\n",
"models good vs. bad trials with nonparametric densities and samples new points\n",
"that are likely to improve the objective. You can swap in other samplers (random\n",
"search, GP-based, etc.) by passing a different sampler instance to `create_study`.\n",
"\n",
"The TPE sampler uses `n_startup_trials` random trials to seed the model. With\n",
"`n_trials=25` and `n_startup_trials=10`, the first 10 trials are random and the\n",
"remaining 15 are guided by the acquisition function. If you want to ensure to start at\n",
"the default configuration, _enqueue_ it before optimization.\n",
"\n",
"```python\n",
"sampler = optuna.samplers.TPESampler(n_startup_trials=10)\n",
"study = optuna.create_study(direction=\"minimize\", sampler=sampler)\n",
"# Optional: ensure the default config is evaluated\n",
"study.enqueue_trial({\"embedding_dim\": 32, \"num_transforms\": 4})\n",
"# This will run the above NPE training up to 25 times\n",
"study.optimize(objective, n_trials=25)\n",
"\n",
"best_params = study.best_params\n",
"embedding_net = FCEmbedding(\n",
" input_dim=x_train.shape[1],\n",
" output_dim=best_params[\"embedding_dim\"],\n",
")\n",
"density_estimator = posterior_nn(\n",
" model=\"nsf\",\n",
" embedding_net=embedding_net,\n",
" num_transforms=best_params[\"num_transforms\"],\n",
")\n",
"\n",
"inference = NPE(prior=prior, density_estimator=density_estimator)\n",
"inference.append_simulations(theta, x)\n",
"final_estimator = inference.train(training_batch_size=128)\n",
"posterior = inference.build_posterior(final_estimator)\n",
"```"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading