add how-to-guide for hyper param optim with optuna.

janfb · janfb · commit ffe8ddba66ca · 2026-01-22T17:01:27.000+01:00
diff --git a/docs/how_to_guide/21_hyperparameter_tuning.ipynb b/docs/how_to_guide/21_hyperparameter_tuning.ipynb
@@ -0,0 +1,157 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# How to tune hyperparameters with Optuna"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This guide shows a minimal [`optuna`](https://optuna.org/) loop for hyperparameter tuning in `sbi`. It uses a toy simulator, `NPE`, an embedding network, and the `posterior_nn` helper. We tune just two hyperparameters: the embedding dimension and the number of flow transforms in an `nsf` density estimator."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Optuna is not a dependency of `sbi`, you need to install it yourself in your\n",
+    "environment. \n",
+    "\n",
+    "Optuna is a lightweight hyperparameter optimization library. You define an objective\n",
+    "function that trains a model (e.g., NPE) and returns a validation metric, and Optuna runs multiple\n",
+    "trials to explore the search space and track the best configuration. As validation\n",
+    "metric, we recommend using the negative log probability of a held-out validation set\n",
+    "`(theta, x)` under the current posterior estimate (see Lueckmann et al. 2021 for\n",
+    "details). "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup a tiny simulation task\n",
+    "\n",
+    "```python\n",
+    "import optuna\n",
+    "import torch\n",
+    "from sbi.inference import NPE\n",
+    "from sbi.neural_nets import posterior_nn\n",
+    "from sbi.neural_nets.embedding_nets import FCEmbedding\n",
+    "from sbi.utils import BoxUniform\n",
+    "\n",
+    "torch.manual_seed(0)\n",
+    "\n",
+    "def simulator(theta):\n",
+    "    return theta + 0.1 * torch.randn_like(theta)\n",
+    "\n",
+    "prior = BoxUniform(low=-2 * torch.ones(2), high=2 * torch.ones(2))\n",
+    "\n",
+    "theta = prior.sample((6000,))\n",
+    "x = simulator(theta)\n",
+    "# Use a separate validation data set for optuna\n",
+    "theta_train, x_train = theta[:5000], x[:5000]\n",
+    "theta_val, x_val = theta[5000:], x[5000:]\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Define the Optuna objective\n",
+    "\n",
+    "```python\n",
+    "def objective(trial):\n",
+    "    embedding_dim = trial.suggest_categorical(\"embedding_dim\", [16, 32, 64])\n",
+    "    num_transforms = trial.suggest_int(\"num_transforms\", 2, 6)\n",
+    "\n",
+    "    embedding_net = FCEmbedding(input_dim=x_train.shape[1], output_dim=embedding_dim)\n",
+    "    density_estimator = posterior_nn(\n",
+    "        model=\"nsf\",\n",
+    "        embedding_net=embedding_net,\n",
+    "        num_transforms=num_transforms,\n",
+    "    )\n",
+    "\n",
+    "    inference = NPE(prior=prior, density_estimator=density_estimator)\n",
+    "    inference.append_simulations(theta_train, x_train)\n",
+    "    estimator = inference.train(\n",
+    "        max_num_epochs=50,\n",
+    "        training_batch_size=128,\n",
+    "        show_train_summary=False,\n",
+    "    )\n",
+    "    posterior = inference.build_posterior(estimator)\n",
+    "\n",
+    "    with torch.no_grad():\n",
+    "        nll = -posterior.log_prob_batched(\n",
+    "            theta_val.unsqueeze(0), x=x_val\n",
+    "        ).mean().item()\n",
+    "    return nll\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Run the study and retrain\n",
+    "\n",
+    "```python\n",
+    "study = optuna.create_study(direction=\"minimize\")\n",
+    "# This will run the above NPE training up to 25 times\n",
+    "study.optimize(objective, n_trials=25)\n",
+    "\n",
+    "best_params = study.best_params\n",
+    "embedding_net = FCEmbedding(\n",
+    "    input_dim=x_train.shape[1],\n",
+    "    output_dim=best_params[\"embedding_dim\"],\n",
+    ")\n",
+    "density_estimator = posterior_nn(\n",
+    "    model=\"nsf\",\n",
+    "    embedding_net=embedding_net,\n",
+    "    num_transforms=best_params[\"num_transforms\"],\n",
+    ")\n",
+    "\n",
+    "inference = NPE(prior=prior, density_estimator=density_estimator)\n",
+    "inference.append_simulations(theta, x)\n",
+    "final_estimator = inference.train(training_batch_size=128)\n",
+    "posterior = inference.build_posterior(final_estimator)\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Notes\n",
+    "\n",
+    "- The toy simulator keeps the example short. Replace it with your simulator and prior.\n",
+    "- You can expand the search space with additional `posterior_nn` arguments (e.g., `hidden_features`)."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}