feat: add blob-detection example notebook

ttngu207 · ttngu207 · commit 26cc216a5b83 · 2025-11-21T15:49:47.000Z
diff --git a/short_tutorials/blob-detection.ipynb b/short_tutorials/blob-detection.ipynb
@@ -0,0 +1,345 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Blob Detection Workflow\n",
+    "\n",
+    "This example shows a compact image-analysis pipeline that detects bright blobs in two sample images using DataJoint. It demonstrates:\n",
+    "\n",
+    "- Seeding a small `Image` manual table with two entries of standard images from `skimage.data`.\n",
+    "- Defining multiple parameter sets for blob detection in a lookup table `BlobParamSet`\n",
+    "- Defining a computed master table `Detection` together with its nested part table `Detection.Blob`.\n",
+    "- Populating the master, which automatically inserts all part rows inside the same transaction.\n",
+    "- Visualizing the results by drawing detection circles on the images.\n",
+    "- Visually selecting the optimal parameter set for each image and saving the selection in a manual table `SelectDetection`.\n",
+    "\n",
+    "Along the way we illustrate why master-part relationships are ideal for computational workflows: the master stores aggregate results and the parts hold per-feature detail, all created atomically.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "\n",
+    "Load the required images and display them for reference.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%xmode minimal"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "import matplotlib.pyplot as plt\n",
+    "from skimage import data\n",
+    "from skimage.feature import blob_doh\n",
+    "from skimage.color import rgb2gray\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import datajoint as dj\n",
+    "\n",
+    "schema = dj.Schema(db_prefix + 'blob_detection')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@schema\n",
+    "class Image(dj.Manual):\n",
+    "    definition = \"\"\"\n",
+    "    image_id : int\n",
+    "    ---\n",
+    "    image_name : varchar(30)\n",
+    "    image : longblob\n",
+    "    \"\"\"\n",
+    "\n",
+    "Image.insert(\n",
+    "    (\n",
+    "        (1, \"hubble deep field\", rgb2gray(data.hubble_deep_field())),\n",
+    "        (2, \"human mitosis\", data.human_mitosis()/255.0)\n",
+    "    ), skip_duplicates=True\n",
+    ");"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, axs = plt.subplots(1, 2, figsize=(10, 5))\n",
+    "for ax, image, title in zip(axs, *Image.fetch(\"image\", \"image_name\")):\n",
+    "    ax.imshow(image, cmap=\"gray_r\")\n",
+    "    ax.axis('off')\n",
+    "    ax.axis('equal')\n",
+    "    ax.set_title(title)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@schema\n",
+    "class BlobParamSet(dj.Lookup):\n",
+    "    definition = \"\"\"\n",
+    "    blob_paramset : int\n",
+    "    ---\n",
+    "    min_sigma : float\n",
+    "    max_sigma : float\n",
+    "    threshold : float\n",
+    "    \"\"\"\n",
+    "    contents = [\n",
+    "        (1, 2.0, 6.0, 0.001),\n",
+    "        (2, 3.0, 8.0, 0.002),\n",
+    "        (3, 4.0, 20.0, 0.01),\n",
+    "    ]\n",
+    "\n",
+    "\n",
+    "@schema\n",
+    "class Detection(dj.Computed):\n",
+    "    definition = \"\"\"\n",
+    "    -> Image\n",
+    "    -> BlobParamSet\n",
+    "    ---\n",
+    "    nblobs : int\n",
+    "    \"\"\"\n",
+    "\n",
+    "    class Blob(dj.Part):\n",
+    "        definition = \"\"\"\n",
+    "        -> master\n",
+    "        blob_id : int\n",
+    "        ---\n",
+    "        x : float\n",
+    "        y : float\n",
+    "        r : float\n",
+    "        \"\"\"\n",
+    "\n",
+    "    def make(self, key):\n",
+    "        # fetch inputs\n",
+    "        img = (Image & key).fetch1(\"image\")\n",
+    "        params = (BlobParamSet & key).fetch1()\n",
+    "\n",
+    "        # compute results\n",
+    "        blobs = blob_doh(\n",
+    "            img, \n",
+    "            min_sigma=params['min_sigma'], \n",
+    "            max_sigma=params['max_sigma'], \n",
+    "            threshold=params['threshold'])\n",
+    "\n",
+    "        # insert master and parts\n",
+    "        self.insert1(dict(key, nblobs=len(blobs)))\n",
+    "        self.Blob.insert(\n",
+    "            (dict(key, blob_id=i, x=x, y=y, r=r)\n",
+    "             for i, (x, y, r) in enumerate(blobs)))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dj.Diagram(schema)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "Detection.populate(display_progress=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "Detection()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Parameter sets\n",
+    "\n",
+    "Define a small lookup table of blob-detection parameters.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fix, axes = plt.subplots(2, 3, figsize=(10, 6))\n",
+    "for ax, key in zip(axes.ravel(), Detection.fetch(\"KEY\", order_by=\"image_id, blob_paramset\")):\n",
+    "    img = (Image & key).fetch1(\"image\")\n",
+    "    ax.imshow(img, cmap=\"gray_r\")\n",
+    "    ax.axis('off')\n",
+    "    ax.axis('equal')\n",
+    "    ax.set_title(str(key), fontsize=10)\n",
+    "    for  x, y, r in zip(*(Detection.Blob & key).fetch(\"y\", \"x\", \"r\")):\n",
+    "        c = plt.Circle((x, y), r*1.2, color='r', alpha=0.5, fill=False)\n",
+    "        ax.add_patch(c)\n",
+    "plt.suptitle(\"Detected blobs - all paramsets\")\n",
+    "plt.tight_layout()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@schema\n",
+    "class SelectDetection(dj.Manual):\n",
+    "    definition = \"\"\"\n",
+    "    -> Image\n",
+    "    ---\n",
+    "    -> Detection\n",
+    "    \"\"\"\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "SelectDetection.insert1(dict(image_id=1, blob_paramset=3))\n",
+    "SelectDetection.insert1(dict(image_id=2, blob_paramset=1))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dj.Diagram(schema)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fix, axes = plt.subplots(1, 2, figsize=(8, 4))\n",
+    "for ax, key in zip(axes.ravel(), SelectDetection.fetch(as_dict=True, order_by=\"image_id\")):\n",
+    "    img = (Image & key).fetch1(\"image\")\n",
+    "    ax.imshow(img, cmap=\"gray_r\")\n",
+    "    ax.axis('off')\n",
+    "    ax.axis('equal')\n",
+    "    ax.set_title(str(key), fontsize=10)\n",
+    "    for  x, y, r in zip(*(Detection.Blob & key).fetch(\"y\", \"x\", \"r\")):\n",
+    "        c = plt.Circle((x, y), r*1.2, color='r', alpha=0.5, fill=False)\n",
+    "        ax.add_patch(c)\n",
+    "plt.suptitle(\"Selected detections\", fontsize=16)\n",
+    "plt.tight_layout()\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Detection master and part tables\n",
+    "\n",
+    "`Detection` is a computed table. When `populate()` runs, its `make()` method:\n",
+    "\n",
+    "1. Fetches the image and parameter set.\n",
+    "2. Runs `skimage.feature.blob_doh` to compute blobs.\n",
+    "3. Inserts one master row with the blob count.\n",
+    "4. Inserts one `Detection.Blob` part row per blob (containing coordinates and radius).\n",
+    "\n",
+    "If any insert fails, the transaction is rolled back so master and parts stay synchronized.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Results\n",
+    "\n",
+    "Populate the detection table and display both the master summary and the per-blob annotations.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Takeaways\n",
+    "\n",
+    "- Master-part tables capture the structure “one job → many detailed results”.\n",
+    "- Downstream analyses depend only on the master (`-> Detection`) yet can access part details when needed.\n",
+    "- Populating the master guarantees atomic creation of all associated parts, preserving workflow integrity.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "schema.drop() # drop the schema for re-generating the tutorial from scratch."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "base",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.13.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}