Skip to content

Commit 4b24a9c

Browse files
committed
docs(notebook): add 'Shape primer' section with conceptual diagrams
Inserts a new section 4 (renumbering subsequent sections 5-10) between geometric-inspection and the on-protein envelope plots. Two new diagrams (matplotlib, no PDB data needed): 1. Three-panel cross-section showing what Cube, Sphere, and Shell look like around a stylized blobby ligand — annotated with the exact sizing formula each shape uses. 2. Single-panel zoomed cross-section of an elongated ligand illustrating *why* the Shell's inner radius doesn't miss surface points. Surface points along the short axis sit at distance R_min from COM (covered by the shell band); the inner zone (< R_min - buffer) is guaranteed to be inside bulk along every direction and is skipped on purpose. Smoke-tested both new code cells in a clean matplotlib Agg session.
1 parent 86d8f57 commit 4b24a9c

1 file changed

Lines changed: 248 additions & 7 deletions

File tree

notebooks/maws/space_analysis.ipynb

Lines changed: 248 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -407,12 +407,253 @@
407407
"plt.show()"
408408
]
409409
},
410+
{
411+
"cell_type": "markdown",
412+
"id": "7fb27b941602401d91542211134fc71a",
413+
"metadata": {},
414+
"source": [
415+
"## 4. Shape primer — what each envelope looks like\n",
416+
"\n",
417+
"Before we fit envelopes to a real protein in the next section, here are\n",
418+
"the three shapes drawn in 2D with a stylized blobby ligand. These plots\n",
419+
"are pure geometry (no PDB data) so they render instantly.\n",
420+
"\n",
421+
"For each shape the green outline is the envelope MAWS samples from; the\n",
422+
"gray blob is the ligand; the magenta star is the mass-weighted COM that\n",
423+
"`compute_envelope_dims` centers everything on.\n"
424+
]
425+
},
426+
{
427+
"cell_type": "code",
428+
"execution_count": null,
429+
"id": "acae54e37e7d407bbb7b55eff062a284",
430+
"metadata": {},
431+
"outputs": [],
432+
"source": [
433+
"import matplotlib.patches as mpatches\n",
434+
"\n",
435+
"# Stylized blob: irregular octagon to suggest a non-spherical ligand.\n",
436+
"blob_xy = np.array(\n",
437+
" [\n",
438+
" (-3.5, -1.0),\n",
439+
" (-2.0, -2.5),\n",
440+
" (0.5, -3.0),\n",
441+
" (3.0, -2.0),\n",
442+
" (4.0, 0.0),\n",
443+
" (3.0, 2.0),\n",
444+
" (0.5, 2.5),\n",
445+
" (-2.5, 1.5),\n",
446+
" ]\n",
447+
")\n",
448+
"blob_r_max = max((x * x + y * y) ** 0.5 for x, y in blob_xy)\n",
449+
"blob_r_min = min((x * x + y * y) ** 0.5 for x, y in blob_xy)\n",
450+
"reach = 2.0 # smaller than 10 Å for a clearer figure\n",
451+
"\n",
452+
"fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n",
453+
"\n",
454+
"\n",
455+
"def draw_blob(ax):\n",
456+
" ax.add_patch(\n",
457+
" mpatches.Polygon(\n",
458+
" blob_xy, closed=True, facecolor=\"lightgray\", edgecolor=\"dimgray\"\n",
459+
" )\n",
460+
" )\n",
461+
" ax.plot(0, 0, marker=\"*\", color=\"magenta\", markersize=15, zorder=3, label=\"COM\")\n",
462+
" # R_max guide\n",
463+
" ax.plot([0, blob_r_max], [0, 0], color=\"black\", lw=0.8, ls=\":\")\n",
464+
" ax.text(blob_r_max / 2, 0.3, \"R_max\", ha=\"center\", fontsize=9)\n",
465+
"\n",
466+
"\n",
467+
"# --- Cube ---\n",
468+
"ax = axes[0]\n",
469+
"w = 2 * (blob_r_max + reach)\n",
470+
"ax.add_patch(\n",
471+
" mpatches.Rectangle((-w / 2, -w / 2), w, w, fill=False, edgecolor=\"green\", lw=2)\n",
472+
")\n",
473+
"draw_blob(ax)\n",
474+
"ax.set_title(f\"Cube — width = 2·(R_max + reach) = {w:.1f}\", fontsize=11)\n",
475+
"\n",
476+
"# --- Sphere ---\n",
477+
"ax = axes[1]\n",
478+
"r = blob_r_max + reach\n",
479+
"ax.add_patch(mpatches.Circle((0, 0), r, fill=False, edgecolor=\"green\", lw=2))\n",
480+
"draw_blob(ax)\n",
481+
"ax.set_title(f\"Sphere — radius = R_max + reach = {r:.1f}\", fontsize=11)\n",
482+
"\n",
483+
"# --- Shell ---\n",
484+
"ax = axes[2]\n",
485+
"r_outer = blob_r_max + reach\n",
486+
"buffer = 1.0 # smaller than the real 5 Å for visual clarity\n",
487+
"r_inner = max(0.0, blob_r_min - buffer)\n",
488+
"ax.add_patch(\n",
489+
" mpatches.Circle((0, 0), r_outer, fill=False, edgecolor=\"green\", lw=2, label=\"outer\")\n",
490+
")\n",
491+
"ax.add_patch(\n",
492+
" mpatches.Circle(\n",
493+
" (0, 0), r_inner, fill=False, edgecolor=\"orange\", lw=2, ls=\"--\", label=\"inner\"\n",
494+
" )\n",
495+
")\n",
496+
"draw_blob(ax)\n",
497+
"ax.set_title(\n",
498+
" f\"Shell — inner = R_min − buffer = {r_inner:.1f},\\n\"\n",
499+
" f\" outer = R_max + reach = {r_outer:.1f}\",\n",
500+
" fontsize=11,\n",
501+
")\n",
502+
"\n",
503+
"for ax in axes:\n",
504+
" ax.set_xlim(-8, 8)\n",
505+
" ax.set_ylim(-8, 8)\n",
506+
" ax.set_aspect(\"equal\")\n",
507+
" ax.grid(True, alpha=0.3)\n",
508+
" ax.axhline(0, color=\"lightgray\", lw=0.5)\n",
509+
" ax.axvline(0, color=\"lightgray\", lw=0.5)\n",
510+
"\n",
511+
"plt.tight_layout()\n",
512+
"plt.show()"
513+
]
514+
},
515+
{
516+
"cell_type": "markdown",
517+
"id": "9a63283cbaf04dbcab1f6479b197f3a8",
518+
"metadata": {},
519+
"source": [
520+
"### Why does Shell have an inner radius — doesn't it miss surface points?\n",
521+
"\n",
522+
"Common worry: *\"if the inner radius is `R_min − 5`, won't we miss surface\n",
523+
"samples close to the protein?\"* — no, because of how `R_min` is defined.\n",
524+
"\n",
525+
"`R_min` is the distance from the COM to the **closest** atom. For most\n",
526+
"proteins the closest atom is on or near the surface along the protein's\n",
527+
"**thinnest** direction. So `R_min` already corresponds to \"surface\n",
528+
"distance, short axis.\" Anything closer to COM than `R_min` is — by\n",
529+
"definition — inside the convex shell of atoms, i.e. inside the bulk\n",
530+
"regardless of which direction you look. The `−5 Å` buffer just gives a\n",
531+
"small safety margin so we don't accidentally clip the short-axis surface.\n",
532+
"\n",
533+
"Below: an elongated ligand (long axis horizontal). The shell band (light\n",
534+
"green) covers the surface along *every* direction. Inside the inner\n",
535+
"circle (gray) is guaranteed-bulk territory we skip on purpose.\n"
536+
]
537+
},
538+
{
539+
"cell_type": "code",
540+
"execution_count": null,
541+
"id": "8dd0d8092fe74a7c96281538738b07e2",
542+
"metadata": {},
543+
"outputs": [],
544+
"source": [
545+
"from matplotlib.patches import Circle, Ellipse\n",
546+
"\n",
547+
"# Elongated ligand: long axis along x.\n",
548+
"R_long = 6.0 # tip-of-blob distance from COM (R_max)\n",
549+
"R_short = 2.0 # short-axis blob distance from COM (R_min)\n",
550+
"buffer = 1.0 # smaller than real 5 Å for visual clarity\n",
551+
"reach = 2.0\n",
552+
"\n",
553+
"inner = max(0.0, R_short - buffer)\n",
554+
"outer = R_long + reach\n",
555+
"\n",
556+
"fig, ax = plt.subplots(figsize=(10, 6))\n",
557+
"\n",
558+
"# Outside (not sampled) — light blue background\n",
559+
"ax.add_patch(Circle((0, 0), 12, facecolor=\"#e8f4ff\", edgecolor=\"none\", zorder=0))\n",
560+
"# Shell band (sampled, SAS decides) — light green\n",
561+
"ax.add_patch(\n",
562+
" Circle((0, 0), outer, facecolor=\"#d8f5d8\", edgecolor=\"green\", lw=2, zorder=1)\n",
563+
")\n",
564+
"# Inner zone (guaranteed bulk, not sampled) — light gray\n",
565+
"ax.add_patch(\n",
566+
" Circle(\n",
567+
" (0, 0), inner, facecolor=\"#cccccc\", edgecolor=\"orange\", lw=2, ls=\"--\", zorder=2\n",
568+
" )\n",
569+
")\n",
570+
"# Ligand blob: elongated ellipse along x\n",
571+
"ax.add_patch(\n",
572+
" Ellipse(\n",
573+
" (0, 0),\n",
574+
" 2 * R_long,\n",
575+
" 2 * R_short,\n",
576+
" facecolor=\"dimgray\",\n",
577+
" edgecolor=\"black\",\n",
578+
" lw=1.5,\n",
579+
" alpha=0.85,\n",
580+
" zorder=3,\n",
581+
" )\n",
582+
")\n",
583+
"# COM\n",
584+
"ax.plot(0, 0, marker=\"*\", color=\"magenta\", markersize=18, zorder=4)\n",
585+
"\n",
586+
"# Direction arrows showing where the surface is\n",
587+
"# Short-axis surface point (top)\n",
588+
"ax.plot(0, R_short, marker=\"o\", color=\"red\", markersize=8, zorder=5)\n",
589+
"ax.annotate(\n",
590+
" \"surface point on short axis\\n(distance = R_min)\",\n",
591+
" xy=(0, R_short),\n",
592+
" xytext=(3, 4.5),\n",
593+
" arrowprops={\"arrowstyle\": \"->\", \"color\": \"red\", \"lw\": 1},\n",
594+
" fontsize=9,\n",
595+
" color=\"red\",\n",
596+
")\n",
597+
"# Long-axis surface point (right)\n",
598+
"ax.plot(R_long, 0, marker=\"o\", color=\"red\", markersize=8, zorder=5)\n",
599+
"ax.annotate(\n",
600+
" \"surface point on long axis\\n(distance = R_max)\",\n",
601+
" xy=(R_long, 0),\n",
602+
" xytext=(R_long + 1.5, -3),\n",
603+
" arrowprops={\"arrowstyle\": \"->\", \"color\": \"red\", \"lw\": 1},\n",
604+
" fontsize=9,\n",
605+
" color=\"red\",\n",
606+
")\n",
607+
"\n",
608+
"# Annotations for the three regions\n",
609+
"ax.text(\n",
610+
" 0,\n",
611+
" -1.2,\n",
612+
" \"guaranteed bulk\\n(< R_min − buffer)\\n← skipped →\",\n",
613+
" ha=\"center\",\n",
614+
" va=\"top\",\n",
615+
" fontsize=9,\n",
616+
" color=\"black\",\n",
617+
" bbox={\"facecolor\": \"white\", \"edgecolor\": \"orange\", \"boxstyle\": \"round\"},\n",
618+
")\n",
619+
"ax.text(\n",
620+
" 0,\n",
621+
" outer + 0.3,\n",
622+
" \"shell band [inner, outer] — sampled; SAS rejects bulk-side, accepts surface-side\",\n",
623+
" ha=\"center\",\n",
624+
" fontsize=9,\n",
625+
" color=\"green\",\n",
626+
" bbox={\"facecolor\": \"white\", \"edgecolor\": \"green\", \"boxstyle\": \"round\"},\n",
627+
")\n",
628+
"ax.text(\n",
629+
" 0,\n",
630+
" 11.2,\n",
631+
" \"outside the envelope (> R_max + reach) — not sampled\",\n",
632+
" ha=\"center\",\n",
633+
" fontsize=9,\n",
634+
" color=\"steelblue\",\n",
635+
" bbox={\"facecolor\": \"white\", \"edgecolor\": \"steelblue\", \"boxstyle\": \"round\"},\n",
636+
")\n",
637+
"\n",
638+
"ax.set_xlim(-12, 12)\n",
639+
"ax.set_ylim(-8, 12)\n",
640+
"ax.set_aspect(\"equal\")\n",
641+
"ax.set_title(\n",
642+
" \"Why Shell with `inner = R_min − buffer` doesn't miss surface points\", fontsize=11\n",
643+
")\n",
644+
"ax.grid(True, alpha=0.3)\n",
645+
"ax.axhline(0, color=\"lightgray\", lw=0.5)\n",
646+
"ax.axvline(0, color=\"lightgray\", lw=0.5)\n",
647+
"plt.tight_layout()\n",
648+
"plt.show()"
649+
]
650+
},
410651
{
411652
"cell_type": "markdown",
412653
"id": "b7cfe3a9",
413654
"metadata": {},
414655
"source": [
415-
"## 4. Auto-sized envelopes per shape\n",
656+
"## 5. Auto-sized envelopes per shape\n",
416657
"\n",
417658
"`compute_envelope_dims` for each shape with reach=10 Å; envelope outline overlaid.\n"
418659
]
@@ -506,7 +747,7 @@
506747
"id": "647ebffb",
507748
"metadata": {},
508749
"source": [
509-
"## 5. Surface rejection in action\n",
750+
"## 6. Surface rejection in action\n",
510751
"\n",
511752
"Build an `Excluder` (probe=1.4 Å) and draw 2000 raw envelope samples per shape.\n",
512753
"Color accepted (green) vs rejected (red).\n"
@@ -577,7 +818,7 @@
577818
"id": "54b5237a",
578819
"metadata": {},
579820
"source": [
580-
"## 6. Per-shape sampling efficiency\n",
821+
"## 7. Per-shape sampling efficiency\n",
581822
"\n",
582823
"Bar chart: rejection rate by shape (lower = better) and µs per accepted sample.\n"
583824
]
@@ -644,7 +885,7 @@
644885
"id": "9551bc67",
645886
"metadata": {},
646887
"source": [
647-
"## 7. Probe + reach sweeps\n",
888+
"## 8. Probe + reach sweeps\n",
648889
"\n",
649890
"How does the accessible region change with `probe` ∈ {0.5, 1.4, 2.5} and\n",
650891
"`reach` ∈ {5, 10, 20}? Top row: vary probe; bottom row: vary reach.\n"
@@ -706,7 +947,7 @@
706947
"id": "a66fdb18",
707948
"metadata": {},
708949
"source": [
709-
"## 8. Failure mode: fully buried envelope\n",
950+
"## 9. Failure mode: fully buried envelope\n",
710951
"\n",
711952
"If the envelope is too small / too deep, no clear point exists. The sampler\n",
712953
"should raise `SamplingError` rather than infinite-loop.\n"
@@ -752,7 +993,7 @@
752993
"id": "53e36416",
753994
"metadata": {},
754995
"source": [
755-
"## 9. Old (20 Å cube) vs new behavior\n",
996+
"## 10. Old (20 Å cube) vs new behavior\n",
756997
"\n",
757998
"Plot the previous 20 Å cube around the protein next to the new auto-sized\n",
758999
"shell + surface rejection. The old default mostly samples the protein\n",
@@ -848,4 +1089,4 @@
8481089
},
8491090
"nbformat": 4,
8501091
"nbformat_minor": 5
851-
}
1092+
}

0 commit comments

Comments
 (0)