borglab · dellaert · Jul 15, 2025 · Jul 8, 2025 · Jul 9, 2025 · Jul 9, 2025
diff --git a/python/gtsam/examples/README.md b/python/gtsam/examples/README.md
@@ -49,7 +49,7 @@
 | SimpleRotation                                        | :heavy_check_mark: |
 | SolverComparer                                        |        |
 | StereoVOExample                                       | :heavy_check_mark: |
-| StereoVOExample_large                                 |        |
+| StereoVOExample_large                                 | :heavy_check_mark: |
 | TimeTBB                                               |        |
 | UGM_chain                                             | discrete functionality not yet exposed |
 | UGM_small                                             | discrete functionality not yet exposed |

diff --git a/python/gtsam/examples/StereoVOExample.ipynb b/python/gtsam/examples/StereoVOExample.ipynb
@@ -7,23 +7,31 @@
       "source": [
         "# Stereo Visual Odometry with GTSAM\n",
         "\n",
+        "<a href=\"https://colab.research.google.com/github/borglab/gtsam/blob/develop/python/gtsam/examples/StereoVOExample.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
+        "\n",
         "This notebook demonstrates a simple 3D stereo visual odometry problem using the GTSAM Python wrapper. The scenario is as follows:\n",
         "\n",
         "1.  A robot starts at the origin (Pose 1: `X(1)`).\n",
-        "2.  The robot moves approximately 1 meter forward (Pose 2: `X(2)`).\n",
+        "2.  The robot moves approximately one meter forward (Pose 2: `X(2)`).\n",
         "3.  From both poses, the robot observes three landmarks (`L(1)`, `L(2)`, `L(3)`) with a stereo camera.\n",
         "\n",
         "We will:\n",
-        "*   Define camera calibration and noise models.\n",
-        "*   Create a factor graph representing the problem.\n",
-        "*   Provide initial estimates for poses and landmark positions.\n",
-        "*   Optimize the graph using Levenberg-Marquardt to find the most probable configuration.\n"
+        "- Define camera calibration and noise models.\n",
+        "- Create a factor graph representing the problem.\n",
+        "- Provide initial estimates for poses and landmark positions.\n",
+        "- Optimize the graph using Levenberg-Marquardt to find the most probable configuration.\n",
+        "\n",
+        "**Note:** This example is also available in C++. If you're interested in the C++ implementation, refer to `StereoVOExample.cpp` in the [GTSAM examples directory](https://github.com/borglab/gtsam/tree/develop/examples)."
       ]
     },
     {
       "cell_type": "markdown",
       "id": "e33e553a",
-      "metadata": {},
+      "metadata": {
+        "tags": [
+          "remove-cell"
+        ]
+      },
       "source": [
         "GTSAM Copyright 2010-2025, Georgia Tech Research Corporation,\n",
         "Atlanta, Georgia 30332-0415\n",
@@ -35,25 +43,49 @@
       ]
     },
     {
-      "cell_type": "markdown",
-      "id": "6af0f339",
-      "metadata": {},
+      "cell_type": "code",
+      "execution_count": 1,
+      "id": "43bbbfe4",
+      "metadata": {
+        "tags": [
+          "remove-cell"
+        ]
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Not running in Google Colab. Please install necessary libraries as needed.\n"
+          ]
+        }
+      ],
       "source": [
-        "<a href=\"https://colab.research.google.com/github/gtsam-project/gtsam-examples/blob/main/python/StereoVOExample_py.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+        "try:\n",
+        "    import google.colab\n",
+        "    print(\"Running in Google Colab, installing necessary libraries...\")\n",
+        "    %pip install --quiet gtsam-develop\n",
+        "    print(\"Done!\")\n",
+        "except ImportError:\n",
+        "    print(\"Not running in Google Colab. Please install necessary libraries as needed.\")\n",
+        "    pass"
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 19,
-      "id": "ac25f0ee",
-      "metadata": {},
+      "execution_count": 2,
+      "id": "778ceb66",
+      "metadata": {
+        "tags": [
+          "hide-cell"
+        ]
+      },
       "outputs": [],
       "source": [
         "import gtsam\n",
-        "import numpy as np\n",
         "\n",
         "# For shorthand for common GTSAM types (like X for poses, L for landmarks)\n",
-        "from gtsam.symbol_shorthand import X, L\n"
+        "from gtsam.symbol_shorthand import X, L"
       ]
     },
     {
@@ -63,14 +95,12 @@
       "source": [
         "## 1. Setup Factor Graph and Priors\n",
         "\n",
-        "We start by creating an empty `NonlinearFactorGraph`.\n",
-        "\n",
-        "The first pose `X(1)` is assumed to be at the origin. We'll add a hard constraint (prior) for this using `NonlinearEqualityPose3`. In GTSAM, factor keys are typically integers. We use `X(1)` as a symbolic key for the first pose.\n"
+        "We start by creating an empty `NonlinearFactorGraph`. The first pose `X(1)` is assumed to be at the world origin. To reflect this, we add a hard constraint using `NonlinearEqualityPose3`, which fixes `X(1)` to the identity pose. This ensures that the optimization has a known reference point for the rest of the variables. In GTSAM, factor keys are typically integers, and `X(1)` is a symbolic shorthand for such a key.\n"
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 20,
+      "execution_count": 3,
       "id": "342a585f",
       "metadata": {},
       "outputs": [
@@ -89,9 +119,7 @@
         "# Define the first pose at the origin\n",
         "first_pose = gtsam.Pose3() # Default constructor is identity\n",
         "\n",
-        "# Add a prior on the first pose (X1). This constraint implies X1 is fixed at the origin.\n",
-        "# NonlinearEqualityPose3 is a factor that enforces X(1) == first_pose.\n",
-        "# It effectively acts as a prior with infinite precision (zero noise).\n",
+        "# Add a prior constraint on X(1)\n",
         "graph.add(gtsam.NonlinearEqualityPose3(X(1), first_pose))\n",
         "\n",
         "print(\"Graph size after adding prior:\", graph.size())\n"
@@ -104,26 +132,42 @@
       "source": [
         "## 2. Define Camera Calibration and Measurement Noise\n",
         "\n",
-        "- **Noise Model:** We define an isotropic noise model for the stereo measurements. `Isotropic.Sigma(3, 1.0)` means a 3-dimensional noise (for $u_L, u_R, v$) with a standard deviation of 1 pixel for each dimension.\n",
+        "To accurately model the stereo observations, we must define both the measurement noise model and the intrinsic calibration of the stereo camera system.\n",
+        "- **Measurement Noise:** We assume an isotropic Gaussian noise model for the stereo measurements. Since each observation consists of three pixel values $(u_L, u_R, v)$, we use `Isotropic.Sigma(3, 1.0)` to define a noise model with unit standard deviation in each of the three dimensions.\n",
         "- **Camera Calibration ([`Cal3_S2Stereo`](/gtsam/geometry/doc/Cal3_S2Stereo.ipynb)):** We define a stereo calibration model with the following properties:\n",
-        "  - $f_x=f_y=1000$\n",
-        "  - $s=0$\n",
-        "  - $(u,v)=(320,\\,240)$\n",
-        "  - $b=0.2$"
+        "  - Focal lengths: $f_x=f_y=1000$\n",
+        "  - Skew: $s=0$\n",
+        "  - Principal point: $(u,v)=(320,\\,240)$\n",
+        "  - Baseline: $b=0.2$\n",
+        "  \n",
+        "These parameters represent a synthetic stereo camera setup for this minimal example."
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 21,
+      "execution_count": 4,
       "id": "e04b26c7",
       "metadata": {},
-      "outputs": [],
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "K: 1000    0  320\n",
+            "   0 1000  240\n",
+            "   0    0    1\n",
+            "Baseline: 0.2\n",
+            "\n"
+          ]
+        }
+      ],
       "source": [
-        "# Define the stereo measurement noise model (isotropic, 1 pixel standard deviation)\n",
-        "measurement_noise = gtsam.noiseModel.Isotropic.Sigma(3, 1.0)\n",
-        "\n",
         "# Create stereo camera calibration object\n",
-        "K = gtsam.Cal3_S2Stereo(1000, 1000, 0, 320, 240, 0.2)"
+        "K = gtsam.Cal3_S2Stereo(1000, 1000, 0, 320, 240, 0.2)\n",
+        "print(K)\n",
+        "\n",
+        "# Define the stereo measurement noise model (isotropic, 1 pixel standard deviation)\n",
+        "measurement_noise = gtsam.noiseModel.Isotropic.Sigma(3, 1.0)"
       ]
     },
     {
@@ -136,19 +180,17 @@
         "We now add stereo factors to the graph. Each factor connects a camera pose to a landmark.\n",
         "The `GenericStereoFactor3D` takes:\n",
         "*   `measured`: The `StereoPoint2` measurement $(u_L, u_R, v)$.\n",
-        "*   `model`: The noise model for the measurement.\n",
+        "*   `model`: The noise model for the pixel measurement.\n",
         "*   `poseKey`: Key for the camera pose.\n",
         "*   `landmarkKey`: Key for the landmark.\n",
         "*   `K`: The stereo camera calibration.\n",
         "\n",
-        "Landmark keys in the C++ example are 3, 4, 5. We'll use `L(1)`, `L(2)`, `L(3)` to map to these.\n",
-        "\n",
-        "**Measurements from Pose 1 (`X(1)`)**\n"
+        "We'll use `L(1)`, `L(2)`, `L(3)` to represent our three landmarks."
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 22,
+      "execution_count": 5,
       "id": "e9778bec",
       "metadata": {},
       "outputs": [
@@ -162,28 +204,22 @@
       ],
       "source": [
         "# Factors from X(1) to landmarks\n",
-        "# StereoPoint2(uL, uR, v)\n",
         "graph.add(gtsam.GenericStereoFactor3D(\n",
-        "    gtsam.StereoPoint2(520, 480, 440), measurement_noise, X(1), L(1), K))\n",
+        "    gtsam.StereoPoint2(520, 480, 440), measurement_noise, X(1), L(1), K\n",
+        "))\n",
         "graph.add(gtsam.GenericStereoFactor3D(\n",
-        "    gtsam.StereoPoint2(120, 80, 440), measurement_noise, X(1), L(2), K))\n",
+        "    gtsam.StereoPoint2(120, 80, 440), measurement_noise, X(1), L(2), K\n",
+        "))\n",
         "graph.add(gtsam.GenericStereoFactor3D(\n",
-        "    gtsam.StereoPoint2(320, 280, 140), measurement_noise, X(1), L(3), K))\n",
+        "    gtsam.StereoPoint2(320, 280, 140), measurement_noise, X(1), L(3), K\n",
+        "))\n",
         "\n",
         "print(\"Graph size after adding X(1) factors:\", graph.size())\n"
       ]
     },
-    {
-      "cell_type": "markdown",
-      "id": "d0160cf1",
-      "metadata": {},
-      "source": [
-        "**Measurements from Pose 2 (`X(2)`)**\n"
-      ]
-    },
     {
       "cell_type": "code",
-      "execution_count": 23,
+      "execution_count": 6,
       "id": "495fc44f",
       "metadata": {},
       "outputs": [
@@ -198,11 +234,14 @@
       "source": [
         "# Factors from X(2) to landmarks\n",
         "graph.add(gtsam.GenericStereoFactor3D(\n",
-        "    gtsam.StereoPoint2(570, 520, 490), measurement_noise, X(2), L(1), K))\n",
+        "    gtsam.StereoPoint2(570, 520, 490), measurement_noise, X(2), L(1), K\n",
+        "))\n",
         "graph.add(gtsam.GenericStereoFactor3D(\n",
-        "    gtsam.StereoPoint2(70, 20, 490), measurement_noise, X(2), L(2), K))\n",
+        "    gtsam.StereoPoint2(70, 20, 490), measurement_noise, X(2), L(2), K\n",
+        "))\n",
         "graph.add(gtsam.GenericStereoFactor3D(\n",
-        "    gtsam.StereoPoint2(320, 270, 115), measurement_noise, X(2), L(3), K))\n",
+        "    gtsam.StereoPoint2(320, 270, 115), measurement_noise, X(2), L(3), K\n",
+        "))\n",
         "\n",
         "print(\"Graph size after adding all factors:\", graph.size())"
       ]
@@ -214,17 +253,15 @@
       "source": [
         "## 4. Create Initial Estimates\n",
         "\n",
-        "Nonlinear optimization requires initial estimates for all variables (poses and landmarks).\n",
-        "We create a `Values` object to store these.\n",
-        "\n",
-        "*   `X(1)`: At origin (matches the prior).\n",
-        "*   `X(2)`: Robot moves forward, slightly off-axis: `Pose3(Rot3(), Point3(0.1, -0.1, 1.1))`.\n",
-        "*   Landmarks (`L(1)`, `L(2)`, `L(3)` or `3,4,5`): Placed at estimated 3D positions.\n"
+        "Nonlinear optimization requires initial estimates for all variables (poses and landmarks). We create a `Values` object named `initial_estimate` to store these. To see the effects of optimization in the later sections, we introduce a small error when inserting the initial estimate of pose `X(2)`. In summary, this is what we insert into `initial_estimate`:\n",
+        "*   Pose `X(1)`: Identity pose at the origin.\n",
+        "*   Pose `X(2)`: Camera pose after the robot moves forward by one meter, slightly off-axis and with a small overshoot.\n",
+        "*   Landmarks (`L(1)`, `L(2)`, `L(3)`): Placed at estimated 3D positions.\n"
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 24,
+      "execution_count": 7,
       "id": "b4be6ab2",
       "metadata": {},
       "outputs": [],
@@ -258,7 +295,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 25,
+      "execution_count": 8,
       "id": "d0c3a87b",
       "metadata": {},
       "outputs": [],
@@ -277,18 +314,21 @@
       "source": [
         "## 6. Analyze Results\n",
         "\n",
-        "We can extract the optimized poses and landmark positions from the `result` Values object.\n",
+        "After optimization, we extract the refined poses and landmark positions from the `result` returned by the optimizer.\n",
+        "\n",
+        "Recall that in this example, the robot begins at the origin, and we expect the second pose `X(2)` to be apprixmately one meter forward along the Z-axis, so `X(2)` should ideally be near `Pose3(Rot3(), Point3(0, 0, 1.0))`.\n",
         "\n",
-        "For example, the true second pose (if the robot moved exactly 1m forward along Z from origin) would be `Pose3(Rot3(), Point3(0, 0, 1.0))`.\n",
-        "The true landmark positions can be triangulated from the perfect measurements and true poses.\n",
-        "The example's measurements and initial estimates are designed to be somewhat noisy/imperfect, so the optimizer refines them.\n",
+        "GTSAM minimizes the total reprojection error (i.e. the discrepancy between observed and predicted stereo image coordinates) across all factors. We can evaluate this using `graph.error()` to compute the total error before and after optimization. A significant drop in this error indicates that the optimizer successfully refined the estimates.\n",
         "\n",
-        "Let's check the error before and after optimization.\n"
+        "To better understand the outcome of this example, we print the error, both prior- and post-optimization, along with the optimized poses and landmark positions. This helps verify:\n",
+        "- That the optimizer is successful in refining the estimates.\n",
+        "- That the second pose `X(2)` aligns with the expected forward motion.\n",
+        "- That the landmark locations are consistent and reasonable."
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 26,
+      "execution_count": 9,
       "id": "093fe3dc",
       "metadata": {},
       "outputs": [