Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion python/gtsam/examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
| SimpleRotation | :heavy_check_mark: |
| SolverComparer | |
| StereoVOExample | :heavy_check_mark: |
| StereoVOExample_large | |
| StereoVOExample_large | :heavy_check_mark: |
| TimeTBB | |
| UGM_chain | discrete functionality not yet exposed |
| UGM_small | discrete functionality not yet exposed |
Expand Down
176 changes: 108 additions & 68 deletions python/gtsam/examples/StereoVOExample.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,23 +7,31 @@
"source": [
"# Stereo Visual Odometry with GTSAM\n",
"\n",
"<a href=\"https://colab.research.google.com/github/borglab/gtsam/blob/develop/python/gtsam/examples/StereoVOExample.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"This notebook demonstrates a simple 3D stereo visual odometry problem using the GTSAM Python wrapper. The scenario is as follows:\n",
"\n",
"1. A robot starts at the origin (Pose 1: `X(1)`).\n",
"2. The robot moves approximately 1 meter forward (Pose 2: `X(2)`).\n",
"2. The robot moves approximately one meter forward (Pose 2: `X(2)`).\n",
"3. From both poses, the robot observes three landmarks (`L(1)`, `L(2)`, `L(3)`) with a stereo camera.\n",
"\n",
"We will:\n",
"* Define camera calibration and noise models.\n",
"* Create a factor graph representing the problem.\n",
"* Provide initial estimates for poses and landmark positions.\n",
"* Optimize the graph using Levenberg-Marquardt to find the most probable configuration.\n"
"- Define camera calibration and noise models.\n",
"- Create a factor graph representing the problem.\n",
"- Provide initial estimates for poses and landmark positions.\n",
"- Optimize the graph using Levenberg-Marquardt to find the most probable configuration.\n",
"\n",
"**Note:** This example is also available in C++. If you're interested in the C++ implementation, refer to `StereoVOExample.cpp` in the [GTSAM examples directory](https://github.com/borglab/gtsam/tree/develop/examples)."
]
},
{
"cell_type": "markdown",
"id": "e33e553a",
"metadata": {},
"metadata": {
"tags": [
"remove-cell"
]
},
"source": [
"GTSAM Copyright 2010-2025, Georgia Tech Research Corporation,\n",
"Atlanta, Georgia 30332-0415\n",
Expand All @@ -35,25 +43,49 @@
]
},
{
"cell_type": "markdown",
"id": "6af0f339",
"metadata": {},
"cell_type": "code",
"execution_count": 1,
"id": "43bbbfe4",
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Not running in Google Colab. Please install necessary libraries as needed.\n"
]
}
],
"source": [
"<a href=\"https://colab.research.google.com/github/gtsam-project/gtsam-examples/blob/main/python/StereoVOExample_py.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
"try:\n",
" import google.colab\n",
" print(\"Running in Google Colab, installing necessary libraries...\")\n",
" %pip install --quiet gtsam-develop\n",
" print(\"Done!\")\n",
"except ImportError:\n",
" print(\"Not running in Google Colab. Please install necessary libraries as needed.\")\n",
" pass"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "ac25f0ee",
"metadata": {},
"execution_count": 2,
"id": "778ceb66",
"metadata": {
"tags": [
"hide-cell"
]
},
"outputs": [],
"source": [
"import gtsam\n",
"import numpy as np\n",
"\n",
"# For shorthand for common GTSAM types (like X for poses, L for landmarks)\n",
"from gtsam.symbol_shorthand import X, L\n"
"from gtsam.symbol_shorthand import X, L"
]
},
{
Expand All @@ -63,14 +95,12 @@
"source": [
"## 1. Setup Factor Graph and Priors\n",
"\n",
"We start by creating an empty `NonlinearFactorGraph`.\n",
"\n",
"The first pose `X(1)` is assumed to be at the origin. We'll add a hard constraint (prior) for this using `NonlinearEqualityPose3`. In GTSAM, factor keys are typically integers. We use `X(1)` as a symbolic key for the first pose.\n"
"We start by creating an empty `NonlinearFactorGraph`. The first pose `X(1)` is assumed to be at the world origin. To reflect this, we add a hard constraint using `NonlinearEqualityPose3`, which fixes `X(1)` to the identity pose. This ensures that the optimization has a known reference point for the rest of the variables. In GTSAM, factor keys are typically integers, and `X(1)` is a symbolic shorthand for such a key.\n"
]
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 3,
"id": "342a585f",
"metadata": {},
"outputs": [
Expand All @@ -89,9 +119,7 @@
"# Define the first pose at the origin\n",
"first_pose = gtsam.Pose3() # Default constructor is identity\n",
"\n",
"# Add a prior on the first pose (X1). This constraint implies X1 is fixed at the origin.\n",
"# NonlinearEqualityPose3 is a factor that enforces X(1) == first_pose.\n",
"# It effectively acts as a prior with infinite precision (zero noise).\n",
"# Add a prior constraint on X(1)\n",
"graph.add(gtsam.NonlinearEqualityPose3(X(1), first_pose))\n",
"\n",
"print(\"Graph size after adding prior:\", graph.size())\n"
Expand All @@ -104,26 +132,42 @@
"source": [
"## 2. Define Camera Calibration and Measurement Noise\n",
"\n",
"- **Noise Model:** We define an isotropic noise model for the stereo measurements. `Isotropic.Sigma(3, 1.0)` means a 3-dimensional noise (for $u_L, u_R, v$) with a standard deviation of 1 pixel for each dimension.\n",
"To accurately model the stereo observations, we must define both the measurement noise model and the intrinsic calibration of the stereo camera system.\n",
"- **Measurement Noise:** We assume an isotropic Gaussian noise model for the stereo measurements. Since each observation consists of three pixel values $(u_L, u_R, v)$, we use `Isotropic.Sigma(3, 1.0)` to define a noise model with unit standard deviation in each of the three dimensions.\n",
"- **Camera Calibration ([`Cal3_S2Stereo`](/gtsam/geometry/doc/Cal3_S2Stereo.ipynb)):** We define a stereo calibration model with the following properties:\n",
" - $f_x=f_y=1000$\n",
" - $s=0$\n",
" - $(u,v)=(320,\\,240)$\n",
" - $b=0.2$"
" - Focal lengths: $f_x=f_y=1000$\n",
" - Skew: $s=0$\n",
" - Principal point: $(u,v)=(320,\\,240)$\n",
" - Baseline: $b=0.2$\n",
" \n",
"These parameters represent a synthetic stereo camera setup for this minimal example."
]
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 4,
"id": "e04b26c7",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"K: 1000 0 320\n",
" 0 1000 240\n",
" 0 0 1\n",
"Baseline: 0.2\n",
"\n"
]
}
],
"source": [
"# Define the stereo measurement noise model (isotropic, 1 pixel standard deviation)\n",
"measurement_noise = gtsam.noiseModel.Isotropic.Sigma(3, 1.0)\n",
"\n",
"# Create stereo camera calibration object\n",
"K = gtsam.Cal3_S2Stereo(1000, 1000, 0, 320, 240, 0.2)"
"K = gtsam.Cal3_S2Stereo(1000, 1000, 0, 320, 240, 0.2)\n",
"print(K)\n",
"\n",
"# Define the stereo measurement noise model (isotropic, 1 pixel standard deviation)\n",
"measurement_noise = gtsam.noiseModel.Isotropic.Sigma(3, 1.0)"
]
},
{
Expand All @@ -136,19 +180,17 @@
"We now add stereo factors to the graph. Each factor connects a camera pose to a landmark.\n",
"The `GenericStereoFactor3D` takes:\n",
"* `measured`: The `StereoPoint2` measurement $(u_L, u_R, v)$.\n",
"* `model`: The noise model for the measurement.\n",
"* `model`: The noise model for the pixel measurement.\n",
"* `poseKey`: Key for the camera pose.\n",
"* `landmarkKey`: Key for the landmark.\n",
"* `K`: The stereo camera calibration.\n",
"\n",
"Landmark keys in the C++ example are 3, 4, 5. We'll use `L(1)`, `L(2)`, `L(3)` to map to these.\n",
"\n",
"**Measurements from Pose 1 (`X(1)`)**\n"
"We'll use `L(1)`, `L(2)`, `L(3)` to represent our three landmarks."
]
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": 5,
"id": "e9778bec",
"metadata": {},
"outputs": [
Expand All @@ -162,28 +204,22 @@
],
"source": [
"# Factors from X(1) to landmarks\n",
"# StereoPoint2(uL, uR, v)\n",
"graph.add(gtsam.GenericStereoFactor3D(\n",
" gtsam.StereoPoint2(520, 480, 440), measurement_noise, X(1), L(1), K))\n",
" gtsam.StereoPoint2(520, 480, 440), measurement_noise, X(1), L(1), K\n",
"))\n",
"graph.add(gtsam.GenericStereoFactor3D(\n",
" gtsam.StereoPoint2(120, 80, 440), measurement_noise, X(1), L(2), K))\n",
" gtsam.StereoPoint2(120, 80, 440), measurement_noise, X(1), L(2), K\n",
"))\n",
"graph.add(gtsam.GenericStereoFactor3D(\n",
" gtsam.StereoPoint2(320, 280, 140), measurement_noise, X(1), L(3), K))\n",
" gtsam.StereoPoint2(320, 280, 140), measurement_noise, X(1), L(3), K\n",
"))\n",
"\n",
"print(\"Graph size after adding X(1) factors:\", graph.size())\n"
]
},
{
"cell_type": "markdown",
"id": "d0160cf1",
"metadata": {},
"source": [
"**Measurements from Pose 2 (`X(2)`)**\n"
]
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 6,
"id": "495fc44f",
"metadata": {},
"outputs": [
Expand All @@ -198,11 +234,14 @@
"source": [
"# Factors from X(2) to landmarks\n",
"graph.add(gtsam.GenericStereoFactor3D(\n",
" gtsam.StereoPoint2(570, 520, 490), measurement_noise, X(2), L(1), K))\n",
" gtsam.StereoPoint2(570, 520, 490), measurement_noise, X(2), L(1), K\n",
"))\n",
"graph.add(gtsam.GenericStereoFactor3D(\n",
" gtsam.StereoPoint2(70, 20, 490), measurement_noise, X(2), L(2), K))\n",
" gtsam.StereoPoint2(70, 20, 490), measurement_noise, X(2), L(2), K\n",
"))\n",
"graph.add(gtsam.GenericStereoFactor3D(\n",
" gtsam.StereoPoint2(320, 270, 115), measurement_noise, X(2), L(3), K))\n",
" gtsam.StereoPoint2(320, 270, 115), measurement_noise, X(2), L(3), K\n",
"))\n",
"\n",
"print(\"Graph size after adding all factors:\", graph.size())"
]
Expand All @@ -214,17 +253,15 @@
"source": [
"## 4. Create Initial Estimates\n",
"\n",
"Nonlinear optimization requires initial estimates for all variables (poses and landmarks).\n",
"We create a `Values` object to store these.\n",
"\n",
"* `X(1)`: At origin (matches the prior).\n",
"* `X(2)`: Robot moves forward, slightly off-axis: `Pose3(Rot3(), Point3(0.1, -0.1, 1.1))`.\n",
"* Landmarks (`L(1)`, `L(2)`, `L(3)` or `3,4,5`): Placed at estimated 3D positions.\n"
"Nonlinear optimization requires initial estimates for all variables (poses and landmarks). We create a `Values` object named `initial_estimate` to store these. To see the effects of optimization in the later sections, we introduce a small error when inserting the initial estimate of pose `X(2)`. In summary, this is what we insert into `initial_estimate`:\n",
"* Pose `X(1)`: Identity pose at the origin.\n",
"* Pose `X(2)`: Camera pose after the robot moves forward by one meter, slightly off-axis and with a small overshoot.\n",
"* Landmarks (`L(1)`, `L(2)`, `L(3)`): Placed at estimated 3D positions.\n"
]
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 7,
"id": "b4be6ab2",
"metadata": {},
"outputs": [],
Expand Down Expand Up @@ -258,7 +295,7 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 8,
"id": "d0c3a87b",
"metadata": {},
"outputs": [],
Expand All @@ -277,18 +314,21 @@
"source": [
"## 6. Analyze Results\n",
"\n",
"We can extract the optimized poses and landmark positions from the `result` Values object.\n",
"After optimization, we extract the refined poses and landmark positions from the `result` returned by the optimizer.\n",
"\n",
"Recall that in this example, the robot begins at the origin, and we expect the second pose `X(2)` to be apprixmately one meter forward along the Z-axis, so `X(2)` should ideally be near `Pose3(Rot3(), Point3(0, 0, 1.0))`.\n",
"\n",
"For example, the true second pose (if the robot moved exactly 1m forward along Z from origin) would be `Pose3(Rot3(), Point3(0, 0, 1.0))`.\n",
"The true landmark positions can be triangulated from the perfect measurements and true poses.\n",
"The example's measurements and initial estimates are designed to be somewhat noisy/imperfect, so the optimizer refines them.\n",
"GTSAM minimizes the total reprojection error (i.e. the discrepancy between observed and predicted stereo image coordinates) across all factors. We can evaluate this using `graph.error()` to compute the total error before and after optimization. A significant drop in this error indicates that the optimizer successfully refined the estimates.\n",
"\n",
"Let's check the error before and after optimization.\n"
"To better understand the outcome of this example, we print the error, both prior- and post-optimization, along with the optimized poses and landmark positions. This helps verify:\n",
"- That the optimizer is successful in refining the estimates.\n",
"- That the second pose `X(2)` aligns with the expected forward motion.\n",
"- That the landmark locations are consistent and reasonable."
]
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 9,
"id": "093fe3dc",
"metadata": {},
"outputs": [
Expand Down
Loading
Loading