Skip to content

Fix Mplot3d warning formatting #168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 17 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Author: **[Luigi Freda](https://www.luigifreda.com)**
- A **[volumetric reconstruction pipeline](#volumetric-reconstruction)** that processes depth and color images using volumetric integration to produce dense reconstructions. It supports **TSDF** with voxel hashing and incremental **Gaussian Splatting**.
- Integration of **[depth prediction models](#depth-prediction)** within the SLAM pipeline. These include DepthPro, DepthAnythingV2, RAFT-Stereo, CREStereo, etc.
- Additional tools for VO (Visual Odometry) and SLAM, with built-in support for both **g2o** and **GTSAM**, along with custom Python bindings for features not included in the original libraries.
- Built-in support for over 10 [dataset types](#datasets).
- Built-in support for over [10 dataset types](#datasets).

pySLAM serves as flexible baseline framework to experiment with VO/SLAM techniques, *[local features](#supported-local-features)*, *[descriptor aggregators](#supported-global-descriptors-and-local-descriptor-aggregation-methods)*, *[global descriptors](#supported-global-descriptors-and-local-descriptor-aggregation-methods)*, *[volumetric integration](#volumetric-reconstruction-pipeline)* and *[depth prediction](#depth-prediction)*. It allows to explore, prototype and develop VO/SLAM pipelines. pySLAM is a research framework and a work in progress. It is not optimized for real-time performances.

Expand Down Expand Up @@ -122,7 +122,7 @@ Other test/example scripts are provided in the `test` folder.

### System overview

[Here](./docs/system_overview.md) you can find a couple of diagram sketches that provide an overview of the main SLAM **workflow**, system **components**, and **classes** relationships/dependencies.
[Here](./docs/system_overview.md) you can find a system overview and diagrams that outline the main SLAM **workflow**, its **components**, and **classes** relationships/dependencies.


---
Expand Down Expand Up @@ -806,22 +806,23 @@ If you like pySLAM and would like to contribute to the code base, you can report

Many improvements and additional features are currently under development:

- [x] loop closing
- [x] relocalization
- [x] stereo and RGBD support
- [x] map saving/loading
- [x] modern DL matching algorithms
- [ ] object detection
- [ ] semantic segmentation
- [x] Loop closing
- [x] Relocalization
- [x] Stereo and RGBD support
- [x] Map saving/loading
- [x] Modern DL matching algorithms
- [ ] Object detection
- [ ] Semantic segmentation
- [x] 3D dense reconstruction
- [x] unified install procedure (single branch) for all OSs
- [x] trajectory saving
- [x] depth prediction integration, more models: VGGT, MoGE [WIP]
- [x] Unified install procedure (single branch) for all OSs
- [x] Trajectory saving
- [x] Depth prediction integration, more models: VGGT, MoGE [WIP]
- [x] ROS support [WIP]
- [x] gaussian splatting integration
- [x] documentation [WIP]
- [x] gtsam integration [WIP]
- [x] Gaussian splatting integration
- [x] Documentation [WIP]
- [x] GTSAM integration [WIP]
- [ ] IMU integration
- [ ] LIDAR integration
- [x] XSt3r-based methods integration [WIP]
- [x] evaluation scripts
- [x] Evaluation scripts
- [ ] More camera models
12 changes: 6 additions & 6 deletions config.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
DATASET:
# select your dataset (decomment only one of the following lines)
#type: EUROC_DATASET
type: KITTI_DATASET
#type: KITTI_DATASET
#type: TUM_DATASET
#type: ICL_NUIM_DATASET
#type: REPLICA_DATASET
#type: TARTANAIR_DATASET
#type: VIDEO_DATASET
type: VIDEO_DATASET
#type: ROS1BAG_DATASET
#type: ROS2BAG_DATASET
#type: FOLDER_DATASET
Expand All @@ -33,17 +33,17 @@ TUM_DATASET:
sensor_type: rgbd # Here, 'sensor_type' can be 'mono' or 'rgbd'
base_path: /home/luigi/Work/datasets/rgbd_datasets/tum
#
#name: rgbd_dataset_freiburg3_long_office_household
#settings: settings/TUM3.yaml # do not forget to correctly set the corresponding camera settings file
name: rgbd_dataset_freiburg3_long_office_household
settings: settings/TUM3.yaml # do not forget to correctly set the corresponding camera settings file
#
# name: rgbd_dataset_freiburg1_xyz
# settings: settings/TUM1.yaml # do not forget to correctly set the corresponding camera settings file
#
#name: rgbd_dataset_freiburg2_desk
#settings: settings/TUM2.yaml # do not forget to correctly set the corresponding camera settings file
#
name: rgbd_dataset_freiburg1_desk
settings: settings/TUM1.yaml # do not forget to correctly set the corresponding camera settings file
# name: rgbd_dataset_freiburg1_desk
# settings: settings/TUM1.yaml # do not forget to correctly set the corresponding camera settings file
#
# name: rgbd_dataset_freiburg1_room # do not use this for mono, there are some in-place rotations during exploratory phases
# settings: settings/TUM1.yaml # do not forget to set the corresponding camera settings file
Expand Down
2 changes: 1 addition & 1 deletion dense/volumetric_integrator_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ class VolumetricIntegratorType(SerializableEnum):
# "ASH: A Modern Framework for Parallel Spatial Hashing in 3D Perception"
GAUSSIAN_SPLATTING = 1 # Incremental Gaussian Splatting by leveraging MonoGS backend: pySLAM keyframes are passed as posed input frames to MonoGS backend.
# You need CUDA to run Gaussian Splatting.
# As for MonoGS backend, see the following paper: "Gaussian Splatting SLAM".
# As for MonoGS backend, see the paper: "Gaussian Splatting SLAM".

@staticmethod
def from_string(name: str):
Expand Down
8 changes: 8 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Documentation

- A document presenting pySLAM is available [here](./tex/document.pdf).
- [System Overview](./system_overview.md)
- [Python virtual environments](./PYTHON-VIRTUAL-ENVS.md)
- [Python conda environments](./CONDA.md)
- [MacOs install procedure](./MAC.md)
- [Troubleshooting](./TROUBLESHOOTING.md)
5 changes: 4 additions & 1 deletion docs/evaluations/evaluations.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
# Evaluations

The following evaluation reports show the results obtained by comparing the `baseline` preset, `ORB2` local features + `DBOW3`, against the preset `ROOT_SIFT` + `DBOW3_INDEPENDENT`, and `SUPERPOINT` + `DBOW3_INDEPENDENT`. The results have been obtained by running `./main_slam_evaluation.py` with the json configuration files contained in the subfolders `EVAL_TUM`, `EUROC` and `KITTI`, respectively.
The following evaluation reports compare the metrics obtained with the `baseline` preset, `ORB2` local features + `DBOW3`, against the presets `ROOT_SIFT` + `DBOW3_INDEPENDENT`, and `SUPERPOINT` + `DBOW3_INDEPENDENT`. The results have been obtained by running `./main_slam_evaluation.py` with the json configuration files contained in the subfolders `EVAL_TUM`, `EUROC` and `KITTI`, respectively.

The data have been obtained by running the script [main_slam_evaluaption.py](../../main_slam_evaluation.py) and using the json configuration files in the folder [../../evaluation/configs/](../../evaluation/configs/).

**Reports**:
- [TUM](./EVAL_TUM/report.pdf)
- [EUROC](./EVAL_EUROC/report.pdf)
- [KITTI](./EVAL_KITTI/report.pdf)
62 changes: 55 additions & 7 deletions docs/system_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,7 @@
<!-- TOC -->

- [System Overview](#system-overview)
- [SLAM Workflow](#slam-workflow)
- [SLAM Components](#slam-components)
- [SLAM Workflow and Components](#slam-workflow-and-components)
- [Main System Components](#main-system-components)
- [Feature Tracker](#feature-tracker)
- [Feature Matcher](#feature-matcher)
Expand All @@ -14,25 +13,37 @@

<!-- /TOC -->

This document presents some diagram sketches that provide an overview of the main workflow, system components, and class relationships/dependencies. To make the diagrams more readable, some minor components and arrows have been omitted.
This document presents a system overview and diagrams that outline the main workflow, its components, and class relationships/dependencies. To make the diagrams more readable, some minor components and arrows have been omitted. A **pdf version** of this presentation (with futher details) is available [here](./tex/document.pdf).

---

## SLAM Workflow
## SLAM Workflow and Components

<p align="center">
<img src="./images/slam_workflow.png" alt="SLAM Workflow" />
</p>

---
## SLAM Components
This figure illustrates the SLAM workflow, which is composed of **five main parallel processing modules**:
- *Tracking*: estimates the camera pose for each incoming frame by extracting and matching local features to the local map, followed by minimizing the reprojection error through motion-only Bundle Adjustment (BA). It includes components such as pose prediction (or relocalization), feature tracking, local map tracking, and keyframe decision-making.
- *Local Mapping*: updates and refines the local map by processing new keyframes. This involves culling redundant map points, creating new points via temporal triangulation, fusing nearby map points, performing Local BA, and pruning redundant local keyframes.
- *Loop Closing*: detects and validates loop closures to correct drift accumulated over time. Upon loop detection, it performs loop group consistency checks and geometric verification, applies corrections, and then launches Pose Graph Optimization (PGO) followed by a full Global Bundle Adjustment (GBA). Loop detection itself is delegated to a parallel process, the *Loop Detector*, which operates independently for better responsiveness and concurrency.
- *Global Bundle Adjustment*: triggered by the Loop Closing module after PGO, this step globally optimizes the trajectory and the sparse structure of the map to ensure consistency across the entire sequence.
- *Volumetric Integration*: uses the keyframes, with their estimated poses and back-projected point clouds, to reconstruct a dense 3D map of the environment. This module optionally integrates predicted depth maps and maintains a volumetric representation such as a TSDF or Gaussian Splatting-based volume.

The first four modules follow the established PTAM and ORB-SLAM paradigm. Here, the Tracking module serves as the front-end, while the remaining modules operate as part of the back-end.

In parallel, the system constructs two types of maps:
- The sparse map ${\cal M}_s = ({\cal K}, {\cal P})$, composed of a set of keyframes ${\cal K}$ and 3D points ${\cal P}$ derived from matched features.
- The volumetric/dense map ${\cal M}_v$, constructed by the Volumetric Integration module, which fuses back-projected point clouds from the keyframes ${\cal K}$ into a dense 3D model.

To ensure consistency between the sparse and volumetric representations, the volumetric map is updated or re-integrated whenever global pose adjustments occur (e.g., after loop closures).

<p align="center">
<img src="./images/slam_components.png" alt="SLAM Components" />
</p>


**Note**: In some case, **Processes** were used instead of **Threads** because in Python 3.8 (used by pySLAM) the Global Interpreter Lock (GIL) allows only one thread can execute at a time within a single process. Multiprocessing avoids this limitation and enables better parallelism, though it involves data duplication via pickling. See this related nice [post](https://www.theserverside.com/blog/Coffee-Talk-Java-News-Stories-and-Opinions/Is-Pythons-GIL-the-software-worlds-biggest-blunder).
This other figure details the internal components and interactions of the above modules. In certain cases, **processes** are employed instead of **threads**. This is due to Python's Global Interpreter Lock (GIL), which prevents concurrent execution of multiple threads in a single process. The use of multiprocessing circumvents this limitation, enabling true parallelism at the cost of some inter-process communication overhead (e.g., via pickling). For an insightful discussion, see this related [post](https://www.theserverside.com/blog/Coffee-Talk-Java-News-Stories-and-Opinions/Is-Pythons-GIL-the-software-worlds-biggest-blunder).


---
Expand All @@ -45,6 +56,21 @@ This document presents some diagram sketches that provide an overview of the mai
<img src="./images/feature_tracker.png" alt="Feature Tracker" />
</p>

The *Feature Tracker* consists of the following key sub-components:

- **Feature Detector**: identifies salient and repeatable keypoints in the image, such as corners or blobs, which are likely to be robust under viewpoint and illumination changes.

- **Feature Extractor**: computes a distinctive descriptor for each detected keypoint, encoding its local appearance to enable robust matching across frames. Examples include ORB, SIFT, or SuperPoint descriptors.

- **Feature Matcher**: establishes correspondences between features in successive frames (or stereo pairs) by comparing their descriptors. Matching can be performed using brute-force, k-NN with ratio test, or learned matching strategies. Refer to Section [Feature Matcher](#feature-matcher) below for further details.

The section [Supported local features](../README.md#supported-local-features) reports the list of supported local feature extractors and detectors.

The last diagram above presents the architecture of the *Feature Tracker* system. It is structured around a `feature_tracker_factory`, which instantiates specific tracker types such as `LK`, `DES_BF`, `DES_FLANN`, `XFEAT`, `LIGHTGLUE`, and `LOFTR`. Each tracker type creates a corresponding implementation (e.g., `LKFeatureTracker`, `DescriptorFeatureTracker`, etc.), all of which inherit from a common `FeatureTracker` interface.

The `FeatureTracker` class is composed of several key sub-components, including a `FeatureManager`, `FeatureDetector`, `FeatureDescriptor`, `PyramidAdaptor`, `BlockAdaptor`, and `FeatureMatcher`. The `FeatureManager` itself also encapsulates instances of the detector, descriptor, and adaptors, highlighting the modular and reusable design of the tracking pipeline.



### Feature Matcher

Expand All @@ -53,19 +79,36 @@ This document presents some diagram sketches that provide an overview of the mai
</p>


This last diagram illustrates the architecture of the *Feature Matcher* module. At its core is the `feature_matcher_factory`, which instantiates matchers based on a specified `matcher_type`, such as `BF`, `FLANN`, `XFEAT`, `LIGHTGLUE`, and `LOFTR`. Each of these creates a corresponding matcher implementation (e.g., `BfFeatureMatcher`, `FlannFeatureMatcher`, etc.), all inheriting from a common `FeatureMatcher` interface.

The `FeatureMatcher` class encapsulates several configuration parameters and components, including the matcher engine (`cv2.BFMatcher`, `FlannBasedMatcher`, `xfeat.XFeat`, etc.), as well as the `matcher_type`, `detector_type`, `descriptor_type`, `norm_type`, and `ratio_test` fields. This modular structure supports extensibility and facilitates switching between traditional and learning-based feature matching backends.

The section [Supported matchers](../README.md#supported-matchers) reports a list of supported feature matchers.

### Loop Detector

<p align="center">
<img src="./images/loop_detector.png" alt="Loop Detector" />
</p>

This diagram shows the architecture of the *Loop Detector* component. A central `loop_detector_factory` instantiates loop detectors based on the selected `global_descriptor_type`, which may include traditional descriptors (e.g., `DBOW2`, `VLAD`, `IBOW`) or deep learning-based embeddings (e.g., `NetVLAD`, `CosPlace`, `EigenPlaces`).

Each descriptor type creates a corresponding loop detector implementation (e.g., `LoopDetectorDBoW2`, `LoopDetectorNetVLAD`), all of which inherit from a base class hierarchy. Traditional methods inherit directly from `LoopDetectorBase`, while deep learning-based approaches inherit from `LoopDetectorVprBase`, which itself extends `LoopDetectorBase`. This design supports modular integration of diverse place recognition techniques within a unified loop closure framework.

The section [Supported loop closing methods](../README.md#supported-global-descriptors-and-local-descriptor-aggregation-methods) reports a list of supported loop closure methods with the adopted global descriptors and local descriptor aggregation methods.

### Depth Estimator

<p align="center">
<img src="./images/depth_estimator.png" alt="Depth Estimator" />
</p>

The last diagram illustrates the architecture of the *Depth Estimator* module. A central `depth_estimator_factory` creates instances of various depth estimation backends based on the selected `depth_estimator_type`, including both traditional and learning-based methods such as `DEPTH_SGBM`, `DEPTH_RAFT_STEREO`, `DEPTH_ANYTHING_V2`, `DEPTH_MAST3R`, and `DEPTH_MVDUST3R`.

Each estimator type instantiates a corresponding implementation (e.g., `DepthEstimatorSgbm`, `DepthEstimatorCrestereo`, etc.), all inheriting from a common `DepthEstimator` interface. This base class encapsulates shared dependencies such as the `camera`, `device`, and `model` components, allowing for modular integration of heterogeneous depth estimation techniques across stereo, monocular, and multi-view pipelines.

Section [Supported depth prediction models](../README.md#supported-depth-prediction-models) provides a list of supported depth estimation/prediction models.


### Volumetric Integrator

Expand All @@ -74,3 +117,8 @@ This document presents some diagram sketches that provide an overview of the mai
</p>


This diagram illustrates the structure of the *Volumetric Integrator* module. At its core, the `volumetric_integrator_factory` generates specific volumetric integrator instances based on the selected `volumetric_integrator_type`, such as `TSDF` and `GAUSSIAN_SPLATTING`.

Each type instantiates a dedicated implementation (e.g., `VolumetricIntegratorTSDF`, `VolumetricIntegratorGaussianSplatting`), which inherits from a common `VolumetricIntegratorBase`. This base class encapsulates key components including the `camera`, a `keyframe_queue`, and the `volume`, enabling flexible integration of various 3D reconstruction methods within a unified pipeline.

Section [Supported volumetric mapping methods](../README.md#supported-volumetric-mapping-methods) provides a list of supported volume integration methods.
47 changes: 39 additions & 8 deletions docs/tex/bibliography.bib
Original file line number Diff line number Diff line change
@@ -1,3 +1,42 @@
@article{kerbl20233d,
title={3d gaussian splatting for real-time radiance field rendering.},
author={Kerbl, Bernhard and Kopanas, Georgios and Leimk{\"u}hler, Thomas and Drettakis, George},
journal={ACM Trans. Graph.},
volume={42},
number={4},
pages={139--1},
year={2023}
}

@article{matsuki2023gaussian,
title={Gaussian splatting slam. arXiv},
author={Matsuki, H and Murai, R and Kelly, PH and Davison, AJ},
journal={arXiv preprint arXiv:2312.06741},
year={2023}
}

@article{dong2022ash,
title={ASH: A modern framework for parallel spatial hashing in 3D perception},
author={Dong, Wei and Lao, Yixing and Kaess, Michael and Koltun, Vladlen},
journal={IEEE transactions on pattern analysis and machine intelligence},
volume={45},
number={5},
pages={5417--5435},
year={2022},
publisher={IEEE}
}

@INPROCEEDINGS{PTAM,
author={Klein, Georg and Murray, David},
booktitle={2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality},
title={Parallel Tracking and Mapping for Small AR Workspaces},
year={2007},
volume={},
number={},
pages={225-234},
keywords={Robot vision systems;Cameras;Tracking;Yarn;Robustness;Layout;Simultaneous localization and mapping;Algorithm design and analysis;Concurrent computing;Handheld computers},
doi={10.1109/ISMAR.2007.4538852}}

@article{rosten2006machine,
title={Machine learning for high-speed corner detection},
author={Rosten, Edward and Drummond, Tom},
Expand Down Expand Up @@ -316,14 +355,6 @@ @article{zollhofer2018state
publisher={Wiley Online Library}
}

@article{kerbl2023monogs,
title={MonoGS: Monocular 3D Gaussian Splatting},
author={Kerbl, Bernhard and others},
journal={arXiv preprint arXiv:2312.06741},
year={2023}
}


@misc{ORB_SLAM2,
author = {Raul Mur-Artal and Juan D. Tardos},
title = {ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras},
Expand Down
Binary file modified docs/tex/document.pdf
Binary file not shown.
Loading