Skip to content

Latest commit

 

History

History
73 lines (61 loc) · 4.02 KB

pastfuturepresent.md

File metadata and controls

73 lines (61 loc) · 4.02 KB

Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age

Author: Cesar Cadena

Year: 2016

Notes:

Introduction:

  • Do robot need SLAM? Is SLAM solved?
  • (1986-2004) = classical age with introduction of basic probabilistic formulations
  • (2004-2015) = algorithmic analysis age study of observability, convergence and consistency, role of sparsity
  • recent odometry have very low drift (<0.5 % of the trajectory) do we need slam?
  • loop closure enables the robot to understand topology
  • Vision based SLAM with slowly moving robots can be considered as a mature research field
  • (2015-) = robust perception age self tuning capabilities, high level understanding, resource awareness

Anatomy:

  • Posterior = belief over X given the measurements
  • when there is no prior p(x) maximum a posteriori estimation becomes maximum likelihood estimation
  • unlike Kalman, MAP formulation doesn't require a distinction between motion and measurement model but Kalman and MAP returns the same estimate in the linear Gaussian Case
  • connectivity of the factor graph influences the sparsity of the SLAM problem
  • the mismatch between filtering and MAP estimation gets smaller when the linearization point is good enough

Long term autonomy robustness:

  • addressing failure modes for long term exploration
  • bag of word address the problem of brute force matching on long term data association
  • make the system failure aware? -> recovery mechanism with a tighter communication between front end and back end
  • for metric relocalization, using place recognition with a sensor (camera) but registration with another (lidar)
  • automatic parameter tuning

Long term autonomy scalability:

  • design SLAM systems with memory complexity bounded
  • sparsification (YEAH!) vs out of core algorithm (parrallel slam)
  • continuous time trajectory estimation
  • open pb: map representation (compressed known map), what can be forgotten?, severe computationnal constraint

Metric map models:

  • in 2D either occupancy grid or landmark based
  • TSDF truncated signed distance function space discretized into a set of voxels $v$ associated with $d(v)$ the distance of the 3D point on the nearest surface and a weight $w(v)$
  • direct methods require GPU for RT
  • how should we choose optimal representation?

Semantic Map models:

  • move from path planning to task planning
  • topological mapping: building a graph with distinguishable places
  • SLAM helps semantic: use of monoSLAM to boost object detection in videos, 3D map for classification
  • Semantic helps SLAM: use prior knowledge of object shape to improve the map
  • joint SLAM and semantic inference
  • semantic based reasonning, actionnability, levels of semantic... many concepts that comes with it

New theoretical tools for SLAM

  • factor graph optimization provides an elegant frame- work which is more amenable to analysis
  • Carlone show that 2D rotation estimation can be computed in closed form
  • Carlone and Dellaert show that in most cases ML estimate is unique and pose graph can be solved globally via convex semidefinite programming
  • computing a suitable initialization is also a good start
  • theoretical work done on posegraph, can it be extended to other type of factor graphs?

Active SLAM:

  • problem of controlling robots motion in order to minimize the uncertainty of its map representation and localisation
  • 3 steps:
    • selecting vantage points
    • computing the utility of each action thinking about the evolution of the posterior over the robot pose and the map
    • estimating if a task is over
  • predicting the effects of future action is an expensive task => when do we switch between active SLAM and passive SLAM

New frontiers:

  • new sensors: depth cameras, light field cameras (?)
  • event based: doesn't send frame, but only pixel changes, 1MHZ, 140dB of dynamic range => SLAM for high speed motion and high dynamic range
  • BUT high frame rate makes SLAM intractable because of amount of data, low spatial resolution
  • deep learning: depth estimation, 6DOF localization, inter frame pose with DNN