Author: Zhu
Year: 2022
- Neural implicit representation produced over smoothed methods and are not scalable
- Uses hierarchical scene representation and pre trained geometric priors for indoor reconstruction
- Uses RGBD input
- IMAP = first SLAM with neural implicit representation uses a single MLP to represent an entire scene -> cannot handle large scenes
- The core of nice SLAM is hierarchical grid based neural encoding -> allows local updates
- Stores latent code of local geometry and optimizes directly on them
- Hierarchical scene rps: Scene geometry represented by four feature grid(coarse, mid, level, color) => 4 rendering network
- a rendering network takes as input the viewing rays of the pixels queried and the feature grid, returns occupancy values
- From left to right: uses camera pose and feature grid to generate a rendered RGBD image
- From right to left: image + depth to get the camera pose
- Alternative optimization to train the network
- Feature grids = voxel grids
- For each pixel, trilinear interpolation is performed to get the associated feature
- Each rendering mlp decode grid features to occupancy values
- Then uses a differentiable renderer that takes color and occupancy as input and returns the depth and color for each pixel
- Mapping optim: Geometric and photometric loss are both L1 loss