diploma-thesis/ch_exploration.tex at master · MartinBurian/diploma-thesis · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
\documentclass[buriama8_dp.tex]{subfiles}
\begin{document}

\chapter{Maps for Exploration}
\label{chap:expl}

The robot is to explore the environment in front of it, and to record its findings in a map. The map needs to be represented in such way that it can be used both in further exploration, to prevent the arm from hitting obstacles, and to further operate the robot, e.g. decide if it is possible to move forward. We also need to find a framework in which we will formulate and solve the problem of guiding the robot through the whole environment. The decision needs to be made on how to drive the arm.

\section{Map representation}

We need to represent a metric, three-dimensional map. Such maps can be represented either continuously, maintaining precise positions of points and objects represented as geometric primitives, or discretized, storing information about discrete parts of the environment.

Of continuous maps, the only representation applicable to our task are point clouds. In point clouds, the map is represented by a set of points. Those can then be for example avoided by the robot because they represent obstacles. Our sensing instrument detects contact not in single point, but on the whole tool. It is then impossible to reconstruct the precise point at which the contact occured. Furthermore, continuous coverage of the environment would increase the complexity of the coverage algorithm. We do not consider any continuous map representation as a candidate for our solution.

On the other hand, discrete representations are easy to implement, and can be used without loss of resolution if the parameters are tuned correctly. Planning in discretized environments is easier as well. In discrete maps, the whole volume of the environment is discretized and the discrete elements, representing a portion of the environment, represent some information about it in the map.

The basic representation of a 3D environment in a grid is an elevation map. The map is represented in a 2D grid, with its \m x and \m y coordinates corresponding to a horizontal plane in the environment, where each cell \((x_i, y_i)\) contains the height \(z_i\) of the top of the obstacle. This representation is very memory-efficient, using only \(N \times M\) memory locations, where \m N and \m M are the dimensions of hirizontal grid slice. The huge drawback is that no vertical structure can be represented, including overhangs, ceilings etc.

Elevation maps have been improved upon by \cite{multilevel} in Multi-level surface map (MLSM). MLSM is able to represent multiple vertically stacked objects in a single cell. The map cell contains, for each object, a probability distribution of the estimated top of the object (as the main motivation of MLSM is for traversal), plus a \uvz{depth} that represents the vertical size of the object, which is the distance from the mean top value to the lowest point in the object. MLSM maintains the low storage complexity by storing sparse information about the vertical structure.

A full discrete representation of a 3D environment is possible in an occupancy grid. The environment is discretized into a 3D grid of voxels, typically cubes, of which the size of their sides corresponds to the resolution of the map. In each voxel, arbitrary data describing the corresponding part of the environment can be stored. Typically, this can be belief in occupancy of the cell when dealing with sensors of probabilistic nature. Dense grids are very memory hungry, needing full \(N^3\) of memory space.

The dense occupancy grid can be stored more efficiently when represented in an octree. Octrees are fractal structures where each cubic cell can be subdivided into eight cubes with sides half as long. In the octree, cells whose subcells are for example all occupied need not be subdivided and can be represented by only one number. An octree can represent the same information as a dense grid, but at lower memory cost. An implementation of octrees for occupancy grids, called Octomap \cite{octomap}, is used in our software framework. Octomap is highly memory efficient, can be serialized and compressed to small data bundles and is designed to aggregate data from LIDARs and other range sensors in a probabilistic framework.

We will be dealing with relatively small environments, as the workspace of the arm is limited. Memory efficiency is not crucial. We do not need any probabilistic framework for map building, as our sensor is highly deterministic and we only need to represent free, full and unexplored environment. We can thus safely use the basic occupancy grid with benefits of simple implementation and without paying the dire memory cost of representing larger environments.

\section{Exploration algorithm}

We represent the environment in a 3D grid. The grid needs to be traversed by the arm with the sensing tool; every cell must be visited to determine if it is occupied or free (except the cells that are found to be unreachable because of obstacles).

We can formalize our grid as a graph where each cell is represented by a vertex. The robot then moves between the cells. To have each move cover exactly one cell, we allow transitions only between neighboring grid cells, and only from a cell to its 6-neighborhood.

To traverse the whole environment optimally, we would like to visit each cell in the grid exactly once. In other words, we would like to follow a Hamiltonian path through the grid. Planning such path through a general graph is NP-complete. In case of the free grid, however, the path is trivial: a zig-zag pattern filling the environment layer by layer\footnote{We will return to the zig-zag pattern in Section~\ref{subsec:boustrop}}. When obstacles are present, the trivial solution is no longer possible.

As an approximation of a Hamiltonian path on the graph, we tried to plan the shortest path visiting all the vertices. We described the problem in an integer linear program that tries to compute the shortest sequence of actions that covers the whole space, avoiding known obstacles. The solution of the ILP proved to be intractable for instances larger than \(4 \times 4 \times 4\) cells, as it contains quadratic number of variables compared to the number of cells in the environment.

RRT, random sampling motion planning algorithm mentioned above, has been modified in \cite{rita} to directly find a path that acquires sensor data from the whole environment. The new algorithm, RITA, samples the control space just like plain RRT. RITA however optimizes both environment coverage by sensors and path length. RITA shows promising results, but with sensors that cover large portions of space, like laser range finders. In our case, the path would need to be very dense as we have to use only contact sensor.

As finding an optimal path is highly complex, we considered solving the task with Reinforcement Learning (\emph{RL}). In RL, the agent (the algorithm controlling the arm in our case) learns which actions to take in given state by interactive trial and error. We formalize a set of state \(S\), and actions \(A\). In our case, the state could be the current position of the robot and the map of the environment, and actions \uvz{move left} and \uvz{move forward}, or directly joint velocities for the arm joints. We then define reward function \(f: S \mapsto \R\) which specifies a reward the agent receives in given state. The agent learns a function \(q: S \mapsto A\) that, for a state \(s \in S\), returns action \(a\) that maximizes the reward the robot gets in the future. The problem in this case is the representation of the function, as the state space is extremely large (\(\{0,1\}^{N \times N \times N}\)). The recently very popular deep neural networks have been used to represent a complex, high-dimensional function in RL \cite{deeprl}. This makes using RL in this case tractable, but highly complex. As the agent needs to interact with the arm to learn how to use it to explore space, it would be necessary either to allow the agent to operate the arm for a prolonged period of time while taking measures against damaging the arm in the process, or to build an extensive simulation of the whole system where the agent could try policies safely and gradually learn to operate the arm to fulfill its goals.

The problem of traversing the whole environment is not uncommon and arises in various applications. The problem is referred to as Coverage Path Planning (\emph{CPP}) problem, and is aimed to be solved by clever algorithms that solve the highly complex planning problem in reasonable time by hiding some of the complexity in heuristic solutions. The algorithms can be generalized to a 3D world. The CPP is elaborated upon in Chapter~\ref{chap:cpp}.

Considering all the methods we presented above, we decided to solve the problem by Coverage Path Planning. It provides tractable, relatively simple solution to the problem of exploring the environment.

\end{document}

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "buriama8_dp"
%%% End: