You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*<ahref="#Q-learning algorithm resulted chart for the environment-1">Q-learningalgorithmresultedchartfortheenvironment-1</a>
68
-
*<ahref="#Final Q-table with values from the final shortest route for environment-1">Final Q-tablewithvaluesfromthefinalshortestrouteforenvironment-1</a>
*<ahref="#Q-learning algorithm resulted chart for the environment-2">Q-learningalgorithmresultedchartfortheenvironment-2</a>
71
-
*<ahref="#Final Q-table with values from the final shortest route for environment-1">Final Q-tablewithvaluesfromthefinalshortestrouteforenvironment-1</a>
*[Q-learning algorithm resulted chart for the environment-1](#q-learning-algorithm-resulted-chart-for-the-environment-1)
68
+
*[Final Q-table with values from the final shortest route for environment-1](#final-q-table-with-values-from-the-final-shortest-route-for-environment-1)
*[Q-learning algorithm resulted chart for the environment-2](#q-learning-algorithm-resulted-chart-for-the-environment-2)
71
+
*[Final Q-table with values from the final shortest route for environment-1](#final-q-table-with-values-from-the-final-shortest-route-for-environment-1)
### <aname="Q-learningalgorithmresultedchartfortheenvironment-1">Q-learning algorithm resulted chart for the environment-1</a>
84
+
### <aid="q-learning-algorithm-resulted-chart-for-the-environment-1">Q-learning algorithm resulted chart for the environment-1</a>
85
85
Represents number of episodes via number of steps and number of episodes via cost for each episode
86
86
87
87

88
88
89
89
<br/>
90
90
91
-
### <aname="Final Q-tablewithvaluesfromthefinalshortestrouteforenvironment-1">Final Q-table with values from the final shortest route for environment-1</a>
91
+
### <aid="final-q-table-with-values-from-the-final-shortest-route-for-environment-1">Final Q-table with values from the final shortest route for environment-1</a>
92
92

93
93
<br/>Looking at the values of the table we can see the decision for the next action made by agent (mobile robot). The sequence of final actions to reach the goal after the Q-table is filled with knowledge is the following: *down-right-down-down-down-right-down-right-down-right-down-down-right-right-up-up.*
94
94
<br/>During the experiment with Q-learning algorithm the found shortest route to reach the goal for the environment-1 consist of 16 steps and the found longest rout to reach the goal consists of 185 steps.
### <aname="Q-learningalgorithmresultedchartfortheenvironment-2">Q-learning algorithm resulted chart for the environment-2</a>
105
+
### <aid="q-learning-algorithm-resulted-chart-for-the-environment-2">Q-learning algorithm resulted chart for the environment-2</a>
106
106
Represents number of episodes via number of steps and number of episodes via cost for each episode
107
107
108
108

109
109
110
110
<br/>
111
111
112
-
### <aname="Final Q-tablewithvaluesfromthefinalshortestrouteforenvironment-1">Final Q-table with values from the final shortest route for environment-1</a>
112
+
### <aid="final-q-table-with-values-from-the-final-shortest-route-for-environment-1">Final Q-table with values from the final shortest route for environment-1</a>
0 commit comments