Chapter 3: Finite Markov Decision Process 🔗 Notes
At each cell, four actions are possible: north
, south
, east
, and west
, which deterministically cause the agent to move one cell in the perspective direction on the grid.
Code
Solving the gridworld example with optimal action-value function. Code