Open
Description
the notebook doesn't currently reproduce figure 6.2 which uses batch updating (replays all episodes in an experience buffer until convergence). as far as i know RL.jl
doesn't currently support this out the gate since episode info isn't saved, but looking at RLTrajectories.jl
, it should make this task a ton easier.