Replies: 1 comment
-
Recurrent PPO (or any other on-policy loss) is supported. Here are the key points to consider, which are similar to the Q-learning tutorial:
Besides that, the loss should work out of the box on the data sampled from the replay buffer, and using the RNNs-based models. But feel free to open an issue or share code. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I read the tutorial on recurrent Q learning, my question would be if recurrent PPO is also supported?
Any special caveats that I should be aware of when doing recurrent PPO in contrast to recurrent Q-learning?
I'm currently trying it out and receiving an error, if it turns out that it should be supported I will open an issue with more details.
Thanks for your input!
Beta Was this translation helpful? Give feedback.
All reactions