Hi, I'm interested in your research and also planning to do that kind of research.
In this implementation, you collect initial observations with random policy.
Can I use autopilot during initial collect step?
It might be good for stable learning especially in early learning steps.
Thank you for great paper and implementation.
Hi, I'm interested in your research and also planning to do that kind of research.
In this implementation, you collect initial observations with random policy.
Can I use autopilot during initial collect step?
It might be good for stable learning especially in early learning steps.
Thank you for great paper and implementation.