Open
Description
Hi there,
Apologies if this question seems obvious but I wonder if and why the two functions split_and_pad_trajectories()
and unpad_trajectories()
are necessary, if rollout_storage
already stores the Memory
class' hidden states?
My understanding is that because the RNNs process the observation sequentially step after step, you'd want to split and pad the observations at every done
so you don't pass in the a newly reset environment observation with the hidden states from last time step. But I wonder if this issue is already avoided since RolloutStorage
class stores the previous timestep's hidden states along with the current timestep's observation, i.e.
hidden states | observation | actions | ... |
---|---|---|---|
... | |||
... | ... | ... | ... |
Metadata
Metadata
Assignees
Labels
No labels