Purpose of `split_and_pad_trajectories()` and `unpad_trajectories()` for recurrent neural network

Hi there,

Apologies if this question seems obvious but I wonder if and why the two functions `split_and_pad_trajectories()` and `unpad_trajectories()` are necessary, if `rollout_storage` already stores the `Memory` class' hidden states? 

My understanding is that because the RNNs process the observation sequentially step after step, you'd want to split and pad the observations at every `done` so you don't pass in the a newly reset environment observation with the hidden states from last time step. But I wonder if this issue is already avoided since `RolloutStorage` class stores the previous timestep's hidden states along with the current timestep's observation, i.e. 

| hidden states | observation | actions | ...|
| --- | --- | --- |---|
| $$\text{hiddenStates}_{t-1}$$ | $$obs_t$$ | $$a_t$$ | ...|
| ... | ... | ... | ...|



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Purpose of `split_and_pad_trajectories()` and `unpad_trajectories()` for recurrent neural network #41

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

hidden states	observation	actions	...
$$\text{hiddenStates}_{t-1}$$	$$obs_t$$	$$a_t$$	...
...	...	...	...

Purpose of split_and_pad_trajectories() and unpad_trajectories() for recurrent neural network #41

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Purpose of `split_and_pad_trajectories()` and `unpad_trajectories()` for recurrent neural network #41