Open
Description
I am running the CleanRL's PPO code for a custom PettingZoo environment using the code here. In line 163, we wrap the environments with the RecordEpisodeStatistics
Gymnasium wrapper, which is then used in lines 210-215 for logging each player's return after the episode has ended.
It turns out that when we invoke pettingzoo_env_to_vec_env_v1
, it invokes the MarkovVectorEnv
class. Here, in line 59 and also in lines 92 and 101, the infos are cast as a list
instead of a usual dict
.
Consequently, the aforementioned Gymnasium wrapper throws an error (tested on PZ's Pistonball
environment):
----> 6 observations, rewards, terminations, truncations, infos = env.step(actions)
7 env.close()
File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/gymnasium/wrappers/record_episode_statistics.py:95, in RecordEpisodeStatistics.step(self, action)
87 """Steps through the environment, recording the episode statistics."""
88 (
89 observations,
90 rewards,
(...)
93 infos,
94 ) = self.env.step(action)
---> 95 assert isinstance(
96 infos, dict
97 ), f"`info` dtype is {type(infos)} while supported dtype is `dict`. This may be due to usage of other wrappers in the wrong order."
98 self.episode_returns += rewards
99 self.episode_lengths += 1
AssertionError: `info` dtype is <class 'list'> while supported dtype is `dict`. This may be due to usage of other wrappers in the wrong order.
Can this please be fixed? If it matters, I am running the code on Lightning Studio with Python 3.10.
Metadata
Metadata
Assignees
Labels
No labels