-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Closed
Description
So far when I tried to use anything else than 0 for random_timesteps for SAC to train a Go2 robot, I do not see any robots in the training videos or plots in tensorboard. In sac.py there is line:
# sample random actions
# TODO, check for stochasticity
if timestep < self._random_timesteps:
return self.policy.random_act({"states": self._state_preprocessor(states)}, role="policy")
# sample stochastic actions
with torch.autocast(device_type=self._device_type, enabled=self._mixed_precision):
actions, _, outputs = self.policy.act({"states": self._state_preprocessor(states)}, role="policy")
return actions, None, outputs
and in base.py for the model there is this for random_act:
def random_act(
self, inputs: Mapping[str, Union[torch.Tensor, Any]], role: str = ""
) -> Tuple[torch.Tensor, None, Mapping[str, Union[torch.Tensor, Any]]]:
"""Act randomly according to the action space
:param inputs: Model inputs. The most common keys are:
- ``"states"``: state of the environment used to make the decision
- ``"taken_actions"``: actions taken by the policy for the given states
:type inputs: dict where the values are typically torch.Tensor
:param role: Role play by the model (default: ``""``)
:type role: str, optional
:raises NotImplementedError: Unsupported action space
:return: Model output. The first component is the action to be taken by the agent
:rtype: tuple of torch.Tensor, None, and dict
"""
# discrete action space (Discrete)
if isinstance(self.action_space, gymnasium.spaces.Discrete):
return torch.randint(self.action_space.n, (inputs["states"].shape[0], 1), device=self.device), None, {}
# continuous action space (Box)
elif isinstance(self.action_space, gymnasium.spaces.Box):
if self._random_distribution is None:
self._random_distribution = torch.distributions.uniform.Uniform(
low=torch.tensor(self.action_space.low[0], device=self.device, dtype=torch.float32),
high=torch.tensor(self.action_space.high[0], device=self.device, dtype=torch.float32),
)
return (
self._random_distribution.sample(sample_shape=(inputs["states"].shape[0], self.num_actions)),
None,
{},
)
else:
raise NotImplementedError(f"Action space type ({type(self.action_space)}) not supported")
I am currently using IsaacLab 2.10 and IsaacSim 4.5.0. I do not know what exactly is the issue here but I think it has something to do with these lines of code I have provided
Metadata
Metadata
Assignees
Labels
No labels