Issue with Action Output Range in MultiDiscrete Action Learning using Direct_rl_env and RSL-RL in IsaacLab #2057
Unanswered
JeonHaneul
asked this question in
Q&A
Replies: 1 comment
-
Hi @JeonHaneul Alternatively, you can take a look at this project: Spaces showcase tasks for Isaac Lab, particularly to the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I am trying to train a policy using Direct_rl_env and RSL-RL in IsaacLab while utilizing MultiDiscrete action space. In my case, the action output needs to generate one value between 0 and 2 and another value between 0 and 3 simultaneously. To define this, I set the action space as follows:
action_space = [{3}, {4}]
However, I noticed an issue where the second action value correctly outputs within the range of 0 to 3, but the first action value falls within the range of -3 to 0, as shown in Figure 1.
To debug this, I printed the low and high values of MultiDiscrete from spaces.py, and as shown in Figure 2, I confirmed that low = [0, 0] and high = [2, 3]. Despite this, the issue persists, and I am unsure what is causing it.
As a workaround, I attempted to apply torch.clamp in _pre_physics_step to force the first action value to stay within the correct range:
torch.clamp(self.actions[:, 0], min=0, max=2)
While this ensures values remain in the expected range, I noticed that the output is mostly 0, with occasional occurrences of 1 and 2.
I am not sure how to resolve this issue. Could someone please provide guidance on what might be causing this behavior and how to properly fix it?
Figure 1
Figure 2
Beta Was this translation helpful? Give feedback.
All reactions