Issue with Action Output Range in MultiDiscrete Action Learning using Direct_rl_env and RSL-RL in IsaacLab #2057
Unanswered
JeonHaneul
asked this question in
Q&A
Replies: 1 comment
-
Hi @JeonHaneul Alternatively, you can take a look at this project: Spaces showcase tasks for Isaac Lab, particularly to the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am trying to train a policy using Direct_rl_env and RSL-RL in IsaacLab while utilizing MultiDiscrete action space. In my case, the action output needs to generate one value between 0 and 2 and another value between 0 and 3 simultaneously. To define this, I set the action space as follows:
action_space = [{3}, {4}]
However, I noticed an issue where the second action value correctly outputs within the range of 0 to 3, but the first action value falls within the range of -3 to 0, as shown in Figure 1.
To debug this, I printed the low and high values of MultiDiscrete from spaces.py, and as shown in Figure 2, I confirmed that low = [0, 0] and high = [2, 3]. Despite this, the issue persists, and I am unsure what is causing it.
As a workaround, I attempted to apply torch.clamp in _pre_physics_step to force the first action value to stay within the correct range:
torch.clamp(self.actions[:, 0], min=0, max=2)
While this ensures values remain in the expected range, I noticed that the output is mostly 0, with occasional occurrences of 1 and 2.
I am not sure how to resolve this issue. Could someone please provide guidance on what might be causing this behavior and how to properly fix it?
Figure 1
Figure 2
Beta Was this translation helpful? Give feedback.
All reactions