[Question] Clarification on Curriculum Update Timing #2335

ozhanozen · 2025-03-14T11:23:37Z

ozhanozen
Mar 14, 2025

Hello,

I have observed that curriculum elements in IsaacLab are only computed when at least one environment resets. Is this the intended behavior?

This seems counterintuitive to me because curriculum term updates do not necessarily rely on reset information and could be applied across all environments simultaneously. For example, consider the following function:

IsaacLab/source/isaaclab/isaaclab/envs/mdp/curriculums.py

Lines 21 to 36 in 91f53e2

    
           def modify_reward_weight(env: ManagerBasedRLEnv, env_ids: Sequence[int], term_name: str, weight: float, num_steps: int): 
        
               """Curriculum that modifies a reward weight a given number of steps. 
        
               Args: 
        
                   env: The learning environment. 
        
                   env_ids: Not used since all environments are affected. 
        
                   term_name: The name of the reward term. 
        
                   weight: The weight of the reward term. 
        
                   num_steps: The number of steps after which the change should be applied. 
        
               """ 
        
               if env.common_step_counter > num_steps: 
        
                   # obtain term settings 
        
                   term_cfg = env.reward_manager.get_term_cfg(term_name) 
        
                   # update term settings 
        
                   term_cfg.weight = weight 
        
                   env.reward_manager.set_term_cfg(term_name, term_cfg)

I would expect the term weight to be updated as soon as env.common_step_counter > num_steps. However, if no environment resets for a while, the update does not occur.

This issue becomes particularly noticeable when using custom curriculum terms that modify elements progressively based on env.common_step_counter. The effective task difficulty differs across learning libraries:
• rsl_rl has init_at_random_ep_len=True, effectively triggering curriculum updates at every environment step.
• skrl does not have such randomization, leading to less frequent updates (depending on decimation and episode length).

Suggested Fix (if unintended behavior)

If this behavior is unintended, I suggest modifying the curriculum update logic by moving:

IsaacLab/source/isaaclab/isaaclab/envs/manager_based_rl_env.py

Line 354 in 91f53e2

self.curriculum_manager.compute(env_ids=env_ids)

out of the _reset_idx() and the following if block:

IsaacLab/source/isaaclab/isaaclab/envs/manager_based_rl_env.py

Lines 215 to 216 in 91f53e2

    
           reset_env_ids = self.reset_buf.nonzero(as_tuple=False).squeeze(-1) 
        
           if len(reset_env_ids) > 0:

This way, curriculum updates will happen at every step by default. Reset-dependent changes can still be handled within specific term functions if necessary. I haven't tested this myself, though.

Looking forward to your insights!

RandomOakForest · 2025-03-14T15:12:05Z

RandomOakForest
Mar 14, 2025
Maintainer

Thank you for posting this. I'm reviewing with the team which is the intended behavior.

0 replies

Mayankm96 · 2025-03-19T17:57:18Z

Mayankm96
Mar 19, 2025
Maintainer

It is intended as the curriculum implemented is a game-based curriculum where at the end of the episode, the performance of the agent is evaluated and its difficulty is adjusted accordingly.

You are right that if you want this to be triggered based on the simulation counter, then it does not happen properly. The proper solution might be to introduce different curriculum modes similar to event modes that get called based on the desired type of triggering.

What do you think?

0 replies

ozhanozen · 2025-03-19T22:05:28Z

ozhanozen
Mar 19, 2025
Author

It is intended as the curriculum implemented is a game-based curriculum where at the end of the episode, the performance of the agent is evaluated and its difficulty is adjusted accordingly.

You are right that if you want this to be triggered based on the simulation counter, then it does not happen properly. The proper solution might be to introduce different curriculum modes similar to event modes that get called based on the desired type of triggering.

What do you think?

Yes, having different curriculum modes is a great idea!

In fact, beyond episode-based or simulation-step-based triggers, I found the need for two additional types of triggers in my work:

Performance-based triggers: These would adjust the curriculum based on the episode's mean/sum of a specific reward term. Implementing this should be feasible, but having some helper functions in the reward manager could make it more convenient and standardized.
Algorithm-centric (e.g., ppo) triggers: For example, in an insertion task with low tolerances, the standard deviation of the action distribution can aid initial exploration for reaching but hinder precision during insertion. A curriculum could be designed to introduce the insertion phase only when the action distribution’s std falls below a certain threshold. However, integrating such an agent-centric curriculum isn’t straightforward, as the agent and environment are currently quite decoupled.

What are your thoughts on these?

0 replies

RandomOakForest · 2025-04-18T13:40:02Z

RandomOakForest
Apr 18, 2025
Maintainer

Thanks for following up. I will move this to our Discussions for the team to continue if they have any additional feedback. Thanks for your interest in Isaac Lab. @Mayankm96 for vis. Thanks.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Clarification on Curriculum Update Timing #2335

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Question] Clarification on Curriculum Update Timing #2335

Uh oh!

ozhanozen Mar 14, 2025

Suggested Fix (if unintended behavior)

Replies: 4 comments

Uh oh!

RandomOakForest Mar 14, 2025 Maintainer

Uh oh!

Mayankm96 Mar 19, 2025 Maintainer

Uh oh!

ozhanozen Mar 19, 2025 Author

Uh oh!

RandomOakForest Apr 18, 2025 Maintainer

ozhanozen
Mar 14, 2025

RandomOakForest
Mar 14, 2025
Maintainer

Mayankm96
Mar 19, 2025
Maintainer

ozhanozen
Mar 19, 2025
Author

RandomOakForest
Apr 18, 2025
Maintainer