[Question] PBT config objective definition

### Question

I want to use the newly added HPO method [Population Based Training](https://isaac-sim.github.io/IsaacLab/main/source/features/population_based_training.html) for my task and set the objective to the total amount of rewards over time for one episode, which is logged in rl_games with the metrics `rewards/time` as can be seen here in the example:
https://github.com/isaac-sim/IsaacLab/blob/b7004f44361a442e226142a1e2d297e4c572a29f/scripts/reinforcement_learning/ray/tuner.py#L364

I have changed the `objective` parameter which is located in the following part of the config:
https://github.com/isaac-sim/IsaacLab/blob/b7004f44361a442e226142a1e2d297e4c572a29f/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/dexsuite/config/kuka_allegro/agents/rl_games_ppo_cfg.yaml#L88-L94

I do get the following error though if I start the training with `objective: rewards/time`:
```
Error executing job with overrides: ['agent.pbt.enabled=True', 'agent.pbt.num_policies=4', 'agent.pbt.policy_idx=0']
Traceback (most recent call last):
  File "/workspace/isaaclab/source/isaaclab_tasks/isaaclab_tasks/utils/hydra.py", line 101, in hydra_main
    func(env_cfg, agent_cfg, *args, **kwargs)
  File "/workspace/isaaclab/scripts/reinforcement_learning/rl_games/train.py", line 239, in main
    runner.run({"train": True, "play": False, "sigma": train_sigma})
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/torch_runner.py", line 178, in run
    self.run_train(args)
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/torch_runner.py", line 149, in run_train
    agent.train()
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/common/a2c_common.py", line 1351, in train
    step_time, play_time, update_time, sum_time, a_losses, c_losses, b_losses, entropies, kls, last_lr, lr_mul = self.train_epoch()
                                                                                                                 ^^^^^^^^^^^^^^^^^^
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/common/a2c_common.py", line 1207, in train_epoch
    batch_dict = self.play_steps()
                 ^^^^^^^^^^^^^^^^^
  File "/workspace/isaaclab/_isaac_sim/kit/python/lib/python3.11/site-packages/rl_games/common/a2c_common.py", line 792, in play_steps
    self.algo_observer.process_infos(infos, env_done_indices)
  File "/workspace/isaaclab/source/isaaclab_rl/isaaclab_rl/rl_games/pbt/pbt.py", line 259, in process_infos
    self._call_multi("process_infos", infos, done_indices)
  File "/workspace/isaaclab/source/isaaclab_rl/isaaclab_rl/rl_games/pbt/pbt.py", line 250, in _call_multi
    getattr(o, method)(*args_, **kwargs_)
  File "/workspace/isaaclab/source/isaaclab_rl/isaaclab_rl/rl_games/pbt/pbt.py", line 75, in process_infos
    score = score[part]
            ~~~~~^^^^^^
KeyError: 'rewards/time'
```

How to set the objective to the corresponding metrics of `rewards/time` and how in general can I access other terms besides `episode.Curriculum/adr` from the example config? 

@ooctipus, maybe you can help out, since I have seen that you have made the commit for this functionality.




	pbt:
	enabled: False
	policy_idx: 0 # policy index in a population
	num_policies: 8 # total number of policies in the population
	directory: .
	workspace: "pbt_workspace" # suffix of the workspace dir name inside train_dir
	objective: episode.Curriculum/adr

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] PBT config objective definition #3638

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] PBT config objective definition #3638

Description

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions