Issues Creating Training Script for DCE RL Navigation Example

## Background
I'm trying to train a policy for the **DCE RL navigation** task, but there’s no training script supplied (only the inference script `dce_nn_navigation.py`).  
I attempted to craft one based on the Sample Factory examples, yet several problems cropped up.

---

## Training Script I’m Using
```python
# this is here just to guarantee that isaacgym is imported before PyTorch
# isort: off
# noinspection PyUnresolvedReferences

# isort: on

import sys
from typing import Dict, Optional, Tuple


import isaacgym
import gymnasium as gym
import torch


from torch import Tensor
from sample_factory.algo.utils.context import global_model_factory
from sample_factory.model.encoder import *
from sample_factory.algo.utils.gymnasium_utils import convert_space
from sample_factory.cfg.arguments import parse_full_cfg, parse_sf_args
from sample_factory.envs.env_utils import register_env
from sample_factory.train import run_rl
from sample_factory.utils.typing import Config, Env
from sample_factory.utils.utils import str2bool

from aerial_gym.registry.task_registry import task_registry

import numpy as np


class AerialGymVecEnv(gym.Env):
    """
    Wrapper for isaacgym environments to make them compatible with the sample factory.
    """

    def __init__(self, aerialgym_env, obs_key):
        self.env = aerialgym_env
        self.num_agents = self.env.num_envs
        self.action_space = convert_space(self.env.action_space)

        # Aerial Gym examples environments actually return dicts
        if obs_key == "obs":
            self.observation_space = gym.spaces.Dict(convert_space(self.env.observation_space))
        else:
            raise ValueError(f"Unknown observation key: {obs_key}")

        self._truncated: Tensor = torch.zeros(self.num_agents, dtype=torch.bool)

    def reset(self, *args, **kwargs) -> Tuple[Dict[str, Tensor], Dict]:
        # some IGE envs return all zeros on the first timestep, but this is probably okay
        obs, rew, terminated, truncated, infos = self.env.reset()
        return obs, infos

    def step(self, action) -> Tuple[Dict[str, Tensor], Tensor, Tensor, Tensor, Dict]:
        obs, rew, terminated, truncated, infos = self.env.step(action)
        return obs, rew, terminated, truncated, infos

    def render(self):
        pass


def make_aerialgym_env(
    full_task_name: str,
    cfg: Config,
    _env_config=None,
    render_mode: Optional[str] = None,
) -> Env:
    
    # Import task_registry for this function
    from aerial_gym.registry.task_registry import task_registry
    
    # Ensure DCE navigation task is registered in this subprocess
    if full_task_name == "dce_navigation_task":
        try:
            # Check if task is already registered
            task_registry.get_task_class(full_task_name)
        except KeyError:
            # Task not registered, register it now
            try:
                from aerial_gym.examples.dce_rl_navigation.dce_navigation_task import DCE_RL_Navigation_Task
                from aerial_gym.config.task_config.navigation_task_config import task_config
                
                dce_config = task_config()
                task_registry.register_task("dce_navigation_task", DCE_RL_Navigation_Task, dce_config)
                print(f"Registered dce_navigation_task in subprocess")
            except Exception as e:
                print(f"Failed to register dce_navigation_task in subprocess: {e}")

    return AerialGymVecEnv(task_registry.make_task(task_name=full_task_name), "obs")


def add_extra_params_func(parser):
    """
    Specify any additional command line arguments for this family of custom environments.
    """
    p = parser
    p.add_argument(
        "--env_agents",
        default=-1,
        type=int,
        help="Num agents in each env (default: -1, means use default value from isaacgymenvs env yaml config file)",
    )
    p.add_argument(
        "--obs_key",
        default="obs",
        type=str,
        help='IsaacGym envs return dicts, some envs return just "obs", and some return "obs" and "states".'
        "States key denotes the full state of the environment, and obs key corresponds to limited observations "
        'available in real world deployment. If we use "states" here we can train will full information '
        "(although the original idea was to use asymmetric training - critic sees full state and policy only sees obs).",
    )
    p.add_argument(
        "--subtask",
        default=None,
        type=str,
        help="Subtask for envs that support it (i.e. AllegroKuka regrasping or manipulation or throw).",
    )
    p.add_argument(
        "--ige_api_version",
        default="preview4",
        type=str,
        choices=["preview3", "preview4"],
        help="We can switch between different versions of IsaacGymEnvs API using this parameter.",
    )
    p.add_argument(
        "--eval_stats",
        default=False,
        type=str2bool,
        help="Whether to collect env stats during evaluation.",
    )


def override_default_params_func(env, parser):
    """Most of these parameters are taken from IsaacGymEnvs default config files."""

    parser.set_defaults(
        # we're using a single very vectorized env, no need to parallelize it further
        batched_sampling=True,
        num_workers=1,
        num_envs_per_worker=1,
        worker_num_splits=1,
        actor_worker_gpus=[0],  # obviously need a GPU
        train_for_env_steps=10000000,
        use_rnn=False,
        adaptive_stddev=True,
        policy_initialization="torch_default",
        env_gpu_actions=True,
        env_gpu_observations=True,  # Critical: Tell Sample Factory we're providing GPU tensors
        reward_scale=0.1,
        rollout=24,
        max_grad_norm=0.0,
        batch_size=2048,
        num_batches_per_epoch=2,
        num_epochs=4,
        ppo_clip_ratio=0.2,
        value_loss_coeff=2.0,
        exploration_loss_coeff=0.0,
        nonlinearity="elu",
        learning_rate=3e-4,
        lr_schedule="kl_adaptive_epoch",
        lr_schedule_kl_threshold=0.016,
        shuffle_minibatches=True,
        gamma=0.98,
        gae_lambda=0.95,
        with_vtrace=False,
        value_bootstrap=True,  # assuming reward from the last step in the episode can generally be ignored
        normalize_input=True,
        normalize_returns=True,  # does not improve results on all envs, but with return normalization we don't need to tune reward scale
        save_best_after=int(1e6),
        serial_mode=True,  # it makes sense to run isaacgym envs in serial mode since most of the parallelism comes from the env itself (although async mode works!)
        async_rl=True,
        use_env_info_cache=False,  # speeds up startup
        kl_loss_coeff=0.1,
        restart_behavior="overwrite",
    )

    # override default config parameters for specific envs
    if env in env_configs:
        parser.set_defaults(**env_configs[env])


# custom default configuration parameters for specific envs
# add more envs here analogously (env names should match config file names in IGE)
env_configs = dict(
    position_setpoint_task=dict(
        train_for_env_steps=131000000000,
        encoder_mlp_layers=[256, 128, 64],
        gamma=0.99,
        rollout=16,
        learning_rate=1e-4,
        lr_schedule_kl_threshold=0.016,
        batch_size=16384,
        num_epochs=4,
        max_grad_norm=1.0,
        num_batches_per_epoch=4,
        exploration_loss_coeff=0.0,
        with_wandb=False,
        wandb_project="quad",
        wandb_user="mihirkulkarni",
    ),
    navigation_task=dict(
        train_for_env_steps=131000000000,
        encoder_mlp_layers=[256, 128, 64],
        use_rnn=True,
        encoder_conv_architecture="convnet_simple",  # "resnet_impala_mihirk",
        rnn_num_layers=1,
        rnn_size=64,
        rnn_type="gru",
        gamma=0.98,
        rollout=32,
        learning_rate=1e-4,
        lr_schedule_kl_threshold=0.016,
        batch_size=1024,
        num_epochs=4,
        max_grad_norm=1.0,
        num_batches_per_epoch=4,
        exploration_loss_coeff=0.0,
        with_wandb=False,
        wandb_project="quad",
        wandb_user="mihirkulkarni",
    ),
    dce_navigation_task=dict(
        train_for_env_steps=10000000,  # 10M steps for DCE navigation
        encoder_mlp_layers=[256, 128, 64],
        use_rnn=True,
        encoder_conv_architecture="convnet_simple",
        rnn_num_layers=1,
        rnn_size=64,
        rnn_type="gru",
        gamma=0.98,
        rollout=32,
        learning_rate=1e-4,
        lr_schedule_kl_threshold=0.016,
        batch_size=1024,
        num_epochs=4,
        max_grad_norm=1.0,
        num_batches_per_epoch=4,
        exploration_loss_coeff=0.0,
        with_wandb=False,
        wandb_project="dce_navigation",
        wandb_user="aerial_gym",
    ),
)


class CustomEncoder(Encoder):
    """Just an example of how to use a custom model component."""

    def __init__(self, cfg, obs_space):
        super().__init__(cfg)

        out_size = 0
        out_size_cnn = 0
        self.encoders = nn.ModuleDict()
        out_size += obs_space["observations"].shape[0]

        encoder_fn_image = make_img_encoder
        self.encoders["image_obs"] = encoder_fn_image(cfg, obs_space["image_obs"])
        out_size += self.encoders["image_obs"].get_out_size()

        obs_space_custom = spaces.Box(np.ones(out_size) * -np.Inf, np.ones(out_size) * np.Inf)
        mlp_layers: List[int] = cfg.encoder_mlp_layers
        self.mlp_head_custom = create_mlp(mlp_layers, obs_space_custom.shape[0], nonlinearity(cfg))
        if len(mlp_layers) > 0:
            self.mlp_head_custom = torch.jit.script(self.mlp_head_custom)
        self.encoder_out_size = calc_num_elements(self.mlp_head_custom, obs_space_custom.shape)

    def forward(self, obs_dict):
        x_image_encoding = self.encoders["image_obs"](obs_dict["image_obs"])
        encoding = self.mlp_head_custom(torch.cat((obs_dict["observations"], x_image_encoding), 1))
        return encoding

    def get_out_size(self) -> int:
        return self.encoder_out_size


def make_custom_encoder(cfg: Config, obs_space: ObsSpace) -> Encoder:
    """Factory function as required by the API."""
    return CustomEncoder(cfg, obs_space)


def register_aerialgym_custom_components():
    # Register DCE navigation task
    try:
        from aerial_gym.examples.dce_rl_navigation.dce_navigation_task import DCE_RL_Navigation_Task
        from aerial_gym.config.task_config.navigation_task_config import task_config
        from aerial_gym.registry.task_registry import task_registry
        
        # Use navigation task config as base for DCE navigation
        dce_config = task_config()
        task_registry.register_task("dce_navigation_task", DCE_RL_Navigation_Task, dce_config)
        print("Successfully registered dce_navigation_task")
    except Exception as e:
        print(f"Warning: Could not register dce_navigation_task: {e}")
    
    for env_name in env_configs:
        register_env(env_name, make_aerialgym_env)

    global_model_factory().register_encoder_factory(make_custom_encoder)


def parse_aerialgym_cfg(evaluation=False):
    parser, partial_cfg = parse_sf_args(evaluation=evaluation)
    add_extra_params_func(parser)
    override_default_params_func(partial_cfg.env, parser)
    final_cfg = parse_full_cfg(parser)
    return final_cfg


def main():
    """Script entry point."""
    register_aerialgym_custom_components()
    cfg = parse_aerialgym_cfg()
    status = run_rl(cfg)
    return status


if __name__ == "__main__":
    sys.exit(main())

# The rest follows the standard Sample Factory training pipeline...

```

## Command I’m Running
```bash
python train_aerialgym_custom_net.py \
  --env=dce_navigation_task \
  --train_for_env_steps=100000000 \
  --experiment=13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew \
  --async_rl=True \
  --use_env_info_cache=False \
  --normalize_input=True

### Fixed: Environment Registration Issue
**Error**  
> Env name `quad_with_obstacles` is not registered  

**Solution**  
Switched to `dce_navigation_task`, which internally creates the `env_with_obstacles` environment.
```
---

## Current Issue: Sample Factory Tensor-Conversion Error
Current Issue: Sample Factory Tensor Conversion Error
The training script successfully initializes all components: Isaac Gym environment loads, neural network architecture is built, assets are loaded, and GPU setup completes correctly. However, the process crashes during the environment reset phase with ValueError: only one element tensors can be converted to Python scalars. This occurs in Sample Factory's make_env.py at line 191 where torch.tensor(x_) is called. The DCE navigation task returns observations as GPU tensors (observations: shape (81,), image_obs: shape (1,135,240)), but Sample Factory's observation conversion pipeline attempts to convert these already-existing PyTorch tensors again, causing the failure. Despite setting env_gpu_observations=True, the tensor handling incompatibility persists.

```python
(aerialgym) ziyar@ziyar:~/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/rl_training/sample_factory/aerialgym_examples$ python train_aerialgym_custom_net.py --env=dce_navigation_task --train_for_env_steps=100000000 --experiment=13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew --async_rl=True --use_env_info_cache=False --normalize_input=True
Importing module 'gym_38' (/home/ziyar/aerialgym/IsaacGym_Preview_4_Package/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_38.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/ziyar/aerialgym/IsaacGym_Preview_4_Package/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/torch/utils/cpp_extension.py:25: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  from pkg_resources import packaging  # type: ignore[attr-defined]
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/pkg_resources/__init__.py:3154: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/pkg_resources/__init__.py:3154: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
PyTorch version 1.13.1
Device count 1
/home/ziyar/aerialgym/IsaacGym_Preview_4_Package/isaacgym/python/isaacgym/_bindings/src/gymtorch
Using /home/ziyar/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Emitting ninja build file /home/ziyar/.cache/torch_extensions/py38_cu117/gymtorch/build.ninja...
Building extension module gymtorch...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module gymtorch...
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/networkx/classes/graph.py:23: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
  from collections import Mapping
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/networkx/classes/reportviews.py:95: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
  from collections import Mapping, Set, Iterable
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/networkx/readwrite/graphml.py:346: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  (np.int, "int"), (np.int8, "int"),
Warp 1.0.0-beta.5 initialized:
   CUDA Toolkit: 11.5, Driver: 12.4
   Devices:
     "cpu"    | x86_64
     "cuda:0" | NVIDIA GeForce RTX 4080 Laptop GPU (sm_89)
   Kernel cache: /home/ziyar/.cache/warp/1.0.0-beta.5
/home/ziyar/aerialgym/IsaacGym_Preview_4_Package/isaacgym/python/isaacgym/torch_utils.py:135: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  def get_axis_params(value, axis_idx, x_value=0., dtype=np.float, n_dims=3):
Successfully registered dce_navigation_task
[2025-06-29 09:35:42,048][15322] register_encoder_factory: <function make_custom_encoder at 0x77d9b83c5280>
[2025-06-29 09:35:42,118][15322] Experiment dir /home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/rl_training/sample_factory/aerialgym_examples/train_dir/13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew already exists!
[2025-06-29 09:35:42,119][15322] Overwriting the existing experiment dir /home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/rl_training/sample_factory/aerialgym_examples/train_dir/13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew...
[2025-06-29 09:35:42,119][15322] Starting training in /home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/rl_training/sample_factory/aerialgym_examples/train_dir/13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew...
[2025-06-29 09:35:42,119][15322] Weights and Biases integration disabled
[2025-06-29 09:35:43,031][15322] Queried available GPUs: 0

[2025-06-29 09:35:43,032][15322] Environment var CUDA_VISIBLE_DEVICES is 0

Importing module 'gym_38' (/home/ziyar/aerialgym/IsaacGym_Preview_4_Package/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_38.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/ziyar/aerialgym/IsaacGym_Preview_4_Package/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/torch/utils/cpp_extension.py:25: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  from pkg_resources import packaging  # type: ignore[attr-defined]
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/pkg_resources/__init__.py:3154: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/pkg_resources/__init__.py:3154: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
PyTorch version 1.13.1
Device count 1
/home/ziyar/aerialgym/IsaacGym_Preview_4_Package/isaacgym/python/isaacgym/_bindings/src/gymtorch
Using /home/ziyar/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Emitting ninja build file /home/ziyar/.cache/torch_extensions/py38_cu117/gymtorch/build.ninja...
Building extension module gymtorch...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module gymtorch...
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/networkx/classes/graph.py:23: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
  from collections import Mapping
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/networkx/classes/reportviews.py:95: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
  from collections import Mapping, Set, Iterable
/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/networkx/readwrite/graphml.py:346: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  (np.int, "int"), (np.int8, "int"),
Warp 1.0.0-beta.5 initialized:
   CUDA Toolkit: 11.5, Driver: 12.4
   Devices:
     "cpu"    | x86_64
     "cuda:0" | NVIDIA GeForce RTX 4080 Laptop GPU (sm_89)
   Kernel cache: /home/ziyar/.cache/warp/1.0.0-beta.5
/home/ziyar/aerialgym/IsaacGym_Preview_4_Package/isaacgym/python/isaacgym/torch_utils.py:135: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  def get_axis_params(value, axis_idx, x_value=0., dtype=np.float, n_dims=3):
Registered dce_navigation_task in subprocess
[2306 ms][aerial_gym.examples.dce_rl_navigation.dce_navigation_task] - CRITICAL : Setting number of environments to 1. (dce_navigation_task.py:18)
[2306 ms][base_task] - INFO : Setting seed: 1840402265 (base_task.py:38)
[2307 ms][navigation_task] - INFO : Building environment for navigation task. (navigation_task.py:44)
[2307 ms][navigation_task] - INFO : Sim Name: base_sim, Env Name: env_with_obstacles, Robot Name: lmf2, Controller Name: lmf2_velocity_control (navigation_task.py:45)
[2307 ms][env_manager] - INFO : Populating environments. (env_manager.py:73)
[2307 ms][env_manager] - INFO : Creating simulation instance. (env_manager.py:87)
[2307 ms][env_manager] - INFO : Instantiating IGE object. (env_manager.py:88)
[2307 ms][IsaacGymEnvManager] - INFO : Creating Isaac Gym Environment (IGE_env_manager.py:41)
[2307 ms][IsaacGymEnvManager] - INFO : Acquiring gym object (IGE_env_manager.py:73)
[2307 ms][IsaacGymEnvManager] - INFO : Acquired gym object (IGE_env_manager.py:75)
[isaacgym:gymutil.py] Unknown args:  ['--env=dce_navigation_task', '--train_for_env_steps=100000000', '--experiment=13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew', '--async_rl=True', '--use_env_info_cache=False', '--normalize_input=True']
[2308 ms][IsaacGymEnvManager] - INFO : Fixing devices (IGE_env_manager.py:89)
[2308 ms][IsaacGymEnvManager] - INFO : Using GPU pipeline for simulation. (IGE_env_manager.py:102)
[2308 ms][IsaacGymEnvManager] - INFO : Sim Device type: cuda, Sim Device ID: 0 (IGE_env_manager.py:105)
[2308 ms][IsaacGymEnvManager] - CRITICAL : 
 Setting graphics device to -1.
 This is done because the simulation is run in headless mode and no Isaac Gym cameras are used.
 No need to worry. The simulation and warp rendering will work as expected. (IGE_env_manager.py:112)
[2308 ms][IsaacGymEnvManager] - INFO : Graphics Device ID: -1 (IGE_env_manager.py:119)
[2308 ms][IsaacGymEnvManager] - INFO : Creating Isaac Gym Simulation Object (IGE_env_manager.py:120)
[2308 ms][IsaacGymEnvManager] - WARNING : If you have set the CUDA_VISIBLE_DEVICES environment variable, please ensure that you set it
to a particular one that works for your system to use the viewer or Isaac Gym cameras.
If you want to run parallel simulations on multiple GPUs with camera sensors,
please disable Isaac Gym and use warp (by setting use_warp=True), set the viewer to headless. (IGE_env_manager.py:127)
[2308 ms][IsaacGymEnvManager] - WARNING : If you see a segfault in the next lines, it is because of the discrepancy between the CUDA device and the graphics device.
Please ensure that the CUDA device and the graphics device are the same. (IGE_env_manager.py:132)
Not connected to PVD
+++ Using GPU PhysX
Physics Engine: PhysX
Physics Device: cuda:0
GPU Pipeline: enabled
[3215 ms][IsaacGymEnvManager] - INFO : Created Isaac Gym Simulation Object (IGE_env_manager.py:136)
[3215 ms][IsaacGymEnvManager] - INFO : Created Isaac Gym Environment (IGE_env_manager.py:43)
[3439 ms][env_manager] - INFO : IGE object instantiated. (env_manager.py:109)
[3440 ms][env_manager] - INFO : Creating warp environment. (env_manager.py:112)
[3440 ms][env_manager] - INFO : Warp environment created. (env_manager.py:114)
[3440 ms][env_manager] - INFO : Creating robot manager. (env_manager.py:118)
[3440 ms][BaseRobot] - INFO : [DONE] Initializing controller (base_robot.py:26)
[3440 ms][BaseRobot] - INFO : Initializing controller lmf2_velocity_control (base_robot.py:29)
[3440 ms][base_multirotor] - WARNING : Creating 1 multirotors. (base_multirotor.py:32)
[3440 ms][env_manager] - INFO : [DONE] Creating robot manager. (env_manager.py:123)
[3440 ms][env_manager] - INFO : [DONE] Creating simulation instance. (env_manager.py:125)
[3440 ms][asset_loader] - INFO : Loading asset: model.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3441 ms][asset_loader] - INFO : Loading asset: panel.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3443 ms][asset_loader] - INFO : Loading asset: 1_x_1_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3445 ms][asset_loader] - INFO : Loading asset: gate.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3448 ms][asset_loader] - INFO : Loading asset: cuboidal_rod.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3449 ms][asset_loader] - INFO : Loading asset: 0_5_x_0_5_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3450 ms][asset_loader] - INFO : Loading asset: left_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3452 ms][asset_loader] - INFO : Loading asset: right_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3453 ms][asset_loader] - INFO : Loading asset: back_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3454 ms][asset_loader] - INFO : Loading asset: front_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3455 ms][asset_loader] - INFO : Loading asset: bottom_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3457 ms][asset_loader] - INFO : Loading asset: top_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[3458 ms][env_manager] - INFO : Populating environment 0 (env_manager.py:179)
[3887 ms][robot_manager] - WARNING : 
Robot mass: 1.2400000467896461,
Inertia: tensor([[0.0134, 0.0000, 0.0000],
        [0.0000, 0.0144, 0.0000],
        [0.0000, 0.0000, 0.0138]], device='cuda:0'),
Robot COM: tensor([[0., 0., 0., 1.]], device='cuda:0') (robot_manager.py:437)
[3887 ms][robot_manager] - WARNING : Calculated robot mass and inertia for this robot. This code assumes that your robot is the same across environments. (robot_manager.py:440)
[3887 ms][robot_manager] - CRITICAL : If your robot differs across environments you need to perform this computation for each different robot here. (robot_manager.py:443)
[3890 ms][env_manager] - INFO : [DONE] Populating environments. (env_manager.py:75)
[3897 ms][IsaacGymEnvManager] - WARNING : Headless: True (IGE_env_manager.py:424)
[3897 ms][IsaacGymEnvManager] - INFO : Headless mode. Viewer not created. (IGE_env_manager.py:434)
*** Can't create empty tensor
[3947 ms][asset_manager] - WARNING : Number of obstacles to be kept in the environment: 9 (asset_manager.py:32)
WARNING: allocation matrix is not full rank. Rank: 4
/home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/control/motor_model.py:45: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  torch.tensor(self.min_thrust, device=self.device, dtype=torch.float32).expand(
/home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/control/motor_model.py:48: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  torch.tensor(self.max_thrust, device=self.device, dtype=torch.float32).expand(
[4166 ms][control_allocation] - WARNING : Control allocation does not account for actuator limits. This leads to suboptimal allocation (control_allocation.py:48)
[4167 ms][WarpSensor] - INFO : Camera sensor initialized (warp_sensor.py:50)
creating render graph
Module warp.utils load on device 'cuda:0' took 1.64 ms
Module aerial_gym.sensors.warp.warp_kernels.warp_camera_kernels load on device 'cuda:0' took 8.21 ms
Module aerial_gym.sensors.warp.warp_kernels.warp_stereo_camera_kernels load on device 'cuda:0' took 12.99 ms
Module aerial_gym.sensors.warp.warp_kernels.warp_lidar_kernels load on device 'cuda:0' took 6.66 ms
finishing capture of render graph
Encoder network initialized.
Defined encoder.
[ImgDecoder] Starting create_model
[ImgDecoder] Done with create_model
Defined decoder.
Loading weights from file:  /home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/utils/vae/weights/ICRA_test_set_more_sim_data_kld_beta_3_LD_64_epoch_49.pth
[2025-06-29 09:35:47,711][15386] Env info: EnvInfo(obs_space=Dict('image_obs': Box(-1.0, 1.0, (1, 135, 240), float32), 'observations': Box(-1.0, 1.0, (81,), float32)), action_space=Box(-1.0, 1.0, (4,), float32), num_agents=1, gpu_actions=True, gpu_observations=True, action_splits=None, all_discrete=None, frameskip=1, reward_shaping_scheme=None, env_info_protocol_version=1)
[2025-06-29 09:35:48,658][15322] Automatically setting recurrence to 32
[2025-06-29 09:35:48,658][15322] In serial mode all components run on the same process. Only use async_rl and serial mode together for debugging.
[2025-06-29 09:35:48,659][15322] Starting experiment with the following configuration:
help=False
algo=APPO
env=dce_navigation_task
experiment=13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew
train_dir=/home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/rl_training/sample_factory/aerialgym_examples/train_dir
restart_behavior=overwrite
device=gpu
seed=None
num_policies=1
async_rl=True
serial_mode=True
batched_sampling=True
num_batches_to_accumulate=2
worker_num_splits=1
policy_workers_per_policy=1
max_policy_lag=1000
num_workers=1
num_envs_per_worker=1
batch_size=1024
num_batches_per_epoch=4
num_epochs=4
rollout=32
recurrence=32
shuffle_minibatches=True
gamma=0.98
reward_scale=0.1
reward_clip=1000.0
value_bootstrap=True
normalize_returns=True
exploration_loss_coeff=0.0
value_loss_coeff=2.0
kl_loss_coeff=0.1
exploration_loss=entropy
gae_lambda=0.95
ppo_clip_ratio=0.2
ppo_clip_value=1.0
with_vtrace=False
vtrace_rho=1.0
vtrace_c=1.0
optimizer=adam
adam_eps=1e-06
adam_beta1=0.9
adam_beta2=0.999
max_grad_norm=1.0
learning_rate=0.0001
lr_schedule=kl_adaptive_epoch
lr_schedule_kl_threshold=0.016
lr_adaptive_min=1e-06
lr_adaptive_max=0.01
obs_subtract_mean=0.0
obs_scale=1.0
normalize_input=True
normalize_input_keys=None
decorrelate_experience_max_seconds=0
decorrelate_envs_on_one_worker=True
actor_worker_gpus=[0]
set_workers_cpu_affinity=True
force_envs_single_thread=False
default_niceness=0
log_to_file=True
experiment_summaries_interval=10
flush_summaries_interval=30
stats_avg=100
summaries_use_frameskip=True
heartbeat_interval=20
heartbeat_reporting_interval=180
train_for_env_steps=100000000
train_for_seconds=10000000000
save_every_sec=120
keep_checkpoints=2
load_checkpoint_kind=latest
save_milestones_sec=-1
save_best_every_sec=5
save_best_metric=reward
save_best_after=1000000
benchmark=False
encoder_mlp_layers=[256, 128, 64]
encoder_conv_architecture=convnet_simple
encoder_conv_mlp_layers=[512]
use_rnn=True
rnn_size=64
rnn_type=gru
rnn_num_layers=1
decoder_mlp_layers=[]
nonlinearity=elu
policy_initialization=torch_default
policy_init_gain=1.0
actor_critic_share_weights=True
adaptive_stddev=True
continuous_tanh_scale=0.0
initial_stddev=1.0
use_env_info_cache=False
env_gpu_actions=True
env_gpu_observations=True
env_frameskip=1
env_framestack=1
pixel_format=CHW
use_record_episode_statistics=False
with_wandb=False
wandb_user=aerial_gym
wandb_project=dce_navigation
wandb_group=None
wandb_job_type=SF
wandb_tags=[]
with_pbt=False
pbt_mix_policies_in_one_env=True
pbt_period_env_steps=5000000
pbt_start_mutation=20000000
pbt_replace_fraction=0.3
pbt_mutation_rate=0.15
pbt_replace_reward_gap=0.1
pbt_replace_reward_gap_absolute=1e-06
pbt_optimize_gamma=False
pbt_target_objective=true_objective
pbt_perturb_min=1.1
pbt_perturb_max=1.5
env_agents=-1
obs_key=obs
subtask=None
ige_api_version=preview4
eval_stats=False
command_line=--env=dce_navigation_task --train_for_env_steps=100000000 --experiment=13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew --async_rl=True --use_env_info_cache=False --normalize_input=True
cli_args={'env': 'dce_navigation_task', 'experiment': '13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew', 'async_rl': True, 'normalize_input': True, 'train_for_env_steps': 100000000, 'use_env_info_cache': False}
git_hash=7f35eed17f2afcde33e3a7aec669b48e9e8e34cd
git_repo_name=https://github.com/ntnu-arl/aerial_gym_simulator.git
[2025-06-29 09:35:48,659][15322] Saving configuration to /home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/rl_training/sample_factory/aerialgym_examples/train_dir/13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew/config.json...
[2025-06-29 09:35:48,734][15322] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-06-29 09:35:48,735][15322] Rollout worker 0 uses device cuda:0
[2025-06-29 09:35:49,108][15322] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-06-29 09:35:49,109][15322] InferenceWorker_p0-w0: min num requests: 1
[2025-06-29 09:35:49,109][15322] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-06-29 09:35:49,109][15322] Starting seed is not provided
[2025-06-29 09:35:49,110][15322] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-06-29 09:35:49,110][15322] Initializing actor-critic model on device cuda:0
[2025-06-29 09:35:49,110][15322] RunningMeanStd input shape: (1, 135, 240)
[2025-06-29 09:35:49,110][15322] RunningMeanStd input shape: (81,)
[2025-06-29 09:35:49,110][15322] RunningMeanStd input shape: (1,)
[2025-06-29 09:35:49,119][15322] ConvEncoder: input_channels=1
[2025-06-29 09:35:49,207][15322] Conv encoder output size: 512
[2025-06-29 09:35:49,224][15322] Created Actor Critic model with architecture:
[2025-06-29 09:35:49,224][15322] ActorCriticSharedWeights(
  (obs_normalizer): ObservationNormalizer(
    (running_mean_std): RunningMeanStdDictInPlace(
      (running_mean_std): ModuleDict(
        (image_obs): RunningMeanStdInPlace()
        (observations): RunningMeanStdInPlace()
      )
    )
  )
  (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
  (encoder): CustomEncoder(
    (encoders): ModuleDict(
      (image_obs): ConvEncoder(
        (enc): RecursiveScriptModule(
          original_name=ConvEncoderImpl
          (conv_head): RecursiveScriptModule(
            original_name=Sequential
            (0): RecursiveScriptModule(original_name=Conv2d)
            (1): RecursiveScriptModule(original_name=ELU)
            (2): RecursiveScriptModule(original_name=Conv2d)
            (3): RecursiveScriptModule(original_name=ELU)
            (4): RecursiveScriptModule(original_name=Conv2d)
            (5): RecursiveScriptModule(original_name=ELU)
          )
          (mlp_layers): RecursiveScriptModule(
            original_name=Sequential
            (0): RecursiveScriptModule(original_name=Linear)
            (1): RecursiveScriptModule(original_name=ELU)
          )
        )
      )
    )
    (mlp_head_custom): RecursiveScriptModule(
      original_name=Sequential
      (0): RecursiveScriptModule(original_name=Linear)
      (1): RecursiveScriptModule(original_name=ELU)
      (2): RecursiveScriptModule(original_name=Linear)
      (3): RecursiveScriptModule(original_name=ELU)
      (4): RecursiveScriptModule(original_name=Linear)
      (5): RecursiveScriptModule(original_name=ELU)
    )
  )
  (core): ModelCoreRNN(
    (core): GRU(64, 64)
  )
  (decoder): MlpDecoder(
    (mlp): Identity()
  )
  (critic_linear): Linear(in_features=64, out_features=1, bias=True)
  (action_parameterization): ActionParameterizationDefault(
    (distribution_linear): Linear(in_features=64, out_features=8, bias=True)
  )
)
[2025-06-29 09:35:49,724][15322] Using optimizer <class 'torch.optim.adam.Adam'>
[2025-06-29 09:35:49,724][15322] No checkpoints found
[2025-06-29 09:35:49,724][15322] Did not load from checkpoint, starting from scratch!
[2025-06-29 09:35:49,724][15322] Initialized policy 0 weights for model version 0
[2025-06-29 09:35:49,724][15322] LearnerWorker_p0 finished initialization!
[2025-06-29 09:35:49,725][15322] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-06-29 09:35:50,081][15322] Inference worker 0-0 is ready!
[2025-06-29 09:35:50,081][15322] All inference workers are ready! Signal rollout workers to start!
[2025-06-29 09:35:50,081][15322] EnvRunner 0-0 uses policy 0
[11808 ms][aerial_gym.examples.dce_rl_navigation.dce_navigation_task] - CRITICAL : Setting number of environments to 1. (dce_navigation_task.py:18)
[11808 ms][base_task] - INFO : Setting seed: 2067188407 (base_task.py:38)
[11809 ms][navigation_task] - INFO : Building environment for navigation task. (navigation_task.py:44)
[11809 ms][navigation_task] - INFO : Sim Name: base_sim, Env Name: env_with_obstacles, Robot Name: lmf2, Controller Name: lmf2_velocity_control (navigation_task.py:45)
[11809 ms][env_manager] - INFO : Populating environments. (env_manager.py:73)
[11809 ms][env_manager] - INFO : Creating simulation instance. (env_manager.py:87)
[11809 ms][env_manager] - INFO : Instantiating IGE object. (env_manager.py:88)
[11809 ms][IsaacGymEnvManager] - INFO : Creating Isaac Gym Environment (IGE_env_manager.py:41)
[11809 ms][IsaacGymEnvManager] - INFO : Acquiring gym object (IGE_env_manager.py:73)
[11809 ms][IsaacGymEnvManager] - INFO : Acquired gym object (IGE_env_manager.py:75)
[isaacgym:gymutil.py] Unknown args:  ['--env=dce_navigation_task', '--train_for_env_steps=100000000', '--experiment=13_09_23_panel_env_randomized_controllers_fraction_penalty_fraction_rew_multiplier_high_pos_low_neg_rew', '--async_rl=True', '--use_env_info_cache=False', '--normalize_input=True']
[11810 ms][IsaacGymEnvManager] - INFO : Fixing devices (IGE_env_manager.py:89)
[11810 ms][IsaacGymEnvManager] - INFO : Using GPU pipeline for simulation. (IGE_env_manager.py:102)
[11810 ms][IsaacGymEnvManager] - INFO : Sim Device type: cuda, Sim Device ID: 0 (IGE_env_manager.py:105)
[11810 ms][IsaacGymEnvManager] - CRITICAL : 
 Setting graphics device to -1.
 This is done because the simulation is run in headless mode and no Isaac Gym cameras are used.
 No need to worry. The simulation and warp rendering will work as expected. (IGE_env_manager.py:112)
[11810 ms][IsaacGymEnvManager] - INFO : Graphics Device ID: -1 (IGE_env_manager.py:119)
[11810 ms][IsaacGymEnvManager] - INFO : Creating Isaac Gym Simulation Object (IGE_env_manager.py:120)
[11810 ms][IsaacGymEnvManager] - WARNING : If you have set the CUDA_VISIBLE_DEVICES environment variable, please ensure that you set it
to a particular one that works for your system to use the viewer or Isaac Gym cameras.
If you want to run parallel simulations on multiple GPUs with camera sensors,
please disable Isaac Gym and use warp (by setting use_warp=True), set the viewer to headless. (IGE_env_manager.py:127)
[11810 ms][IsaacGymEnvManager] - WARNING : If you see a segfault in the next lines, it is because of the discrepancy between the CUDA device and the graphics device.
Please ensure that the CUDA device and the graphics device are the same. (IGE_env_manager.py:132)
Not connected to PVD
+++ Using GPU PhysX
Physics Engine: PhysX
Physics Device: cuda:0
GPU Pipeline: enabled
[12711 ms][IsaacGymEnvManager] - INFO : Created Isaac Gym Simulation Object (IGE_env_manager.py:136)
[12712 ms][IsaacGymEnvManager] - INFO : Created Isaac Gym Environment (IGE_env_manager.py:43)
[12938 ms][env_manager] - INFO : IGE object instantiated. (env_manager.py:109)
[12938 ms][env_manager] - INFO : Creating warp environment. (env_manager.py:112)
[12938 ms][env_manager] - INFO : Warp environment created. (env_manager.py:114)
[12938 ms][env_manager] - INFO : Creating robot manager. (env_manager.py:118)
[12938 ms][BaseRobot] - INFO : [DONE] Initializing controller (base_robot.py:26)
[12938 ms][BaseRobot] - INFO : Initializing controller lmf2_velocity_control (base_robot.py:29)
[12938 ms][base_multirotor] - WARNING : Creating 1 multirotors. (base_multirotor.py:32)
[12938 ms][env_manager] - INFO : [DONE] Creating robot manager. (env_manager.py:123)
[12938 ms][env_manager] - INFO : [DONE] Creating simulation instance. (env_manager.py:125)
[12938 ms][asset_loader] - INFO : Loading asset: model.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12939 ms][asset_loader] - INFO : Loading asset: panel.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12941 ms][asset_loader] - INFO : Loading asset: 0_5_x_0_5_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12942 ms][asset_loader] - INFO : Loading asset: small_cube.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12943 ms][asset_loader] - INFO : Loading asset: gate.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12946 ms][asset_loader] - INFO : Loading asset: 1_x_1_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12948 ms][asset_loader] - INFO : Loading asset: left_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12949 ms][asset_loader] - INFO : Loading asset: right_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12951 ms][asset_loader] - INFO : Loading asset: back_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12952 ms][asset_loader] - INFO : Loading asset: front_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12953 ms][asset_loader] - INFO : Loading asset: bottom_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12954 ms][asset_loader] - INFO : Loading asset: top_wall.urdf for the first time. Next use of this asset will be via the asset buffer. (asset_loader.py:71)
[12956 ms][env_manager] - INFO : Populating environment 0 (env_manager.py:179)
[12976 ms][robot_manager] - WARNING : 
Robot mass: 1.2400000467896461,
Inertia: tensor([[0.0134, 0.0000, 0.0000],
        [0.0000, 0.0144, 0.0000],
        [0.0000, 0.0000, 0.0138]], device='cuda:0'),
Robot COM: tensor([[0., 0., 0., 1.]], device='cuda:0') (robot_manager.py:437)
[12976 ms][robot_manager] - WARNING : Calculated robot mass and inertia for this robot. This code assumes that your robot is the same across environments. (robot_manager.py:440)
[12976 ms][robot_manager] - CRITICAL : If your robot differs across environments you need to perform this computation for each different robot here. (robot_manager.py:443)
[12978 ms][env_manager] - INFO : [DONE] Populating environments. (env_manager.py:75)
[12985 ms][IsaacGymEnvManager] - WARNING : Headless: True (IGE_env_manager.py:424)
[12985 ms][IsaacGymEnvManager] - INFO : Headless mode. Viewer not created. (IGE_env_manager.py:434)
*** Can't create empty tensor
[13002 ms][asset_manager] - WARNING : Number of obstacles to be kept in the environment: 9 (asset_manager.py:32)
WARNING: allocation matrix is not full rank. Rank: 4
/home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/control/motor_model.py:45: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  torch.tensor(self.min_thrust, device=self.device, dtype=torch.float32).expand(
/home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/control/motor_model.py:48: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  torch.tensor(self.max_thrust, device=self.device, dtype=torch.float32).expand(
[13218 ms][control_allocation] - WARNING : Control allocation does not account for actuator limits. This leads to suboptimal allocation (control_allocation.py:48)
[13219 ms][WarpSensor] - INFO : Camera sensor initialized (warp_sensor.py:50)
creating render graph
Module warp.utils load on device 'cuda:0' took 1.65 ms
Module aerial_gym.sensors.warp.warp_kernels.warp_camera_kernels load on device 'cuda:0' took 8.65 ms
Module aerial_gym.sensors.warp.warp_kernels.warp_stereo_camera_kernels load on device 'cuda:0' took 12.46 ms
Module aerial_gym.sensors.warp.warp_kernels.warp_lidar_kernels load on device 'cuda:0' took 6.68 ms
finishing capture of render graph
Encoder network initialized.
Defined encoder.
[ImgDecoder] Starting create_model
[ImgDecoder] Done with create_model
Defined decoder.
Loading weights from file:  /home/ziyar/aerialgym/aerialgym_ws/src/aerial_gym_simulator/aerial_gym/utils/vae/weights/ICRA_test_set_more_sim_data_kld_beta_3_LD_64_epoch_49.pth
[2025-06-29 09:35:51,791][15322] EvtLoop [Runner_EvtLoop, process=main process 15322] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=()
Traceback (most recent call last):
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init
    env_runner.init(self.timing)
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/sampling/batched_sampling.py", line 181, in init
    self.last_obs, info = self.vec_env.reset()  # anything we need to do with info? Currently we ignore it
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/utils/make_env.py", line 219, in reset
    return self._convert(obs), info
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/utils/make_env.py", line 175, in _convert
    result[key] = self._convert_obs_func[key](value)
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/utils/make_env.py", line 191, in <lambda>
    return lambda x_: torch.tensor(x_)
ValueError: only one element tensors can be converted to Python scalars
[2025-06-29 09:35:51,791][15322] Unhandled exception only one element tensors can be converted to Python scalars in evt loop Runner_EvtLoop
[2025-06-29 09:35:51,791][15322] Uncaught exception in Runner evt loop
Traceback (most recent call last):
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/runners/runner.py", line 770, in run
    evt_loop_status = self.event_loop.exec()
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/signal_slot/signal_slot.py", line 403, in exec
    raise exc
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/signal_slot/signal_slot.py", line 399, in exec
    while self._loop_iteration():
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/signal_slot/signal_slot.py", line 383, in _loop_iteration
    self._process_signal(s)
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/signal_slot/signal_slot.py", line 358, in _process_signal
    raise exc
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init
    env_runner.init(self.timing)
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/sampling/batched_sampling.py", line 181, in init
    self.last_obs, info = self.vec_env.reset()  # anything we need to do with info? Currently we ignore it
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/utils/make_env.py", line 219, in reset
    return self._convert(obs), info
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/utils/make_env.py", line 175, in _convert
    result[key] = self._convert_obs_func[key](value)
  File "/home/ziyar/miniforge3/envs/aerialgym/lib/python3.8/site-packages/sample_factory/algo/utils/make_env.py", line 191, in <lambda>
    return lambda x_: torch.tensor(x_)
ValueError: only one element tensors can be converted to Python scalars
[2025-06-29 09:35:51,792][15322] Runner profile tree view:
main_loop: 2.6828
[2025-06-29 09:35:51,792][15322] Collected {0: 0}, FPS: 0.0
```
---

## Environment Details

| Item              | Value                                                                                                                                                                   |
|-------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Observation space | `Dict({'image_obs': Box(-1.0, 1.0, (1, 135, 240), float32), 'observations': Box(-1.0, 1.0, (81,), float32)})`                                                            |
| GPU mode          | `env_gpu_observations=True, env_gpu_actions=True`                                                                                                                        |
| Assets            | Load successfully                                                                                                                                                       |

---

## What I’ve Already Tried
- Followed Sample Factory wrapper guidelines exactly  
- Verified `env_gpu_observations=True`  
- Confirmed the DCE task returns the correct observation format  
- Still hits the tensor-conversion issue  

---

## Key Questions
1. **Official Training Script** – Is there a correct/official training script for DCE navigation that I missed?

2. **GPU Tensor Handling** – How should GPU tensor observations be handled in the Sample Factory wrapper for Aerial Gym? Particularly for my example.

3. **Task-Specific Quirks** – Are there any special considerations for the DCE navigation task specifically?

4. **Alternative Frameworks** – Should I be using a different training framework or approach for this image-based navigation task? 
---

*Any guidance is appreciated! The DCE navigation example looks promising, but without an official training pipeline it’s hard to replicate the published results.*

Some links I have looked into:
https://ntnu-arl.github.io/aerial_gym_simulator/6_rl_training/?utm_source=chatgpt.com#sample-factory
https://github.com/alex-petrenko/sample-factory
https://gymnasium.farama.org




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issues Creating Training Script for DCE RL Navigation Example #46

Background

Training Script I’m Using

Command I’m Running

Current Issue: Sample Factory Tensor-Conversion Error

Environment Details

What I’ve Already Tried

Key Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Item	Value
Observation space	`Dict({'image_obs': Box(-1.0, 1.0, (1, 135, 240), float32), 'observations': Box(-1.0, 1.0, (81,), float32)})`
GPU mode	`env_gpu_observations=True, env_gpu_actions=True`
Assets	Load successfully

Issues Creating Training Script for DCE RL Navigation Example #46

Description

Background

Training Script I’m Using

Command I’m Running

Current Issue: Sample Factory Tensor-Conversion Error

Environment Details

What I’ve Already Tried

Key Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions