[Bug]: Custom Sub-Hyperparameters during train.py -> Optimize

### 🐛 Bug

I am developing a custom Feature Extractor Type (based on DeepSets) for SB3 and want to train + optimize it with sb3_zoo. For it I add the following to a custom config.py file:

```python
gym.register(
    "env-name",
    class,
    kwargs)

hyperparams = {
    "env-name": dict(
        policy="MultiInputPolicy",
        policy_kwargs={
            "features_extractor_class": FeatureExtractorSet,
            "features_extractor_kwargs": {
                "features_dim": 10
            }
        }
    )
}
```

This works well with the normal train.py (Arguments: '--algo', 'a2c', '--conf-file', 'path/to/config.py', '--gym-packages', 'path.to.config', '--n-timesteps', '100', '--device', 'cpu', '-P', '--env', 'env-name', ...)

When adding '-optimize' the training fails (actions contain NaN as I encode invalid observations that are discarded by the custom FeatureExtractorSet with NaN). Closer investigation shows that the `objective` function updated `self._hyperparams` which contains the sub-dict `{'policy_kwargs': {'feature_extractor_class': FeatureExtractorSet}}` with the sampled hyper-parameters that also set other policy_kwargs then feature_extractor_class.

I would suggest replacing https://github.com/DLR-RM/rl-baselines3-zoo/blob/28dc22896abceb68ea5258dd7dee1d20a691864a/rl_zoo3/exp_manager.py#L741 with a deep_update (e.g. from pydantic).

### To Reproduce

_No response_

### Relevant log output / Error message

_No response_

### System Info

- OS: Linux-5.15.0-91-generic-x86_64-with-glibc2.31 # 101~20.04.1-Ubuntu SMP Thu Nov 16 14:22:28 UTC 2023
- Python: 3.9.18
- Stable-Baselines3: 2.2.1
- PyTorch: 2.1.1+cu121
- GPU Enabled: True
- Numpy: 1.26.2
- Cloudpickle: 3.0.0
- Gymnasium: 0.29.1

### Checklist

- [X] I have checked that there is no similar [issue](https://github.com/DLR-RM/rl-baselines3-zoo/issues) in the repo
- [X] I have read the [SB3 documentation](https://stable-baselines3.readthedocs.io/en/master/)
- [X] I have read the [RL Zoo documentation](https://rl-baselines3-zoo.readthedocs.io)
- [X] I have provided a [minimal and working](https://github.com/DLR-RM/stable-baselines3/issues/982#issuecomment-1197044014) example to reproduce the bug
- [X] I've used the [markdown code blocks](https://help.github.com/en/articles/creating-and-highlighting-code-blocks) for both code and stack traces.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Custom Sub-Hyperparameters during train.py -> Optimize #431

🐛 Bug

To Reproduce

Relevant log output / Error message

System Info

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Custom Sub-Hyperparameters during train.py -> Optimize #431

Description

🐛 Bug

To Reproduce

Relevant log output / Error message

System Info

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions