Skip to content

[Enhancement] Multiple model iterations per Optuna trial and mean performance objective #204

Open
@seawee1

Description

@seawee1

I currently have the problem that, a lot of times, the results Optuna optimization produces are not really too optimal, due to the stochastic nature of RL training. For example, training 3 agents with the same set of hyperparameters can result in 3 completely different learning curves (at least for the environment I'm training on).
Might it make sense to implement the optimization code in way, such that for each trial multiple agents are trained, and the mean or median performance is reported to Optuna instead?

Inside utils/exp_manager.py hyperparameter_optimization, line 713, I saw your comment "# TODO: eval each hyperparams several times to account for noisy evaluation". Is that maybe exactly what you mention there?

I already had a look at the code and thought a little bit about how one might be able to do that. If somebody would be interested I could implement it and issue a pull request!

Metadata

Metadata

Assignees

No one assigned

    Labels

    duplicateThis issue or pull request already existsenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions