Skip to content

fidelity_index doesn't support nested param #1125

@FrancoisPgm

Description

@FrancoisPgm

Describe the bug
I am runnig orion with the hydra plugin, and when I use a nested param of the config for the fidelity space for BOHB, e.g. hydra.sweeper.params.model.trainer.max_epochs: "fidelity(low=1, high=2)", the fidelity_index gets set as "model.trainer.max_epochs", but the trial.params dict keeps the nested structure :

{'model': {'params': {'lr': 0.0001783,
                      'lr_scheduler_args': {'T_max': 72312},
                      'weight_decay': 0.01001},
           'trainer': {'max_epochs': 1.0}}}

So I get :

  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/orion/algo/base.py", line 308, in has_suggested_all_possible_values
    fidelity_value = trial.params[fidelity_index]
KeyError: 'model.trainer.max_epochs'

Expected behavior
I'd expect either the fidelity_index to keep the nested structure somehow, or the trial.params dict to get flattened keys, something like:

{
    'model.params.lr': 0.0001783,
    'model.params.lr_scheduler_args.T_max': 72312,
    'model.params.weight_decay': 0.01001,
    'model.trainer.max_epochs': 1.0
}

For now I can easily avoid the issue by using a non-nested param in my config file:
hydra.sweeper.params.max_epochs: "fidelity(low=1, high=2)"

Steps to reproduce
Define a fidelity dimension with a nested param.

Environment (please complete the following information):

  • OS: MacOS Sonoma 14.1.1
  • Python version: 3.9
  • Oríon version: 0.2.4.post1+computecanada
  • Database: PickleDB

Additional context
The full error log :

[2023-12-05 08:13:00,956][HYDRA] Orion Optimizer {'type': 'bohb', 'config': {'seed': 1, 'min_points_in_model': 4, 'top_n_percent': 40, 'num_samples': 5}}
[2023-12-05 08:13:00,956][HYDRA] with parametrization {'model.params.lr': 'loguniform(1e-05, 0.01)', 'model.params.lr_scheduler_args.T_max': 'uniform(1000, 100000, discrete=True)', 'model.params.weight_decay': 'loguniform(0.01, 100)', 'model.trainer.max_epochs': 'fidelity(1, 2)'}
Traceback (most recent call last):
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra_plugins/hydra_orion_sweeper/implementation.py", line 353, in clientctx
    yield client
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra_plugins/hydra_orion_sweeper/implementation.py", line 510, in sweep
    raise e
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra_plugins/hydra_orion_sweeper/implementation.py", line 507, in sweep
    self.optimize(self.client)
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra_plugins/hydra_orion_sweeper/implementation.py", line 525, in optimize
    trials = self.sample_trials()
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra_plugins/hydra_orion_sweeper/implementation.py", line 555, in sample_trials
    trials = self.suggest_trials(self.n_workers())
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra_plugins/hydra_orion_sweeper/implementation.py", line 434, in suggest_trials
    trial = self.client.suggest(pool_size=count)
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/orion/client/experiment.py", line 563, in suggest
    if self.is_done:
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/orion/client/experiment.py", line 167, in is_done
    return self._experiment.is_done
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/orion/core/worker/experiment.py", line 541, in is_done
    self.algorithms.is_done and num_pending_trials == 0
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/orion/core/worker/primary_algo.py", line 277, in is_done
    return super().is_done or self.algorithm.is_done
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/orion/algo/base.py", line 293, in is_done
    return self.has_completed_max_trials or self.has_suggested_all_possible_values()
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/orion/algo/base.py", line 308, in has_suggested_all_possible_values
    fidelity_value = trial.params[fidelity_index]
KeyError: 'model.trainer.max_epochs'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra/_internal/utils.py", line 466, in <lambda>
    lambda: hydra.multirun(
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 162, in multirun
    ret = sweeper.sweep(arguments=task_overrides)
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra_plugins/hydra_orion_sweeper/orion_sweeper.py", line 79, in sweep
    return self.sweeper.sweep(arguments)
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra_plugins/hydra_orion_sweeper/implementation.py", line 510, in sweep
    raise e
  File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.9.6/lib/python3.9/contextlib.py", line 135, in __exit__
    self.gen.throw(type, value, traceback)
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/hydra_plugins/hydra_orion_sweeper/implementation.py", line 355, in clientctx
    client.close()
  File "/scratch/fpaugam/test_orion_env39/lib/python3.9/site-packages/orion/client/experiment.py", line 828, in close
    raise RuntimeError(
RuntimeError: There is still reserved trials: dict_keys(['7ba7eed37ff08c60dc9bad9341405be4'])
Release all trials before closing the client, using client.release(trial).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIndicates an unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions