Skip to content

Minimal modifications to params.yaml when running --set-param #8720

Open
@lucaerculiani

Description

@lucaerculiani

dvc version: 2.38.1

Hi!

Deep learning guy here.
In our experiments we make use of the configuration files to define the structure of the network.
In particular we have some parameters that are lists where each element is a list of parameters that identify
a particulal sumbodule of the network plus its initialization arguments and its inputs.
This is a quite powerful way to define networks but comes at the cost of having
some parameters with complex values. In order to keep them manageable, we use indentation, inlining
and comments to clarify its structure.
Aside from these, we have other parameters that control other aspects of the network that are usually simple values. These are
the ones we want to change during our batches of experiment to find their optimal values.

Here an example:

activation: "relu"
batch_nomalization: True

network: [ # list of modules : [ name, [*parameters]]
[convolution2d, [16, 2, 2]],
[submodule_x, [32, True, 0.5]],
[submodule_y, [32, False, 1.0, 16, True]],
... 
]

In this case network is the complex parameter whose indentation/inlining is useful, while activation and batch_nomalization are the parameters that are usually explored via batches of experiment.

The problem is that when I try to run an experiment using dvc exp run --set-param activation=swish, the resulting
params.yaml file is something like:

activation: swish
batch_nomalization: True

network: 
- -  convolution2d
- - - 16
- - - 2
- - - 2
- - submodule_x
- - - 32
- - - True
- - - 0.5
- - sumbodule_y
- - - 32
- - - False
- - - 1.0
- - - 16
- - - True
...

The reformatting and dropping of comments makes reading and modifying the network after the fact (while preserving the best set of values for the parameters, discovered during past experiments) more compicated.
What I would prefer is localizing the modifications done to the parameters file only to the keys that actually needs to be changed, thus obtaining (in this case):

activation: swish
batch_nomalization: True

network: [ # list of modules : [ name, [*parameters]]
[convolution2d, [16, 2, 2]],
[submodule_x, [32, True, 0.5]],
[submodule_y, [32, False, 1.0, 16, True]],
...
]

What do you think about it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    A: hydraRelated to hydra integrationp2-mediumMedium priority, should be done, but less importantregressionOhh, we broke something :-(

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions