Skip to content

Feat: Proposal config builder system#904

Draft
melisande-c wants to merge 24 commits into
mainfrom
mc/feat/config-builder
Draft

Feat: Proposal config builder system#904
melisande-c wants to merge 24 commits into
mainfrom
mc/feat/config-builder

Conversation

@melisande-c
Copy link
Copy Markdown
Member

@melisande-c melisande-c commented Apr 17, 2026

Disclaimer

  • I am an AI agent.
  • I have used AI and I thoroughly reviewed every line.
  • I have not used AI extensively.

Description

Note

tldr: Draft PR as a proposal for a config builder system that will be hopefully easier to maintain (and use) as we add new algorithms and features. No docs or tests yet this is just to discuss whether we like the idea or not, and the implementation if we do like it.

Background - why do we need this PR?

I wanted to make the code for setting certain groups of parameters in the config more easily reusable and modifiable between algorithms.

Overview - what changed?

Introduced builder classes for each algorithm N2VConfigBuilder, CareConfigBuilder and N2NConfigBuilder that can be used to replace the create_n2v_config, create_advanced_n2v_config etc.

Hopefully the mechanism will allow introducing new algorithm config builders, e.g. MicrosplitConfigBuilder etc., more easily.

Implementation - how did you implement the changes?

Builder classes

The builder classes can be instantiated with a basic set of parameters, similar to those in create_n2v_config and this will produce a valid configuration with all the other parameters set by the pydantic defaults. In init concrete builder classes must create a minimum config_dict attribute, defined by the TypedDict ConfigDict. If a user wants to change any parameters from the defaults they have to call extra "builder methods".

The "builder methods" build up the internal config dictionary incrementally until finally build is called and the final configuration is validated by pydantic and returned.

The user is able to chain multiple of these methods by calling one after the other (see examples at the bottom), this is because each method returns a reference to self.

Mixins

I have decided to use a mixin mechanism to share sets of "builder methods" between multiple ConfigBuilder classes. Mixins can get confusing if you're not careful and can sometimes also be tricky to debug but I came to the conclusion that this would have the best trade-off between reusing code and user experience. The inherited methods are recognised by IDEs, and mkdocstrings can also render inherited members in the API reference.

Sets of "builder methods" that will likely be universal across all algorithms are included in the TrainingParamsMixin and OptimizerParamsMixin, it is likely that most of the DataParamsMixin method will also be relevant for microsplit but we will probably also need some extra parameters.

BaseConfigBuilder

The config builder classes should inherit from the BaseConfigBuilder. There is also a protocol ConfigBuilder and a type variable, ConfigBuilderT. The mixin's self argument has to be typed as ConfigBuilderT for IDEs to work properly.

Additional hook

The base config builder also provides an additional hook _before_build. This can be used by the mixins or the child config builders to make any last modifications to the config dict before pydantic validates it. However, we should probably try to use this sparingly to avoid any confusing behaviour. It will be called in the order that the child classes are subclassed.

The _before_build hook is currently only used by the N2VConfigBuilder to set the monitor_metric in the checkpoint and early-stopping callback parameters because it has to wait until after the set_checkpoint_params and set_early_stopping_params is called.

Things to note

  • I have decided to set all defaults (not including initialisation) to None, this way if a parameter is None then we fall back to the defaults in the pydantic classes and we don't have to maintain two sets of defaults in the ConfigBuilder classes and the Pydantic classes.

Changes Made

New features or files

Builders:

  • N2VConfigBuilder
  • N2NConfigBuilder
  • CAREConfigBuilder
  • BaseConfigBuilder
  • ConfigBuilder (Protocol)

Mixins:

  • DataParamsMixin
  • TrainingParamsMixin
  • OptimizerParamsMixin
  • UnetParamsMixin

How has this been tested?

Only experimented in notebooks, there are probably still some bugs.

Related Issues

First proposed in #902

Additional Notes and Examples

How the API would look for some common use-cases:

# most basic, no additional build methods are called.
config = (
    N2VConfigBuilder(
        experiment_name="basic_example",
        data_type="array",
        axes="YX",
        patch_size=(64, 64),
        batch_size=8,
        num_epochs=30,
    )
    .build()
)
# care with channel options
config = (
    CAREConfigBuilder(
        experiment_name="builder_example",
        data_type="array",
        axes="CYX",
        patch_size=(64, 64),
        batch_size=8,
        num_epochs=30,
        n_channels_in=2,
        n_channels_out=2
    )
    .set_advanced_data_params(channels=[0, 2])
    .set_model_params(independent_channels=False)
    .build()
)
# using n2v2, note it would be possible to move the `use_n2v2` arg to __init__
config = (
    N2VConfigBuilder(
        experiment_name="builder_example",
        data_type="array",
        axes="YX",
        patch_size=(64, 64),
        batch_size=8,
        num_epochs=30,
    )
    .set_n2v_params(use_n2v2=True)
    .build()
)
# set the parameters required for no validation during training
config = (
    N2VConfigBuilder(
        experiment_name="builder_example",
        data_type="array",
        axes="YX",
        patch_size=(64, 64),
        batch_size=8,
        num_epochs=30,
    )
    .set_advanced_data_params(n_val_patches=0)
    .set_monitor_metric("train_loss_epoch")
    .build()
)

And an example of if someone wanted to change at least one parameter in each build method (hopefully a very rare use-case)

# demonstrating every available build method for n2v
config = (
    N2VConfigBuilder(
        experiment_name="builder_example",
        data_type="array",
        axes="YX",
        patch_size=(64, 64),
        batch_size=8,
        num_epochs=30,
    )
    .set_n2v_params(struct_n2v_axis="horizontal")
    .set_monitor_metric("train_loss_epoch")
    .set_advanced_data_params(n_val_patches=32, augmentations=["x_flip"])
    .set_data_loader_params(num_workers=3)
    .set_normalization("quantile")
    .set_patch_filter("shannon", threshold=3.2)
    .set_optimizer("Adam", lr=0.0001)
    .set_lr_scheduler("ReduceLROnPlateau", patience=20)
    .set_trainer_params(limit_train_batches=128)
    .set_checkpoint_params(every_n_epochs=5)
    .set_early_stopping(on=True, patience=5)
    .set_logger("wandb")
    .set_model_params(use_batch_norm=False)
    .build()
)

Please ensure your PR meets the following requirements:

  • Code builds and passes tests locally, including doctests
  • New tests have been added (for bug fixes/features)
  • Documentation has been updated
  • Pre-commit passes

@melisande-c melisande-c requested a review from a team April 17, 2026 13:46
Copy link
Copy Markdown
Member

@jdeschamps jdeschamps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't spend as much time as I wanted... But I am just going to write down my thoughts so we can start discussing.

I like the principle, as it may be elegant on the user side. I do worry however that it will be hard to maintain with an explosion of small classes. While they are well-defined, some parameters are broken off from their logical place (e.g. axes are in the base class, not in data), making it hard to follow. Long range interactions (e.g. disabling validation, 3D vs 2D, n_input and channels) are also a bit tricky, since many parameters interact at different levels and it is hard to ensure that they are all compatible when they are set one at a time, so the consequence is going to be complex mixins. Units of parameters working together may end up in different classes.

In addition, I am not sure that it will be intuitive for users. A function with a long list of parameters is annoying, but it is at least clear what parameters can be used. It also allows dealing with the long range interaction much more easily.

The argument in favor is obviously extensibility, which will be easier than the current convenience functions. The long range interactions could be mixins of their own (e.g. channels + n_input_channels), I guess the docs are a good place to see what is typically changed together. Although redefining the same parameter in different mixins sounds like a bad idea. We should examine whether the error raised in the convenience functions could just move to a Pydantic validation.

I did not see any error raised, probably because it is work in progress, I suppose it could also be added to guide users on sets of incompatible parameters.

You say the defaults are all delegated to the Pydantic model, as it should. That is not the case for the training config though.

One idea: we could also instantiate a Builder from an existing configuration.

I am not in favor of including this into v0.2 because I'd like to go ahead with the release, we could introduce it in later patch versions and make it the default in v0.3 (alongside MSplit support) with a deprecation warning for the linear functions.

Comment on lines +19 to +31
@overload
def set_early_stopping(
self: ConfigBuilderT, on: Literal[True], **kwargs: Any
) -> ConfigBuilderT: ...

@overload
def set_early_stopping(
self: ConfigBuilderT, on: Literal[False]
) -> ConfigBuilderT: ...

def set_early_stopping(
self: ConfigBuilderT, on: bool, **kwargs: Any
) -> ConfigBuilderT:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the overload really justified?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If set_early_stopping is not called, then it is off and not in the dict, if it is called then the kwargs are passed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see it is to be able to disable the early stopping if it was there originally. I'd rather have set_early_stopping(*kwargs) and disable_early_stopping() methods

Comment on lines +159 to +166
# set default checkpointing params (n2n self supervised)
# (can be overwritten with set_checkpoint_params from TrainingParamMixin)
self.config_dict["training_config"]["checkpoint_params"] = asdict(
SelfSupervisedCheckpointing()
)

# no early stopping by default
self.config_dict["training_config"]["early_stopping_params"] = None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to not add a training_config, only adding it if the specific methods are called, and leave it to the default constructor to set the defaults?

early_stopping_params: NotRequired[dict[str, Any] | None]


class ConfigDict(TypedDict):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusing naming because of Pydantic's ConfigDict.

@melisande-c
Copy link
Copy Markdown
Member Author

melisande-c commented Apr 23, 2026

I like the principle, as it may be elegant on the user side. I do worry however that it will be hard to maintain with an explosion of small classes. While they are well-defined, some parameters are broken off from their logical place (e.g. axes are in the base class, not in data), making it hard to follow. Long range interactions (e.g. disabling validation, 3D vs 2D, n_input and channels) are also a bit tricky, since many parameters interact at different levels and it is hard to ensure that they are all compatible when they are set one at a time, so the consequence is going to be complex mixins. Units of parameters working together may end up in different classes.

Yes the reason why I did not put axes into the DataParamsMixin was because I wanted to be able to initialise and build the builder classes and for that to create a valid configuration without having to call any additional set_* methods. None of the Mixin classes are able to set parameters that do not have a default in the pydantic classes, they are only for changing default values. I figured it is the responsibility of the builder classes to ask for any parameters without defaults to create a valid configuration. (Although I did not follow this purely since I also included n_channels etc. in __init__).

I agree that the long range interaction between different levels of the config is a bit tricky, and maybe annoying to have to call multiple methods to set the configuration correctly.

In addition, I am not sure that it will be intuitive for users. A function with a long list of parameters is annoying, but it is at least clear what parameters can be used. It also allows dealing with the long range interaction much more easily.

Yeah I think I would like to see some user testings to see if it is intuitive or not. In vscode, typing the builder. will immediately show all the builder methods, if they have descriptive enough names maybe it will be ok.

The argument in favor is obviously extensibility, which will be easier than the current convenience functions. The long range interactions could be mixins of their own (e.g. channels + n_input_channels), I guess the docs are a good place to see what is typically changed together. Although redefining the same parameter in different mixins sounds like a bad idea. We should examine whether the error raised in the convenience functions could just move to a Pydantic validation.

Yes I was torn between how to group parameters in a way that made sense. While I also was originally thinking there should not be any overlap between parameters set by different methods, now I am considering maybe it would be ok, if you change the same parameter in two different methods then it will be the last method called that will overwrite it last.

I did not see any error raised, probably because it is work in progress, I suppose it could also be added to guide users on sets of incompatible parameters.

I identified that some of the errors raise in the convenience functions duplicate validation that is now present in the Configuration classes with the introduction of ModelConstraints. These two checks are currently not covered, but they could be added as validation in the root Configuration class:

if channels_present and (n_channels is None and channels is None):
raise ValueError(
"`n_channels` or `channels` must be specified when using channels."
)

if n_val_patches == 0 and monitor_metric == "val_loss":
raise ValueError(
"When disabling validation (`n_val_patches==0`), set `monitor_metric` to "
'`"train_loss"` or `"train_loss_epoch"`.'
)

You say the defaults are all delegated to the Pydantic model, as it should. That is not the case for the training config though.

Do you mean num_epochs: int = 30? Yeah I did think I should change the default training config to be {"num_epochs": 30} instead. Or if you mean algorithm specify defaults for checkpointing and early stopping then I think it's ok because it is more intentional, but if you don't think so then it would be possible to set this with a validator in N2VConfiguration instead.

One idea: we could also instantiate a Builder from an existing configuration.

I am not in favor of including this into v0.2 because I'd like to go ahead with the release, we could introduce it in later patch versions and make it the default in v0.3 (alongside MSplit support) with a deprecation warning for the linear functions.

I agree, it definitely needs more discussion and refining

@jdeschamps
Copy link
Copy Markdown
Member

Alright, so plan of action could be:

  • add tests and finalize the PR
  • we merge silently as an experimental feature, we play around with it, I think writing docs would help (at least me)
  • we give ourselves until v0.3 to decide whether it should replace the body of the convenience functions, which could then be deprecated for a little while

I identified that some of the errors raise in the convenience functions duplicate validation that is now present in the Configuration classes with the introduction of ModelConstraints. These two checks are currently not covered, but they could be added as validation in the root Configuration class:

if channels_present and (n_channels is None and channels is None):
raise ValueError(
"`n_channels` or `channels` must be specified when using channels."
)

if n_val_patches == 0 and monitor_metric == "val_loss":
raise ValueError(
"When disabling validation (`n_val_patches==0`), set `monitor_metric` to "
'`"train_loss"` or `"train_loss_epoch"`.'
)

The first one not really since it corresponds to the number of input in the model, and there is no way to tell apart the default value and what the user set (as opposed to a function with a default equal to None). The second one should indeed be a validation in Pydantic.

Base automatically changed from dev/v0.2 to main May 13, 2026 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants