cyber-physical-systems-group
diff --git a/‎pydentification/experiment/README.md‎
Lines changed: 106 additions & 19 deletions b/‎pydentification/experiment/README.md‎
Lines changed: 106 additions & 19 deletions
diff --git a/‎pydentification/experiment/context.py‎
Lines changed: 68 additions & 0 deletions b/‎pydentification/experiment/context.py‎
Lines changed: 68 additions & 0 deletions
diff --git a/‎pydentification/experiment/defaults/__init__.py‎ b/‎pydentification/experiment/defaults/__init__.py‎
diff --git a/‎pydentification/experiment/defaults/report.py‎
Lines changed: 18 additions & 0 deletions b/‎pydentification/experiment/defaults/report.py‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎pydentification/experiment/defaults/save.py‎
Lines changed: 11 additions & 0 deletions b/‎pydentification/experiment/defaults/save.py‎
Lines changed: 11 additions & 0 deletions
diff --git a/‎pydentification/experiment/defaults/train.py‎
Lines changed: 12 additions & 0 deletions b/‎pydentification/experiment/defaults/train.py‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎pydentification/experiment/entrypoints.py‎
Lines changed: 107 additions & 0 deletions b/‎pydentification/experiment/entrypoints.py‎
Lines changed: 107 additions & 0 deletions
diff --git a/‎pydentification/experiment/parameters.py‎
Lines changed: 25 additions & 0 deletions b/‎pydentification/experiment/parameters.py‎
Lines changed: 25 additions & 0 deletions
@@ -1,30 +1,117 @@
 # Experiment
 
-This package contains utils for running experiments with W&B, including single runs, sweeps etc.
-The code here can be used only with W&B, but this is not required to use other packages
+This directory contains experiment utils, including entrypoints, which can be used to run W&B experiments. They are not
+integral part of the library, so they need additional code defining the experiment settings to run.
 
-## Reporters
+## How to Use
 
-Reporters are standalone functions used to log commonly needed experiment properties to W&B, including plots.
-To use then run them in the experiment code:
+To use the entrypoints and utils provided here, `RuntimeContext` needs to be implemented, which is used to parametrize
+the experiment. Interface is given by state-less class, so it can be defined as single namespace, it needs to be passed
+to the entrypoint function.
 
-* `report_prediction_plot` - adds interactive plotly graphic to W&B 
-* `report_metrics` - adds numeric value for each regression metrics
-* `report_trainable_parameters` - adds number of trainable parameters of the model
+To run experiment (assume not using sweep for now), following code needs to be implemented. This additionally allows 
+to use `click` library to pass the config files as options.
 
-### Example
+```python
+import click
 
-To use reporters run following example code:
+from pydentification.experiment.entrypoints import run
 
-```python
-y_hat = trainer.predict(model, test_loader)  # assume trainer and model are trained 
-y_pred = torch.cat(y_hat).numpy()
-y_true = torch.cat([y for _, y in test_loader]).numpy()
+from src import runtime  # assume this is project specific code
+
+
+@click.command()
+@click.option("--data", type=click.Path(exists=True), required=True)
+@click.option("--experiment", type=click.Path(exists=True), required=True)
+def main(data, experiment):
+    run(data=data, experiment=experiment, runtime=runtime)
+
+
+if __name__ == "__main__":
+    main()
+```
+
+To run simply execute the script with flags for config passing.
+
+```bash
+python main.py --data data.yaml --experiment experiment.yaml
+```
+
+## Parametrization
 
-metrics = regression_metrics(y_pred=y_pred.flatten(), y_true=y_true.flatten())  # function from pydentification.metrics
+Each training run is parametrized by 5 functions, which take two configurations. Functions are used to define the input,
+model architecture, training logic, reporting to W&B and storage logic. For the last three, we provide useful defaults
+in `defaults` package. They need to be implemented as part of single namespace, called `context`, which is passed to
+entrypoint for running experiment and sweep. Interface for `context` is given by `RuntimeContext`.
 
-# run reporters
-report_metrics(metrics)
-report_trainable_parameters(model)
-report_prediction_plot(predictions=y_pred, targets=y_true)
+Additionally, two config files can be used, one for storing data and one for model parameters. They are used to abstract
+dataset loading and creating model architecture from the code, to quickly iterate through different configurations.
+Otherwise `pydentification` library can be used as collection of standalone components, which can be useful for various 
+project related to neural system identification.
+
+### Functions
+
+* `input_fn` - function which takes configuration file and returns `pl.DataModule` subclass, typically one of data-modules provided by `pydentification`.
+* `model_fn` - function which takes configuration file and returns `pl.LightningModule` and `pl.Trainer` instances.
+* `train_fn` - function which takes model, trainer and the data-module and executes training code, typically inside the `Trainer`.
+* `report_fn` - function, which takes model, trainer and the data-module, it should run predictions and store relevant metrics in W&B dashboard.
+* `save_fn` - function takes in run name (given by W&B `id` or `name`) and model, it saves the model to the disk.
+
+### Configurations
+
+The entrypoint (both training and sweep) are parameterized by two configs, one of them is for the data settings and the
+other for the experiment and model, which contains hyperparameters and training settings. 
+
+The data config is stored in `YAML` and it is passed to the `input_fn` function. Not all parameters of the data-module
+is stored in the data config, only the static values. The example config looks following:
+
+```yaml
+name: Dataset
+path: data/dataset.csv
+test_size: 10000
+input_columns: [x]
+output_columns: [y]
 ```
+
+The experiment config looks in the following way.
+
+```yaml
+general:
+  project: project-name
+  n_runs: 1
+  name: placeholder
+training:
+  n_epochs: 10
+  patience: 1
+  timeout: "00:00:01:00"
+  batch_size: 32
+  shift: 1
+  validation_size: 0.1
+model:
+  model_name: MLP
+  # generic parameter convention
+  n_input_time_steps: 64
+  n_output_time_steps: 1
+  n_input_state_variables: 1
+  n_output_state_variables: 1
+  # neural network parameters
+  n_hidden_layers: 2
+  activation: relu
+  n_hidden_time_steps: 32
+  n_hidden_state_variables: 4
+```
+
+To use sweep add following section to the experiment config.
+
+```yaml
+sweep:
+  name: sweep
+  method: random
+  metric: {name: test/root_mean_squared_error, goal: minimize}
+sweep_parameters:
+  # neural network
+  n_hidden_layers: [1, 2, 3, 4, 5]
+  n_hidden_time_steps: [32, 16, 8]
+  n_hidden_state_variables: [1, 4, 8, 16]
+  activation: [leaky_relu, relu, gelu, sigmoid, tanh]
+```
@@ -0,0 +1,68 @@
+from abc import ABC, abstractmethod
+from typing import Any
+
+import lightning.pytorch as pl
+
+
+class RuntimeContext(ABC):
+    """
+    This interface defined the runtime context needed for experiment execution by provided entrypoints.
+    It can be used to define custom experiment execution flow.
+
+    The interface can be implemented as module or namespace
+    """
+
+    @staticmethod
+    @abstractmethod
+    def input_fn(config: dict[str, Any], parameters: dict[str, Any]) -> pl.LightningDataModule:
+        """
+        :param config: static dataset configuration
+        :param parameters: dynamic experiment configuration, for example delay-line length for dynamical systems
+
+        :return: LightningDataModule instance, which is used to load and prepare data for training
+        """
+        ...
+
+    @staticmethod
+    @abstractmethod
+    def model_fn(
+        name: str, config: dict[str, Any], parameters: dict[str, Any]
+    ) -> tuple[pl.LightningModule, pl.Trainer]:
+        """
+        :param name: name of the W&B project, will be used for logging with callbacks
+        :param config: static configuration, for example timeout, validation-size etc.
+        :param parameters: dynamic experiment configuration, for example model settings or batch-size
+        """
+
+    @staticmethod
+    @abstractmethod
+    def train_fn(
+        model: pl.LightningModule, trainer: pl.Trainer, dm: pl.LightningDataModule
+    ) -> tuple[pl.LightningModule, pl.Trainer]:
+        """
+        :param model: LightningModule instance, returned from model_fn
+        :param trainer: Trainer instance, returned from model_fn
+        :param dm: LightningDataModule instance, returned from input_fn
+
+        :return: trained model and trainer
+        """
+        ...
+
+    @staticmethod
+    @abstractmethod
+    def report_fn(model: pl.LightningModule, trainer: pl.Trainer, dm: pl.LightningDataModule):
+        """
+        :param model: LightningModule instance, returned from train_fn (needs to be trained)
+        :param trainer: Trainer instance, returned from train_fn, can be used for easier prediction
+        :param dm: LightningDataModule instance, returned from input_fn, used for prediction on test data
+        """
+        ...
+
+    @staticmethod
+    @abstractmethod
+    def save_fn(name: str, model: pl.LightningModule):
+        """
+        :param name: name of the run, returned from wandb.run.id or config
+        :param model: trained LightningModule instance to be saved, returned from train_fn
+        """
+        ...
@@ -0,0 +1,18 @@
+import lightning.pytorch as pl
+import torch
+
+from pydentification.experiment.reporters import report_metrics, report_prediction_plot, report_trainable_parameters
+from pydentification.metrics import regression_metrics
+
+
+def report_fn(model: pl.LightningModule, trainer: pl.Trainer, dm: pl.LightningDataModule) -> None:
+    """Logs the experiment results to W&B"""
+    y_hat = trainer.predict(model, datamodule=dm)
+    y_pred = torch.cat(y_hat).numpy()
+    y_true = torch.cat([y for _, y in dm.test_dataloader()]).numpy()
+
+    metrics = regression_metrics(y_pred=y_pred.flatten(), y_true=y_true.flatten())  # type: ignore
+
+    report_metrics(metrics, prefix="test")  # type: ignore
+    report_trainable_parameters(model, prefix="config")
+    report_prediction_plot(predictions=y_pred, targets=y_true, prefix="test")
@@ -0,0 +1,11 @@
+import os
+
+import torch
+import wandb
+
+
+def save_fn(name: str, model: torch.nn.Module):
+    path = f"models/{name}/trained-model.pt"
+    os.makedirs(os.path.dirname(path), exist_ok=True)
+    torch.save(model, path)
+    wandb.save(path)
@@ -0,0 +1,12 @@
+import lightning.pytorch as pl
+
+
+def train_fn(
+    model: pl.LightningModule, trainer: pl.Trainer, dm: pl.LightningDataModule
+) -> tuple[pl.LightningModule, pl.Trainer]:
+    """
+    Runs training using pl.Trainer and pl.LightningModule
+    with given LightningDataModule, returns both model and trainer
+    """
+    trainer.fit(model, datamodule=dm)
+    return model, trainer
@@ -0,0 +1,107 @@
+import logging
+from functools import partial
+from typing import Any
+
+import wandb
+import yaml
+
+from .context import RuntimeContext
+from .parameters import left_dict_join, prepare_config_for_sweep
+
+
+def run_training(
+    runtime: RuntimeContext,
+    project_name: str,
+    dataset_config: dict[str, Any],
+    training_config: dict[str, Any],
+    model_config: dict[str, Any],
+):
+    """
+    This function is used to run a single training experiment with given configuration. It contains the main
+    experimentation logic and parameter passing.
+    """
+    for config in (dataset_config, model_config, training_config):
+        if isinstance(config, dict):  # prevents logging parameters twice in sweep mode
+            wandb.log(config)
+
+    try:
+        # merge static and dynamic parameters of dataset
+        data_parameters = left_dict_join(training_config, model_config)
+        # experiment flow
+        dm = runtime.input_fn(dataset_config, data_parameters)
+        model, trainer = runtime.model_fn(project_name, training_config, model_config)
+        model, trainer = runtime.train_fn(model, trainer, dm)
+        runtime.report_fn(model, trainer, dm)
+        runtime.save_fn(wandb.run.id, model)
+
+    except Exception as e:
+        logging.exception(e)  # log traceback, W&B can sometimes lose information
+        raise ValueError("Experiment failed.") from e
+
+
+def run_sweep_step(
+    runtime: RuntimeContext, project_name: str, dataset_config: dict[str, Any], experiment_config: dict[str, Any]
+):
+    with wandb.init(reinit=True):
+        parameters = wandb.config  # dynamically generated model settings by W&B sweep
+
+        run_training(
+            runtime=runtime,
+            project_name=project_name,
+            dataset_config=dataset_config,
+            model_config=parameters,
+            training_config=experiment_config["training"],
+        )
+
+
+def run(data: str, experiment: str, runtime: RuntimeContext):
+    """
+    Run single experiment with given configuration.
+
+    :param data: dataset configuration
+    :param experiment: experiment configuration
+    :param runtime: runtime context with code executing the training and all preparations
+    """
+    dataset_config = yaml.safe_load(open(data))
+    experiment_config = yaml.safe_load(open(experiment))
+
+    project = experiment_config["general"]["project"]
+    name = experiment_config["general"]["name"]
+
+    with wandb.init(project=project, name=name):
+        model_config = experiment_config["model"]
+        training_config = experiment_config["training"]
+
+        run_training(
+            runtime=runtime,
+            project_name=project,
+            dataset_config=dataset_config,
+            model_config=model_config,
+            training_config=training_config,
+        )
+
+
+def sweep(data: str, experiment: str, runtime: RuntimeContext):
+    """
+    Run a sweep experiment with given configuration.
+
+    :param data: dataset configuration
+    :param experiment: experiment configuration
+    :param runtime: runtime context with code executing the training and all preparations
+    """
+    dataset_config = yaml.safe_load(open(data))
+    experiment_config = yaml.safe_load(open(experiment))
+
+    sweep_parameters = left_dict_join(experiment_config["sweep_parameters"], experiment_config["model"])
+    sweep_config = prepare_config_for_sweep(experiment_config["sweep"], sweep_parameters)
+    sweep_id = wandb.sweep(sweep_config, project=experiment_config["general"]["project"])
+
+    run_sweep_fn = partial(
+        run_sweep_step, runtime, experiment_config["general"]["project"], dataset_config, experiment_config
+    )
+    wandb.agent(
+        sweep_id,
+        function=run_sweep_fn,
+        count=experiment_config["general"]["n_runs"],
+        project=experiment_config["general"]["project"],
+    )
@@ -0,0 +1,25 @@
+from itertools import chain
+from typing import Any
+
+
+def left_dict_join(main: dict, other: dict) -> dict:
+    """Merges two dictionaries into single one, where keys from main are added when duplicate is found"""
+    return dict(chain(other.items(), main.items()))
+
+
+def prepare_config_for_sweep(config: dict[str, Any], parameters: dict[str, Any]) -> dict[str, Any]:
+    """
+    Prepares W&B config for running sweep, based on two distinct configs
+
+    :param config: general sweep config with values such as name, method or metric
+                   for more details see: https://docs.wandb.ai/guides/sweeps/define-sweep-configuration
+    :param parameters: parameters to sweep over, each given as list
+
+    :return: configuration dictionary ready for sweep
+    """
+    parameters = {
+        key: {"values": values if isinstance(values, list) else [values]} for key, values in parameters.items()
+    }
+    config.update({"parameters": parameters})
+
+    return config