|
1 | 1 | # Experiment |
2 | 2 |
|
3 | | -This package contains utils for running experiments with W&B, including single runs, sweeps etc. |
4 | | -The code here can be used only with W&B, but this is not required to use other packages |
| 3 | +This directory contains experiment utils, including entrypoints, which can be used to run W&B experiments. They are not |
| 4 | +integral part of the library, so they need additional code defining the experiment settings to run. |
5 | 5 |
|
6 | | -## Reporters |
| 6 | +## How to Use |
7 | 7 |
|
8 | | -Reporters are standalone functions used to log commonly needed experiment properties to W&B, including plots. |
9 | | -To use then run them in the experiment code: |
| 8 | +To use the entrypoints and utils provided here, `RuntimeContext` needs to be implemented, which is used to parametrize |
| 9 | +the experiment. Interface is given by state-less class, so it can be defined as single namespace, it needs to be passed |
| 10 | +to the entrypoint function. |
10 | 11 |
|
11 | | -* `report_prediction_plot` - adds interactive plotly graphic to W&B |
12 | | -* `report_metrics` - adds numeric value for each regression metrics |
13 | | -* `report_trainable_parameters` - adds number of trainable parameters of the model |
| 12 | +To run experiment (assume not using sweep for now), following code needs to be implemented. This additionally allows |
| 13 | +to use `click` library to pass the config files as options. |
14 | 14 |
|
15 | | -### Example |
| 15 | +```python |
| 16 | +import click |
16 | 17 |
|
17 | | -To use reporters run following example code: |
| 18 | +from pydentification.experiment.entrypoints import run |
18 | 19 |
|
19 | | -```python |
20 | | -y_hat = trainer.predict(model, test_loader) # assume trainer and model are trained |
21 | | -y_pred = torch.cat(y_hat).numpy() |
22 | | -y_true = torch.cat([y for _, y in test_loader]).numpy() |
| 20 | +from src import runtime # assume this is project specific code |
| 21 | + |
| 22 | + |
| 23 | +@click.command() |
| 24 | +@click.option("--data", type=click.Path(exists=True), required=True) |
| 25 | +@click.option("--experiment", type=click.Path(exists=True), required=True) |
| 26 | +def main(data, experiment): |
| 27 | + run(data=data, experiment=experiment, runtime=runtime) |
| 28 | + |
| 29 | + |
| 30 | +if __name__ == "__main__": |
| 31 | + main() |
| 32 | +``` |
| 33 | + |
| 34 | +To run simply execute the script with flags for config passing. |
| 35 | + |
| 36 | +```bash |
| 37 | +python main.py --data data.yaml --experiment experiment.yaml |
| 38 | +``` |
| 39 | + |
| 40 | +## Parametrization |
23 | 41 |
|
24 | | -metrics = regression_metrics(y_pred=y_pred.flatten(), y_true=y_true.flatten()) # function from pydentification.metrics |
| 42 | +Each training run is parametrized by 5 functions, which take two configurations. Functions are used to define the input, |
| 43 | +model architecture, training logic, reporting to W&B and storage logic. For the last three, we provide useful defaults |
| 44 | +in `defaults` package. They need to be implemented as part of single namespace, called `context`, which is passed to |
| 45 | +entrypoint for running experiment and sweep. Interface for `context` is given by `RuntimeContext`. |
25 | 46 |
|
26 | | -# run reporters |
27 | | -report_metrics(metrics) |
28 | | -report_trainable_parameters(model) |
29 | | -report_prediction_plot(predictions=y_pred, targets=y_true) |
| 47 | +Additionally, two config files can be used, one for storing data and one for model parameters. They are used to abstract |
| 48 | +dataset loading and creating model architecture from the code, to quickly iterate through different configurations. |
| 49 | +Otherwise `pydentification` library can be used as collection of standalone components, which can be useful for various |
| 50 | +project related to neural system identification. |
| 51 | + |
| 52 | +### Functions |
| 53 | + |
| 54 | +* `input_fn` - function which takes configuration file and returns `pl.DataModule` subclass, typically one of data-modules provided by `pydentification`. |
| 55 | +* `model_fn` - function which takes configuration file and returns `pl.LightningModule` and `pl.Trainer` instances. |
| 56 | +* `train_fn` - function which takes model, trainer and the data-module and executes training code, typically inside the `Trainer`. |
| 57 | +* `report_fn` - function, which takes model, trainer and the data-module, it should run predictions and store relevant metrics in W&B dashboard. |
| 58 | +* `save_fn` - function takes in run name (given by W&B `id` or `name`) and model, it saves the model to the disk. |
| 59 | + |
| 60 | +### Configurations |
| 61 | + |
| 62 | +The entrypoint (both training and sweep) are parameterized by two configs, one of them is for the data settings and the |
| 63 | +other for the experiment and model, which contains hyperparameters and training settings. |
| 64 | + |
| 65 | +The data config is stored in `YAML` and it is passed to the `input_fn` function. Not all parameters of the data-module |
| 66 | +is stored in the data config, only the static values. The example config looks following: |
| 67 | + |
| 68 | +```yaml |
| 69 | +name: Dataset |
| 70 | +path: data/dataset.csv |
| 71 | +test_size: 10000 |
| 72 | +input_columns: [x] |
| 73 | +output_columns: [y] |
30 | 74 | ``` |
| 75 | +
|
| 76 | +The experiment config looks in the following way. |
| 77 | +
|
| 78 | +```yaml |
| 79 | +general: |
| 80 | + project: project-name |
| 81 | + n_runs: 1 |
| 82 | + name: placeholder |
| 83 | +training: |
| 84 | + n_epochs: 10 |
| 85 | + patience: 1 |
| 86 | + timeout: "00:00:01:00" |
| 87 | + batch_size: 32 |
| 88 | + shift: 1 |
| 89 | + validation_size: 0.1 |
| 90 | +model: |
| 91 | + model_name: MLP |
| 92 | + # generic parameter convention |
| 93 | + n_input_time_steps: 64 |
| 94 | + n_output_time_steps: 1 |
| 95 | + n_input_state_variables: 1 |
| 96 | + n_output_state_variables: 1 |
| 97 | + # neural network parameters |
| 98 | + n_hidden_layers: 2 |
| 99 | + activation: relu |
| 100 | + n_hidden_time_steps: 32 |
| 101 | + n_hidden_state_variables: 4 |
| 102 | +``` |
| 103 | +
|
| 104 | +To use sweep add following section to the experiment config. |
| 105 | +
|
| 106 | +```yaml |
| 107 | +sweep: |
| 108 | + name: sweep |
| 109 | + method: random |
| 110 | + metric: {name: test/root_mean_squared_error, goal: minimize} |
| 111 | +sweep_parameters: |
| 112 | + # neural network |
| 113 | + n_hidden_layers: [1, 2, 3, 4, 5] |
| 114 | + n_hidden_time_steps: [32, 16, 8] |
| 115 | + n_hidden_state_variables: [1, 4, 8, 16] |
| 116 | + activation: [leaky_relu, relu, gelu, sigmoid, tanh] |
| 117 | +``` |
0 commit comments